yiwenlu66/mu-mcp # codebase.md

# Directory Structure

```
├── .env.example
├── .gitignore
├── chat_handler.py
├── models.py
├── prompts.py
├── pyproject.toml
├── README.md
├── requirements.txt
├── server.py
├── setup.sh
└── storage.py
```

# Files

--------------------------------------------------------------------------------
/.env.example:
--------------------------------------------------------------------------------

```
# μ-MCP Configuration
# Copy this file to .env and configure your settings

# OpenRouter API Key (required)
# Get yours at: https://openrouter.ai/keys
OPENROUTER_API_KEY=your_key_here

# Allowed Models (optional)
# Comma-separated list of models to enable
# Leave empty to allow all models
# Examples: gpt-5,gemini-pro,o3,deepseek
# Full list: gpt-5, gpt-5-mini, gpt-4o, o3, o3-mini, o3-mini-high,
#           o4-mini, o4-mini-high, sonnet, opus, haiku,
#           gemini-2.5-pro, gemini-2.5-flash,
#           deepseek-chat, deepseek-r1, grok-4, grok-code-fast-1,
#           qwen3-max
OPENROUTER_ALLOWED_MODELS=

# Logging Level (optional)
# Options: DEBUG, INFO, WARNING, ERROR
# Default: INFO
LOG_LEVEL=INFO
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# Virtual Environments
venv/
ENV/
env/
.venv/
.env.local
.env.*.local

# IDE
.vscode/
.idea/
*.swp
*.swo
*~
.DS_Store

# Testing
.coverage
.pytest_cache/
.mypy_cache/
.dmypy.json
dmypy.json
htmlcov/
.tox/
.nox/
coverage.xml
*.cover
*.log

# Environment variables
.env
.env.local
.env.*.local

# Database
*.db
*.sqlite
*.sqlite3

# Package managers
uv.lock
poetry.lock
Pipfile.lock

# Serena cache
.serena/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# Celery
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
# μ-MCP Server

**μ** (mu) = micro, minimal - in sardonic contrast to zen-mcp's 10,000+ lines of orchestration.

A pure MCP server that does one thing well: enable chat with AI models via OpenRouter.

## Philosophy

Following UNIX principles:
- **Do one thing well**: Provide access to AI models
- **No hardcoded control flow**: The AI agents decide everything
- **Minimal interface**: One tool, clean parameters
- **Persistent state**: Conversations persist across sessions
- **Model agnostic**: Support any OpenRouter model

## Features

- ✅ **Multi-model conversations** - Switch models mid-conversation
- ✅ **Persistent storage** - Conversations saved to disk
- ✅ **Model registry** - Curated models with capabilities
- ✅ **LLM-driven model selection** - Calling agent picks the best model
- ✅ **Reasoning effort control** - Simple pass-through to OpenRouter (low/medium/high)
- ✅ **MCP prompts** - Slash commands `/mu:chat` and `/mu:continue`
- ✅ **Token-based budgeting** - Smart file truncation
- ✅ **Proper MIME types** - Correct image format handling

## What's NOT Included

- ❌ Workflow orchestration (let AI decide)
- ❌ Step tracking (unnecessary complexity)
- ❌ Confidence levels (trust the models)
- ❌ Expert validation (models are the experts)
- ❌ Hardcoded procedures (pure AI agency)
- ❌ Web search implementation (just ask Claude)
- ❌ Multiple providers (OpenRouter handles all)

## Setup

### Quick Install (with uv)

1. **Get OpenRouter API key**: https://openrouter.ai/keys

2. **Run setup script**:
   ```bash
   ./setup.sh
   ```
   
   This will:
   - Install `uv` if not present (blazing fast Python package manager)
   - Install dependencies
   - Show you the Claude Desktop config

3. **Add to Claude Desktop config** (`~/.config/claude/claude_desktop_config.json`):
   ```json
   {
     "mcpServers": {
       "mu-mcp": {
         "command": "uv",
         "args": ["--directory", "/path/to/mu-mcp", "run", "python", "/path/to/mu-mcp/server.py"],
         "env": {
           "OPENROUTER_API_KEY": "your-key-here"
         }
       }
     }
   }
   ```

4. **Restart Claude Desktop**

### Manual Install (traditional)

If you prefer pip/venv:
```bash
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
```

Then use this Claude Desktop config:
```json
{
  "mcpServers": {
    "mu-mcp": {
      "command": "/path/to/mu-mcp/venv/bin/python",
      "args": ["/path/to/mu-mcp/server.py"],
      "env": {
        "OPENROUTER_API_KEY": "your-key-here"
      }
    }
  }
}

## Usage

### Basic Chat
```
/mu:chat
Then specify model and prompt: "Use gpt-5 to explain quantum computing"
```

### Continue Conversations
```
/mu:continue
Claude sees your recent conversations and can intelligently continue them
Preserves full context even after Claude's memory is compacted
```

**Key Use Case**: Maintain context between different Claude sessions. When Claude's context gets compacted or you need to switch between tasks, `/mu:continue` allows Claude to see all your recent conversations (with titles, timestamps, and models used) and seamlessly resume where you left off. The agent intelligently selects or asks which conversation to continue based on your needs.

### Challenge Mode
```
/mu:challenge
Encourages critical thinking and avoids reflexive agreement
```

### Multi-AI Discussion
```
/mu:discuss
Orchestrate multi-turn discussions among diverse AI models
```

### Model Selection
```
Chat with GPT-5 about code optimization
Chat with O3 Mini High for complex reasoning
Chat with DeepSeek R1 for systematic analysis  
Chat with Claude about API design
```

### Reasoning Effort Control
```
Chat with o3-mini using high reasoning effort for complex problems
Chat with gpt-5 using low reasoning effort for quick responses
Chat with o4-mini-high using medium reasoning effort for balanced analysis
```

Note: Reasoning effort is automatically ignored by models that don't support it.

### With Files and Images
```
Review this code: /path/to/file.py
Analyze this diagram: /path/to/image.png
```

### Model Selection by LLM
The calling LLM agent (Claude) sees all available models with their descriptions and capabilities, allowing intelligent selection based on:
- Task requirements and complexity
- Performance vs cost trade-offs  
- Specific model strengths
- Context window needs
- Image support requirements

## Architecture

```
server.py         # MCP server with prompt handlers
chat_handler.py   # Chat logic with multi-model support
models.py         # Model registry and capabilities
prompts.py        # System prompts for peer AI collaboration
storage.py        # Persistent conversation storage
.env.example      # Configuration template
```

## Configuration

### Environment Variables

- `OPENROUTER_API_KEY` - Your OpenRouter API key (required)
- `OPENROUTER_ALLOWED_MODELS` - Comma-separated list of allowed models (optional)
- `LOG_LEVEL` - Logging verbosity (DEBUG, INFO, WARNING, ERROR)

## Why μ-MCP?

### The Problem with zen-mcp-server

zen-mcp grew to **10,000+ lines** trying to control AI behavior:
- 15+ specialized tools with overlapping functionality
- Complex workflow orchestration that limits AI agency
- Hardcoded decision trees that prescribe solutions
- "Step tracking" and "confidence levels" that add noise
- Redundant schema fields and validation layers

### The μ-MCP Approach

**Less code, more capability**:
- Single tool that does one thing perfectly
- AI agents make all decisions
- Clean, persistent conversation state
- Model capabilities, not hardcoded behaviors
- Trust in AI intelligence over procedural control

### Philosophical Difference

**zen-mcp**: "Let me orchestrate 12 steps for you to debug this code"
**μ-mcp**: "Here's the model catalog. Pick what you need."

The best tool is the one that gets out of the way.

## Related Projects

- [zen-mcp-server](https://github.com/winnative/zen-mcp-server) - The bloated alternative we're reacting against

## License

MIT
```

--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------

```
mcp>=1.0.0
aiohttp>=3.9.0
python-dotenv>=1.0.0
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
[project]
name = "mu-mcp"
version = "2.0.0"
description = "Minimal MCP server for AI model interactions via OpenRouter"
requires-python = ">=3.10"
dependencies = [
    "mcp>=1.0.0",
    "aiohttp>=3.9.0",
    "python-dotenv>=1.0.0",
]

[tool.uv]
dev-dependencies = []

[project.scripts]
mu-mcp = "server:main"
```

--------------------------------------------------------------------------------
/setup.sh:
--------------------------------------------------------------------------------

```bash
#!/bin/bash

echo "🚀 μ-MCP Server Setup (with uv)"
echo "=============================="

# Check if uv is installed
if ! command -v uv &> /dev/null; then
    echo "📦 Installing uv..."
    curl -LsSf https://astral.sh/uv/install.sh | sh
    
    # Add to PATH for current session
    export PATH="$HOME/.cargo/bin:$PATH"
    
    echo "✅ uv installed successfully"
else
    echo "✅ uv is already installed"
fi

# Install dependencies using uv
echo ""
echo "📥 Installing dependencies..."
uv pip sync requirements.txt

# Check for API key
if [ -z "$OPENROUTER_API_KEY" ]; then
    echo ""
    echo "⚠️  OPENROUTER_API_KEY not set!"
    echo "Please add to your shell profile:"
    echo "export OPENROUTER_API_KEY='your-api-key'"
    echo ""
    echo "Get your API key at: https://openrouter.ai/keys"
else
    echo "✅ OpenRouter API key found"
fi

# Create MCP config
echo ""
echo "📝 Add to Claude Desktop config (~/.config/claude/claude_desktop_config.json):"
echo ""
cat << EOF
{
  "mcpServers": {
    "mu-mcp": {
      "command": "uv",
      "args": ["--directory", "$(pwd)", "run", "python", "$(pwd)/server.py"],
      "env": {
        "OPENROUTER_API_KEY": "\$OPENROUTER_API_KEY"
      }
    }
  }
}
EOF

echo ""
echo "✅ Setup complete! Restart Claude Desktop to use μ-MCP server."
echo ""
echo "To test the server manually, run:"
echo "  uv run python server.py"
```

--------------------------------------------------------------------------------
/prompts.py:
--------------------------------------------------------------------------------

```python
"""System prompts for μ-MCP."""


def get_llm_system_prompt(model_name: str = None) -> str:
    """
    System prompt for the LLM being called.
    Modern, direct, without childish "you are" patterns.
    """
    return """Collaborate as a technical peer with Claude, the AI agent requesting assistance.

Core principles:
- Provide expert analysis and alternative perspectives
- Challenge assumptions constructively when warranted
- Share implementation details and edge cases
- Acknowledge uncertainty rather than guessing

When additional context would strengthen your response:
- Request Claude perform web searches for current documentation
- Ask Claude to provide specific files or code sections

Format code with proper syntax highlighting.
Maintain technical precision over conversational comfort.
Skip unnecessary preambles - dive directly into substance."""


def get_request_wrapper() -> str:
    """
    Wrapper text to inform the peer AI that this request is from Claude.
    """
    return """

---

REQUEST FROM CLAUDE: The following query comes from Claude, an AI assistant seeking peer collaboration."""


def get_response_wrapper(model_name: str) -> str:
    """
    Wrapper text for Claude to understand this is another AI's perspective.
    
    Args:
        model_name: Short model name (e.g., "gpt-5", "sonnet")
    """
    # Format short name for display (e.g., "gpt-5" -> "GPT 5")
    display_name = model_name.upper().replace("-", " ")
    return f"""

---

PEER AI RESPONSE ({display_name}): Evaluate this perspective critically and integrate valuable insights."""


def get_agent_tool_description() -> str:
    """
    Description for the calling agent (Claude) about how to use this tool.
    """
    return """Direct access to state-of-the-art AI models via OpenRouter.

Provide EXACTLY ONE:
- title: Start fresh (when switching topics, context too long, or isolating model contexts)
- continuation_id: Continue existing conversation (preserves full context)

When starting fresh: Model has no context - include background details or attach files
When continuing: Model has conversation history - don't repeat context

FILE ATTACHMENT BEST PRACTICES:
- Proactively attach relevant files when starting new conversations for context
- For long content (git diffs, logs, terminal output), save to a file and attach it rather than pasting verbatim in prompt
- Files are processed more efficiently and precisely than inline text"""
```

--------------------------------------------------------------------------------
/models.py:
--------------------------------------------------------------------------------

```python
"""OpenRouter model registry and capabilities."""

import os
from dataclasses import dataclass
from typing import Optional

# Load environment variables from .env file
from dotenv import load_dotenv
load_dotenv()


@dataclass
class ModelCapabilities:
    """Model metadata for routing and selection."""
    
    name: str  # Full OpenRouter model path
    description: str  # What the model is best for


# OpenRouter model registry - popular models with good support
OPENROUTER_MODELS = {
    # OpenAI Models
    "gpt-5": ModelCapabilities(
        name="openai/gpt-5",
        description="Most advanced OpenAI model with extended context. Excels at complex reasoning, coding, and multimodal understanding",
    ),
    "gpt-5-mini": ModelCapabilities(
        name="openai/gpt-5-mini",
        description="Efficient GPT-5 variant. Balances performance and cost for general-purpose tasks",
    ),
    "gpt-4o": ModelCapabilities(
        name="openai/gpt-4o",
        description="Multimodal model supporting text, image, audio, and video. Strong at creative writing and following complex instructions",
    ),
    "o3": ModelCapabilities(
        name="openai/o3",
        description="Advanced reasoning model with tool integration and visual reasoning. Excels at mathematical proofs and complex problem-solving",
    ),
    "o3-mini": ModelCapabilities(
        name="openai/o3-mini",
        description="Production-ready small reasoning model with function calling and structured outputs. Good for systematic problem-solving",
    ),
    "o3-mini-high": ModelCapabilities(
        name="openai/o3-mini-high",
        description="Enhanced O3 Mini with deeper reasoning, better accuracy vs standard O3 Mini",
    ),
    "o4-mini": ModelCapabilities(
        name="openai/o4-mini",
        description="Fast reasoning model optimized for speed. Exceptional at math, coding, and visual tasks with tool support",
    ),
    "o4-mini-high": ModelCapabilities(
        name="openai/o4-mini-high",
        description="Premium O4 Mini variant with enhanced reasoning depth and accuracy",
    ),
    
    # Anthropic Models
    "sonnet": ModelCapabilities(
        name="anthropic/claude-sonnet-4",
        description="Industry-leading coding model with superior instruction following. Excellent for software development and technical writing",
    ),
    "opus": ModelCapabilities(
        name="anthropic/claude-opus-4.1",
        description="Most capable Claude model for sustained complex work. Strongest at deep analysis and long-running tasks",
    ),
    "haiku": ModelCapabilities(
        name="anthropic/claude-3.5-haiku",
        description="Fast, efficient model matching previous flagship performance. Great for high-volume, quick-response scenarios",
    ),
    
    # Google Models
    "gemini-2.5-pro": ModelCapabilities(
        name="google/gemini-2.5-pro",
        description="Massive context window with thinking mode. Best for analyzing huge datasets, codebases, and STEM reasoning",
    ),
    "gemini-2.5-flash": ModelCapabilities(
        name="google/gemini-2.5-flash",
        description="Best price-performance with thinking capabilities. Ideal for high-volume tasks with multimodal and multilingual support",
    ),
    
    # DeepSeek Models
    "deepseek-chat": ModelCapabilities(
        name="deepseek/deepseek-chat-v3.1",
        description="Hybrid model switching between reasoning and direct modes. Strong multilingual support and code completion",
    ),
    "deepseek-r1": ModelCapabilities(
        name="deepseek/deepseek-r1",
        description="Open-source reasoning model with exceptional math capabilities. Highly cost-effective for complex reasoning tasks",
    ),
    
    # X.AI Models
    "grok-4": ModelCapabilities(
        name="x-ai/grok-4",
        description="Multimodal model with strong reasoning and analysis capabilities. Excellent at complex problem-solving and scientific tasks",
    ),
    "grok-code-fast-1": ModelCapabilities(
        name="x-ai/grok-code-fast-1",
        description="Ultra-fast coding specialist optimized for IDE integration. Best for rapid code generation and bug fixes",
    ),
    
    # Qwen Models
    "qwen3-max": ModelCapabilities(
        name="qwen/qwen3-max",
        description="Trillion-parameter model with ultra-long context. Excels at complex reasoning, structured data, and creative tasks",
    ),
}


def get_allowed_models() -> dict[str, ModelCapabilities]:
    """Get models filtered by OPENROUTER_ALLOWED_MODELS env var."""
    allowed = os.getenv("OPENROUTER_ALLOWED_MODELS", "")
    
    if not allowed:
        # No restrictions, return all models
        return OPENROUTER_MODELS
    
    # Parse comma-separated list
    allowed_names = [name.strip().lower() for name in allowed.split(",")]
    filtered = {}
    
    for key, model in OPENROUTER_MODELS.items():
        # Check main key
        if key.lower() in allowed_names:
            filtered[key] = model
            continue
                
        # Check full model name
        if model.name.split("/")[-1].lower() in allowed_names:
            filtered[key] = model
    
    return filtered


def resolve_model(name: str) -> Optional[str]:
    """Resolve a model name to the full OpenRouter model path."""
    if not name:
        return None
        
    name_lower = name.lower()
    
    # Check if it's already a full path
    if "/" in name:
        return name
    
    # Check available models
    models = get_allowed_models()
    
    # Direct key match
    if name_lower in models:
        return models[name_lower].name
    
    # Check by model name suffix
    for key, model in models.items():
        if model.name.endswith(f"/{name_lower}"):
            return model.name
    
    return None


def get_short_name(full_name: str) -> Optional[str]:
    """Get the short name (key) for a full model path.
    
    Args:
        full_name: Full OpenRouter model path (e.g., "openai/gpt-5")
        
    Returns:
        Short name key (e.g., "gpt-5") or None if not found
    """
    if not full_name:
        return None
    
    # Check available models for matching full name
    models = get_allowed_models()
    
    for key, model in models.items():
        if model.name == full_name:
            return key
    
    # If not found in registry, return None
    # This handles cases where a custom full path was used
    return None

```

--------------------------------------------------------------------------------
/chat_handler.py:
--------------------------------------------------------------------------------

```python
"""Chat handler for μ-MCP."""

import base64
import logging
import mimetypes
import os
import uuid
from pathlib import Path
from typing import Optional, Union

import aiohttp

from models import (
    get_allowed_models,
    resolve_model,
    get_short_name,
)
from prompts import (
    get_llm_system_prompt,
    get_response_wrapper,
    get_request_wrapper,
)
from storage import ConversationStorage

logger = logging.getLogger(__name__)


class ChatHandler:
    """Handle chat interactions with OpenRouter models."""

    def __init__(self):
        self.api_key = os.getenv("OPENROUTER_API_KEY")
        if not self.api_key:
            raise ValueError("OPENROUTER_API_KEY environment variable not set")
        
        self.base_url = "https://openrouter.ai/api/v1/chat/completions"
        
        # Initialize persistent storage with default directory
        self.storage = ConversationStorage()

    async def chat(
        self,
        prompt: str,
        model: str,  # Now required
        title: Optional[str] = None,
        continuation_id: Optional[str] = None,
        files: Optional[list[str]] = None,
        images: Optional[list[str]] = None,
        reasoning_effort: Optional[str] = "medium",
    ) -> dict:
        """
        Chat with an AI model.
        
        Args:
            prompt: The user's message
            model: Model name (required)
            title: Title for a new conversation (provide this OR continuation_id, not both)
            continuation_id: UUID to continue existing conversation (provide this OR title, not both)
            files: List of file paths to include
            images: List of image paths to include
            reasoning_effort: Reasoning depth - "low", "medium", or "high" (for models that support it)
        
        Returns dict with:
        - content: The model's response with wrapper
        - continuation_id: UUID for continuing this conversation
        - model_used: The actual model that was used
        """
        # Resolve model name/alias
        resolved_model = resolve_model(model)
        if not resolved_model:
            # If not found in registry, use as-is (might be full path)
            resolved_model = model
        
        # Validate: exactly one of title or continuation_id must be provided
        if (title and continuation_id):
            return {
                "error": "Cannot provide both 'title' and 'continuation_id'. Use 'title' for new conversations or 'continuation_id' to continue existing ones.",
                "continuation_id": None,
                "model_used": None,
            }
        
        if (not title and not continuation_id):
            return {
                "error": "Must provide either 'title' for a new conversation or 'continuation_id' to continue an existing one.",
                "continuation_id": None,
                "model_used": None,
            }
        
        # Get or create conversation
        messages_with_metadata = []
        if continuation_id:
            # Try to load from persistent storage
            conversation_data = self.storage.load_conversation(continuation_id)
            if conversation_data:
                messages_with_metadata = conversation_data.get("messages", [])
            else:
                # Fail fast - conversation not found
                return {
                    "error": f"Conversation {continuation_id} not found. Please start a new conversation or use a valid continuation_id.",
                    "continuation_id": None,
                    "model_used": None,
                }
        else:
            # New conversation with title provided
            continuation_id = str(uuid.uuid4())

        # Build the user message with metadata and request wrapper
        wrapped_prompt = prompt + get_request_wrapper()
        user_content = self._build_user_content(wrapped_prompt, files, images)
        user_message = self.storage.add_metadata_to_message(
            {"role": "user", "content": user_content},
            {"target_model": resolved_model}
        )
        messages_with_metadata.append(user_message)
        
        # Get clean messages for API (without metadata)
        api_messages = self.storage.get_messages_for_api(messages_with_metadata)
        
        # Add system prompt for the LLM
        system_prompt = get_llm_system_prompt(resolved_model)
        api_messages.insert(0, {"role": "system", "content": system_prompt})

        # Make API call
        response_text = await self._call_openrouter(
            api_messages, resolved_model, reasoning_effort
        )

        # Add assistant response with metadata
        assistant_message = self.storage.add_metadata_to_message(
            {"role": "assistant", "content": response_text},
            {"model": resolved_model, "model_used": resolved_model}
        )
        messages_with_metadata.append(assistant_message)
        
        # Save conversation to persistent storage
        # Pass title only for new conversations (when title was provided)
        self.storage.save_conversation(
            continuation_id,
            messages_with_metadata,
            {"models_used": [resolved_model]},
            title=title  # Will be None for continuations, actual title for new conversations
        )
        
        # Get short name for agent interface
        short_name = get_short_name(resolved_model)
        # Fall back to resolved model if not in registry (custom path)
        display_name = short_name if short_name else resolved_model
        
        # Add response wrapper for Claude with model identification
        wrapped_response = response_text + get_response_wrapper(display_name)

        return {
            "content": wrapped_response,
            "continuation_id": continuation_id,
            "model_used": display_name,
        }

    def _build_user_content(
        self, prompt: str, files: Optional[list[str]], images: Optional[list[str]]
    ) -> str | list:
        """Build user message content with files and images."""
        content_parts = []

        # Add main prompt
        content_parts.append({"type": "text", "text": prompt})

        # Add files as text
        if files:
            file_content = self._read_files(files)
            if file_content:
                content_parts.append({"type": "text", "text": f"\n\nFiles:\n{file_content}"})

        # Add images as base64 with proper MIME type
        if images:
            for image_path in images:
                result = self._encode_image(image_path)
                if result:
                    encoded_data, mime_type = result
                    content_parts.append(
                        {
                            "type": "image_url",
                            "image_url": {"url": f"data:{mime_type};base64,{encoded_data}"},
                        }
                    )

        # If only text, return string; otherwise return multi-part content
        if len(content_parts) == 1:
            return prompt
        return content_parts

    def _read_files(self, file_paths: list[str]) -> str:
        """Read and combine file contents with token-based budgeting."""
        contents = []
        # Simple token estimation: ~4 chars per token
        # Reserve tokens for prompt and response
        max_file_tokens = 50_000  # ~200k chars
        total_tokens = 0

        for file_path in file_paths:
            try:
                path = Path(file_path)
                if path.exists() and path.is_file():
                    content = path.read_text(errors="ignore")
                    # Estimate tokens
                    file_tokens = len(content) // 4
                    
                    if total_tokens + file_tokens > max_file_tokens:
                        # Truncate if needed
                        remaining_tokens = max_file_tokens - total_tokens
                        if remaining_tokens > 100:  # Worth including partial
                            char_limit = remaining_tokens * 4
                            content = content[:char_limit] + "\n[File truncated]"
                            contents.append(f"\n--- {file_path} ---\n{content}")
                        break
                    
                    contents.append(f"\n--- {file_path} ---\n{content}")
                    total_tokens += file_tokens
            except Exception as e:
                logger.warning(f"Could not read file {file_path}: {e}")

        return "".join(contents)

    def _encode_image(self, image_path: str) -> Optional[tuple[str, str]]:
        """Encode image to base64 with proper MIME type."""
        try:
            path = Path(image_path)
            if path.exists() and path.is_file():
                # Detect MIME type
                mime_type, _ = mimetypes.guess_type(str(path))
                if not mime_type or not mime_type.startswith('image/'):
                    # Default to JPEG for unknown types
                    mime_type = 'image/jpeg'
                
                with open(path, "rb") as f:
                    encoded = base64.b64encode(f.read()).decode("utf-8")
                    return encoded, mime_type
        except Exception as e:
            logger.warning(f"Could not encode image {image_path}: {e}")
        return None

    async def _call_openrouter(
        self,
        messages: list,
        model: str,
        reasoning_effort: Optional[str],
    ) -> str:
        """Make API call to OpenRouter."""
        headers = {
            "Authorization": f"Bearer {self.api_key}",
            "Content-Type": "application/json",
            "HTTP-Referer": "https://github.com/mu-mcp",
            "X-Title": "μ-MCP Server",
        }

        data = {
            "model": model,
            "messages": messages,
        }

        # Add reasoning effort if specified
        # OpenRouter will automatically ignore this for non-reasoning models
        if reasoning_effort:
            data["reasoning"] = {
                "effort": reasoning_effort  # "low", "medium", or "high"
            }

        async with aiohttp.ClientSession() as session:
            async with session.post(self.base_url, headers=headers, json=data) as response:
                if response.status != 200:
                    error_text = await response.text()
                    raise Exception(f"OpenRouter API error: {response.status} - {error_text}")

                result = await response.json()
                return result["choices"][0]["message"]["content"]

```

--------------------------------------------------------------------------------
/storage.py:
--------------------------------------------------------------------------------

```python
"""Persistent conversation storage for μ-MCP."""

import json
import os
from datetime import datetime
from pathlib import Path
from typing import Optional
import logging

from models import get_short_name

logger = logging.getLogger(__name__)


class ConversationStorage:
    """Handles persistent storage of multi-model conversations."""
    
    def __init__(self):
        """Initialize storage with default directory."""
        # Always use default ~/.mu-mcp/conversations
        self.storage_dir = Path.home() / ".mu-mcp" / "conversations"
        
        # Create directory if it doesn't exist
        self.storage_dir.mkdir(parents=True, exist_ok=True)
        
        # In-memory cache for conversations (no limit within MCP lifecycle)
        self._cache = {}
        
        # Track last conversation for "continue" command
        self._last_conversation_id = None
        self._last_model_used = None
        
        logger.info(f"Conversation storage initialized at: {self.storage_dir}")
    
    def save_conversation(self, conversation_id: str, messages: list, 
                         model_metadata: Optional[dict] = None, title: Optional[str] = None) -> bool:
        """
        Save a conversation to disk and update cache.
        
        Args:
            conversation_id: Unique conversation identifier
            messages: List of message dicts with role and content
            model_metadata: Optional metadata about models used
            title: Optional conversation title
        
        Returns:
            True if saved successfully
        """
        try:
            file_path = self.storage_dir / f"{conversation_id}.json"
            
            # Check if conversation exists to determine created time
            existing = {}
            if file_path.exists():
                with open(file_path, "r") as f:
                    existing = json.load(f)
                    created = existing.get("created")
            else:
                created = datetime.utcnow().isoformat()
            
            # Prepare conversation data
            conversation_data = {
                "id": conversation_id,
                "created": created,
                "updated": datetime.utcnow().isoformat(),
                "messages": messages,
            }
            
            # Add title if provided or preserve existing title
            if title:
                conversation_data["title"] = title
            elif "title" in existing:
                conversation_data["title"] = existing["title"]
            
            # Add model metadata if provided
            if model_metadata:
                conversation_data["model_metadata"] = model_metadata
            
            # Write to file
            with open(file_path, "w") as f:
                json.dump(conversation_data, f, indent=2)
            
            # Update cache (write-through)
            self._cache[conversation_id] = conversation_data
            
            # Update last conversation tracking
            self._last_conversation_id = conversation_id
            # Extract the last model used from messages or metadata
            last_full_name = None
            if model_metadata and "models_used" in model_metadata:
                last_full_name = model_metadata["models_used"][-1] if model_metadata["models_used"] else None
            else:
                # Try to extract from the last assistant message
                for msg in reversed(messages):
                    if msg.get("role") == "assistant":
                        metadata = msg.get("metadata", {})
                        if "model" in metadata:
                            last_full_name = metadata["model"]
                            break
            
            # Convert to short name for agent interface
            if last_full_name:
                short_name = get_short_name(last_full_name)
                self._last_model_used = short_name if short_name else last_full_name
            else:
                self._last_model_used = None
            
            logger.debug(f"Saved conversation {conversation_id} with {len(messages)} messages")
            return True
            
        except Exception as e:
            logger.error(f"Failed to save conversation {conversation_id}: {e}")
            return False
    
    def load_conversation(self, conversation_id: str) -> Optional[dict]:
        """
        Load a conversation from cache or disk.
        
        Args:
            conversation_id: Unique conversation identifier
        
        Returns:
            Conversation data dict or None if not found
        """
        # Check cache first
        if conversation_id in self._cache:
            data = self._cache[conversation_id]
            logger.debug(f"Loaded conversation {conversation_id} from cache with {len(data.get('messages', []))} messages")
            return data
        
        # Not in cache, try loading from disk
        try:
            file_path = self.storage_dir / f"{conversation_id}.json"
            
            if not file_path.exists():
                logger.debug(f"Conversation {conversation_id} not found")
                return None
            
            with open(file_path, "r") as f:
                data = json.load(f)
            
            # Add to cache for future access
            self._cache[conversation_id] = data
            
            logger.debug(f"Loaded conversation {conversation_id} from disk with {len(data.get('messages', []))} messages")
            return data
            
        except Exception as e:
            logger.error(f"Failed to load conversation {conversation_id}: {e}")
            return None
    
    def get_last_conversation_info(self) -> tuple[Optional[str], Optional[str]]:
        """
        Get the last conversation ID and model used.
        
        Returns:
            Tuple of (conversation_id, model_used) or (None, None) if no conversations
        """
        return self._last_conversation_id, self._last_model_used
    
    
    def get_messages_for_api(self, messages: list) -> list:
        """
        Extract just role and content for API calls.
        Strips metadata that OpenRouter doesn't understand.
        
        Args:
            messages: List of message dicts potentially with metadata
        
        Returns:
            Clean list of messages for API
        """
        clean_messages = []
        
        for msg in messages:
            # Only include role and content for API
            clean_msg = {
                "role": msg.get("role"),
                "content": msg.get("content")
            }
            clean_messages.append(clean_msg)
        
        return clean_messages
    
    def add_metadata_to_message(self, message: dict, metadata: dict) -> dict:
        """
        Add metadata to a message for storage.
        
        Args:
            message: Basic message dict with role and content
            metadata: Metadata to add (timestamp, model, etc.)
        
        Returns:
            Message with metadata added
        """
        return {
            **message,
            "metadata": {
                "timestamp": datetime.utcnow().isoformat(),
                **metadata
            }
        }
    
    def list_recent_conversations(self, limit: int = 20) -> list[dict]:
        """
        List the most recently updated conversations.
        
        Args:
            limit: Maximum number of conversations to return
        
        Returns:
            List of conversation summaries sorted by update time (newest first)
        """
        conversations = []
        
        try:
            # Get all conversation files with their modification times
            files_with_mtime = []
            for file_path in self.storage_dir.glob("*.json"):
                try:
                    mtime = file_path.stat().st_mtime
                    files_with_mtime.append((mtime, file_path))
                except Exception as e:
                    logger.warning(f"Failed to stat file {file_path}: {e}")
                    continue
            
            # Sort by modification time (newest first) and take only the limit
            files_with_mtime.sort(key=lambda x: x[0], reverse=True)
            recent_files = files_with_mtime[:limit]
            
            # Now load only the recent files
            for _, file_path in recent_files:
                try:
                    with open(file_path, "r") as f:
                        data = json.load(f)
                        
                        # Extract key information
                        conv_summary = {
                            "id": data.get("id"),
                            "title": data.get("title"),
                            "created": data.get("created"),
                            "updated": data.get("updated"),
                        }
                        
                        # Extract model used from messages or metadata
                        model_full_name = None
                        if "model_metadata" in data and "models_used" in data["model_metadata"]:
                            models = data["model_metadata"]["models_used"]
                            model_full_name = models[-1] if models else None
                        else:
                            # Try to extract from the last assistant message
                            for msg in reversed(data.get("messages", [])):
                                if msg.get("role") == "assistant":
                                    metadata = msg.get("metadata", {})
                                    if "model" in metadata:
                                        model_full_name = metadata["model"]
                                        break
                        
                        # Convert to short name for agent interface
                        if model_full_name:
                            short_name = get_short_name(model_full_name)
                            model_used = short_name if short_name else model_full_name
                        else:
                            model_used = None
                        
                        conv_summary["model_used"] = model_used
                        
                        # If no title exists (should not happen with new version)
                        # just use a placeholder
                        if not conv_summary["title"]:
                            conv_summary["title"] = "[Untitled conversation]"
                        
                        conversations.append(conv_summary)
                        
                except Exception as e:
                    logger.warning(f"Failed to read conversation file {file_path}: {e}")
                    continue
            
            # The files are already in order from the filesystem sorting
            # But we should still sort by the actual "updated" field in case of discrepancies
            conversations.sort(key=lambda x: x.get("updated", ""), reverse=True)
            
            return conversations
            
        except Exception as e:
            logger.error(f"Failed to list recent conversations: {e}")
            return []
```

--------------------------------------------------------------------------------
/server.py:
--------------------------------------------------------------------------------

```python
#!/usr/bin/env python3
"""μ-MCP Server - Minimal MCP server for AI model interactions.

In contrast to zen-mcp's 10,000+ lines of orchestration,
μ-MCP provides pure model access with no hardcoded workflows.
"""

import asyncio
import json
import logging
import os
import sys
from typing import Any

# Load environment variables from .env file
from dotenv import load_dotenv
load_dotenv()

from mcp import McpError, types
from mcp.server import Server
from mcp.server.models import InitializationOptions
from mcp.types import (
    TextContent,
    Tool,
    ServerCapabilities,
    ToolsCapability,
    Prompt,
    GetPromptResult,
    PromptMessage,
    PromptsCapability,
)

from models import get_allowed_models
from prompts import get_agent_tool_description

# Configure logging
log_level = os.getenv("LOG_LEVEL", "INFO")
logging.basicConfig(
    level=getattr(logging, log_level),
    format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
)
logger = logging.getLogger(__name__)

app = Server("μ-mcp")


@app.list_tools()
async def list_tools() -> list[Tool]:
    """List available tools - just one: chat."""
    # Build model enum for schema
    models = get_allowed_models()
    
    # Build model enum for schema
    model_enum = []
    model_descriptions = []
    
    for key, model in models.items():
        # Use short name (key) in enum
        model_enum.append(key)
        # Show only short name in description, not full path
        model_descriptions.append(f"• {key}: {model.description}")
    
    # Build the combined description
    models_description = "Select the AI model that best fits your task:\n\n" + "\n".join(model_descriptions)
    
    return [
        Tool(
            name="chat",
            description=get_agent_tool_description(),
            inputSchema={
                "type": "object",
                "properties": {
                    "prompt": {
                        "type": "string",
                        "description": "Your message or question"
                    },
                    "model": {
                        "type": "string",
                        "enum": model_enum,
                        "description": models_description,
                    },
                    "title": {
                        "type": "string",
                        "description": "Title for new conversation (3-10 words). Provide this OR continuation_id, not both",
                    },
                    "continuation_id": {
                        "type": "string",
                        "description": "UUID to continue existing conversation. Provide this OR title, not both",
                    },
                    "files": {
                        "type": "array",
                        "items": {"type": "string"},
                        "description": "Absolute paths to files to include as context",
                    },
                    "images": {
                        "type": "array",
                        "items": {"type": "string"},
                        "description": "Absolute paths to images to include",
                    },
                    "reasoning_effort": {
                        "type": "string",
                        "enum": ["low", "medium", "high"],
                        "description": "Reasoning depth for models that support it (low=20%, medium=50%, high=80% of computation)",
                        "default": "medium",
                    },
                },
                "required": ["prompt", "model"],  # Model is now required
            },
        )
    ]


@app.list_prompts()
async def list_prompts() -> list[Prompt]:
    """List available prompts for slash commands."""
    return [
        Prompt(
            name="chat",
            description="Start a chat with AI models",
            arguments=[],
        ),
        Prompt(
            name="continue",
            description="Continue the previous conversation",
            arguments=[],
        ),
        Prompt(
            name="challenge",
            description="Encourage critical thinking and avoid reflexive agreement",
            arguments=[],
        ),
        Prompt(
            name="discuss",
            description="Orchestrate multi-turn discussion among multiple AIs",
            arguments=[],
        ),
    ]


@app.get_prompt()
async def get_prompt(name: str, arguments: dict[str, Any] = None) -> GetPromptResult:
    """Generate prompt text for slash commands."""
    if name == "chat":
        return GetPromptResult(
            description="Start a chat with AI models",
            messages=[
                PromptMessage(
                    role="user",
                    content=TextContent(
                        type="text",
                        text="Use the chat tool to interact with an AI model."
                    )
                )
            ],
        )
    elif name == "continue":
        # Get the list of recent conversations
        from chat_handler import ChatHandler
        from datetime import datetime
        
        handler = ChatHandler()
        recent_conversations = handler.storage.list_recent_conversations(20)
        
        if recent_conversations:
            # Format the conversation list
            conv_list = []
            for i, conv in enumerate(recent_conversations, 1):
                # Calculate relative time
                if conv.get("updated"):
                    try:
                        updated_time = datetime.fromisoformat(conv["updated"])
                        now = datetime.utcnow()
                        time_diff = now - updated_time
                        
                        # Format relative time
                        if time_diff.days > 0:
                            time_str = f"{time_diff.days} day{'s' if time_diff.days > 1 else ''} ago"
                        elif time_diff.seconds >= 3600:
                            hours = time_diff.seconds // 3600
                            time_str = f"{hours} hour{'s' if hours > 1 else ''} ago"
                        elif time_diff.seconds >= 60:
                            minutes = time_diff.seconds // 60
                            time_str = f"{minutes} minute{'s' if minutes > 1 else ''} ago"
                        else:
                            time_str = "just now"
                    except:
                        time_str = "unknown time"
                else:
                    time_str = "unknown time"
                
                # Get display text (title should always exist)
                display = conv.get("title", "[Untitled]")
                # model_used is already a short name from list_recent_conversations()
                model = conv.get("model_used", "unknown model")
                
                conv_list.append(
                    f"{i}. [{time_str}] {display}\n"
                    f"   Model: {model} | ID: {conv['id']}"
                )
            
            instruction_text = f"""Select a conversation to continue using the chat tool.

Recent Conversations (newest first):
{chr(10).join(conv_list)}

To continue a conversation, use the chat tool with the desired continuation_id.
Example: Use continuation_id: "{recent_conversations[0]['id']}" for the most recent conversation.

This allows you to access the full conversation history even if your context was compacted."""
        else:
            instruction_text = "No previous conversations found. Start a new conversation using the chat tool."
        
        return GetPromptResult(
            description="Continue a previous conversation",
            messages=[
                PromptMessage(
                    role="user",
                    content=TextContent(
                        type="text",
                        text=instruction_text
                    )
                )
            ],
        )
    elif name == "challenge":
        return GetPromptResult(
            description="Encourage critical thinking and avoid reflexive agreement",
            messages=[
                PromptMessage(
                    role="user",
                    content=TextContent(
                        type="text",
                        text="""CRITICAL REASSESSMENT MODE:

When using the chat tool, wrap your prompt with instructions for the AI to:
- Challenge ideas and think critically before responding
- Evaluate whether they actually agree or disagree
- Provide thoughtful analysis rather than reflexive agreement

Example: Instead of accepting a statement, ask the AI to examine it for accuracy, completeness, and reasoning flaws.
This promotes truth-seeking over compliance."""
                    )
                )
            ],
        )
    elif name == "discuss":
        return GetPromptResult(
            description="Orchestrate multi-turn discussion among multiple AIs",
            messages=[
                PromptMessage(
                    role="user",
                    content=TextContent(
                        type="text",
                        text="""MULTI-AI DISCUSSION MODE:

Use the chat tool to orchestrate a multi-turn discussion among diverse AI models.

Requirements:
1. Select models with complementary strengths based on the topic
2. Start fresh conversations (no continuation_id) for each model
3. Provide context about the topic and other participants' perspectives
4. Exchange key insights between models across multiple turns
5. Encourage constructive disagreement - not consensus for its own sake
6. Continue until either consensus emerges naturally OR sufficiently diverse perspectives are gathered

Do NOT stop after one round. Keep the discussion going through multiple exchanges until reaching a natural conclusion.
Synthesize findings, highlighting both agreements and valuable disagreements."""
                    )
                )
            ],
        )
    else:
        raise ValueError(f"Unknown prompt: {name}")


@app.call_tool()
async def call_tool(name: str, arguments: Any) -> list[TextContent]:
    """Handle tool calls - just chat."""
    if name != "chat":
        raise McpError(f"Unknown tool: {name}")

    from chat_handler import ChatHandler

    try:
        handler = ChatHandler()
        result = await handler.chat(**arguments)
        return [TextContent(type="text", text=json.dumps(result, indent=2))]
    except Exception as e:
        logger.error(f"Chat tool error: {e}")
        return [TextContent(type="text", text=f"Error: {str(e)}")]


async def main():
    """Run the MCP server."""
    # Check for API key
    if not os.getenv("OPENROUTER_API_KEY"):
        logger.error("OPENROUTER_API_KEY environment variable not set")
        logger.error("Get your API key at: https://openrouter.ai/keys")
        sys.exit(1)

    # Log configuration
    models = get_allowed_models()
    logger.info(f"Starting μ-MCP Server...")
    logger.info(f"Available models: {len(models)}")
    
    # Use stdio transport
    from mcp.server.stdio import stdio_server

    async with stdio_server() as (read_stream, write_stream):
        await app.run(
            read_stream,
            write_stream,
            InitializationOptions(
                server_name="μ-mcp",
                server_version="2.0.0",
                capabilities=ServerCapabilities(
                    tools=ToolsCapability(),
                    prompts=PromptsCapability(),
                ),
            ),
        )


if __name__ == "__main__":
    asyncio.run(main())

```