disler/aider-mcp-server # codebase.md

# Directory Structure

```
├── .claude
│   └── commands
│       ├── context_prime_w_aider.md
│       ├── context_prime.md
│       ├── jprompt_ultra_diff_review.md
│       └── multi_aider_sub_agent.md
├── .env.sample
├── .gitignore
├── .mcp.json
├── .python-version
├── ai_docs
│   ├── just-prompt-example-mcp-server.xml
│   └── programmable-aider-documentation.md
├── pyproject.toml
├── README.md
├── specs
│   └── init-aider-mcp-exp.md
├── src
│   └── aider_mcp_server
│       ├── __init__.py
│       ├── __main__.py
│       ├── atoms
│       │   ├── __init__.py
│       │   ├── data_types.py
│       │   ├── logging.py
│       │   ├── tools
│       │   │   ├── __init__.py
│       │   │   ├── aider_ai_code.py
│       │   │   └── aider_list_models.py
│       │   └── utils.py
│       ├── server.py
│       └── tests
│           ├── __init__.py
│           └── atoms
│               ├── __init__.py
│               ├── test_logging.py
│               └── tools
│                   ├── __init__.py
│                   ├── test_aider_ai_code.py
│                   └── test_aider_list_models.py
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------

```
3.12

```

--------------------------------------------------------------------------------
/.mcp.json:
--------------------------------------------------------------------------------

```json
{
  "mcpServers": {
    "aider-mcp-server": {
      "type": "stdio",
      "command": "uv",
      "args": [
        "--directory",
        ".",
        "run",
        "aider-mcp-server",
        "--editor-model",
        "gemini/gemini-2.5-pro-preview-03-25",
        "--current-working-dir",
        "."
      ],
      "env": {}
    }
  }
}

```

--------------------------------------------------------------------------------
/.env.sample:
--------------------------------------------------------------------------------

```
# Environment Variables for just-prompt

# OpenAI API Key
OPENAI_API_KEY=your_openai_api_key_here

# Anthropic API Key
ANTHROPIC_API_KEY=your_anthropic_api_key_here

# Gemini API Key
GEMINI_API_KEY=your_gemini_api_key_here

# Groq API Key
GROQ_API_KEY=your_groq_api_key_here

# DeepSeek API Key
DEEPSEEK_API_KEY=your_deepseek_api_key_here

# OpenRouter API Key
OPENROUTER_API_KEY=your_openrouter_api_key_here

# Ollama endpoint (if not default)
OLLAMA_HOST=http://localhost:11434

FIREWORKS_API_KEY=
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
# Python-generated files
__pycache__/
*.py[oc]
build/
dist/
wheels/
*.egg-info

# Virtual environments
.venv

.env

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# Distribution / packaging
dist/
build/
*.egg-info/
*.egg

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Jupyter Notebook
.ipynb_checkpoints

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# IDE specific files
.idea/
.vscode/
*.swp
*.swo
.DS_Store


prompts/responses
.aider*

focus_output/

# Log files
logs/
*.log
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
# Aider MCP Server - Experimental
> Model context protocol server for offloading AI coding work to Aider, enhancing development efficiency and flexibility.

## Overview

This server allows Claude Code to offload AI coding tasks to Aider, the best open source AI coding assistant. By delegating certain coding tasks to Aider, we can reduce costs, gain control over our coding model and operate Claude Code in a more orchestrative way to review and revise code.

## Setup

0. Clone the repository:

```bash
git clone https://github.com/disler/aider-mcp-server.git
```

1. Install dependencies:

```bash
uv sync
```

2. Create your environment file:

```bash
cp .env.sample .env
```

3. Configure your API keys in the `.env` file (or use the mcpServers "env" section) to have the api key needed for the model you want to use in aider:

```
GEMINI_API_KEY=your_gemini_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here
...see .env.sample for more
```

4. Copy and fill out the the `.mcp.json` into the root of your project and update the `--directory` to point to this project's root directory and the `--current-working-dir` to point to the root of your project.

```json
{
  "mcpServers": {
    "aider-mcp-server": {
      "type": "stdio",
      "command": "uv",
      "args": [
        "--directory",
        "<path to this project>",
        "run",
        "aider-mcp-server",
        "--editor-model",
        "gpt-4o",
        "--current-working-dir",
        "<path to your project>"
      ],
      "env": {
        "GEMINI_API_KEY": "<your gemini api key>",
        "OPENAI_API_KEY": "<your openai api key>",
        "ANTHROPIC_API_KEY": "<your anthropic api key>",
        ...see .env.sample for more
      }
    }
  }
}
```

## Testing
> Tests run with gemini-2.5-pro-exp-03-25

To run all tests:

```bash
uv run pytest
```

To run specific tests:

```bash
# Test listing models
uv run pytest src/aider_mcp_server/tests/atoms/tools/test_aider_list_models.py

# Test AI coding
uv run pytest src/aider_mcp_server/tests/atoms/tools/test_aider_ai_code.py
```

Note: The AI coding tests require a valid API key for the Gemini model. Make sure to set it in your `.env` file before running the tests.

## Add this MCP server to Claude Code

### Add with `gemini-2.5-pro-exp-03-25`

```bash
claude mcp add aider-mcp-server -s local \
  -- \
  uv --directory "<path to the aider mcp server project>" \
  run aider-mcp-server \
  --editor-model "gemini/gemini-2.5-pro-exp-03-25" \
  --current-working-dir "<path to your project>"
```

### Add with `gemini-2.5-pro-preview-03-25`

```bash
claude mcp add aider-mcp-server -s local \
  -- \
  uv --directory "<path to the aider mcp server project>" \
  run aider-mcp-server \
  --editor-model "gemini/gemini-2.5-pro-preview-03-25" \
  --current-working-dir "<path to your project>"
```

### Add with `quasar-alpha`

```bash
claude mcp add aider-mcp-server -s local \
  -- \
  uv --directory "<path to the aider mcp server project>" \
  run aider-mcp-server \
  --editor-model "openrouter/openrouter/quasar-alpha" \
  --current-working-dir "<path to your project>"
```

### Add with `llama4-maverick-instruct-basic`

```bash
claude mcp add aider-mcp-server -s local \
  -- \
  uv --directory "<path to the aider mcp server project>" \
  run aider-mcp-server \
  --editor-model "fireworks_ai/accounts/fireworks/models/llama4-maverick-instruct-basic" \
  --current-working-dir "<path to your project>"
```

## Usage

This MCP server provides the following functionalities:

1. **Offload AI coding tasks to Aider**:
   - Takes a prompt and file paths
   - Uses Aider to implement the requested changes
   - Returns success or failure

2. **List available models**:
   - Provides a list of models matching a substring
   - Useful for discovering supported models


## Available Tools

This MCP server exposes the following tools:

### 1. `aider_ai_code`

This tool allows you to run Aider to perform AI coding tasks based on a provided prompt and specified files.

**Parameters:**

- `ai_coding_prompt` (string, required): The natural language instruction for the AI coding task.
- `relative_editable_files` (list of strings, required): A list of file paths (relative to the `current_working_dir`) that Aider is allowed to modify. If a file doesn't exist, it will be created.
- `relative_readonly_files` (list of strings, optional): A list of file paths (relative to the `current_working_dir`) that Aider can read for context but cannot modify. Defaults to an empty list `[]`.
- `model` (string, optional): The primary AI model Aider should use for generating code. Defaults to `"gemini/gemini-2.5-pro-exp-03-25"`. You can use the `list_models` tool to find other available models.
- `editor_model` (string, optional): The AI model Aider should use for editing/refining code, particularly when using architect mode. If not provided, the primary `model` might be used depending on Aider's internal logic. Defaults to `None`.

**Example Usage (within an MCP request):**

Claude Code Prompt:
```
Use the Aider AI Code tool to: Refactor the calculate_sum function in calculator.py to handle potential TypeError exceptions.
```

Result:
```json
{
  "name": "aider_ai_code",
  "parameters": {
    "ai_coding_prompt": "Refactor the calculate_sum function in calculator.py to handle potential TypeError exceptions.",
    "relative_editable_files": ["src/calculator.py"],
    "relative_readonly_files": ["docs/requirements.txt"],
    "model": "openai/gpt-4o"
  }
}
```

**Returns:**

- A simple dict: {success, diff}
  - `success`: boolean - Whether the operation was successful.
  - `diff`: string - The diff of the changes made to the file.

### 2. `list_models`

This tool lists available AI models supported by Aider that match a given substring.

**Parameters:**

- `substring` (string, required): The substring to search for within the names of available models.

**Example Usage (within an MCP request):**

Claude Code Prompt:
```
Use the Aider List Models tool to: List models that contain the substring "gemini".
```

Result:
```json
{
  "name": "list_models",
  "parameters": {
    "substring": "gemini"
  }
}
```

**Returns:**

- A list of model name strings that match the provided substring. Example: `["gemini/gemini-1.5-flash", "gemini/gemini-1.5-pro", "gemini/gemini-pro"]`

## Architecture

The server is structured as follows:

- **Server layer**: Handles MCP protocol communication
- **Atoms layer**: Individual, pure functional components
  - **Tools**: Specific capabilities (AI coding, listing models)
  - **Utils**: Constants and helper functions
  - **Data Types**: Type definitions using Pydantic

All components are thoroughly tested for reliability.

## Codebase Structure

The project is organized into the following main directories and files:

```
.
├── ai_docs                   # Documentation related to AI models and examples
│   ├── just-prompt-example-mcp-server.xml
│   └── programmable-aider-documentation.md
├── pyproject.toml            # Project metadata and dependencies
├── README.md                 # This file
├── specs                     # Specification documents
│   └── init-aider-mcp-exp.md
├── src                       # Source code directory
│   └── aider_mcp_server      # Main package for the server
│       ├── __init__.py       # Package initializer
│       ├── __main__.py       # Main entry point for the server executable
│       ├── atoms             # Core, reusable components (pure functions)
│       │   ├── __init__.py
│       │   ├── data_types.py # Pydantic models for data structures
│       │   ├── logging.py    # Custom logging setup
│       │   ├── tools         # Individual tool implementations
│       │   │   ├── __init__.py
│       │   │   ├── aider_ai_code.py # Logic for the aider_ai_code tool
│       │   │   └── aider_list_models.py # Logic for the list_models tool
│       │   └── utils.py      # Utility functions and constants (like default models)
│       ├── server.py         # MCP server logic, tool registration, request handling
│       └── tests             # Unit and integration tests
│           ├── __init__.py
│           └── atoms         # Tests for the atoms layer
│               ├── __init__.py
│               ├── test_logging.py # Tests for logging
│               └── tools     # Tests for the tools
│                   ├── __init__.py
│                   ├── test_aider_ai_code.py # Tests for AI coding tool
│                   └── test_aider_list_models.py # Tests for model listing tool
```

- **`src/aider_mcp_server`**: Contains the main application code.
  - **`atoms`**: Holds the fundamental building blocks. These are designed to be pure functions or simple classes with minimal dependencies.
    - **`tools`**: Each file here implements the core logic for a specific MCP tool (`aider_ai_code`, `list_models`).
    - **`utils.py`**: Contains shared constants like default model names.
    - **`data_types.py`**: Defines Pydantic models for request/response structures, ensuring data validation.
    - **`logging.py`**: Sets up a consistent logging format for console and file output.
  - **`server.py`**: Orchestrates the MCP server. It initializes the server, registers the tools defined in the `atoms/tools` directory, handles incoming requests, routes them to the appropriate tool logic, and sends back responses according to the MCP protocol.
  - **`__main__.py`**: Provides the command-line interface entry point (`aider-mcp-server`), parsing arguments like `--editor-model` and starting the server defined in `server.py`.
  - **`tests`**: Contains tests mirroring the structure of the `src` directory, ensuring that each component (especially atoms) works as expected.


```

--------------------------------------------------------------------------------
/src/aider_mcp_server/atoms/__init__.py:
--------------------------------------------------------------------------------

```python
# Atoms package initialization
```

--------------------------------------------------------------------------------
/src/aider_mcp_server/atoms/tools/__init__.py:
--------------------------------------------------------------------------------

```python
# Tools package initialization
```

--------------------------------------------------------------------------------
/src/aider_mcp_server/tests/__init__.py:
--------------------------------------------------------------------------------

```python
# Tests package initialization
```

--------------------------------------------------------------------------------
/src/aider_mcp_server/tests/atoms/__init__.py:
--------------------------------------------------------------------------------

```python
# Atoms tests package initialization
```

--------------------------------------------------------------------------------
/src/aider_mcp_server/tests/atoms/tools/__init__.py:
--------------------------------------------------------------------------------

```python
# Tools tests package initialization
```

--------------------------------------------------------------------------------
/src/aider_mcp_server/atoms/utils.py:
--------------------------------------------------------------------------------

```python
DEFAULT_EDITOR_MODEL = "openai/gpt-4.1"
DEFAULT_TESTING_MODEL = "openai/gpt-4.1"


```

--------------------------------------------------------------------------------
/src/aider_mcp_server/__init__.py:
--------------------------------------------------------------------------------

```python
from aider_mcp_server.__main__ import main

# This just re-exports the main function from __main__.py
```

--------------------------------------------------------------------------------
/.claude/commands/context_prime.md:
--------------------------------------------------------------------------------

```markdown
## Context

READ README.md, THEN run git ls-files and eza --git-ignore --tree to understand the context of the project don't read any other files.

## Commands & Feedback Loops

We're using `uv run pytest` to run tests.

You can validate the app works with `uv run aider-mcp-server --help`.
```

--------------------------------------------------------------------------------
/.claude/commands/multi_aider_sub_agent.md:
--------------------------------------------------------------------------------

```markdown
Run a multi aider_ai_code call with a sub agent call to fulfill the following tasks back to back in the most sensible order. If the given task(s) can be broken down into smaller tasks, do that. If tasks are dependent on certain changes to be made first, make sure to run the dependent tasks first. $ARGUMENTS
```

--------------------------------------------------------------------------------
/.claude/commands/context_prime_w_aider.md:
--------------------------------------------------------------------------------

```markdown
## Context

READ README.md, THEN run git ls-files and eza --git-ignore --tree to understand the context of the project don't read any other files.

## Commands & Feedback Loops

To validate code use `uv run pytest` to run tests. (don't run this now)

You can validate the app works with `uv run aider-mcp-server --help`.

## Coding

For coding always use the aider_ai_code tool.
```

--------------------------------------------------------------------------------
/src/aider_mcp_server/atoms/tools/aider_list_models.py:
--------------------------------------------------------------------------------

```python
from typing import List
from aider.models import fuzzy_match_models

def list_models(substring: str) -> List[str]:
    """
    List available models that match the provided substring.
    
    Args:
        substring (str): Substring to match against available models.
    
    Returns:
        List[str]: List of model names matching the substring.
    """
    return fuzzy_match_models(substring)
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
[project]
name = "aider-mcp-server"
version = "0.1.0"
description = "Model context protocol server for offloading ai coding work to Aider"
readme = "README.md"
authors = [
    { name = "IndyDevDan", email = "[email protected]" }
]
requires-python = ">=3.12"
dependencies = [
    "aider-chat>=0.81.0",
    "boto3>=1.37.27",
    "mcp>=1.6.0",
    "pydantic>=2.11.2",
    "pytest>=8.3.5",
    "rich>=14.0.0",
]

[project.scripts]
aider-mcp-server = "aider_mcp_server:main"

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

```

--------------------------------------------------------------------------------
/src/aider_mcp_server/__main__.py:
--------------------------------------------------------------------------------

```python
import argparse
import asyncio
from aider_mcp_server.server import serve
from aider_mcp_server.atoms.utils import DEFAULT_EDITOR_MODEL

def main():
    # Create the argument parser
    parser = argparse.ArgumentParser(description="Aider MCP Server - Offload AI coding tasks to Aider")
    
    # Add arguments
    parser.add_argument(
        "--editor-model", 
        type=str, 
        default=DEFAULT_EDITOR_MODEL,
        help=f"Editor model to use (default: {DEFAULT_EDITOR_MODEL})"
    )
    parser.add_argument(
        "--current-working-dir", 
        type=str, 
        required=True,
        help="Current working directory (must be a valid git repository)"
    )
    
    args = parser.parse_args()
    
    # Run the server asynchronously
    asyncio.run(serve(
        editor_model=args.editor_model,
        current_working_dir=args.current_working_dir
    ))

if __name__ == "__main__":
    main()
```

--------------------------------------------------------------------------------
/src/aider_mcp_server/tests/atoms/tools/test_aider_list_models.py:
--------------------------------------------------------------------------------

```python
import pytest
from aider_mcp_server.atoms.tools.aider_list_models import list_models

def test_list_models_openai():
    """Test that list_models returns GPT-4o model when searching for openai."""
    models = list_models("openai")
    assert any("gpt-4o" in model for model in models), "Expected to find GPT-4o model in the list"
    
def test_list_models_gemini():
    """Test that list_models returns Gemini models when searching for gemini."""
    models = list_models("gemini")
    assert any("gemini" in model.lower() for model in models), "Expected to find Gemini models in the list"
    
def test_list_models_empty():
    """Test that list_models with an empty string returns all models."""
    models = list_models("")
    assert len(models) > 0, "Expected to get at least some models with empty string"
    
def test_list_models_nonexistent():
    """Test that list_models with a nonexistent model returns an empty list."""
    models = list_models("this_model_does_not_exist_12345")
    assert len(models) == 0, "Expected to get no models with a nonexistent model name"
```

--------------------------------------------------------------------------------
/.claude/commands/jprompt_ultra_diff_review.md:
--------------------------------------------------------------------------------

```markdown
# Ultra Diff Review
> Execute each task in the order given to conduct a thorough code review.

## Task 1: Create diff.txt

Create a new file called diff.md.

At the top of the file, add the following markdown:

```md
# Code Review
- Review the diff, report on issues, bugs, and improvements. 
- End with a concise markdown table of any issues found, their solutions, and a risk assessment for each issue if applicable.
- Use emojis to convey the severity of each issue.

## Diff

```

## Task 2: git diff and append

Then run git diff and append the output to the file.

## Task 3: just-prompt multi-llm tool call

Then use that file as the input to this just-prompt tool call.

prompts_from_file_to_file(
    from_file = diff.md,
    models = "openai:o3-mini, anthropic:claude-3-7-sonnet-20250219:4k, gemini:gemini-2.0-flash-thinking-exp"
    output_dir = ultra_diff_review/
)

## Task 4: Read the output files and synthesize

Then read the output files and think hard to synthesize the results into a new single file called `ultra_diff_review/fusion_ultra_diff_review.md` following the original instructions plus any additional instructions or callouts you think are needed to create the best possible review.

## Task 5: Present the results

Then let me know which issues you think are worth resolving and we'll proceed from there.
```

--------------------------------------------------------------------------------
/src/aider_mcp_server/atoms/data_types.py:
--------------------------------------------------------------------------------

```python
from typing import List, Optional, Dict, Any, Union
from pydantic import BaseModel, Field

# MCP Protocol Base Types
class MCPRequest(BaseModel):
    """Base class for MCP protocol requests."""
    name: str
    parameters: Dict[str, Any]

class MCPResponse(BaseModel):
    """Base class for MCP protocol responses."""
    pass

class MCPErrorResponse(MCPResponse):
    """Error response for MCP protocol."""
    error: str

# Tool-specific request parameter models
class AICodeParams(BaseModel):
    """Parameters for the aider_ai_code tool."""
    ai_coding_prompt: str
    relative_editable_files: List[str]
    relative_readonly_files: List[str] = Field(default_factory=list)

class ListModelsParams(BaseModel):
    """Parameters for the list_models tool."""
    substring: str = ""

# Tool-specific response models
class AICodeResponse(MCPResponse):
    """Response for the aider_ai_code tool."""
    status: str  # 'success' or 'failure'
    message: Optional[str] = None

class ListModelsResponse(MCPResponse):
    """Response for the list_models tool."""
    models: List[str]

# Specific request types
class AICodeRequest(MCPRequest):
    """Request for the aider_ai_code tool."""
    name: str = "aider_ai_code"
    parameters: AICodeParams

class ListModelsRequest(MCPRequest):
    """Request for the list_models tool."""
    name: str = "list_models"
    parameters: ListModelsParams

# Union type for all possible MCP responses
MCPToolResponse = Union[AICodeResponse, ListModelsResponse, MCPErrorResponse]
```

--------------------------------------------------------------------------------
/ai_docs/programmable-aider-documentation.md:
--------------------------------------------------------------------------------

```markdown
# Aider is a programmable AI coding assistant

Here's how to use it in python to build tools that allow us to offload ai coding tasks to aider.

## Code Examples

```

class AICodeParams(BaseModel):
    architect: bool = True
    prompt: str
    model: str
    editor_model: Optional[str] = None
    editable_context: List[str]
    readonly_context: List[str] = []
    settings: Optional[dict]
    use_git: bool = True


def build_ai_coding_assistant(params: AICodeParams) -> Coder:
    """Create and configure a Coder instance based on provided parameters"""
    settings = params.settings or {}
    auto_commits = settings.get("auto_commits", False)
    suggest_shell_commands = settings.get("suggest_shell_commands", False)
    detect_urls = settings.get("detect_urls", False)

    # Extract budget_tokens setting once for both models
    budget_tokens = settings.get("budget_tokens")

    if params.architect:
        model = Model(model=params.model, editor_model=params.editor_model)
        extra_params = {}

        # Add reasoning_effort if available
        if settings.get("reasoning_effort"):
            extra_params["reasoning_effort"] = settings["reasoning_effort"]

        # Add thinking budget if specified
        if budget_tokens is not None:
            extra_params = add_thinking_budget_to_params(extra_params, budget_tokens)

        model.extra_params = extra_params
        return Coder.create(
            main_model=model,
            edit_format="architect",
            io=InputOutput(yes=True),
            fnames=params.editable_context,
            read_only_fnames=params.readonly_context,
            auto_commits=auto_commits,
            suggest_shell_commands=suggest_shell_commands,
            detect_urls=detect_urls,
            use_git=params.use_git,
        )
    else:
        model = Model(params.model)
        extra_params = {}

        # Add reasoning_effort if available
        if settings.get("reasoning_effort"):
            extra_params["reasoning_effort"] = settings["reasoning_effort"]

        # Add thinking budget if specified (consistent for both modes)
        if budget_tokens is not None:
            extra_params = add_thinking_budget_to_params(extra_params, budget_tokens)

        model.extra_params = extra_params
        return Coder.create(
            main_model=model,
            io=InputOutput(yes=True),
            fnames=params.editable_context,
            read_only_fnames=params.readonly_context,
            auto_commits=auto_commits,
            suggest_shell_commands=suggest_shell_commands,
            detect_urls=detect_urls,
            use_git=params.use_git,
        )


def ai_code(coder: Coder, params: AICodeParams):
    """Execute AI coding using provided coder instance and parameters"""
    # Execute the AI coding with the provided prompt
    coder.run(params.prompt)


```
```

--------------------------------------------------------------------------------
/specs/init-aider-mcp-exp.md:
--------------------------------------------------------------------------------

```markdown
# Aider Model Context Protocol (MCP) Experimental Server
> Here we detail how we'll build the experimental ai coding aider mcp server.

## Why?
Claude Code is a new, powerful agentic coding tool that is currently in beta. It's great but it's incredibly expensive.
We can offload some of the work to a simpler ai coding tool: Aider. The original AI Coding Assistant.

By discretely offloading work to Aider, we can not only reduce costs but use Claude Code (and auxillary LLM calls combined with aider) to better create more, reliable code through multiple - focused - LLM calls.

## Resources to ingest
> To understand how we'll build this, READ these files

ai_docs/just-prompt-example-mcp-server.xml
ai_docs/programmable-aider-documentation.md

## Implementation Notes

- We want to mirror the exact structure of the just-prompt codebase as closely as possible. Minus of course the tools that are specific to just-prompt.
- Every atom must be tested in a respective tests/*_test.py file.
- every atom/tools/*.py must only have a single responsibility - one method.
- when we run aider run in no commit mode, we should not commit any changes to the codebase.
- if architect_model is not provided, don't use architect mode.

## Application Structure

- src/
  - aider_mcp_server/
    - __init__.py
    - __main__.py
    - server.py
      - serve(editor_model: str = DEFAULT_EDITOR_MODEL, current_working_dir: str = ".", architect_model: str = None) -> None
    - atoms/
      - __init__.py
      - tools/
        - __init__.py
        - aider_ai_code.py
          - code_with_aider(ai_coding_prompt: str, relative_editable_files: List[str], relative_readonly_files: List[str] = []) -> str
            - runs one shot aider based on ai_docs/programmable-aider-documentation.md
            - outputs 'success' or 'failure'
        - aider_list_models.py
          - list_models(substring: str) -> List[str]
            - calls aider.models.fuzzy_match_models(substr: str) and returns the list of models
      - utils.py
        - DEFAULT_EDITOR_MODEL = "gemini/gemini-2.5-pro-exp-03-25"
        - DEFAULT_ARCHITECT_MODEL = "gemini/gemini-2.5-pro-exp-03-25"
      - data_types.py
    - tests/
      - __init__.py
      - atoms/
        - __init__.py
        - tools/
          - __init__.py
          - test_aider_ai_code.py
            - here create tests for basic 'math' functionality: 'add, 'subtract', 'multiply', 'divide'. Use temp dirs.
          - test_aider_list_models.py
            - here create a real call to list_models(openai) and assert gpt-4o substr in list.

## Commands

- if for whatever reason you need additional python packages use `uv add <package_name>`.

## Validation
- Use `uv run pytest <path_to_test_file.py>` to run tests. Every atom/ must be tested.
- Don't mock any tests - run real LLM calls. Make sure to test for failure paths.
- At the end run `uv run aider-mcp-server --help` to validate the server is working.


```

--------------------------------------------------------------------------------
/src/aider_mcp_server/atoms/logging.py:
--------------------------------------------------------------------------------

```python
import os
import logging
import time
from pathlib import Path
from typing import Optional, Union


class Logger:
    """Custom logger that writes to both console and file."""
    
    def __init__(
        self,
        name: str,
        log_dir: Optional[Union[str, Path]] = None,
        level: int = logging.INFO,
    ):
        """
        Initialize the logger.
        
        Args:
            name: Logger name
            log_dir: Directory to store log files (defaults to ./logs)
            level: Logging level
        """
        self.name = name
        self.level = level
        
        # Set up the logger
        self.logger = logging.getLogger(name)
        self.logger.setLevel(level)
        self.logger.propagate = False
        
        # Clear any existing handlers
        if self.logger.handlers:
            self.logger.handlers.clear()

        # Define a standard formatter
        log_formatter = logging.Formatter(
            '%(asctime)s [%(levelname)s] %(name)s: %(message)s',
            datefmt='%Y-%m-%d %H:%M:%S'
        )

        # Add console handler with standard formatting
        console_handler = logging.StreamHandler()
        console_handler.setFormatter(log_formatter)
        console_handler.setLevel(level)
        self.logger.addHandler(console_handler)

        # Add file handler if log_dir is provided
        if log_dir is not None:
            # Create log directory if it doesn't exist
            log_dir = Path(log_dir)
            log_dir.mkdir(parents=True, exist_ok=True)
            
            # Use a fixed log file name
            log_file_name = "aider_mcp_server.log"
            log_file_path = log_dir / log_file_name

            # Set up file handler to append
            file_handler = logging.FileHandler(log_file_path, mode='a')
            # Use the same formatter as the console handler
            file_handler.setFormatter(log_formatter)
            file_handler.setLevel(level)
            self.logger.addHandler(file_handler)

            self.log_file_path = log_file_path
            self.logger.info(f"Logging to: {log_file_path}")

    def debug(self, message: str, **kwargs):
        """Log a debug message."""
        self.logger.debug(message, **kwargs)
    
    def info(self, message: str, **kwargs):
        """Log an info message."""
        self.logger.info(message, **kwargs)
    
    def warning(self, message: str, **kwargs):
        """Log a warning message."""
        self.logger.warning(message, **kwargs)
    
    def error(self, message: str, **kwargs):
        """Log an error message."""
        self.logger.error(message, **kwargs)
    
    def critical(self, message: str, **kwargs):
        """Log a critical message."""
        self.logger.critical(message, **kwargs)
    
    def exception(self, message: str, **kwargs):
        """Log an exception message with traceback."""
        self.logger.exception(message, **kwargs)


def get_logger(
    name: str,
    log_dir: Optional[Union[str, Path]] = None,
    level: int = logging.INFO,
) -> Logger:
    """
    Get a configured logger instance.
    
    Args:
        name: Logger name
        log_dir: Directory to store log files (defaults to ./logs)
        level: Logging level

    Returns:
        Configured Logger instance
    """
    if log_dir is None:
        # Default log directory is ./logs
        log_dir = Path("./logs")
    
    return Logger(
        name=name,
        log_dir=log_dir,
        level=level,
    )

```

--------------------------------------------------------------------------------
/src/aider_mcp_server/tests/atoms/test_logging.py:
--------------------------------------------------------------------------------

```python
import pytest
import logging
from pathlib import Path

from aider_mcp_server.atoms.logging import Logger, get_logger


def test_logger_creation_and_file_output(tmp_path):
    """Test Logger instance creation using get_logger and log file existence with fixed name."""
    log_dir = tmp_path / "logs"
    logger_name = "test_logger_creation"
    expected_log_file = log_dir / "aider_mcp_server.log" # Fixed log file name

    # --- Test get_logger ---
    logger = get_logger(
        name=logger_name,
        log_dir=log_dir,
        level=logging.INFO,
    )
    assert logger is not None, "Logger instance from get_logger should be created"
    assert logger.name == logger_name

    # Log a message to ensure file handling is triggered
    logger.info("Initial log message.")

    # Verify log directory and file exist
    assert log_dir.exists(), f"Log directory should be created by get_logger at {log_dir}"
    assert log_dir.is_dir(), f"Log path created by get_logger should be a directory"
    assert expected_log_file.exists(), f"Log file should be created by get_logger at {expected_log_file}"
    assert expected_log_file.is_file(), f"Log path created by get_logger should point to a file"


def test_log_levels_and_output(tmp_path):
    """Test logging at different levels to the fixed log file using get_logger."""
    log_dir = tmp_path / "logs"
    logger_name = "test_logger_levels"
    expected_log_file = log_dir / "aider_mcp_server.log" # Fixed log file name

    # Instantiate our custom logger with DEBUG level using get_logger
    logger = get_logger(
        name=logger_name,
        log_dir=log_dir,
        level=logging.DEBUG,
    )

    # Log messages at different levels
    messages = {
        logging.DEBUG: "This is a debug message.",
        logging.INFO: "This is an info message.",
        logging.WARNING: "This is a warning message.",
        logging.ERROR: "This is an error message.",
        logging.CRITICAL: "This is a critical message.",
    }

    logger.debug(messages[logging.DEBUG])
    logger.info(messages[logging.INFO])
    logger.warning(messages[logging.WARNING])
    logger.error(messages[logging.ERROR])
    logger.critical(messages[logging.CRITICAL])

    # Verify file output
    assert expected_log_file.exists(), "Log file should exist for level testing"

    file_content = expected_log_file.read_text()

    # Verify file output contains messages and level indicators
    for level, msg in messages.items():
        level_name = logging.getLevelName(level)
        assert msg in file_content, f"Message '{msg}' not found in file content"
        assert level_name in file_content, f"Level '{level_name}' not found in file content"
        assert logger_name in file_content, f"Logger name '{logger_name}' not found in file content"


def test_log_level_filtering(tmp_path):
    """Test that messages below the set log level are filtered using get_logger."""
    log_dir = tmp_path / "logs"
    logger_name = "test_logger_filtering"
    expected_log_file = log_dir / "aider_mcp_server.log" # Fixed log file name

    # Instantiate the logger with WARNING level using get_logger
    logger = get_logger(
        name=logger_name,
        log_dir=log_dir,
        level=logging.WARNING,
    )

    # Log messages at different levels
    debug_msg = "This debug message should NOT appear."
    info_msg = "This info message should NOT appear."
    warning_msg = "This warning message SHOULD appear."
    error_msg = "This error message SHOULD appear."
    critical_msg = "This critical message SHOULD appear." # Add critical for completeness

    logger.debug(debug_msg)
    logger.info(info_msg)
    logger.warning(warning_msg)
    logger.error(error_msg)
    logger.critical(critical_msg)

    # Verify file output filtering
    assert expected_log_file.exists(), "Log file should exist for filtering testing"

    file_content = expected_log_file.read_text()

    assert debug_msg not in file_content, "Debug message should be filtered from file"
    assert info_msg not in file_content, "Info message should be filtered from file"
    assert warning_msg in file_content, "Warning message should appear in file"
    assert error_msg in file_content, "Error message should appear in file"
    assert critical_msg in file_content, "Critical message should appear in file"
    assert logging.getLevelName(logging.DEBUG) not in file_content, "DEBUG level indicator should be filtered from file"
    assert logging.getLevelName(logging.INFO) not in file_content, "INFO level indicator should be filtered from file"
    assert logging.getLevelName(logging.WARNING) in file_content, "WARNING level indicator should appear in file"
    assert logging.getLevelName(logging.ERROR) in file_content, "ERROR level indicator should appear in file"
    assert logging.getLevelName(logging.CRITICAL) in file_content, "CRITICAL level indicator should appear in file"
    assert logger_name in file_content, f"Logger name '{logger_name}' should appear in file content"


def test_log_appending(tmp_path):
    """Test that log messages are appended to the existing log file."""
    log_dir = tmp_path / "logs"
    logger_name_1 = "test_logger_append_1"
    logger_name_2 = "test_logger_append_2"
    expected_log_file = log_dir / "aider_mcp_server.log" # Fixed log file name

    # First logger instance and message
    logger1 = get_logger(
        name=logger_name_1,
        log_dir=log_dir,
        level=logging.INFO,
    )
    message1 = "First message to append."
    logger1.info(message1)

    # Ensure some time passes or context switches if needed, though file handler should manage appending
    # Second logger instance (or could reuse logger1) and message
    logger2 = get_logger(
        name=logger_name_2, # Can use a different name or the same
        log_dir=log_dir,
        level=logging.INFO,
    )
    message2 = "Second message to append."
    logger2.info(message2)

    # Verify both messages are in the file
    assert expected_log_file.exists(), "Log file should exist for appending test"
    file_content = expected_log_file.read_text()

    assert message1 in file_content, "First message not found in appended log file"
    assert logger_name_1 in file_content, "First logger name not found in appended log file"
    assert message2 in file_content, "Second message not found in appended log file"
    assert logger_name_2 in file_content, "Second logger name not found in appended log file"

```

--------------------------------------------------------------------------------
/src/aider_mcp_server/atoms/tools/aider_ai_code.py:
--------------------------------------------------------------------------------

```python
import json
from typing import List, Optional, Dict, Any, Union
import os
import os.path
import subprocess
from aider.models import Model
from aider.coders import Coder
from aider.io import InputOutput
from aider_mcp_server.atoms.logging import get_logger
from aider_mcp_server.atoms.utils import DEFAULT_EDITOR_MODEL

# Configure logging for this module
logger = get_logger(__name__)

# Type alias for response dictionary
ResponseDict = Dict[str, Union[bool, str]]


def _get_changes_diff_or_content(
    relative_editable_files: List[str], working_dir: str = None
) -> str:
    """
    Get the git diff for the specified files, or their content if git fails.

    Args:
        relative_editable_files: List of files to check for changes
        working_dir: The working directory where the git repo is located
    """
    diff = ""
    # Log current directory for debugging
    current_dir = os.getcwd()
    logger.info(f"Current directory during diff: {current_dir}")
    if working_dir:
        logger.info(f"Using working directory: {working_dir}")

    # Always attempt to use git
    files_arg = " ".join(relative_editable_files)
    logger.info(f"Attempting to get git diff for: {' '.join(relative_editable_files)}")

    try:
        # Use git -C to specify the repository directory
        if working_dir:
            diff_cmd = f"git -C {working_dir} diff -- {files_arg}"
        else:
            diff_cmd = f"git diff -- {files_arg}"

        logger.info(f"Running git command: {diff_cmd}")
        diff = subprocess.check_output(
            diff_cmd, shell=True, text=True, stderr=subprocess.PIPE
        )
        logger.info("Successfully obtained git diff.")
    except subprocess.CalledProcessError as e:
        logger.warning(
            f"Git diff command failed with exit code {e.returncode}. Error: {e.stderr.strip()}"
        )
        logger.warning("Falling back to reading file contents.")
        diff = "Git diff failed. Current file contents:\n\n"
        for file_path in relative_editable_files:
            full_path = (
                os.path.join(working_dir, file_path) if working_dir else file_path
            )
            if os.path.exists(full_path):
                try:
                    with open(full_path, "r") as f:
                        content = f.read()
                        diff += f"--- {file_path} ---\n{content}\n\n"
                        logger.info(f"Read content for {file_path}")
                except Exception as read_e:
                    logger.error(
                        f"Failed reading file {full_path} for content fallback: {read_e}"
                    )
                    diff += f"--- {file_path} --- (Error reading file)\n\n"
            else:
                logger.warning(f"File {full_path} not found during content fallback.")
                diff += f"--- {file_path} --- (File not found)\n\n"
    except Exception as e:
        logger.error(f"Unexpected error getting git diff: {str(e)}")
        diff = f"Error getting git diff: {str(e)}\n\n"  # Provide error in diff string as fallback
    return diff


def _check_for_meaningful_changes(
    relative_editable_files: List[str], working_dir: str = None
) -> bool:
    """
    Check if the edited files contain meaningful content.

    Args:
        relative_editable_files: List of files to check
        working_dir: The working directory where files are located
    """
    for file_path in relative_editable_files:
        # Use the working directory if provided
        full_path = os.path.join(working_dir, file_path) if working_dir else file_path
        logger.info(f"Checking for meaningful content in: {full_path}")

        if os.path.exists(full_path):
            try:
                with open(full_path, "r") as f:
                    content = f.read()
                    # Check if the file has more than just whitespace or a single comment line,
                    # or contains common code keywords. This is a heuristic.
                    stripped_content = content.strip()
                    if stripped_content and (
                        len(stripped_content.split("\n")) > 1
                        or any(
                            kw in content
                            for kw in [
                                "def ",
                                "class ",
                                "import ",
                                "from ",
                                "async def",
                            ]
                        )
                    ):
                        logger.info(f"Meaningful content found in: {file_path}")
                        return True
            except Exception as e:
                logger.error(
                    f"Failed reading file {full_path} during meaningful change check: {e}"
                )
                # If we can't read it, we can't confirm meaningful change from this file
                continue
        else:
            logger.info(
                f"File not found or empty, skipping meaningful check: {full_path}"
            )

    logger.info("No meaningful changes detected in any editable files.")
    return False


def _process_coder_results(
    relative_editable_files: List[str], working_dir: str = None
) -> ResponseDict:
    """
    Process the results after Aider has run, checking for meaningful changes
    and retrieving the diff or content.

    Args:
        relative_editable_files: List of files that were edited
        working_dir: The working directory where the git repo is located

    Returns:
        Dictionary with success status and diff output
    """
    diff_output = _get_changes_diff_or_content(relative_editable_files, working_dir)
    logger.info("Checking for meaningful changes in edited files...")
    has_meaningful_content = _check_for_meaningful_changes(
        relative_editable_files, working_dir
    )

    if has_meaningful_content:
        logger.info("Meaningful changes found. Processing successful.")
        return {"success": True, "diff": diff_output}
    else:
        logger.warning(
            "No meaningful changes detected. Processing marked as unsuccessful."
        )
        # Even if no meaningful content, provide the diff/content if available
        return {
            "success": False,
            "diff": diff_output
            or "No meaningful changes detected and no diff/content available.",
        }


def _format_response(response: ResponseDict) -> str:
    """
    Format the response dictionary as a JSON string.

    Args:
        response: Dictionary containing success status and diff output

    Returns:
        JSON string representation of the response
    """
    return json.dumps(response, indent=4)


def code_with_aider(
    ai_coding_prompt: str,
    relative_editable_files: List[str],
    relative_readonly_files: List[str] = [],
    model: str = DEFAULT_EDITOR_MODEL,
    working_dir: str = None,
) -> str:
    """
    Run Aider to perform AI coding tasks based on the provided prompt and files.

    Args:
        ai_coding_prompt (str): The prompt for the AI to execute.
        relative_editable_files (List[str]): List of files that can be edited.
        relative_readonly_files (List[str], optional): List of files that can be read but not edited. Defaults to [].
        model (str, optional): The model to use. Defaults to DEFAULT_EDITOR_MODEL.
        working_dir (str, required): The working directory where git repository is located and files are stored.

    Returns:
        Dict[str, Any]: {'success': True/False, 'diff': str with git diff output}
    """
    logger.info("Starting code_with_aider process.")
    logger.info(f"Prompt: '{ai_coding_prompt}'")

    # Working directory must be provided
    if not working_dir:
        error_msg = "Error: working_dir is required for code_with_aider"
        logger.error(error_msg)
        return json.dumps({"success": False, "diff": error_msg})

    logger.info(f"Working directory: {working_dir}")
    logger.info(f"Editable files: {relative_editable_files}")
    logger.info(f"Readonly files: {relative_readonly_files}")
    logger.info(f"Model: {model}")

    try:
        # Configure the model
        logger.info("Configuring AI model...")  # Point 1: Before init
        ai_model = Model(model)
        logger.info(f"Configured model: {model}")
        logger.info("AI model configured.")  # Point 2: After init

        # Create the coder instance
        logger.info("Creating Aider coder instance...")
        # Use working directory for chat history file if provided
        history_dir = working_dir
        abs_editable_files = [
            os.path.join(working_dir, file) for file in relative_editable_files
        ]
        abs_readonly_files = [
            os.path.join(working_dir, file) for file in relative_readonly_files
        ]
        chat_history_file = os.path.join(history_dir, ".aider.chat.history.md")
        logger.info(f"Using chat history file: {chat_history_file}")

        coder = Coder.create(
            main_model=ai_model,
            io=InputOutput(
                yes=True,
                chat_history_file=chat_history_file,
            ),
            fnames=abs_editable_files,
            read_only_fnames=abs_readonly_files,
            auto_commits=False,  # We'll handle commits separately
            suggest_shell_commands=False,
            detect_urls=False,
            use_git=True,  # Always use git
        )
        logger.info("Aider coder instance created successfully.")

        # Run the coding session
        logger.info("Starting Aider coding session...")  # Point 3: Before run
        result = coder.run(ai_coding_prompt)
        logger.info(f"Aider coding session result: {result}")
        logger.info("Aider coding session finished.")  # Point 4: After run

        # Process the results after the coder has run
        logger.info("Processing coder results...")  # Point 5: Processing results
        try:
            response = _process_coder_results(relative_editable_files, working_dir)
            logger.info("Coder results processed.")
        except Exception as e:
            logger.exception(
                f"Error processing coder results: {str(e)}"
            )  # Point 6: Error
            response = {
                "success": False,
                "diff": f"Error processing files after execution: {str(e)}",
            }

    except Exception as e:
        logger.exception(
            f"Critical Error in code_with_aider: {str(e)}"
        )  # Point 6: Error
        response = {
            "success": False,
            "diff": f"Unhandled Error during Aider execution: {str(e)}",
        }

    formatted_response = _format_response(response)
    logger.info(
        f"code_with_aider process completed. Success: {response.get('success')}"
    )
    logger.info(
        f"Formatted response: {formatted_response}"
    )  # Log complete response for debugging
    return formatted_response

```

--------------------------------------------------------------------------------
/src/aider_mcp_server/server.py:
--------------------------------------------------------------------------------

```python
import json
import sys
import os
import asyncio
import subprocess
import logging
from typing import Dict, Any, Optional, List, Tuple, Union

import mcp
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import Tool, TextContent

from aider_mcp_server.atoms.logging import get_logger
from aider_mcp_server.atoms.utils import DEFAULT_EDITOR_MODEL
from aider_mcp_server.atoms.tools.aider_ai_code import code_with_aider
from aider_mcp_server.atoms.tools.aider_list_models import list_models

# Configure logging
logger = get_logger(__name__)

# Define MCP tools
AIDER_AI_CODE_TOOL = Tool(
    name="aider_ai_code",
    description="Run Aider to perform AI coding tasks based on the provided prompt and files",
    inputSchema={
        "type": "object",
        "properties": {
            "ai_coding_prompt": {
                "type": "string",
                "description": "The prompt for the AI to execute",
            },
            "relative_editable_files": {
                "type": "array",
                "description": "LIST of relative paths to files that can be edited",
                "items": {"type": "string"},
            },
            "relative_readonly_files": {
                "type": "array",
                "description": "LIST of relative paths to files that can be read but not edited, add files that are not editable but useful for context",
                "items": {"type": "string"},
            },
            "model": {
                "type": "string",
                "description": "The primary AI model Aider should use for generating code, leave blank unless model is specified in the request",
            },
        },
        "required": ["ai_coding_prompt", "relative_editable_files"],
    },
)

LIST_MODELS_TOOL = Tool(
    name="list_models",
    description="List available models that match the provided substring",
    inputSchema={
        "type": "object",
        "properties": {
            "substring": {
                "type": "string",
                "description": "Substring to match against available models",
            }
        },
    },
)


def is_git_repository(directory: str) -> Tuple[bool, Union[str, None]]:
    """
    Check if the specified directory is a git repository.

    Args:
        directory (str): The directory to check.

    Returns:
        Tuple[bool, Union[str, None]]: A tuple containing a boolean indicating if it's a git repo,
                                      and an error message if it's not.
    """
    try:
        # Make sure the directory exists
        if not os.path.isdir(directory):
            return False, f"Directory does not exist: {directory}"

        # Use the git command with -C option to specify the working directory
        # This way we don't need to change our current directory
        result = subprocess.run(
            ["git", "-C", directory, "rev-parse", "--is-inside-work-tree"],
            capture_output=True,
            text=True,
            check=False,
        )

        if result.returncode == 0 and result.stdout.strip() == "true":
            return True, None
        else:
            return False, result.stderr.strip() or "Directory is not a git repository"

    except subprocess.SubprocessError as e:
        return False, f"Error checking git repository: {str(e)}"
    except Exception as e:
        return False, f"Unexpected error checking git repository: {str(e)}"


def process_aider_ai_code_request(
    params: Dict[str, Any],
    editor_model: str,
    current_working_dir: str,
) -> Dict[str, Any]:
    """
    Process an aider_ai_code request.

    Args:
        params (Dict[str, Any]): The request parameters.
        editor_model (str): The editor model to use.
        current_working_dir (str): The current working directory where git repo is located.

    Returns:
        Dict[str, Any]: The response data.
    """
    ai_coding_prompt = params.get("ai_coding_prompt", "")
    relative_editable_files = params.get("relative_editable_files", [])
    relative_readonly_files = params.get("relative_readonly_files", [])

    # Ensure relative_editable_files is a list
    if isinstance(relative_editable_files, str):
        logger.info(
            f"Converting single editable file string to list: {relative_editable_files}"
        )
        relative_editable_files = [relative_editable_files]

    # Ensure relative_readonly_files is a list
    if isinstance(relative_readonly_files, str):
        logger.info(
            f"Converting single readonly file string to list: {relative_readonly_files}"
        )
        relative_readonly_files = [relative_readonly_files]

    # Get the model from request parameters if provided
    request_model = params.get("model")

    # Log the request details
    logger.info(f"AI Coding Request: Prompt: '{ai_coding_prompt}'")
    logger.info(f"Editable files: {relative_editable_files}")
    logger.info(f"Readonly files: {relative_readonly_files}")
    logger.info(f"Editor model: {editor_model}")
    if request_model:
        logger.info(f"Request-specified model: {request_model}")

    # Use the model specified in the request if provided, otherwise use the editor model
    model_to_use = request_model if request_model else editor_model

    # Use the passed-in current_working_dir parameter
    logger.info(f"Using working directory for code_with_aider: {current_working_dir}")

    result_json = code_with_aider(
        ai_coding_prompt=ai_coding_prompt,
        relative_editable_files=relative_editable_files,
        relative_readonly_files=relative_readonly_files,
        model=model_to_use,
        working_dir=current_working_dir,
    )

    # Parse the JSON string result
    try:
        result_dict = json.loads(result_json)
    except json.JSONDecodeError as e:
        logger.error(f"Error: Failed to parse JSON response from code_with_aider: {e}")
        logger.error(f"Received raw response: {result_json}")
        return {"error": "Failed to process AI coding result"}

    logger.info(
        f"AI Coding Request Completed. Success: {result_dict.get('success', False)}"
    )
    return {
        "success": result_dict.get("success", False),
        "diff": result_dict.get("diff", "Error retrieving diff"),
    }


def process_list_models_request(params: Dict[str, Any]) -> Dict[str, Any]:
    """
    Process a list_models request.

    Args:
        params (Dict[str, Any]): The request parameters.

    Returns:
        Dict[str, Any]: The response data.
    """
    substring = params.get("substring", "")

    # Log the request details
    logger.info(f"List Models Request: Substring: '{substring}'")

    models = list_models(substring)
    logger.info(f"Found {len(models)} models matching '{substring}'")

    return {"models": models}


def handle_request(
    request: Dict[str, Any],
    current_working_dir: str,
    editor_model: str,
) -> Dict[str, Any]:
    """
    Handle incoming MCP requests according to the MCP protocol.

    Args:
        request (Dict[str, Any]): The request JSON.
        current_working_dir (str): The current working directory. Must be a valid git repository.
        editor_model (str): The editor model to use.

    Returns:
        Dict[str, Any]: The response JSON.
    """
    try:
        # Validate current_working_dir is provided and is a git repository
        if not current_working_dir:
            error_msg = "Error: current_working_dir is required. Please provide a valid git repository path."
            logger.error(error_msg)
            return {"error": error_msg}

        # MCP protocol requires 'name' and 'parameters' fields
        if "name" not in request:
            logger.error("Error: Received request missing 'name' field.")
            return {"error": "Missing 'name' field in request"}

        request_type = request.get("name")
        params = request.get("parameters", {})

        logger.info(
            f"Received request: Type='{request_type}', CWD='{current_working_dir}'"
        )

        # Validate that the current_working_dir is a git repository before changing to it
        is_git_repo, error_message = is_git_repository(current_working_dir)
        if not is_git_repo:
            error_msg = f"Error: The specified directory '{current_working_dir}' is not a valid git repository: {error_message}"
            logger.error(error_msg)
            return {"error": error_msg}

        # Set working directory
        logger.info(f"Changing working directory to: {current_working_dir}")
        os.chdir(current_working_dir)

        # Route to the appropriate handler based on request type
        if request_type == "aider_ai_code":
            return process_aider_ai_code_request(
                params, editor_model, current_working_dir
            )

        elif request_type == "list_models":
            return process_list_models_request(params)

        else:
            # Unknown request type
            logger.warning(f"Warning: Unknown request type received: {request_type}")
            return {"error": f"Unknown request type: {request_type}"}

    except Exception as e:
        # Handle any errors
        logger.exception(
            f"Critical Error: Unhandled exception during request processing: {str(e)}"
        )
        return {"error": f"Internal server error: {str(e)}"}


async def serve(
    editor_model: str = DEFAULT_EDITOR_MODEL,
    current_working_dir: str = None,
) -> None:
    """
    Start the MCP server following the Model Context Protocol.

    The server reads JSON requests from stdin and writes JSON responses to stdout.
    Each request should contain a 'name' field indicating the tool to invoke, and
    a 'parameters' field with the tool-specific parameters.

    Args:
        editor_model (str, optional): The editor model to use. Defaults to DEFAULT_EDITOR_MODEL.
        current_working_dir (str, required): The current working directory. Must be a valid git repository.

    Raises:
        ValueError: If current_working_dir is not provided or is not a git repository.
    """
    logger.info(f"Starting Aider MCP Server")
    logger.info(f"Editor Model: {editor_model}")

    # Validate current_working_dir is provided
    if not current_working_dir:
        error_msg = "Error: current_working_dir is required. Please provide a valid git repository path."
        logger.error(error_msg)
        raise ValueError(error_msg)

    logger.info(f"Initial Working Directory: {current_working_dir}")

    # Validate that the current_working_dir is a git repository
    is_git_repo, error_message = is_git_repository(current_working_dir)
    if not is_git_repo:
        error_msg = f"Error: The specified directory '{current_working_dir}' is not a valid git repository: {error_message}"
        logger.error(error_msg)
        raise ValueError(error_msg)

    logger.info(f"Validated git repository at: {current_working_dir}")

    # Set working directory
    logger.info(f"Setting working directory to: {current_working_dir}")
    os.chdir(current_working_dir)

    # Create the MCP server
    server = Server("aider-mcp-server")

    @server.list_tools()
    async def list_tools() -> List[Tool]:
        """Register all available tools with the MCP server."""
        return [AIDER_AI_CODE_TOOL, LIST_MODELS_TOOL]

    @server.call_tool()
    async def call_tool(name: str, arguments: Dict[str, Any]) -> List[TextContent]:
        """Handle tool calls from the MCP client."""
        logger.info(f"Received Tool Call: Name='{name}'")
        logger.info(f"Arguments: {arguments}")

        try:
            if name == "aider_ai_code":
                logger.info(f"Processing 'aider_ai_code' tool call...")
                result = process_aider_ai_code_request(
                    arguments, editor_model, current_working_dir
                )
                return [TextContent(type="text", text=json.dumps(result))]

            elif name == "list_models":
                logger.info(f"Processing 'list_models' tool call...")
                result = process_list_models_request(arguments)
                return [TextContent(type="text", text=json.dumps(result))]

            else:
                logger.warning(f"Warning: Received call for unknown tool: {name}")
                return [
                    TextContent(
                        type="text", text=json.dumps({"error": f"Unknown tool: {name}"})
                    )
                ]

        except Exception as e:
            logger.exception(f"Error: Exception during tool call '{name}': {e}")
            return [
                TextContent(
                    type="text",
                    text=json.dumps(
                        {"error": f"Error processing tool {name}: {str(e)}"}
                    ),
                )
            ]

    # Initialize and run the server
    try:
        options = server.create_initialization_options()
        logger.info("Initializing stdio server connection...")
        async with stdio_server() as (read_stream, write_stream):
            logger.info("Server running. Waiting for requests...")
            await server.run(read_stream, write_stream, options, raise_exceptions=True)
    except Exception as e:
        logger.exception(
            f"Critical Error: Server stopped due to unhandled exception: {e}"
        )
        raise
    finally:
        logger.info("Aider MCP Server shutting down.")

```

--------------------------------------------------------------------------------
/src/aider_mcp_server/tests/atoms/tools/test_aider_ai_code.py:
--------------------------------------------------------------------------------

```python
import os
import json
import tempfile
import pytest
import shutil
import subprocess
from aider_mcp_server.atoms.tools.aider_ai_code import code_with_aider
from aider_mcp_server.atoms.utils import DEFAULT_TESTING_MODEL

@pytest.fixture
def temp_dir():
    """Create a temporary directory with an initialized Git repository for testing."""
    tmp_dir = tempfile.mkdtemp()
    
    # Initialize git repository in the temp directory
    subprocess.run(["git", "init"], cwd=tmp_dir, capture_output=True, text=True, check=True)
    
    # Configure git user for the repository
    subprocess.run(["git", "config", "user.name", "Test User"], cwd=tmp_dir, capture_output=True, text=True, check=True)
    subprocess.run(["git", "config", "user.email", "[email protected]"], cwd=tmp_dir, capture_output=True, text=True, check=True)
    
    # Create and commit an initial file to have a valid git history
    with open(os.path.join(tmp_dir, "README.md"), "w") as f:
        f.write("# Test Repository\nThis is a test repository for Aider MCP Server tests.")
    
    subprocess.run(["git", "add", "README.md"], cwd=tmp_dir, capture_output=True, text=True, check=True)
    subprocess.run(["git", "commit", "-m", "Initial commit"], cwd=tmp_dir, capture_output=True, text=True, check=True)
    
    yield tmp_dir
    
    # Clean up
    shutil.rmtree(tmp_dir)

def test_addition(temp_dir):
    """Test that code_with_aider can create a file that adds two numbers."""
    # Create the test file
    test_file = os.path.join(temp_dir, "math_add.py")
    with open(test_file, "w") as f:
        f.write("# This file should implement addition\n")
    
    prompt = "Implement a function add(a, b) that returns the sum of a and b in the math_add.py file."
    
    # Run code_with_aider with working_dir
    result = code_with_aider(
        ai_coding_prompt=prompt,
        relative_editable_files=[test_file],
        working_dir=temp_dir  # Pass the temp directory as working_dir
    )
    
    # Parse the JSON result
    result_dict = json.loads(result)
    
    # Check that it succeeded
    assert result_dict["success"] is True, "Expected code_with_aider to succeed"
    assert "diff" in result_dict, "Expected diff to be in result"
    
    # Check that the file was modified correctly
    with open(test_file, "r") as f:
        content = f.read()
    
    assert any(x in content for x in ["def add(a, b):", "def add(a:"]), "Expected to find add function in the file"
    assert "return a + b" in content, "Expected to find return statement in the file"
    
    # Try to import and use the function
    import sys
    sys.path.append(temp_dir)
    from math_add import add
    assert add(2, 3) == 5, "Expected add(2, 3) to return 5"

def test_subtraction(temp_dir):
    """Test that code_with_aider can create a file that subtracts two numbers."""
    # Create the test file
    test_file = os.path.join(temp_dir, "math_subtract.py")
    with open(test_file, "w") as f:
        f.write("# This file should implement subtraction\n")
    
    prompt = "Implement a function subtract(a, b) that returns a minus b in the math_subtract.py file."
    
    # Run code_with_aider with working_dir
    result = code_with_aider(
        ai_coding_prompt=prompt,
        relative_editable_files=[test_file],
        working_dir=temp_dir  # Pass the temp directory as working_dir
    )
    
    # Parse the JSON result
    result_dict = json.loads(result)
    
    # Check that it succeeded
    assert result_dict["success"] is True, "Expected code_with_aider to succeed"
    assert "diff" in result_dict, "Expected diff to be in result"
    
    # Check that the file was modified correctly
    with open(test_file, "r") as f:
        content = f.read()
    
    assert any(x in content for x in ["def subtract(a, b):", "def subtract(a:"]), "Expected to find subtract function in the file"
    assert "return a - b" in content, "Expected to find return statement in the file"
    
    # Try to import and use the function
    import sys
    sys.path.append(temp_dir)
    from math_subtract import subtract
    assert subtract(5, 3) == 2, "Expected subtract(5, 3) to return 2"

def test_multiplication(temp_dir):
    """Test that code_with_aider can create a file that multiplies two numbers."""
    # Create the test file
    test_file = os.path.join(temp_dir, "math_multiply.py")
    with open(test_file, "w") as f:
        f.write("# This file should implement multiplication\n")
    
    prompt = "Implement a function multiply(a, b) that returns the product of a and b in the math_multiply.py file."
    
    # Run code_with_aider with working_dir
    result = code_with_aider(
        ai_coding_prompt=prompt,
        relative_editable_files=[test_file],
        working_dir=temp_dir  # Pass the temp directory as working_dir
    )
    
    # Parse the JSON result
    result_dict = json.loads(result)
    
    # Check that it succeeded
    assert result_dict["success"] is True, "Expected code_with_aider to succeed"
    assert "diff" in result_dict, "Expected diff to be in result"
    
    # Check that the file was modified correctly
    with open(test_file, "r") as f:
        content = f.read()
    
    assert any(x in content for x in ["def multiply(a, b):", "def multiply(a:"]), "Expected to find multiply function in the file"
    assert "return a * b" in content, "Expected to find return statement in the file"
    
    # Try to import and use the function
    import sys
    sys.path.append(temp_dir)
    from math_multiply import multiply
    assert multiply(2, 3) == 6, "Expected multiply(2, 3) to return 6"

def test_division(temp_dir):
    """Test that code_with_aider can create a file that divides two numbers."""
    # Create the test file
    test_file = os.path.join(temp_dir, "math_divide.py")
    with open(test_file, "w") as f:
        f.write("# This file should implement division\n")
    
    prompt = "Implement a function divide(a, b) that returns a divided by b in the math_divide.py file. Handle division by zero by returning None."
    
    # Run code_with_aider with working_dir
    result = code_with_aider(
        ai_coding_prompt=prompt,
        relative_editable_files=[test_file],
        working_dir=temp_dir  # Pass the temp directory as working_dir
    )
    
    # Parse the JSON result
    result_dict = json.loads(result)
    
    # Check that it succeeded
    assert result_dict["success"] is True, "Expected code_with_aider to succeed"
    assert "diff" in result_dict, "Expected diff to be in result"
    
    # Check that the file was modified correctly
    with open(test_file, "r") as f:
        content = f.read()
    
    assert any(x in content for x in ["def divide(a, b):", "def divide(a:"]), "Expected to find divide function in the file"
    assert "return" in content, "Expected to find return statement in the file"
    
    # Try to import and use the function
    import sys
    sys.path.append(temp_dir)
    from math_divide import divide
    assert divide(6, 3) == 2, "Expected divide(6, 3) to return 2"
    assert divide(1, 0) is None, "Expected divide(1, 0) to return None"

def test_failure_case(temp_dir):
    """Test that code_with_aider returns error information for a failure scenario."""
    
    try:
        # Ensure this test runs in a non-git directory
        os.chdir(temp_dir)
        
        # Create a test file in the temp directory
        test_file = os.path.join(temp_dir, "failure_test.py")
        with open(test_file, "w") as f:
            f.write("# This file should trigger a failure\n")
        
        # Use an invalid model name to ensure a failure
        prompt = "This prompt should fail because we're using a non-existent model."
        
        # Run code_with_aider with an invalid model name
        result = code_with_aider(
            ai_coding_prompt=prompt,
            relative_editable_files=[test_file],
            model="non_existent_model_123456789",  # This model doesn't exist
            working_dir=temp_dir  # Pass the temp directory as working_dir
        )
        
        # Parse the JSON result
        result_dict = json.loads(result)

        # Check the result - we're still expecting success=False but the important part
        # is that we get a diff that explains the error.
        # The diff should indicate that no meaningful changes were made,
        # often because the model couldn't be reached or produced no output.
        assert "diff" in result_dict, "Expected diff to be in result"
        diff_content = result_dict["diff"]
        assert "File contents after editing (git not used):" in diff_content or "No meaningful changes detected" in diff_content, \
               f"Expected error information like 'File contents after editing' or 'No meaningful changes' in diff, but got: {diff_content}"
    finally:
        # Make sure we go back to the main directory
        os.chdir("/Users/indydevdan/Documents/projects/aider-mcp-exp")

def test_complex_tasks(temp_dir):
    """Test that code_with_aider correctly implements more complex tasks."""
    # Create the test file for a calculator class
    test_file = os.path.join(temp_dir, "calculator.py")
    with open(test_file, "w") as f:
        f.write("# This file should implement a calculator class\n")
    
    # More complex prompt suitable for architect mode
    prompt = """
    Create a Calculator class with the following features:
    1. Basic operations: add, subtract, multiply, divide methods
    2. Memory functions: memory_store, memory_recall, memory_clear
    3. A history feature that keeps track of operations 
    4. A method to show_history
    5. Error handling for division by zero
    
    All methods should be well-documented with docstrings.
    """
    
    # Run code_with_aider with explicit model
    result = code_with_aider(
        ai_coding_prompt=prompt,
        relative_editable_files=[test_file],
        model=DEFAULT_TESTING_MODEL,  # Main model
        working_dir=temp_dir  # Pass the temp directory as working_dir
    )
    
    # Parse the JSON result
    result_dict = json.loads(result)
    
    # Check that it succeeded
    assert result_dict["success"] is True, "Expected code_with_aider with architect mode to succeed"
    assert "diff" in result_dict, "Expected diff to be in result"
    
    # Check that the file was modified correctly with expected elements
    with open(test_file, "r") as f:
        content = f.read()
    
    # Check for class definition and methods - relaxed assertions to accommodate type hints
    assert "class Calculator" in content, "Expected to find Calculator class definition"
    assert "add" in content, "Expected to find add method"
    assert "subtract" in content, "Expected to find subtract method"
    assert "multiply" in content, "Expected to find multiply method"
    assert "divide" in content, "Expected to find divide method"
    assert "memory_" in content, "Expected to find memory functions"
    assert "history" in content, "Expected to find history functionality"
    
    # Import and test basic calculator functionality
    import sys
    sys.path.append(temp_dir)
    from calculator import Calculator
    
    # Test the calculator
    calc = Calculator()
    
    # Test basic operations
    assert calc.add(2, 3) == 5, "Expected add(2, 3) to return 5"
    assert calc.subtract(5, 3) == 2, "Expected subtract(5, 3) to return 2"
    assert calc.multiply(2, 3) == 6, "Expected multiply(2, 3) to return 6"
    assert calc.divide(6, 3) == 2, "Expected divide(6, 3) to return 2"
    
    # Test division by zero error handling
    try:
        result = calc.divide(5, 0)
        assert result is None or isinstance(result, (str, type(None))), \
            "Expected divide by zero to return None, error message, or raise exception"
    except Exception:
        # It's fine if it raises an exception - that's valid error handling too
        pass
    
    # Test memory functions if implemented as expected
    try:
        calc.memory_store(10)
        assert calc.memory_recall() == 10, "Expected memory_recall() to return stored value"
        calc.memory_clear()
        assert calc.memory_recall() == 0 or calc.memory_recall() is None, \
            "Expected memory_recall() to return 0 or None after clearing"
    except (AttributeError, TypeError):
        # Some implementations might handle memory differently
        pass

def test_diff_output(temp_dir):
    """Test that code_with_aider produces proper git diff output when modifying existing files."""
    # Create an initial math file
    test_file = os.path.join(temp_dir, "math_operations.py")
    initial_content = """# Math operations module
def add(a, b):
    return a + b

def subtract(a, b):
    return a - b
"""
    
    with open(test_file, "w") as f:
        f.write(initial_content)
    
    # Commit the initial file to git
    subprocess.run(["git", "add", "math_operations.py"], cwd=temp_dir, capture_output=True, text=True, check=True)
    subprocess.run(["git", "commit", "-m", "Add initial math operations"], cwd=temp_dir, capture_output=True, text=True, check=True)
    
    # Now modify the file using Aider
    prompt = "Add a multiply function that takes two parameters and returns their product. Also add a docstring to the existing add function."
    
    result = code_with_aider(
        ai_coding_prompt=prompt,
        relative_editable_files=["math_operations.py"],
        model=DEFAULT_TESTING_MODEL,
        working_dir=temp_dir
    )
    
    # Parse the JSON result
    result_dict = json.loads(result)
    
    # Check that it succeeded
    assert result_dict["success"] is True, "Expected code_with_aider to succeed"
    assert "diff" in result_dict, "Expected diff to be in result"
    
    # Verify the diff contains expected git diff markers
    diff_content = result_dict["diff"]
    assert "diff --git" in diff_content, "Expected git diff header in diff output"
    assert "@@" in diff_content, "Expected hunk headers (@@) in diff output"
    assert "+++ b/math_operations.py" in diff_content, "Expected new file marker in diff"
    assert "--- a/math_operations.py" in diff_content, "Expected old file marker in diff"
    
    # Verify the diff shows additions (lines starting with +)
    diff_lines = diff_content.split('\n')
    added_lines = [line for line in diff_lines if line.startswith('+') and not line.startswith('+++')]
    assert len(added_lines) > 0, "Expected to find added lines in diff"
    
    # Check that multiply function was actually added to the file
    with open(test_file, "r") as f:
        final_content = f.read()
    
    assert "def multiply" in final_content, "Expected multiply function to be added"
    assert "docstring" in final_content.lower() or '"""' in final_content, "Expected docstring to be added"

```

--------------------------------------------------------------------------------
/ai_docs/just-prompt-example-mcp-server.xml:
--------------------------------------------------------------------------------

```
This file is a merged representation of a subset of the codebase, containing files not matching ignore patterns, combined into a single document by Repomix.

<file_summary>
This section contains a summary of this file.

<purpose>
This file contains a packed representation of the entire repository's contents.
It is designed to be easily consumable by AI systems for analysis, code review,
or other automated processes.
</purpose>

<file_format>
The content is organized as follows:
1. This summary section
2. Repository information
3. Directory structure
4. Repository files, each consisting of:
  - File path as an attribute
  - Full contents of the file
</file_format>

<usage_guidelines>
- This file should be treated as read-only. Any changes should be made to the
  original repository files, not this packed version.
- When processing this file, use the file path to distinguish
  between different files in the repository.
- Be aware that this file may contain sensitive information. Handle it with
  the same level of security as you would the original repository.
</usage_guidelines>

<notes>
- Some files may have been excluded based on .gitignore rules and Repomix's configuration
- Binary files are not included in this packed representation. Please refer to the Repository Structure section for a complete list of file paths, including binary files
- Files matching these patterns are excluded: uv.lock, example_outputs/*, ai_docs
- Files matching patterns in .gitignore are excluded
- Files matching default ignore patterns are excluded
- Files are sorted by Git change count (files with more changes are at the bottom)
</notes>

<additional_info>

</additional_info>

</file_summary>

<directory_structure>
.claude/
  commands/
    context_prime_w_lead.md
    context_prime.md
    jprompt_ultra_diff_review.md
    project_hello_w_name.md
    project_hello.md
prompts/
  countdown_component.txt
  mock_bin_search.txt
  mock_ui_component.txt
specs/
  init-just-prompt.md
src/
  just_prompt/
    atoms/
      llm_providers/
        __init__.py
        anthropic.py
        deepseek.py
        gemini.py
        groq.py
        ollama.py
        openai.py
      shared/
        __init__.py
        data_types.py
        model_router.py
        utils.py
        validator.py
      __init__.py
    molecules/
      __init__.py
      list_models.py
      list_providers.py
      prompt_from_file_to_file.py
      prompt_from_file.py
      prompt.py
    tests/
      atoms/
        llm_providers/
          __init__.py
          test_anthropic.py
          test_deepseek.py
          test_gemini.py
          test_groq.py
          test_ollama.py
          test_openai.py
        shared/
          __init__.py
          test_model_router.py
          test_utils.py
          test_validator.py
        __init__.py
      molecules/
        __init__.py
        test_list_models.py
        test_list_providers.py
        test_prompt_from_file_to_file.py
        test_prompt_from_file.py
        test_prompt.py
      __init__.py
    __init__.py
    __main__.py
    server.py
ultra_diff_review/
  diff_anthropic_claude-3-7-sonnet-20250219_4k.md
  diff_gemini_gemini-2.0-flash-thinking-exp.md
  diff_openai_o3-mini.md
  fusion_ultra_diff_review.md
.env.sample
.gitignore
.mcp.json
.python-version
list_models.py
pyproject.toml
README.md
</directory_structure>

<files>
This section contains the contents of the repository's files.

<file path=".claude/commands/context_prime_w_lead.md">
READ README.md, THEN run git ls-files to understand the context of the project.

Be sure to also READ: $ARGUMENTS and nothing else.
</file>

<file path=".claude/commands/project_hello.md">
hi how are you
</file>

<file path="prompts/countdown_component.txt">
Create a countdown timer component that satisfies these requirements:

1. Framework implementations:
   - Vue.js
   - Svelte
   - React
   - Vanilla JavaScript

2. Component interface:
   - :start-time: number (starting time in seconds)
   - :format: number (display format, 0 = MM:SS, 1 = HH:MM:SS)

3. Features:
   - Count down from start-time to zero
   - Display remaining time in specified format
   - Stop counting when reaching zero
   - Emit/callback 'finished' event when countdown completes
   - Provide a visual indication when time is running low (< 10% of total)

4. Include:
   - Component implementation
   - Sample usage
   - Clear comments explaining key parts

Provide clean, well-structured code for each framework version.
</file>

<file path="prompts/mock_bin_search.txt">
python: return code exclusively: def binary_search(arr, target) -> Optional[int]:
</file>

<file path="prompts/mock_ui_component.txt">
Build vue, react, and svelte components for this component definition:

<TableOfContents :tree="tree" />

The tree is a json object that looks like this:

```json
{
    "name": "TableOfContents",
    "children": [
        {
            "name": "Item",
            "children": [
                {
                    "name": "Item",
                    "children": []
                }
            ]
        },
        {
            "name": "Item 2",
            "children": []
        }
    ]
}
```
</file>

<file path="specs/init-just-prompt.md">
# Specification for Just Prompt
> We're building a lightweight wrapper mcp server around openai, anthropic, gemini, groq, deepseek, and ollama.

## Implementation details

- First, READ ai_docs/* to understand the providers, models, and to see an example mcp server.
- Mirror the work done inside `of ai_docs/pocket-pick-mcp-server-example.xml`. Here we have a complete example of how to build a mcp server. We also have a complete codebase structure that we want to replicate. With some slight tweaks - see `Codebase Structure` below.
- Don't mock any tests - run simple "What is the capital of France?" tests and expect them to pass case insensitive.
- Be sure to use load_dotenv() in the tests.
- models_prefixed_by_provider look like this:
  - openai:gpt-4o
  - anthropic:claude-3-5-sonnet-20240620
  - gemini:gemini-1.5-flash
  - groq:llama-3.1-70b-versatile
  - deepseek:deepseek-coder
  - ollama:llama3.1
  - or using short names:
    - o:gpt-4o
    - a:claude-3-5-sonnet-20240620
    - g:gemini-1.5-flash
    - q:llama-3.1-70b-versatile
    - d:deepseek-coder
    - l:llama3.1
- Be sure to comment every function and class with clear doc strings.
- Don't explicitly write out the full list of models for a provider. Instead, use the `list_models` function.
- Create a 'magic' function somewhere using the weak_provider_and_model param - make sure this is callable. We're going to take the 'models_prefixed_by_provider' and pass it to this function running a custom prompt where we ask the model to return the right model for this given item. TO be clear the 'models_prefixed_by_provider' will be a natural language query and will sometimes be wrong, so we want to correct it after parsing the provider and update it to the right value by provider this weak model prompt the list_model() call for the provider, then add the to the prompt and ask it to return the right model ONLY IF the model (from the split : call) is not in the providers list_model() already. If we run this functionality be sure to log 'weak_provider_and_model' and the 'models_prefixed_by_provider' and the 'corrected_model' to the console. If we dont just say 'using <provider> and <model>'.
- For tests use these models
  - o:gpt-4o-mini
  - a:claude-3-5-haiku
  - g:gemini-2.0-flash
  - q:qwen-2.5-32b
  - d:deepseek-coder
  - l:gemma3:12b
- To implement list models read `list_models.py`.

## Tools we want to expose
> Here's the tools we want to expose:

prompt(text, models_prefixed_by_provider: List[str]) -> List[str] (return value is list of responses)

prompt_from_file(file, models_prefixed_by_provider: List[str]) -> List[str] (return value is list of responses)

prompt_from_file_to_file(file, models_prefixed_by_provider: List[str], output_dir: str = ".") -> List[str] (return value is a list of file paths)

list_providers() -> List[str]

list_models(provider: str) -> List[str]

## Codebase Structure

- .env.sample
- src/
  - just_prompt/
    - __init__.py
    - __main__.py
    - server.py
      - serve(weak_provider_and_model: str = "o:gpt-4o-mini") -> None
    - atoms/
      - __init__.py
      - llm_providers/
        - __init__.py
        - openai.py
          - prompt(text, model) -> str
          - list_models() -> List[str]
        - anthropic.py
          - ...same as openai.py
        - gemini.py
          - ...
        - groq.py
          - ...
        - deepseek.py
          - ...
        - ollama.py
          - ...
      - shared/
        - __init__.py
        - validator.py
          - validate_models_prefixed_by_provider(models_prefixed_by_provider: List[str]) -> raise error if a model prefix does not match a provider
        - utils.py
          - split_provider_and_model(model: str) -> Tuple[str, str] - be sure this only splits the first : in the model string and leaves the rest of the string as the model name. Models will have additional : in the string and we want to ignore them and leave them for the model name.
        - data_types.py
          - class PromptRequest(BaseModel) {text: str, models_prefixed_by_provider: List[str]}
          - class PromptResponse(BaseModel) {responses: List[str]}
          - class PromptFromFileRequest(BaseModel) {file: str, models_prefixed_by_provider: List[str]}
          - class PromptFromFileResponse(BaseModel) {responses: List[str]}
          - class PromptFromFileToFileRequest(BaseModel) {file: str, models_prefixed_by_provider: List[str], output_dir: str = "."}
          - class PromptFromFileToFileResponse(BaseModel) {file_paths: List[str]}
          - class ListProvidersRequest(BaseModel) {}
          - class ListProvidersResponse(BaseModel) {providers: List[str]} - returns all providers with long and short names
          - class ListModelsRequest(BaseModel) {provider: str}
          - class ListModelsResponse(BaseModel) {models: List[str]} - returns all models for a given provider
          - class ModelAlias(BaseModel) {provider: str, model: str}
          - class ModelProviders(Enum):
              OPENAI = ("openai", "o")
              ANTHROPIC = ("anthropic", "a")
              GEMINI = ("gemini", "g")
              GROQ = ("groq", "q")
              DEEPSEEK = ("deepseek", "d")
              OLLAMA = ("ollama", "l")
              
              def __init__(self, full_name, short_name):
                  self.full_name = full_name
                  self.short_name = short_name
                  
              @classmethod
              def from_name(cls, name):
                  for provider in cls:
                      if provider.full_name == name or provider.short_name == name:
                          return provider
                  return None
        - model_router.py
    - molecules/
      - __init__.py
      - prompt.py
      - prompt_from_file.py
      - prompt_from_file_to_file.py
      - list_providers.py
      - list_models.py
    - tests/
      - __init__.py
      - atoms/
        - __init__.py
        - llm_providers/
          - __init__.py
          - test_openai.py
          - test_anthropic.py
          - test_gemini.py
          - test_groq.py
          - test_deepseek.py
          - test_ollama.py
        - shared/
          - __init__.py
          - test_utils.py
      - molecules/
        - __init__.py
        - test_prompt.py
        - test_prompt_from_file.py
        - test_prompt_from_file_to_file.py
        - test_list_providers.py
        - test_list_models.py

## Per provider documentation

### OpenAI
See: `ai_docs/llm_providers_details.xml`

### Anthropic
See: `ai_docs/llm_providers_details.xml`

### Gemini
See: `ai_docs/llm_providers_details.xml`

### Groq

Quickstart
Get up and running with the Groq API in a few minutes.

Create an API Key
Please visit here to create an API Key.

Set up your API Key (recommended)
Configure your API key as an environment variable. This approach streamlines your API usage by eliminating the need to include your API key in each request. Moreover, it enhances security by minimizing the risk of inadvertently including your API key in your codebase.

In your terminal of choice:

export GROQ_API_KEY=<your-api-key-here>
Requesting your first chat completion
curl
JavaScript
Python
JSON
Install the Groq Python library:

pip install groq
Performing a Chat Completion:

import os

from groq import Groq

client = Groq(
    api_key=os.environ.get("GROQ_API_KEY"),
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Explain the importance of fast language models",
        }
    ],
    model="llama-3.3-70b-versatile",
)

print(chat_completion.choices[0].message.content)
Now that you have successfully received a chat completion, you can try out the other endpoints in the API.

Next Steps
Check out the Playground to try out the Groq API in your browser
Join our GroqCloud developer community on Discord
Chat with our Docs at lightning speed using the Groq API!
Add a how-to on your project to the Groq API Cookbook

### DeepSeek
See: `ai_docs/llm_providers_details.xml`

### Ollama
See: `ai_docs/llm_providers_details.xml`


## Validation (close the loop)

- Run `uv run pytest <path_to_test>` to validate the tests are passing - do this iteratively as you build out the tests.
- After code is written, run `uv run pytest` to validate all tests are passing.
- At the end Use `uv run just-prompt --help` to validate the mcp server works.
</file>

<file path="src/just_prompt/atoms/llm_providers/__init__.py">
# LLM Providers package - interfaces for various LLM APIs
</file>

<file path="src/just_prompt/atoms/llm_providers/deepseek.py">
"""
DeepSeek provider implementation.
"""

import os
from typing import List
import logging
from openai import OpenAI
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Configure logging
logger = logging.getLogger(__name__)

# Initialize DeepSeek client with OpenAI-compatible interface
client = OpenAI(
    api_key=os.environ.get("DEEPSEEK_API_KEY"),
    base_url="https://api.deepseek.com"
)


def prompt(text: str, model: str) -> str:
    """
    Send a prompt to DeepSeek and get a response.
    
    Args:
        text: The prompt text
        model: The model name
        
    Returns:
        Response string from the model
    """
    try:
        logger.info(f"Sending prompt to DeepSeek model: {model}")
        
        # Create chat completion
        response = client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": text}],
            stream=False,
        )
        
        # Extract response content
        return response.choices[0].message.content
    except Exception as e:
        logger.error(f"Error sending prompt to DeepSeek: {e}")
        raise ValueError(f"Failed to get response from DeepSeek: {str(e)}")


def list_models() -> List[str]:
    """
    List available DeepSeek models.
    
    Returns:
        List of model names
    """
    try:
        logger.info("Listing DeepSeek models")
        response = client.models.list()
        
        # Extract model IDs
        models = [model.id for model in response.data]
        
        return models
    except Exception as e:
        logger.error(f"Error listing DeepSeek models: {e}")
        # Return some known models if API fails
        logger.info("Returning hardcoded list of known DeepSeek models")
        return [
            "deepseek-coder",
            "deepseek-chat",
            "deepseek-reasoner",
            "deepseek-coder-v2",
            "deepseek-reasoner-lite"
        ]
</file>

<file path="src/just_prompt/atoms/llm_providers/gemini.py">
"""
Google Gemini provider implementation.
"""

import os
from typing import List
import logging
import google.generativeai as genai
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Configure logging
logger = logging.getLogger(__name__)

# Initialize Gemini
genai.configure(api_key=os.environ.get("GEMINI_API_KEY"))


def prompt(text: str, model: str) -> str:
    """
    Send a prompt to Google Gemini and get a response.
    
    Args:
        text: The prompt text
        model: The model name
        
    Returns:
        Response string from the model
    """
    try:
        logger.info(f"Sending prompt to Gemini model: {model}")
        
        # Create generative model
        gemini_model = genai.GenerativeModel(model_name=model)
        
        # Generate content
        response = gemini_model.generate_content(text)
        
        return response.text
    except Exception as e:
        logger.error(f"Error sending prompt to Gemini: {e}")
        raise ValueError(f"Failed to get response from Gemini: {str(e)}")


def list_models() -> List[str]:
    """
    List available Google Gemini models.
    
    Returns:
        List of model names
    """
    try:
        logger.info("Listing Gemini models")
        
        # Get the list of models
        models = []
        for m in genai.list_models():
            if "generateContent" in m.supported_generation_methods:
                models.append(m.name)
                
        # Format model names - strip the "models/" prefix if present
        formatted_models = [model.replace("models/", "") for model in models]
        
        return formatted_models
    except Exception as e:
        logger.error(f"Error listing Gemini models: {e}")
        # Return some known models if API fails
        logger.info("Returning hardcoded list of known Gemini models")
        return [
            "gemini-1.5-pro",
            "gemini-1.5-flash",
            "gemini-1.5-flash-latest",
            "gemini-1.0-pro",
            "gemini-2.0-flash"
        ]
</file>

<file path="src/just_prompt/atoms/llm_providers/groq.py">
"""
Groq provider implementation.
"""

import os
from typing import List
import logging
from groq import Groq
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Configure logging
logger = logging.getLogger(__name__)

# Initialize Groq client
client = Groq(api_key=os.environ.get("GROQ_API_KEY"))


def prompt(text: str, model: str) -> str:
    """
    Send a prompt to Groq and get a response.
    
    Args:
        text: The prompt text
        model: The model name
        
    Returns:
        Response string from the model
    """
    try:
        logger.info(f"Sending prompt to Groq model: {model}")
        
        # Create chat completion
        chat_completion = client.chat.completions.create(
            messages=[{"role": "user", "content": text}],
            model=model,
        )
        
        # Extract response content
        return chat_completion.choices[0].message.content
    except Exception as e:
        logger.error(f"Error sending prompt to Groq: {e}")
        raise ValueError(f"Failed to get response from Groq: {str(e)}")


def list_models() -> List[str]:
    """
    List available Groq models.
    
    Returns:
        List of model names
    """
    try:
        logger.info("Listing Groq models")
        response = client.models.list()
        
        # Extract model IDs
        models = [model.id for model in response.data]
        
        return models
    except Exception as e:
        logger.error(f"Error listing Groq models: {e}")
        # Return some known models if API fails
        logger.info("Returning hardcoded list of known Groq models")
        return [
            "llama-3.3-70b-versatile",
            "llama-3.1-70b-versatile",
            "llama-3.1-8b-versatile",
            "mixtral-8x7b-32768",
            "gemma-7b-it",
            "qwen-2.5-32b"
        ]
</file>

<file path="src/just_prompt/atoms/shared/__init__.py">
# Shared package - common utilities and data types
</file>

<file path="src/just_prompt/atoms/__init__.py">
# Atoms package - basic building blocks
</file>

<file path="src/just_prompt/molecules/__init__.py">
# Molecules package - higher-level functionality built from atoms
</file>

<file path="src/just_prompt/molecules/list_models.py">
"""
List models functionality for just-prompt.
"""

from typing import List
import logging
from ..atoms.shared.validator import validate_provider
from ..atoms.shared.model_router import ModelRouter

logger = logging.getLogger(__name__)


def list_models(provider: str) -> List[str]:
    """
    List available models for a provider.
    
    Args:
        provider: Provider name (full or short)
        
    Returns:
        List of model names
    """
    # Validate provider
    validate_provider(provider)
    
    # Get models from provider
    return ModelRouter.route_list_models(provider)
</file>

<file path="src/just_prompt/molecules/list_providers.py">
"""
List providers functionality for just-prompt.
"""

from typing import List, Dict
import logging
from ..atoms.shared.data_types import ModelProviders

logger = logging.getLogger(__name__)


def list_providers() -> List[Dict[str, str]]:
    """
    List all available providers with their full and short names.
    
    Returns:
        List of dictionaries with provider information
    """
    providers = []
    for provider in ModelProviders:
        providers.append({
            "name": provider.name,
            "full_name": provider.full_name,
            "short_name": provider.short_name
        })
    
    return providers
</file>

<file path="src/just_prompt/tests/atoms/llm_providers/__init__.py">
# LLM Providers tests package
</file>

<file path="src/just_prompt/tests/atoms/llm_providers/test_deepseek.py">
"""
Tests for DeepSeek provider.
"""

import pytest
import os
from dotenv import load_dotenv
from just_prompt.atoms.llm_providers import deepseek

# Load environment variables
load_dotenv()

# Skip tests if API key not available
if not os.environ.get("DEEPSEEK_API_KEY"):
    pytest.skip("DeepSeek API key not available", allow_module_level=True)


def test_list_models():
    """Test listing DeepSeek models."""
    models = deepseek.list_models()
    assert isinstance(models, list)
    assert len(models) > 0
    assert all(isinstance(model, str) for model in models)


def test_prompt():
    """Test sending prompt to DeepSeek."""
    response = deepseek.prompt("What is the capital of France?", "deepseek-coder")
    assert isinstance(response, str)
    assert len(response) > 0
    assert "paris" in response.lower() or "Paris" in response
</file>

<file path="src/just_prompt/tests/atoms/llm_providers/test_groq.py">
"""
Tests for Groq provider.
"""

import pytest
import os
from dotenv import load_dotenv
from just_prompt.atoms.llm_providers import groq

# Load environment variables
load_dotenv()

# Skip tests if API key not available
if not os.environ.get("GROQ_API_KEY"):
    pytest.skip("Groq API key not available", allow_module_level=True)


def test_list_models():
    """Test listing Groq models."""
    models = groq.list_models()
    assert isinstance(models, list)
    assert len(models) > 0
    assert all(isinstance(model, str) for model in models)


def test_prompt():
    """Test sending prompt to Groq."""
    response = groq.prompt("What is the capital of France?", "qwen-2.5-32b")
    assert isinstance(response, str)
    assert len(response) > 0
    assert "paris" in response.lower() or "Paris" in response
</file>

<file path="src/just_prompt/tests/atoms/shared/__init__.py">
# Shared tests package
</file>

<file path="src/just_prompt/tests/atoms/shared/test_utils.py">
"""
Tests for utility functions.
"""

import pytest
from just_prompt.atoms.shared.utils import split_provider_and_model, get_provider_from_prefix


def test_split_provider_and_model():
    """Test splitting provider and model from string."""
    # Test basic splitting
    provider, model = split_provider_and_model("openai:gpt-4")
    assert provider == "openai"
    assert model == "gpt-4"
    
    # Test short provider name
    provider, model = split_provider_and_model("o:gpt-4")
    assert provider == "o"
    assert model == "gpt-4"
    
    # Test model with colons
    provider, model = split_provider_and_model("ollama:llama3:latest")
    assert provider == "ollama"
    assert model == "llama3:latest"
    
    # Test invalid format
    with pytest.raises(ValueError):
        split_provider_and_model("invalid-model-string")


def test_get_provider_from_prefix():
    """Test getting provider from prefix."""
    # Test full names
    assert get_provider_from_prefix("openai") == "openai"
    assert get_provider_from_prefix("anthropic") == "anthropic"
    assert get_provider_from_prefix("gemini") == "gemini"
    assert get_provider_from_prefix("groq") == "groq"
    assert get_provider_from_prefix("deepseek") == "deepseek"
    assert get_provider_from_prefix("ollama") == "ollama"
    
    # Test short names
    assert get_provider_from_prefix("o") == "openai"
    assert get_provider_from_prefix("a") == "anthropic"
    assert get_provider_from_prefix("g") == "gemini"
    assert get_provider_from_prefix("q") == "groq"
    assert get_provider_from_prefix("d") == "deepseek"
    assert get_provider_from_prefix("l") == "ollama"
    
    # Test invalid prefix
    with pytest.raises(ValueError):
        get_provider_from_prefix("unknown")
</file>

<file path="src/just_prompt/tests/atoms/__init__.py">
# Atoms tests package
</file>

<file path="src/just_prompt/tests/molecules/__init__.py">
# Molecules tests package
</file>

<file path="src/just_prompt/tests/molecules/test_list_providers.py">
"""
Tests for list_providers functionality.
"""

import pytest
from just_prompt.molecules.list_providers import list_providers


def test_list_providers():
    """Test listing providers."""
    providers = list_providers()
    
    # Check basic structure
    assert isinstance(providers, list)
    assert len(providers) > 0
    assert all(isinstance(p, dict) for p in providers)
    
    # Check expected providers are present
    provider_names = [p["name"] for p in providers]
    assert "OPENAI" in provider_names
    assert "ANTHROPIC" in provider_names
    assert "GEMINI" in provider_names
    assert "GROQ" in provider_names
    assert "DEEPSEEK" in provider_names
    assert "OLLAMA" in provider_names
    
    # Check each provider has required fields
    for provider in providers:
        assert "name" in provider
        assert "full_name" in provider
        assert "short_name" in provider
        
        # Check full_name and short_name values
        if provider["name"] == "OPENAI":
            assert provider["full_name"] == "openai"
            assert provider["short_name"] == "o"
        elif provider["name"] == "ANTHROPIC":
            assert provider["full_name"] == "anthropic"
            assert provider["short_name"] == "a"
</file>

<file path="src/just_prompt/tests/__init__.py">
# Tests package
</file>

<file path="src/just_prompt/__init__.py">
# just-prompt - A lightweight wrapper MCP server for various LLM providers

__version__ = "0.1.0"
</file>

<file path="ultra_diff_review/diff_anthropic_claude-3-7-sonnet-20250219_4k.md">
# Code Review

I've analyzed the changes made to the `list_models.py` file. The diff shows a complete refactoring of the file that organizes model listing functionality into separate functions for different AI providers.

## Key Changes

1. **Code Organization:** The code has been restructured from a series of commented blocks into organized functions for each AI provider.
2. **Function Implementation:** Each provider now has a dedicated function for listing their available models.
3. **DeepSeek API Key:** A hardcoded API key is now present in the DeepSeek function.
4. **Function Execution:** All functions are defined but commented out at the bottom of the file.

## Issues and Improvements

### 1. Hardcoded API Key
The `list_deepseek_models()` function contains a hardcoded API key: `"sk-ds-3f422175ff114212a42d7107c3efd1e4"`. This is a significant security risk as API keys should never be stored in source code.

### 2. Inconsistent Environment Variable Usage
Most functions use environment variables for API keys, but the DeepSeek function does not follow this pattern.

### 3. Error Handling
None of the functions include error handling for API failures, network issues, or missing API keys.

### 4. Import Organization
Import statements are scattered throughout the functions instead of being consolidated at the top of the file.

### 5. No Main Function
There's no main function or entrypoint that would allow users to select which model list they want to see.

## Issue Summary

| Issue | Solution | Risk Assessment |
|-------|----------|-----------------|
| 🚨 Hardcoded API key in DeepSeek function | Replace with environment variable: `api_key=os.environ.get("DEEPSEEK_API_KEY")` | High - Security risk, potential unauthorized API usage and charges |
| ⚠️ No error handling | Add try/except blocks to handle API errors, network issues, and missing credentials | Medium - Code will fail without clear error messages |
| 🔧 Inconsistent environment variable usage | Standardize API key access across all providers | Low - Maintenance and consistency issue |
| 🔧 Scattered imports | Consolidate common imports at the top of the file | Low - Code organization issue |
| 💡 No main function or CLI | Add a main function with argument parsing to run specific provider functions | Low - Usability enhancement |
| 💡 Missing API key validation | Add checks to validate API keys are present before making API calls | Medium - Prevents unclear errors when keys are missing |

The most critical issue is the hardcoded API key which should be addressed immediately to prevent security risks.
</file>

<file path="ultra_diff_review/diff_gemini_gemini-2.0-flash-thinking-exp.md">
## Code Review

The diff introduces modularity and improves the structure of the script by encapsulating the model listing logic for each provider into separate functions. However, there are a few issues and areas for improvement.

**Issues, Bugs, and Improvements:**

1.  **🚨 Hardcoded API Key (DeepSeek):** The `list_deepseek_models` function includes a hardcoded API key for DeepSeek. This is a major security vulnerability as API keys should be kept secret and managed securely, preferably through environment variables.

2.  **⚠️ Lack of Error Handling:** The script lacks error handling. If API calls fail due to network issues, invalid API keys, or other reasons, the script will likely crash or produce uninformative error messages.  Robust error handling is crucial for production-ready code.

3.  **ℹ️ Inconsistent API Key Loading (Minor):** While `dotenv` is used for Anthropic and Gemini API keys, OpenAI, Groq, and DeepSeek (partially) rely directly on environment variables.  While functional, consistent use of `dotenv` for all API keys would enhance maintainability and project consistency.

4.  **ℹ️ Missing Function Docstrings (Minor):** The functions lack docstrings explaining their purpose, parameters (if any), and return values. Docstrings enhance code readability and make it easier to understand the function's role.

5.  **ℹ️ No Centralized Configuration (Minor):**  While using environment variables is good, having a more centralized configuration mechanism (even if it's just a `.env` file loaded by `dotenv`) could be beneficial for managing various settings in the future.

**Markdown Table of Issues:**

| Issue                      | Solution                                                    | Risk Assessment |
|----------------------------|-------------------------------------------------------------|-----------------|
| 🚨 **Hardcoded API Key (DeepSeek)** | Use environment variables to store and access the DeepSeek API key. | High            |
| ⚠️ **Lack of Error Handling**    | Implement `try-except` blocks to handle potential API errors. | Medium          |
| ℹ️ **Inconsistent API Key Loading** | Use `dotenv` consistently for all API keys.               | Low             |
| ℹ️ **Missing Function Docstrings** | Add docstrings to each function explaining its purpose.   | Low             |
| ℹ️ **No Centralized Config**    | Consider a more centralized configuration approach if needed. | Low             |
</file>

<file path="ultra_diff_review/diff_openai_o3-mini.md">
Below is a review of the changes with observations on potential issues, bugs, and improvements:

1. Hardcoded API key in list_deepseek_models  
 • Issue: The Deepseek function uses a hardcoded API key ("sk-ds-3f422175ff114212a42d7107c3efd1e4"). This exposes sensitive credentials in the source code.  
 • Recommendation: Retrieve the key from an environment variable (or a secure vault) as is done for other models.  
 • Severity: 🚨 Critical

2. Repeated load_dotenv calls  
 • Issue: Both list_anthropic_models and list_gemini_models call load_dotenv() even if they might be used in the same run.  
 • Recommendation: Consider loading environment variables once in a main entry point or in a shared initialization function.  
 • Severity: ⚠️ Moderate

3. Redundant API calls in list_gemini_models  
 • Issue: The Gemini function calls client.models.list() twice (once for generateContent and again for embedContent). This might be inefficient if each call performs network I/O.  
 • Recommendation: Cache the result of client.models.list() into a variable and reuse it for both loops.  
 • Severity: ⚠️ Low

4. Inconsistent variable naming and potential confusion  
 • Observation: In list_groq_models, the result of client.models.list() is stored in a variable named chat_completion even though the function is about listing models.  
 • Recommendation: Use a name such as models or model_list for clarity.  
 • Severity: ℹ️ Low

5. Lack of error handling for API calls  
 • Observation: All functions simply print the results of API calls without handling potential exceptions (e.g., network errors, invalid credentials).  
 • Recommendation: Wrap API calls in try-except blocks and add meaningful error messages.  
 • Severity: ⚠️ Moderate

6. Consistency in output formatting  
 • Observation: While some functions print header messages (like list_anthropic_models and list_gemini_models), others (like list_openai_models or list_deepseek_models) simply print the raw result.  
 • Recommendation: Add consistent formatting or output messages for clarity.  
 • Severity: ℹ️ Low

Below is a concise summary in a markdown table:

| Issue                                | Solution                                                                                 | Risk Assessment          |
|--------------------------------------|------------------------------------------------------------------------------------------|--------------------------|
| Hardcoded API key in Deepseek        | Use an environment variable (e.g., os.environ.get("DEEPSEEK_API_KEY"))                     | 🚨 Critical              |
| Multiple load_dotenv() calls         | Load environment variables once at program start instead of in each function               | ⚠️ Moderate             |
| Redundant API call in Gemini models  | Cache client.models.list() in a variable and reuse it for looping through supported actions | ⚠️ Low                  |
| Inconsistent variable naming (Groq)  | Rename variables (e.g., change "chat_completion" to "models" in list_groq_models)            | ℹ️ Low (cosmetic)       |
| Lack of error handling               | Wrap API calls in try-except blocks and log errors or provide user-friendly error messages  | ⚠️ Moderate             |

This review should help in making the code more secure, efficient, and maintainable.
</file>

<file path="ultra_diff_review/fusion_ultra_diff_review.md">
# Ultra Diff Review - Fusion Analysis

## Overview
This is a synthesized analysis combining insights from multiple LLM reviews of the changes made to `list_models.py`. The code has been refactored to organize model listing functionality into separate functions for different AI providers.

## Critical Issues

### 1. 🚨 Hardcoded API Key (DeepSeek)
**Description**: The `list_deepseek_models()` function contains a hardcoded API key (`"sk-ds-3f422175ff114212a42d7107c3efd1e4"`).
**Impact**: Major security vulnerability that could lead to unauthorized API usage and charges.
**Solution**: Use environment variables instead:
```python
api_key=os.environ.get("DEEPSEEK_API_KEY")
```

### 2. ⚠️ Lack of Error Handling
**Description**: None of the functions include error handling for API failures, network issues, or missing credentials.
**Impact**: Code will crash or produce uninformative errors with actual usage.
**Solution**: Implement try-except blocks for all API calls:
```python
try:
    client = DeepSeek(api_key=os.environ.get("DEEPSEEK_API_KEY"))
    models = client.models.list()
    # Process models
except Exception as e:
    print(f"Error fetching DeepSeek models: {e}")
```

## Medium Priority Issues

### 3. ⚠️ Multiple load_dotenv() Calls
**Description**: Both `list_anthropic_models()` and `list_gemini_models()` call `load_dotenv()` independently.
**Impact**: Redundant operations if multiple functions are called in the same run.
**Solution**: Move `load_dotenv()` to a single location at the top of the file.

### 4. ⚠️ Inconsistent API Key Access Patterns
**Description**: Different functions use different methods to access API keys.
**Impact**: Reduces code maintainability and consistency.
**Solution**: Standardize API key access patterns across all providers.

### 5. ⚠️ Redundant API Call in Gemini Function
**Description**: `list_gemini_models()` calls `client.models.list()` twice for different filtering operations.
**Impact**: Potential performance issue - may make unnecessary network calls.
**Solution**: Store results in a variable and reuse:
```python
models = client.models.list()
print("List of models that support generateContent:\n")
for m in models:
    # Filter for generateContent
    
print("List of models that support embedContent:\n")
for m in models:
    # Filter for embedContent
```

## Low Priority Issues

### 6. ℹ️ Inconsistent Variable Naming
**Description**: In `list_groq_models()`, the result of `client.models.list()` is stored in a variable named `chat_completion`.
**Impact**: Low - could cause confusion during maintenance.
**Solution**: Use a more appropriate variable name like `models` or `model_list`.

### 7. ℹ️ Inconsistent Output Formatting
**Description**: Some functions include descriptive print statements, while others just print raw results.
**Impact**: Low - user experience inconsistency.
**Solution**: Standardize output formatting across all functions.

### 8. ℹ️ Scattered Imports
**Description**: Import statements are scattered throughout functions rather than at the top of the file.
**Impact**: Low - code organization issue.
**Solution**: Consolidate imports at the top of the file.

### 9. ℹ️ Missing Function Docstrings
**Description**: Functions lack documentation describing their purpose and usage.
**Impact**: Low - reduces code readability and maintainability.
**Solution**: Add docstrings to all functions.

### 10. 💡 No Main Function
**Description**: There's no main function to coordinate the execution of different provider functions.
**Impact**: Low - usability enhancement needed.
**Solution**: Add a main function with argument parsing to run specific provider functions.

## Summary Table

| ID | Issue | Solution | Risk Assessment |
|----|-------|----------|-----------------|
| 1 | 🚨 Hardcoded API key (DeepSeek) | Use environment variables | High |
| 2 | ⚠️ No error handling | Add try/except blocks for API calls | Medium |
| 3 | ⚠️ Multiple load_dotenv() calls | Move to single location at file top | Medium |
| 4 | ⚠️ Inconsistent API key access | Standardize patterns across providers | Medium |
| 5 | ⚠️ Redundant API call (Gemini) | Cache API response in variable | Medium |
| 6 | ℹ️ Inconsistent variable naming | Rename variables appropriately | Low |
| 7 | ℹ️ Inconsistent output formatting | Standardize output format | Low |
| 8 | ℹ️ Scattered imports | Consolidate imports at file top | Low |
| 9 | ℹ️ Missing function docstrings | Add documentation to functions | Low |
| 10 | 💡 No main function | Add main() with argument parsing | Low |

## Recommendation
The hardcoded API key issue (#1) should be addressed immediately as it poses a significant security risk. Following that, implementing proper error handling (#2) would greatly improve the reliability of the code.
</file>

<file path=".python-version">
3.12
</file>

<file path=".claude/commands/context_prime.md">
READ README.md, THEN run git ls-files to understand the context of the project.
</file>

<file path=".claude/commands/project_hello_w_name.md">
hi how are you $ARGUMENTS
</file>

<file path="src/just_prompt/atoms/shared/data_types.py">
"""
Data types and models for just-prompt MCP server.
"""

from enum import Enum


class ModelProviders(Enum):
    """
    Enum of supported model providers with their full and short names.
    """
    OPENAI = ("openai", "o")
    ANTHROPIC = ("anthropic", "a")
    GEMINI = ("gemini", "g") 
    GROQ = ("groq", "q")
    DEEPSEEK = ("deepseek", "d")
    OLLAMA = ("ollama", "l")
    
    def __init__(self, full_name, short_name):
        self.full_name = full_name
        self.short_name = short_name
        
    @classmethod
    def from_name(cls, name):
        """
        Get provider enum from full or short name.
        
        Args:
            name: The provider name (full or short)
            
        Returns:
            ModelProviders: The corresponding provider enum, or None if not found
        """
        for provider in cls:
            if provider.full_name == name or provider.short_name == name:
                return provider
        return None
</file>

<file path="src/just_prompt/atoms/shared/validator.py">
"""
Validation utilities for just-prompt.
"""

from typing import List, Dict, Optional, Tuple
import logging
import os
from .data_types import ModelProviders
from .utils import split_provider_and_model, get_api_key

logger = logging.getLogger(__name__)


def validate_models_prefixed_by_provider(models_prefixed_by_provider: List[str]) -> bool:
    """
    Validate that provider prefixes in model strings are valid.
    
    Args:
        models_prefixed_by_provider: List of model strings in format "provider:model"
        
    Returns:
        True if all valid, raises ValueError otherwise
    """
    if not models_prefixed_by_provider:
        raise ValueError("No models provided")
    
    for model_string in models_prefixed_by_provider:
        try:
            provider_prefix, model_name = split_provider_and_model(model_string)
            provider = ModelProviders.from_name(provider_prefix)
            if provider is None:
                raise ValueError(f"Unknown provider prefix: {provider_prefix}")
        except Exception as e:
            logger.error(f"Validation error for model string '{model_string}': {str(e)}")
            raise
    
    return True


def validate_provider(provider: str) -> bool:
    """
    Validate that a provider name is valid.
    
    Args:
        provider: Provider name (full or short)
        
    Returns:
        True if valid, raises ValueError otherwise
    """
    provider_enum = ModelProviders.from_name(provider)
    if provider_enum is None:
        raise ValueError(f"Unknown provider: {provider}")
    
    return True


def validate_provider_api_keys() -> Dict[str, bool]:
    """
    Validate that API keys are available for each provider.
    
    Returns:
        Dictionary mapping provider names to availability status (True if available, False otherwise)
    """
    available_providers = {}
    
    # Check API keys for each provider
    for provider in ModelProviders:
        provider_name = provider.full_name
        
        # Special case for Ollama which uses OLLAMA_HOST instead of an API key
        if provider_name == "ollama":
            host = os.environ.get("OLLAMA_HOST")
            is_available = host is not None and host.strip() != ""
            available_providers[provider_name] = is_available
        else:
            # Get API key
            api_key = get_api_key(provider_name)
            is_available = api_key is not None and api_key.strip() != ""
            available_providers[provider_name] = is_available
    
    return available_providers


def print_provider_availability(detailed: bool = True) -> None:
    """
    Print information about which providers are available based on API keys.
    
    Args:
        detailed: Whether to print detailed information about missing keys
    """
    availability = validate_provider_api_keys()
    
    available = [p for p, status in availability.items() if status]
    unavailable = [p for p, status in availability.items() if not status]
    
    # Print availability information
    logger.info(f"Available LLM providers: {', '.join(available)}")
    
    if detailed and unavailable:
        env_vars = {
            "openai": "OPENAI_API_KEY",
            "anthropic": "ANTHROPIC_API_KEY",
            "gemini": "GEMINI_API_KEY", 
            "groq": "GROQ_API_KEY",
            "deepseek": "DEEPSEEK_API_KEY",
            "ollama": "OLLAMA_HOST"
        }
        
        logger.warning(f"The following providers are unavailable due to missing API keys:")
        for provider in unavailable:
            env_var = env_vars.get(provider)
            if env_var:
                logger.warning(f"  - {provider}: Missing environment variable {env_var}")
            else:
                logger.warning(f"  - {provider}: Missing configuration")
</file>

<file path="src/just_prompt/molecules/prompt_from_file.py">
"""
Prompt from file functionality for just-prompt.
"""

from typing import List
import logging
import os
from pathlib import Path
from .prompt import prompt

logger = logging.getLogger(__name__)


def prompt_from_file(file: str, models_prefixed_by_provider: List[str] = None) -> List[str]:
    """
    Read text from a file and send it as a prompt to multiple models.
    
    Args:
        file: Path to the text file
        models_prefixed_by_provider: List of model strings in format "provider:model"
                                    If None, uses the DEFAULT_MODELS environment variable
        
    Returns:
        List of responses from the models
    """
    file_path = Path(file)
    
    # Validate file
    if not file_path.exists():
        raise FileNotFoundError(f"File not found: {file}")
    
    if not file_path.is_file():
        raise ValueError(f"Not a file: {file}")
    
    # Read file content
    try:
        with open(file_path, 'r', encoding='utf-8') as f:
            text = f.read()
    except Exception as e:
        logger.error(f"Error reading file {file}: {e}")
        raise ValueError(f"Error reading file: {str(e)}")
    
    # Send prompt with file content
    return prompt(text, models_prefixed_by_provider)
</file>

<file path="src/just_prompt/tests/atoms/llm_providers/test_gemini.py">
"""
Tests for Gemini provider.
"""

import pytest
import os
from dotenv import load_dotenv
from just_prompt.atoms.llm_providers import gemini

# Load environment variables
load_dotenv()

# Skip tests if API key not available
if not os.environ.get("GEMINI_API_KEY"):
    pytest.skip("Gemini API key not available", allow_module_level=True)


def test_list_models():
    """Test listing Gemini models."""
    models = gemini.list_models()
    
    # Assertions
    assert isinstance(models, list)
    assert len(models) > 0
    assert all(isinstance(model, str) for model in models)
    
    # Check for at least one expected model containing gemini
    gemini_models = [model for model in models if "gemini" in model.lower()]
    assert len(gemini_models) > 0, "No Gemini models found"


def test_prompt():
    """Test sending prompt to Gemini."""
    # Using gemini-1.5-flash as the model for testing
    response = gemini.prompt("What is the capital of France?", "gemini-1.5-flash")
    
    # Assertions
    assert isinstance(response, str)
    assert len(response) > 0
    assert "paris" in response.lower() or "Paris" in response
</file>

<file path="src/just_prompt/tests/atoms/llm_providers/test_ollama.py">
"""
Tests for Ollama provider.
"""

import pytest
import os
from dotenv import load_dotenv
from just_prompt.atoms.llm_providers import ollama

# Load environment variables
load_dotenv()


def test_list_models():
    """Test listing Ollama models."""
    models = ollama.list_models()
    assert isinstance(models, list)
    assert isinstance(models[0], str)
    assert len(models) > 0


def test_prompt():
    """Test sending prompt to Ollama."""
    # Using llama3 as default model - adjust if needed based on your environment

    response = ollama.prompt("What is the capital of France?", "gemma3:12b")

    # Assertions
    assert isinstance(response, str)
    assert len(response) > 0
    assert "paris" in response.lower() or "Paris" in response
</file>

<file path="src/just_prompt/tests/atoms/llm_providers/test_openai.py">
"""
Tests for OpenAI provider.
"""

import pytest
import os
from dotenv import load_dotenv
from just_prompt.atoms.llm_providers import openai

# Load environment variables
load_dotenv()

# Skip tests if API key not available
if not os.environ.get("OPENAI_API_KEY"):
    pytest.skip("OpenAI API key not available", allow_module_level=True)


def test_list_models():
    """Test listing OpenAI models."""
    models = openai.list_models()
    
    # Assertions
    assert isinstance(models, list)
    assert len(models) > 0
    assert all(isinstance(model, str) for model in models)
    
    # Check for at least one expected model
    gpt_models = [model for model in models if "gpt" in model.lower()]
    assert len(gpt_models) > 0, "No GPT models found"


def test_prompt():
    """Test sending prompt to OpenAI."""
    response = openai.prompt("What is the capital of France?", "gpt-4o-mini")
    
    # Assertions
    assert isinstance(response, str)
    assert len(response) > 0
    assert "paris" in response.lower() or "Paris" in response
</file>

<file path="src/just_prompt/tests/atoms/shared/test_validator.py">
"""
Tests for validator functions.
"""

import pytest
import os
from unittest.mock import patch
from just_prompt.atoms.shared.validator import (
    validate_models_prefixed_by_provider, 
    validate_provider,
    validate_provider_api_keys,
    print_provider_availability
)


def test_validate_models_prefixed_by_provider():
    """Test validating model strings."""
    # Valid model strings
    assert validate_models_prefixed_by_provider(["openai:gpt-4o-mini"]) == True
    assert validate_models_prefixed_by_provider(["anthropic:claude-3-5-haiku"]) == True
    assert validate_models_prefixed_by_provider(["o:gpt-4o-mini", "a:claude-3-5-haiku"]) == True
    
    # Invalid model strings
    with pytest.raises(ValueError):
        validate_models_prefixed_by_provider([])
    
    with pytest.raises(ValueError):
        validate_models_prefixed_by_provider(["unknown:model"])
    
    with pytest.raises(ValueError):
        validate_models_prefixed_by_provider(["invalid-format"])


def test_validate_provider():
    """Test validating provider names."""
    # Valid providers
    assert validate_provider("openai") == True
    assert validate_provider("anthropic") == True
    assert validate_provider("o") == True
    assert validate_provider("a") == True
    
    # Invalid providers
    with pytest.raises(ValueError):
        validate_provider("unknown")
        
    with pytest.raises(ValueError):
        validate_provider("")


def test_validate_provider_api_keys():
    """Test validating provider API keys."""
    # Use mocked environment variables with a mix of valid, empty, and missing keys
    with patch.dict(os.environ, {
        "OPENAI_API_KEY": "test-key",
        "ANTHROPIC_API_KEY": "test-key",
        "GROQ_API_KEY": "test-key",  
        # GEMINI_API_KEY not defined
        "DEEPSEEK_API_KEY": "test-key",
        "OLLAMA_HOST": "http://localhost:11434"
    }):
        # Call the function to validate provider API keys
        availability = validate_provider_api_keys()
        
        # Check that each provider has the correct availability status
        assert availability["openai"] is True
        assert availability["anthropic"] is True
        assert availability["groq"] is True
        
        # This depends on the actual implementation. Since we're mocking the environment,
        # let's just assert that the keys exist rather than specific values
        assert "gemini" in availability
        assert "deepseek" in availability
        assert "ollama" in availability
        
        # Make sure all providers are included in the result
        assert set(availability.keys()) == {"openai", "anthropic", "gemini", "groq", "deepseek", "ollama"}


def test_validate_provider_api_keys_none():
    """Test validating provider API keys when none are available."""
    # Use mocked environment variables with no API keys
    with patch.dict(os.environ, {}, clear=True):
        # Call the function to validate provider API keys
        availability = validate_provider_api_keys()
        
        # Check that all providers are marked as unavailable
        assert all(status is False for status in availability.values())
        assert set(availability.keys()) == {"openai", "anthropic", "gemini", "groq", "deepseek", "ollama"}


def test_print_provider_availability():
    """Test printing provider availability."""
    # Mock the validate_provider_api_keys function to return a controlled result
    mock_availability = {
        "openai": True,
        "anthropic": False,
        "gemini": True,
        "groq": False,
        "deepseek": True,
        "ollama": False
    }
    
    with patch('just_prompt.atoms.shared.validator.validate_provider_api_keys', 
              return_value=mock_availability):
        
        # Mock the logger to verify the log messages
        with patch('just_prompt.atoms.shared.validator.logger') as mock_logger:
            # Call the function to print provider availability
            print_provider_availability(detailed=True)
            
            # Verify that info was called with a message about available providers
            mock_logger.info.assert_called_once()
            info_call_args = mock_logger.info.call_args[0][0]
            assert "Available LLM providers:" in info_call_args
            assert "openai" in info_call_args
            assert "gemini" in info_call_args
            assert "deepseek" in info_call_args
            
            # Check that warning was called multiple times
            assert mock_logger.warning.call_count >= 2
            
            # Check that the first warning is about missing API keys
            warning_calls = [call[0][0] for call in mock_logger.warning.call_args_list]
            assert "The following providers are unavailable due to missing API keys:" in warning_calls
</file>

<file path="src/just_prompt/tests/molecules/test_prompt_from_file.py">
"""
Tests for prompt_from_file functionality.
"""

import pytest
import os
import tempfile
from dotenv import load_dotenv
from just_prompt.molecules.prompt_from_file import prompt_from_file

# Load environment variables
load_dotenv()


def test_nonexistent_file():
    """Test with non-existent file."""
    with pytest.raises(FileNotFoundError):
        prompt_from_file("/non/existent/file.txt", ["o:gpt-4o-mini"])


def test_file_read():
    """Test that the file is read correctly and processes with real API call."""
    # Create temporary file with a simple question
    with tempfile.NamedTemporaryFile(mode='w+', delete=False) as temp:
        temp.write("What is the capital of France?")
        temp_path = temp.name
    
    try:
        # Make real API call
        response = prompt_from_file(temp_path, ["o:gpt-4o-mini"])
        
        # Assertions
        assert isinstance(response, list)
        assert len(response) == 1
        assert "paris" in response[0].lower() or "Paris" in response[0]
    finally:
        # Clean up
        os.unlink(temp_path)
</file>

<file path="src/just_prompt/tests/molecules/test_prompt.py">
"""
Tests for prompt functionality.
"""

import pytest
import os
from dotenv import load_dotenv
from just_prompt.molecules.prompt import prompt

# Load environment variables
load_dotenv()

def test_prompt_basic():
    """Test basic prompt functionality with a real API call."""
    # Define a simple test case
    test_prompt = "What is the capital of France?"
    test_models = ["openai:gpt-4o-mini"]

    # Call the prompt function with a real model
    response = prompt(test_prompt, test_models)

    # Assertions
    assert isinstance(response, list)
    assert len(response) == 1
    assert "paris" in response[0].lower() or "Paris" in response[0]

def test_prompt_multiple_models():
    """Test prompt with multiple models."""
    # Skip if API keys aren't available
    if not os.environ.get("OPENAI_API_KEY") or not os.environ.get("ANTHROPIC_API_KEY"):
        pytest.skip("Required API keys not available")
        
    # Define a simple test case
    test_prompt = "What is the capital of France?"
    test_models = ["openai:gpt-4o-mini", "anthropic:claude-3-5-haiku-20241022"]

    # Call the prompt function with multiple models
    response = prompt(test_prompt, test_models)

    # Assertions
    assert isinstance(response, list)
    assert len(response) == 2
    # Check all responses contain Paris
    for r in response:
        assert "paris" in r.lower() or "Paris" in r
</file>

<file path=".env.sample">
# Environment Variables for just-prompt

# OpenAI API Key
OPENAI_API_KEY=your_openai_api_key_here

# Anthropic API Key
ANTHROPIC_API_KEY=your_anthropic_api_key_here

# Gemini API Key
GEMINI_API_KEY=your_gemini_api_key_here

# Groq API Key
GROQ_API_KEY=your_groq_api_key_here

# DeepSeek API Key
DEEPSEEK_API_KEY=your_deepseek_api_key_here

# Ollama endpoint (if not default)
OLLAMA_HOST=http://localhost:11434
</file>

<file path="src/just_prompt/atoms/llm_providers/anthropic.py">
"""
Anthropic provider implementation.
"""

import os
import re
import anthropic
from typing import List, Tuple
import logging
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Configure logging
logger = logging.getLogger(__name__)

# Initialize Anthropic client
client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))


def parse_thinking_suffix(model: str) -> Tuple[str, int]:
    """
    Parse a model name to check for thinking token budget suffixes.
    Only works with the claude-3-7-sonnet-20250219 model.
    
    Supported formats:
    - model:1k, model:4k, model:16k
    - model:1000, model:1054, model:1333, etc. (any value between 1024-16000)
    
    Args:
        model: The model name potentially with a thinking suffix
        
    Returns:
        Tuple of (base_model_name, thinking_budget)
        If no thinking suffix is found, thinking_budget will be 0
    """
    # Look for patterns like ":1k", ":4k", ":16k" or ":1000", ":1054", etc.
    pattern = r'^(.+?)(?::(\d+)k?)?$'
    match = re.match(pattern, model)
    
    if not match:
        return model, 0
    
    base_model = match.group(1)
    thinking_suffix = match.group(2)
    
    # Validate the model - only claude-3-7-sonnet-20250219 supports thinking
    if base_model != "claude-3-7-sonnet-20250219":
        logger.warning(f"Model {base_model} does not support thinking, ignoring thinking suffix")
        return base_model, 0
    
    if not thinking_suffix:
        return model, 0
    
    # Convert to integer
    try:
        thinking_budget = int(thinking_suffix)
        # If a small number like 1, 4, 16 is provided, assume it's in "k" (multiply by 1024)
        if thinking_budget < 100:
            thinking_budget *= 1024
            
        # Adjust values outside the range
        if thinking_budget < 1024:
            logger.warning(f"Thinking budget {thinking_budget} below minimum (1024), using 1024 instead")
            thinking_budget = 1024
        elif thinking_budget > 16000:
            logger.warning(f"Thinking budget {thinking_budget} above maximum (16000), using 16000 instead")
            thinking_budget = 16000
            
        logger.info(f"Using thinking budget of {thinking_budget} tokens for model {base_model}")
        return base_model, thinking_budget
    except ValueError:
        logger.warning(f"Invalid thinking budget format: {thinking_suffix}, ignoring")
        return base_model, 0


def prompt_with_thinking(text: str, model: str, thinking_budget: int) -> str:
    """
    Send a prompt to Anthropic Claude with thinking enabled and get a response.
    
    Args:
        text: The prompt text
        model: The base model name (without thinking suffix)
        thinking_budget: The token budget for thinking
        
    Returns:
        Response string from the model
    """
    try:
        # Ensure max_tokens is greater than thinking_budget
        # Documentation requires this: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#max-tokens-and-context-window-size
        max_tokens = thinking_budget + 1000  # Adding 1000 tokens for the response
        
        logger.info(f"Sending prompt to Anthropic model {model} with thinking budget {thinking_budget}")
        message = client.messages.create(
            model=model,
            max_tokens=max_tokens,
            thinking={
                "type": "enabled",
                "budget_tokens": thinking_budget,
            },
            messages=[{"role": "user", "content": text}]
        )
        
        # Extract the response from the message content
        # Filter out thinking blocks and only get text blocks
        text_blocks = [block for block in message.content if block.type == "text"]
        
        if not text_blocks:
            raise ValueError("No text content found in response")
            
        return text_blocks[0].text
    except Exception as e:
        logger.error(f"Error sending prompt with thinking to Anthropic: {e}")
        raise ValueError(f"Failed to get response from Anthropic with thinking: {str(e)}")


def prompt(text: str, model: str) -> str:
    """
    Send a prompt to Anthropic Claude and get a response.
    
    Automatically handles thinking suffixes in the model name (e.g., claude-3-7-sonnet-20250219:4k)
    
    Args:
        text: The prompt text
        model: The model name, optionally with thinking suffix
        
    Returns:
        Response string from the model
    """
    # Parse the model name to check for thinking suffixes
    base_model, thinking_budget = parse_thinking_suffix(model)
    
    # If thinking budget is specified, use prompt_with_thinking
    if thinking_budget > 0:
        return prompt_with_thinking(text, base_model, thinking_budget)
    
    # Otherwise, use regular prompt
    try:
        logger.info(f"Sending prompt to Anthropic model: {base_model}")
        message = client.messages.create(
            model=base_model, max_tokens=4096, messages=[{"role": "user", "content": text}]
        )

        # Extract the response from the message content
        # Get only text blocks
        text_blocks = [block for block in message.content if block.type == "text"]
        
        if not text_blocks:
            raise ValueError("No text content found in response")
            
        return text_blocks[0].text
    except Exception as e:
        logger.error(f"Error sending prompt to Anthropic: {e}")
        raise ValueError(f"Failed to get response from Anthropic: {str(e)}")


def list_models() -> List[str]:
    """
    List available Anthropic models.
    
    Returns:
        List of model names
    """
    try:
        logger.info("Listing Anthropic models")
        response = client.models.list()

        models = [model.id for model in response.data]
        return models
    except Exception as e:
        logger.error(f"Error listing Anthropic models: {e}")
        # Return some known models if API fails
        logger.info("Returning hardcoded list of known Anthropic models")
        return [
            "claude-3-7-sonnet",
            "claude-3-5-sonnet",
            "claude-3-5-sonnet-20240620",
            "claude-3-opus-20240229",
            "claude-3-sonnet-20240229",
            "claude-3-haiku-20240307",
            "claude-3-5-haiku",
        ]
</file>

<file path="src/just_prompt/atoms/llm_providers/openai.py">
"""
OpenAI provider implementation.
"""

import os
from openai import OpenAI
from typing import List
import logging
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Configure logging
logger = logging.getLogger(__name__)

# Initialize OpenAI client
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))


def prompt(text: str, model: str) -> str:
    """
    Send a prompt to OpenAI and get a response.

    Args:
        text: The prompt text
        model: The model name

    Returns:
        Response string from the model
    """
    try:
        logger.info(f"Sending prompt to OpenAI model: {model}")
        response = client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": text}],
        )

        return response.choices[0].message.content
    except Exception as e:
        logger.error(f"Error sending prompt to OpenAI: {e}")
        raise ValueError(f"Failed to get response from OpenAI: {str(e)}")


def list_models() -> List[str]:
    """
    List available OpenAI models.

    Returns:
        List of model names
    """
    try:
        logger.info("Listing OpenAI models")
        response = client.models.list()

        # Return all models without filtering
        models = [model.id for model in response.data]

        return models
    except Exception as e:
        logger.error(f"Error listing OpenAI models: {e}")
        raise ValueError(f"Failed to list OpenAI models: {str(e)}")
</file>

<file path="src/just_prompt/atoms/shared/model_router.py">
"""
Model router for dispatching requests to the appropriate provider.
"""

import logging
from typing import List, Dict, Any, Optional
import importlib
from .utils import split_provider_and_model
from .data_types import ModelProviders

logger = logging.getLogger(__name__)


class ModelRouter:
    """
    Routes requests to the appropriate provider based on the model string.
    """

    @staticmethod
    def validate_and_correct_model(provider_name: str, model_name: str) -> str:
        """
        Validate a model name against available models for a provider, and correct it if needed.

        Args:
            provider_name: Provider name (full name)
            model_name: Model name to validate and potentially correct

        Returns:
            Validated and potentially corrected model name
        """
        # Early return for our thinking token model to bypass validation
        if "claude-3-7-sonnet-20250219" in model_name:
            return model_name

        try:
            # Import the provider module
            provider_module_name = f"just_prompt.atoms.llm_providers.{provider_name}"
            provider_module = importlib.import_module(provider_module_name)

            # Get available models
            available_models = provider_module.list_models()

            # Check if model is in available models
            if model_name in available_models:
                return model_name

            # Model needs correction - use the default correction model
            import os

            correction_model = os.environ.get(
                "CORRECTION_MODEL", "anthropic:claude-3-7-sonnet-20250219"
            )

            # Use magic model correction
            corrected_model = ModelRouter.magic_model_correction(
                provider_name, model_name, correction_model
            )

            if corrected_model != model_name:
                logger.info(
                    f"Corrected model name from '{model_name}' to '{corrected_model}' for provider '{provider_name}'"
                )
                return corrected_model

            return model_name
        except Exception as e:
            logger.warning(
                f"Error validating model '{model_name}' for provider '{provider_name}': {e}"
            )
            return model_name

    @staticmethod
    def route_prompt(model_string: str, text: str) -> str:
        """
        Route a prompt to the appropriate provider.

        Args:
            model_string: String in format "provider:model"
            text: The prompt text

        Returns:
            Response from the model
        """
        provider_prefix, model = split_provider_and_model(model_string)
        provider = ModelProviders.from_name(provider_prefix)

        if not provider:
            raise ValueError(f"Unknown provider prefix: {provider_prefix}")

        # Validate and potentially correct the model name
        validated_model = ModelRouter.validate_and_correct_model(
            provider.full_name, model
        )

        # Import the appropriate provider module
        try:
            module_name = f"just_prompt.atoms.llm_providers.{provider.full_name}"
            provider_module = importlib.import_module(module_name)

            # Call the prompt function
            return provider_module.prompt(text, validated_model)
        except ImportError as e:
            logger.error(f"Failed to import provider module: {e}")
            raise ValueError(f"Provider not available: {provider.full_name}")
        except Exception as e:
            logger.error(f"Error routing prompt to {provider.full_name}: {e}")
            raise

    @staticmethod
    def route_list_models(provider_name: str) -> List[str]:
        """
        Route a list_models request to the appropriate provider.

        Args:
            provider_name: Provider name (full or short)

        Returns:
            List of model names
        """
        provider = ModelProviders.from_name(provider_name)

        if not provider:
            raise ValueError(f"Unknown provider: {provider_name}")

        # Import the appropriate provider module
        try:
            module_name = f"just_prompt.atoms.llm_providers.{provider.full_name}"
            provider_module = importlib.import_module(module_name)

            # Call the list_models function
            return provider_module.list_models()
        except ImportError as e:
            logger.error(f"Failed to import provider module: {e}")
            raise ValueError(f"Provider not available: {provider.full_name}")
        except Exception as e:
            logger.error(f"Error listing models for {provider.full_name}: {e}")
            raise

    @staticmethod
    def magic_model_correction(provider: str, model: str, correction_model: str) -> str:
        """
        Correct a model name using a correction AI model if needed.

        Args:
            provider: Provider name
            model: Original model name
            correction_model: Model to use for the correction llm prompt, e.g. "o:gpt-4o-mini"

        Returns:
            Corrected model name
        """
        provider_module_name = f"just_prompt.atoms.llm_providers.{provider}"

        try:
            provider_module = importlib.import_module(provider_module_name)
            available_models = provider_module.list_models()

            # If model is already in available models, no correction needed
            if model in available_models:
                logger.info(f"Using {provider} and {model}")
                return model

            # Model needs correction - use correction model to correct it
            correction_provider, correction_model_name = split_provider_and_model(
                correction_model
            )
            correction_provider_enum = ModelProviders.from_name(correction_provider)

            if not correction_provider_enum:
                logger.warning(
                    f"Invalid correction model provider: {correction_provider}, skipping correction"
                )
                return model

            correction_module_name = (
                f"just_prompt.atoms.llm_providers.{correction_provider_enum.full_name}"
            )
            correction_module = importlib.import_module(correction_module_name)

            # Build prompt for the correction model
            prompt = f"""
Given a user-provided model name "{model}" for the provider "{provider}", and the list of actual available models below,
return the closest matching model name from the available models list.
Only return the exact model name, nothing else.

Available models: {', '.join(available_models)}
"""
            # Get correction from correction model
            corrected_model = correction_module.prompt(
                prompt, correction_model_name
            ).strip()

            # Verify the corrected model exists in the available models
            if corrected_model in available_models:
                logger.info(f"correction_model: {correction_model}")
                logger.info(f"models_prefixed_by_provider: {provider}:{model}")
                logger.info(f"corrected_model: {corrected_model}")
                return corrected_model
            else:
                logger.warning(
                    f"Corrected model {corrected_model} not found in available models"
                )
                return model

        except Exception as e:
            logger.error(f"Error in model correction: {e}")
            return model
</file>

<file path="src/just_prompt/atoms/shared/utils.py">
"""
Utility functions for just-prompt.
"""

from typing import Tuple, List
import os
from dotenv import load_dotenv
import logging

# Set up logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s [%(levelname)s] %(message)s',
    datefmt='%Y-%m-%d %H:%M:%S'
)

# Load environment variables
load_dotenv()

# Default model constants
DEFAULT_MODEL = "anthropic:claude-3-7-sonnet-20250219"


def split_provider_and_model(model_string: str) -> Tuple[str, str]:
    """
    Split a model string into provider and model name.
    
    Note: This only splits the first colon in the model string and leaves the rest of the string
    as the model name. Models will have additional colons in the string and we want to ignore them
    and leave them for the model name.
    
    Args:
        model_string: String in format "provider:model"
        
    Returns:
        Tuple containing (provider, model)
    """
    parts = model_string.split(":", 1)
    if len(parts) != 2:
        raise ValueError(f"Invalid model string format: {model_string}. Expected format: 'provider:model'")
    
    provider, model = parts
    return provider, model


def get_provider_from_prefix(prefix: str) -> str:
    """
    Get the full provider name from a prefix.
    
    Args:
        prefix: Provider prefix (short or full name)
        
    Returns:
        Full provider name
    """
    from .data_types import ModelProviders
    
    provider = ModelProviders.from_name(prefix)
    if provider is None:
        raise ValueError(f"Unknown provider prefix: {prefix}")
    
    return provider.full_name


def get_models_prefixed_by_provider(provider_prefix: str, model_name: str) -> str:
    """
    Format a model string with provider prefix.
    
    Args:
        provider_prefix: The provider prefix (short or full name)
        model_name: The model name
        
    Returns:
        Formatted string in "provider:model" format
    """
    provider = get_provider_from_prefix(provider_prefix)
    return f"{provider}:{model_name}"


def get_api_key(provider: str) -> str:
    """
    Get the API key for a provider from environment variables.
    
    Args:
        provider: Provider name (full name)
        
    Returns:
        API key as string
    """
    key_mapping = {
        "openai": "OPENAI_API_KEY",
        "anthropic": "ANTHROPIC_API_KEY",
        "gemini": "GEMINI_API_KEY",
        "groq": "GROQ_API_KEY",
        "deepseek": "DEEPSEEK_API_KEY"
    }
    
    env_var = key_mapping.get(provider)
    if not env_var:
        return None
    
    return os.environ.get(env_var)
</file>

<file path="src/just_prompt/tests/atoms/llm_providers/test_anthropic.py">
"""
Tests for Anthropic provider.
"""

import pytest
import os
from dotenv import load_dotenv
from just_prompt.atoms.llm_providers import anthropic

# Load environment variables
load_dotenv()

# Skip tests if API key not available
if not os.environ.get("ANTHROPIC_API_KEY"):
    pytest.skip("Anthropic API key not available", allow_module_level=True)


def test_list_models():
    """Test listing Anthropic models."""
    models = anthropic.list_models()
    
    # Assertions
    assert isinstance(models, list)
    assert len(models) > 0
    assert all(isinstance(model, str) for model in models)
    
    # Check for at least one expected model
    claude_models = [model for model in models if "claude" in model.lower()]
    assert len(claude_models) > 0, "No Claude models found"


def test_prompt():
    """Test sending prompt to Anthropic."""
    # Use the correct model name from the available models
    response = anthropic.prompt("What is the capital of France?", "claude-3-5-haiku-20241022")
    
    # Assertions
    assert isinstance(response, str)
    assert len(response) > 0
    assert "paris" in response.lower() or "Paris" in response


def test_parse_thinking_suffix():
    """Test parsing thinking suffix from model names."""
    # Test cases with no suffix
    assert anthropic.parse_thinking_suffix("claude-3-7-sonnet") == ("claude-3-7-sonnet", 0)
    assert anthropic.parse_thinking_suffix("claude-3-5-haiku-20241022") == ("claude-3-5-haiku-20241022", 0)
    
    # Test cases with supported model and k suffixes
    assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:1k") == ("claude-3-7-sonnet-20250219", 1024)
    assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:4k") == ("claude-3-7-sonnet-20250219", 4096)
    assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:15k") == ("claude-3-7-sonnet-20250219", 15360)  # 15*1024=15360 < 16000
    
    # Test cases with supported model and numeric suffixes
    assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:1024") == ("claude-3-7-sonnet-20250219", 1024)
    assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:4096") == ("claude-3-7-sonnet-20250219", 4096)
    assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:8000") == ("claude-3-7-sonnet-20250219", 8000)
    
    # Test cases with non-supported model
    assert anthropic.parse_thinking_suffix("claude-3-7-sonnet:1k") == ("claude-3-7-sonnet", 0)
    assert anthropic.parse_thinking_suffix("claude-3-5-haiku:4k") == ("claude-3-5-haiku", 0)
    
    # Test cases with out-of-range values (should adjust to valid range)
    assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:500") == ("claude-3-7-sonnet-20250219", 1024)  # Below min 1024, should use 1024
    assert anthropic.parse_thinking_suffix("claude-3-7-sonnet-20250219:20000") == ("claude-3-7-sonnet-20250219", 16000)  # Above max 16000, should use 16000


def test_prompt_with_thinking():
    """Test sending prompt with thinking enabled."""
    # Test with 1k thinking tokens on the supported model
    response = anthropic.prompt("What is the capital of Spain?", "claude-3-7-sonnet-20250219:1k")
    
    # Assertions
    assert isinstance(response, str)
    assert len(response) > 0
    assert "madrid" in response.lower() or "Madrid" in response
    
    # Test with 2k thinking tokens on the supported model
    response = anthropic.prompt("What is the capital of Germany?", "claude-3-7-sonnet-20250219:2k")
    
    # Assertions
    assert isinstance(response, str)
    assert len(response) > 0
    assert "berlin" in response.lower() or "Berlin" in response
    
    # Test with out-of-range but auto-corrected thinking tokens
    response = anthropic.prompt("What is the capital of Italy?", "claude-3-7-sonnet-20250219:500")
    
    # Assertions (should still work with a corrected budget of 1024)
    assert isinstance(response, str)
    assert len(response) > 0
    assert "rome" in response.lower() or "Rome" in response
</file>

<file path="src/just_prompt/tests/atoms/shared/test_model_router.py">
"""
Tests for model router.
"""

import pytest
import os
from unittest.mock import patch, MagicMock
import importlib
from just_prompt.atoms.shared.model_router import ModelRouter
from just_prompt.atoms.shared.data_types import ModelProviders


@patch('importlib.import_module')
def test_route_prompt(mock_import_module):
    """Test routing prompts to the appropriate provider."""
    # Set up mock
    mock_module = MagicMock()
    mock_module.prompt.return_value = "Paris is the capital of France."
    mock_import_module.return_value = mock_module
    
    # Test with full provider name
    response = ModelRouter.route_prompt("openai:gpt-4o-mini", "What is the capital of France?")
    assert response == "Paris is the capital of France."
    mock_import_module.assert_called_with("just_prompt.atoms.llm_providers.openai")
    mock_module.prompt.assert_called_with("What is the capital of France?", "gpt-4o-mini")
    
    # Test with short provider name
    response = ModelRouter.route_prompt("o:gpt-4o-mini", "What is the capital of France?")
    assert response == "Paris is the capital of France."
    
    # Test invalid provider
    with pytest.raises(ValueError):
        ModelRouter.route_prompt("unknown:model", "What is the capital of France?")


@patch('importlib.import_module')
def test_route_list_models(mock_import_module):
    """Test routing list_models requests to the appropriate provider."""
    # Set up mock
    mock_module = MagicMock()
    mock_module.list_models.return_value = ["model1", "model2"]
    mock_import_module.return_value = mock_module
    
    # Test with full provider name
    models = ModelRouter.route_list_models("openai")
    assert models == ["model1", "model2"]
    mock_import_module.assert_called_with("just_prompt.atoms.llm_providers.openai")
    mock_module.list_models.assert_called_once()
    
    # Test with short provider name
    models = ModelRouter.route_list_models("o")
    assert models == ["model1", "model2"]
    
    # Test invalid provider
    with pytest.raises(ValueError):
        ModelRouter.route_list_models("unknown")


def test_validate_and_correct_model_shorthand():
    """Test validation and correction of shorthand model names like a:sonnet.3.7."""
    try:
        # Test with shorthand notation a:sonnet.3.7
        # This should be corrected to claude-3-7-sonnet-20250219
        # First, use the split_provider_and_model to get the provider and model
        from just_prompt.atoms.shared.utils import split_provider_and_model
        provider_prefix, model = split_provider_and_model("a:sonnet.3.7")
        
        # Get the provider enum
        provider = ModelProviders.from_name(provider_prefix)
        
        # Call validate_and_correct_model
        result = ModelRouter.magic_model_correction(provider.full_name, model, "anthropic:claude-3-7-sonnet-20250219")
        
        # The magic_model_correction method should correct sonnet.3.7 to claude-3-7-sonnet-20250219
        assert "claude-3-7" in result, f"Expected sonnet.3.7 to be corrected to a claude-3-7 model, got {result}"
        print(f"Shorthand model 'sonnet.3.7' was corrected to '{result}'")
    except Exception as e:
        pytest.fail(f"Test failed with error: {e}")
</file>

<file path="src/just_prompt/tests/molecules/test_list_models.py">
"""
Tests for list_models functionality for all providers.
"""

import pytest
import os
from dotenv import load_dotenv
from just_prompt.molecules.list_models import list_models

# Load environment variables
load_dotenv()

def test_list_models_openai():
    """Test listing OpenAI models with real API call."""
    # Skip if API key isn't available
    if not os.environ.get("OPENAI_API_KEY"):
        pytest.skip("OpenAI API key not available")
        
    # Test with full provider name
    models = list_models("openai")
    
    # Assertions
    assert isinstance(models, list)
    assert len(models) > 0
    
    # Check for specific model patterns that should exist
    assert any("gpt" in model.lower() for model in models)
    
def test_list_models_anthropic():
    """Test listing Anthropic models with real API call."""
    # Skip if API key isn't available
    if not os.environ.get("ANTHROPIC_API_KEY"):
        pytest.skip("Anthropic API key not available")
        
    # Test with full provider name
    models = list_models("anthropic")
    
    # Assertions
    assert isinstance(models, list)
    assert len(models) > 0
    
    # Check for specific model patterns that should exist
    assert any("claude" in model.lower() for model in models)

def test_list_models_gemini():
    """Test listing Gemini models with real API call."""
    # Skip if API key isn't available
    if not os.environ.get("GEMINI_API_KEY"):
        pytest.skip("Gemini API key not available")
        
    # Test with full provider name
    models = list_models("gemini")
    
    # Assertions
    assert isinstance(models, list)
    assert len(models) > 0
    
    # Check for specific model patterns that should exist
    assert any("gemini" in model.lower() for model in models)

def test_list_models_groq():
    """Test listing Groq models with real API call."""
    # Skip if API key isn't available
    if not os.environ.get("GROQ_API_KEY"):
        pytest.skip("Groq API key not available")
        
    # Test with full provider name
    models = list_models("groq")
    
    # Assertions
    assert isinstance(models, list)
    assert len(models) > 0
    
    # Check for specific model patterns (llama or mixtral are common in Groq)
    assert any(("llama" in model.lower() or "mixtral" in model.lower()) for model in models)

def test_list_models_deepseek():
    """Test listing DeepSeek models with real API call."""
    # Skip if API key isn't available
    if not os.environ.get("DEEPSEEK_API_KEY"):
        pytest.skip("DeepSeek API key not available")
        
    # Test with full provider name
    models = list_models("deepseek")
    
    # Assertions
    assert isinstance(models, list)
    assert len(models) > 0
    
    # Check for basic list return (no specific pattern needed)
    assert all(isinstance(model, str) for model in models)

def test_list_models_ollama():
    """Test listing Ollama models with real API call."""
    # Test with full provider name
    models = list_models("ollama")
    
    # Assertions
    assert isinstance(models, list)
    assert len(models) > 0
    
    # Check for basic list return (model entries could be anything)
    assert all(isinstance(model, str) for model in models)

def test_list_models_with_short_names():
    """Test listing models using short provider names."""
    # Test each provider with short name (only if API key available)
    
    # OpenAI - short name "o"
    if os.environ.get("OPENAI_API_KEY"):
        models = list_models("o")
        assert isinstance(models, list)
        assert len(models) > 0
        assert any("gpt" in model.lower() for model in models)
    
    # Anthropic - short name "a"
    if os.environ.get("ANTHROPIC_API_KEY"):
        models = list_models("a")
        assert isinstance(models, list)
        assert len(models) > 0
        assert any("claude" in model.lower() for model in models)
    
    # Gemini - short name "g"
    if os.environ.get("GEMINI_API_KEY"):
        models = list_models("g")
        assert isinstance(models, list)
        assert len(models) > 0
        assert any("gemini" in model.lower() for model in models)
    
    # Groq - short name "q"
    if os.environ.get("GROQ_API_KEY"):
        models = list_models("q")
        assert isinstance(models, list)
        assert len(models) > 0
    
    # DeepSeek - short name "d"
    if os.environ.get("DEEPSEEK_API_KEY"):
        models = list_models("d")
        assert isinstance(models, list)
        assert len(models) > 0
    
    # Ollama - short name "l"
    models = list_models("l")
    assert isinstance(models, list)
    assert len(models) > 0

def test_list_models_invalid_provider():
    """Test with invalid provider name."""
    # Test invalid provider
    with pytest.raises(ValueError):
        list_models("unknown_provider")
</file>

<file path="src/just_prompt/tests/molecules/test_prompt_from_file_to_file.py">
"""
Tests for prompt_from_file_to_file functionality.
"""

import pytest
import os
import tempfile
import shutil
from dotenv import load_dotenv
from just_prompt.molecules.prompt_from_file_to_file import prompt_from_file_to_file

# Load environment variables
load_dotenv()


def test_directory_creation_and_file_writing():
    """Test that the output directory is created and files are written with real API responses."""
    # Create temporary input file with a simple question
    with tempfile.NamedTemporaryFile(mode='w+', delete=False) as temp_file:
        temp_file.write("What is the capital of France?")
        input_path = temp_file.name
    
    # Create a deep non-existent directory path
    temp_dir = os.path.join(tempfile.gettempdir(), "just_prompt_test_dir", "output")
    
    try:
        # Make real API call
        file_paths = prompt_from_file_to_file(
            input_path, 
            ["o:gpt-4o-mini"],
            temp_dir
        )
        
        # Assertions
        assert isinstance(file_paths, list)
        assert len(file_paths) == 1
        
        # Check that the file exists
        assert os.path.exists(file_paths[0])
        
        # Check that the file has a .md extension
        assert file_paths[0].endswith('.md')
        
        # Check file content contains the expected response
        with open(file_paths[0], 'r') as f:
            content = f.read()
            assert "paris" in content.lower() or "Paris" in content
    finally:
        # Clean up
        os.unlink(input_path)
        # Remove the created directory and all its contents
        if os.path.exists(os.path.dirname(temp_dir)):
            shutil.rmtree(os.path.dirname(temp_dir))
</file>

<file path=".mcp.json">
{
  "mcpServers": {
    "just-prompt": {
      "type": "stdio",
      "command": "uv",
      "args": [
        "--directory",
        ".",
        "run",
        "just-prompt",
        "--default-models",
        "anthropic:claude-3-7-sonnet-20250219,openai:o3-mini,gemini:gemini-2.5-pro-exp-03-25"
      ],
      "env": {}
    }
  }
}
</file>

<file path="pyproject.toml">
[project]
name = "just-prompt"
version = "0.1.0"
description = "A lightweight MCP server for various LLM providers"
readme = "README.md"
requires-python = ">=3.10"
dependencies = [
    "anthropic>=0.49.0",
    "google-genai>=1.7.0",
    "groq>=0.20.0",
    "ollama>=0.4.7",
    "openai>=1.68.0",
    "python-dotenv>=1.0.1",
    "pydantic>=2.0.0",
    "mcp>=0.1.5",
]

[project.scripts]
just-prompt = "just_prompt.__main__:main"

[project.optional-dependencies]
test = [
    "pytest>=7.3.1",
    "pytest-asyncio>=0.20.3",
]

[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
</file>

<file path="src/just_prompt/atoms/llm_providers/ollama.py">
"""
Ollama provider implementation.
"""

import os
from typing import List
import logging
import ollama
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Configure logging
logger = logging.getLogger(__name__)


def prompt(text: str, model: str) -> str:
    """
    Send a prompt to Ollama and get a response.

    Args:
        text: The prompt text
        model: The model name

    Returns:
        Response string from the model
    """
    try:
        logger.info(f"Sending prompt to Ollama model: {model}")

        # Create chat completion
        response = ollama.chat(
            model=model,
            messages=[
                {
                    "role": "user",
                    "content": text,
                },
            ],
        )

        # Extract response content
        return response.message.content
    except Exception as e:
        logger.error(f"Error sending prompt to Ollama: {e}")
        raise ValueError(f"Failed to get response from Ollama: {str(e)}")


def list_models() -> List[str]:
    """
    List available Ollama models.

    Returns:
        List of model names
    """
    logger.info("Listing Ollama models")
    response = ollama.list()

    # Extract model names from the models attribute
    models = [model.model for model in response.models]

    return models
</file>

<file path="src/just_prompt/molecules/prompt_from_file_to_file.py">
"""
Prompt from file to file functionality for just-prompt.
"""

from typing import List
import logging
import os
from pathlib import Path
from .prompt_from_file import prompt_from_file
from ..atoms.shared.utils import DEFAULT_MODEL

logger = logging.getLogger(__name__)


def prompt_from_file_to_file(file: str, models_prefixed_by_provider: List[str] = None, output_dir: str = ".") -> List[str]:
    """
    Read text from a file, send it as prompt to multiple models, and save responses to files.
    
    Args:
        file: Path to the text file
        models_prefixed_by_provider: List of model strings in format "provider:model"
                                    If None, uses the DEFAULT_MODELS environment variable
        output_dir: Directory to save response files
        
    Returns:
        List of paths to the output files
    """
    # Validate output directory
    output_path = Path(output_dir)
    if not output_path.exists():
        output_path.mkdir(parents=True, exist_ok=True)
    
    if not output_path.is_dir():
        raise ValueError(f"Not a directory: {output_dir}")
    
    # Get the base name of the input file
    input_file_name = Path(file).stem
    
    # Get responses
    responses = prompt_from_file(file, models_prefixed_by_provider)
    
    # Save responses to files
    output_files = []
    
    # Get the models that were actually used
    models_used = models_prefixed_by_provider
    if not models_used:
        default_models = os.environ.get("DEFAULT_MODELS", DEFAULT_MODEL)
        models_used = [model.strip() for model in default_models.split(",")]
    
    for i, (model_string, response) in enumerate(zip(models_used, responses)):
        # Sanitize model string for filename (replace colons with underscores)
        safe_model_name = model_string.replace(":", "_")
        
        # Create output filename with .md extension
        output_file = output_path / f"{input_file_name}_{safe_model_name}.md"
        
        # Write response to file as markdown
        try:
            with open(output_file, 'w', encoding='utf-8') as f:
                f.write(response)
            output_files.append(str(output_file))
        except Exception as e:
            logger.error(f"Error writing response to {output_file}: {e}")
            output_files.append(f"Error: {str(e)}")
    
    return output_files
</file>

<file path="src/just_prompt/molecules/prompt.py">
"""
Prompt functionality for just-prompt.
"""

from typing import List
import logging
import concurrent.futures
import os
from ..atoms.shared.validator import validate_models_prefixed_by_provider
from ..atoms.shared.utils import split_provider_and_model, DEFAULT_MODEL
from ..atoms.shared.model_router import ModelRouter

logger = logging.getLogger(__name__)


def _process_model_prompt(model_string: str, text: str) -> str:
    """
    Process a single model prompt.
    
    Args:
        model_string: String in format "provider:model"
        text: The prompt text
        
    Returns:
        Response from the model
    """
    try:
        return ModelRouter.route_prompt(model_string, text)
    except Exception as e:
        logger.error(f"Error processing prompt for {model_string}: {e}")
        return f"Error ({model_string}): {str(e)}"


def _correct_model_name(provider: str, model: str, correction_model: str) -> str:
    """
    Correct a model name using the correction model.
    
    Args:
        provider: Provider name
        model: Model name
        correction_model: Model to use for correction
        
    Returns:
        Corrected model name
    """
    try:
        return ModelRouter.magic_model_correction(provider, model, correction_model)
    except Exception as e:
        logger.error(f"Error correcting model name {provider}:{model}: {e}")
        return model


def prompt(text: str, models_prefixed_by_provider: List[str] = None) -> List[str]:
    """
    Send a prompt to multiple models using parallel processing.
    
    Args:
        text: The prompt text
        models_prefixed_by_provider: List of model strings in format "provider:model"
                                    If None, uses the DEFAULT_MODELS environment variable
        
    Returns:
        List of responses from the models
    """
    # Use default models if no models provided
    if not models_prefixed_by_provider:
        default_models = os.environ.get("DEFAULT_MODELS", DEFAULT_MODEL)
        models_prefixed_by_provider = [model.strip() for model in default_models.split(",")]
    # Validate model strings
    validate_models_prefixed_by_provider(models_prefixed_by_provider)
    
    # Prepare corrected model strings
    corrected_models = []
    for model_string in models_prefixed_by_provider:
        provider, model = split_provider_and_model(model_string)
        
        # Get correction model from environment
        correction_model = os.environ.get("CORRECTION_MODEL", DEFAULT_MODEL)
        
        # Check if model needs correction
        corrected_model = _correct_model_name(provider, model, correction_model)
        
        # Use corrected model
        if corrected_model != model:
            model_string = f"{provider}:{corrected_model}"
        
        corrected_models.append(model_string)
    
    # Process each model in parallel using ThreadPoolExecutor
    responses = []
    with concurrent.futures.ThreadPoolExecutor() as executor:
        # Submit all tasks
        future_to_model = {
            executor.submit(_process_model_prompt, model_string, text): model_string
            for model_string in corrected_models
        }
        
        # Collect results in order
        for model_string in corrected_models:
            for future, future_model in future_to_model.items():
                if future_model == model_string:
                    responses.append(future.result())
                    break
    
    return responses
</file>

<file path="list_models.py">
def list_openai_models():
    from openai import OpenAI

    client = OpenAI()

    print(client.models.list())


def list_groq_models():
    import os
    from groq import Groq

    client = Groq(
        api_key=os.environ.get("GROQ_API_KEY"),
    )

    chat_completion = client.models.list()

    print(chat_completion)


def list_anthropic_models():
    import anthropic
    import os
    from dotenv import load_dotenv

    load_dotenv()

    client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
    models = client.models.list()
    print("Available Anthropic models:")
    for model in models.data:
        print(f"- {model.id}")


def list_gemini_models():
    import os
    from google import genai
    from dotenv import load_dotenv

    load_dotenv()

    client = genai.Client(api_key=os.environ.get("GEMINI_API_KEY"))

    print("List of models that support generateContent:\n")
    for m in client.models.list():
        for action in m.supported_actions:
            if action == "generateContent":
                print(m.name)

    print("List of models that support embedContent:\n")
    for m in client.models.list():
        for action in m.supported_actions:
            if action == "embedContent":
                print(m.name)


def list_deepseek_models():
    from openai import OpenAI

    # for backward compatibility, you can still use `https://api.deepseek.com/v1` as `base_url`.
    client = OpenAI(
        api_key="sk-ds-3f422175ff114212a42d7107c3efd1e4",  # fake
        base_url="https://api.deepseek.com",
    )
    print(client.models.list())


def list_ollama_models():
    import ollama

    print(ollama.list())


# Uncomment to run the functions
# list_openai_models()
# list_groq_models()
# list_anthropic_models()
# list_gemini_models()
# list_deepseek_models()
# list_ollama_models()
</file>

<file path="src/just_prompt/__main__.py">
"""
Main entry point for just-prompt.
"""

import argparse
import asyncio
import logging
import os
import sys
from dotenv import load_dotenv
from .server import serve
from .atoms.shared.utils import DEFAULT_MODEL
from .atoms.shared.validator import print_provider_availability

# Load environment variables
load_dotenv()

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s [%(levelname)s] %(message)s',
    datefmt='%Y-%m-%d %H:%M:%S'
)
logger = logging.getLogger(__name__)


def main():
    """
    Main entry point for just-prompt.
    """
    parser = argparse.ArgumentParser(description="just-prompt - A lightweight MCP server for various LLM providers")
    parser.add_argument(
        "--default-models", 
        default=DEFAULT_MODEL,
        help="Comma-separated list of default models to use for prompts and model name correction, in format provider:model"
    )
    parser.add_argument(
        "--log-level", 
        choices=["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"],
        default="INFO",
        help="Logging level"
    )
    parser.add_argument(
        "--show-providers",
        action="store_true",
        help="Show available providers and exit"
    )
    
    args = parser.parse_args()
    
    # Set logging level
    logging.getLogger().setLevel(getattr(logging, args.log_level))
    
    # Show provider availability
    print_provider_availability()
    
    # If --show-providers flag is provided, exit after showing provider info
    if args.show_providers:
        sys.exit(0)
    
    try:
        # Start server (asyncio)
        asyncio.run(serve(args.default_models))
    except Exception as e:
        logger.error(f"Error starting server: {e}")
        sys.exit(1)


if __name__ == "__main__":
    main()
</file>

<file path=".gitignore">
# Python-generated files
__pycache__/
*.py[oc]
build/
dist/
wheels/
*.egg-info

# Virtual environments
.venv

.env

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# Distribution / packaging
dist/
build/
*.egg-info/
*.egg

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Jupyter Notebook
.ipynb_checkpoints

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# IDE specific files
.idea/
.vscode/
*.swp
*.swo
.DS_Store


prompts/responses
.aider*

focus_output/
</file>

<file path=".claude/commands/jprompt_ultra_diff_review.md">
# Ultra Diff Review
> Execute each task in the order given to conduct a thorough code review.

## Task 1: Create diff.txt

Create a new file called diff.md.

At the top of the file, add the following markdown:

```md
# Code Review
- Review the diff, report on issues, bugs, and improvements. 
- End with a concise markdown table of any issues found, their solutions, and a risk assessment for each issue if applicable.
- Use emojis to convey the severity of each issue.

## Diff

```

## Task 2: git diff and append

Then run git diff and append the output to the file.

## Task 3: just-prompt multi-llm tool call

Then use that file as the input to this just-prompt tool call.

prompts_from_file_to_file(
    from_file = diff.md,
    models = "openai:o3-mini, anthropic:claude-3-7-sonnet-20250219:4k, gemini:gemini-2.0-flash-thinking-exp"
    output_dir = ultra_diff_review/
)

## Task 4: Read the output files and synthesize

Then read the output files and think hard to synthesize the results into a new single file called `ultra_diff_review/fusion_ultra_diff_review.md` following the original instructions plus any additional instructions or callouts you think are needed to create the best possible review.

## Task 5: Present the results

Then let me know which issues you think are worth resolving and we'll proceed from there.
</file>

<file path="src/just_prompt/server.py">
"""
MCP server for just-prompt.
"""

import asyncio
import logging
import os
from typing import List, Dict, Any, Optional
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import Tool, TextContent
from pydantic import BaseModel, Field
from .atoms.shared.utils import DEFAULT_MODEL
from .atoms.shared.validator import print_provider_availability
from .molecules.prompt import prompt
from .molecules.prompt_from_file import prompt_from_file
from .molecules.prompt_from_file_to_file import prompt_from_file_to_file
from .molecules.list_providers import list_providers as list_providers_func
from .molecules.list_models import list_models as list_models_func
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s [%(levelname)s] %(message)s',
    datefmt='%Y-%m-%d %H:%M:%S'
)
logger = logging.getLogger(__name__)

# Tool names enum
class JustPromptTools:
    PROMPT = "prompt"
    PROMPT_FROM_FILE = "prompt_from_file"
    PROMPT_FROM_FILE_TO_FILE = "prompt_from_file_to_file"
    LIST_PROVIDERS = "list_providers"
    LIST_MODELS = "list_models"

# Schema classes for MCP tools
class PromptSchema(BaseModel):
    text: str = Field(..., description="The prompt text")
    models_prefixed_by_provider: Optional[List[str]] = Field(
        None, 
        description="List of models with provider prefixes (e.g., 'openai:gpt-4o' or 'o:gpt-4o'). If not provided, uses default models."
    )

class PromptFromFileSchema(BaseModel):
    file: str = Field(..., description="Path to the file containing the prompt")
    models_prefixed_by_provider: Optional[List[str]] = Field(
        None, 
        description="List of models with provider prefixes (e.g., 'openai:gpt-4o' or 'o:gpt-4o'). If not provided, uses default models."
    )

class PromptFromFileToFileSchema(BaseModel):
    file: str = Field(..., description="Path to the file containing the prompt")
    models_prefixed_by_provider: Optional[List[str]] = Field(
        None, 
        description="List of models with provider prefixes (e.g., 'openai:gpt-4o' or 'o:gpt-4o'). If not provided, uses default models."
    )
    output_dir: str = Field(
        default=".", 
        description="Directory to save the response files to (default: current directory)"
    )

class ListProvidersSchema(BaseModel):
    pass

class ListModelsSchema(BaseModel):
    provider: str = Field(..., description="Provider to list models for (e.g., 'openai' or 'o')")


async def serve(default_models: str = DEFAULT_MODEL) -> None:
    """
    Start the MCP server.
    
    Args:
        default_models: Comma-separated list of default models to use for prompts and corrections
    """
    # Set global default models for prompts and corrections
    os.environ["DEFAULT_MODELS"] = default_models
    
    # Parse default models into a list
    default_models_list = [model.strip() for model in default_models.split(",")]
    
    # Set the first model as the correction model
    correction_model = default_models_list[0] if default_models_list else "o:gpt-4o-mini"
    os.environ["CORRECTION_MODEL"] = correction_model
    
    logger.info(f"Starting server with default models: {default_models}")
    logger.info(f"Using correction model: {correction_model}")
    
    # Check and log provider availability
    print_provider_availability()
    
    # Create the MCP server
    server = Server("just-prompt")
    
    @server.list_tools()
    async def list_tools() -> List[Tool]:
        """Register all available tools with the MCP server."""
        return [
            Tool(
                name=JustPromptTools.PROMPT,
                description="Send a prompt to multiple LLM models",
                inputSchema=PromptSchema.schema(),
            ),
            Tool(
                name=JustPromptTools.PROMPT_FROM_FILE,
                description="Send a prompt from a file to multiple LLM models",
                inputSchema=PromptFromFileSchema.schema(),
            ),
            Tool(
                name=JustPromptTools.PROMPT_FROM_FILE_TO_FILE,
                description="Send a prompt from a file to multiple LLM models and save responses to files",
                inputSchema=PromptFromFileToFileSchema.schema(),
            ),
            Tool(
                name=JustPromptTools.LIST_PROVIDERS,
                description="List all available LLM providers",
                inputSchema=ListProvidersSchema.schema(),
            ),
            Tool(
                name=JustPromptTools.LIST_MODELS,
                description="List all available models for a specific LLM provider",
                inputSchema=ListModelsSchema.schema(),
            ),
        ]
    
    @server.call_tool()
    async def call_tool(name: str, arguments: Dict[str, Any]) -> List[TextContent]:
        """Handle tool calls from the MCP client."""
        logger.info(f"Tool call: {name}, arguments: {arguments}")
        
        try:
            if name == JustPromptTools.PROMPT:
                models_to_use = arguments.get("models_prefixed_by_provider")
                responses = prompt(arguments["text"], models_to_use)
                
                # Get the model names that were actually used
                models_used = models_to_use if models_to_use else [model.strip() for model in os.environ.get("DEFAULT_MODELS", DEFAULT_MODEL).split(",")]
                
                return [TextContent(
                    type="text",
                    text="\n".join([f"Model: {models_used[i]}\nResponse: {resp}" 
                                  for i, resp in enumerate(responses)])
                )]
                
            elif name == JustPromptTools.PROMPT_FROM_FILE:
                models_to_use = arguments.get("models_prefixed_by_provider")
                responses = prompt_from_file(arguments["file"], models_to_use)
                
                # Get the model names that were actually used
                models_used = models_to_use if models_to_use else [model.strip() for model in os.environ.get("DEFAULT_MODELS", DEFAULT_MODEL).split(",")]
                
                return [TextContent(
                    type="text",
                    text="\n".join([f"Model: {models_used[i]}\nResponse: {resp}" 
                                  for i, resp in enumerate(responses)])
                )]
                
            elif name == JustPromptTools.PROMPT_FROM_FILE_TO_FILE:
                output_dir = arguments.get("output_dir", ".")
                models_to_use = arguments.get("models_prefixed_by_provider")
                file_paths = prompt_from_file_to_file(
                    arguments["file"], 
                    models_to_use,
                    output_dir
                )
                return [TextContent(
                    type="text",
                    text=f"Responses saved to:\n" + "\n".join(file_paths)
                )]
                
            elif name == JustPromptTools.LIST_PROVIDERS:
                providers = list_providers_func()
                provider_text = "\nAvailable Providers:\n"
                for provider in providers:
                    provider_text += f"- {provider['name']}: full_name='{provider['full_name']}', short_name='{provider['short_name']}'\n"
                return [TextContent(
                    type="text",
                    text=provider_text
                )]
                
            elif name == JustPromptTools.LIST_MODELS:
                models = list_models_func(arguments["provider"])
                return [TextContent(
                    type="text",
                    text=f"Models for provider '{arguments['provider']}':\n" + 
                         "\n".join([f"- {model}" for model in models])
                )]
                
            else:
                return [TextContent(
                    type="text",
                    text=f"Unknown tool: {name}"
                )]
                
        except Exception as e:
            logger.error(f"Error handling tool call: {name}, error: {e}")
            return [TextContent(
                type="text",
                text=f"Error: {str(e)}"
            )]
    
    # Initialize and run the server
    try:
        options = server.create_initialization_options()
        async with stdio_server() as (read_stream, write_stream):
            await server.run(read_stream, write_stream, options, raise_exceptions=True)
    except Exception as e:
        logger.error(f"Error running server: {e}")
        raise
</file>

<file path="README.md">
# Just Prompt - A lightweight MCP server for LLM providers

`just-prompt` is a Model Control Protocol (MCP) server that provides a unified interface to various Large Language Model (LLM) providers including OpenAI, Anthropic, Google Gemini, Groq, DeepSeek, and Ollama.

## Tools

The following MCP tools are available in the server:

- **`prompt`**: Send a prompt to multiple LLM models
  - Parameters:
    - `text`: The prompt text
    - `models_prefixed_by_provider` (optional): List of models with provider prefixes. If not provided, uses default models.

- **`prompt_from_file`**: Send a prompt from a file to multiple LLM models
  - Parameters:
    - `file`: Path to the file containing the prompt
    - `models_prefixed_by_provider` (optional): List of models with provider prefixes. If not provided, uses default models.

- **`prompt_from_file_to_file`**: Send a prompt from a file to multiple LLM models and save responses as markdown files
  - Parameters:
    - `file`: Path to the file containing the prompt
    - `models_prefixed_by_provider` (optional): List of models with provider prefixes. If not provided, uses default models.
    - `output_dir` (default: "."): Directory to save the response markdown files to

- **`list_providers`**: List all available LLM providers
  - Parameters: None

- **`list_models`**: List all available models for a specific LLM provider
  - Parameters:
    - `provider`: Provider to list models for (e.g., 'openai' or 'o')

## Provider Prefixes
> every model must be prefixed with the provider name
>
> use the short name for faster referencing

- `o` or `openai`: OpenAI 
  - `o:gpt-4o-mini`
  - `openai:gpt-4o-mini`
- `a` or `anthropic`: Anthropic 
  - `a:claude-3-5-haiku`
  - `anthropic:claude-3-5-haiku`
- `g` or `gemini`: Google Gemini 
  - `g:gemini-2.5-pro-exp-03-25`
  - `gemini:gemini:gemini-2.5-pro-exp-03-25`
- `q` or `groq`: Groq 
  - `q:llama-3.1-70b-versatile`
  - `groq:llama-3.1-70b-versatile`
- `d` or `deepseek`: DeepSeek 
  - `d:deepseek-coder`
  - `deepseek:deepseek-coder`
- `l` or `ollama`: Ollama 
  - `l:llama3.1`
  - `ollama:llama3.1`

## Features

- Unified API for multiple LLM providers
- Support for text prompts from strings or files
- Run multiple models in parallel
- Automatic model name correction using the first model in the `--default-models` list
- Ability to save responses to files
- Easy listing of available providers and models

## Installation

```bash
# Clone the repository
git clone https://github.com/yourusername/just-prompt.git
cd just-prompt

# Install with pip
uv sync
```

### Environment Variables

Create a `.env` file with your API keys (you can copy the `.env.sample` file):

```bash
cp .env.sample .env
```

Then edit the `.env` file to add your API keys (or export them in your shell):

```
OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here
GROQ_API_KEY=your_groq_api_key_here
DEEPSEEK_API_KEY=your_deepseek_api_key_here
OLLAMA_HOST=http://localhost:11434
```

## Claude Code Installation

Default model set to `anthropic:claude-3-7-sonnet-20250219`.

If you use Claude Code right out of the repository you can see in the .mcp.json file we set the default models to...

```
{
  "mcpServers": {
    "just-prompt": {
      "type": "stdio",
      "command": "uv",
      "args": [
        "--directory",
        ".",
        "run",
        "just-prompt",
        "--default-models",
        "anthropic:claude-3-7-sonnet-20250219,openai:o3-mini,gemini:gemini-2.5-pro-exp-03-25"
      ],
      "env": {}
    }
  }
}
```

The `--default-models` parameter sets the models to use when none are explicitly provided to the API endpoints. The first model in the list is also used for model name correction when needed. This can be a list of models separated by commas.

When starting the server, it will automatically check which API keys are available in your environment and inform you which providers you can use. If a key is missing, the provider will be listed as unavailable, but the server will still start and can be used with the providers that are available.

### Using `mcp add-json`

Copy this and paste it into claude code with BUT don't run until you copy the json

```
claude mcp add just-prompt "$(pbpaste)"
```

JSON to copy

```
{
    "command": "uv",
    "args": ["--directory", ".", "run", "just-prompt"]
}
```

With a custom default model set to `openai:gpt-4o`.

```
{
    "command": "uv",
    "args": ["--directory", ".", "run", "just-prompt", "--default-models", "openai:gpt-4o"]
}
```

With multiple default models:

```
{
    "command": "uv",
    "args": ["--directory", ".", "run", "just-prompt", "--default-models", "anthropic:claude-3-7-sonnet-20250219,openai:gpt-4o,gemini:gemini-2.5-pro-exp-03-25"]
}
```

### Using `mcp add` with project scope

```bash
# With default model (anthropic:claude-3-7-sonnet-20250219)
claude mcp add just-prompt -s project \
  -- \
    uv --directory . \
    run just-prompt

# With custom default model
claude mcp add just-prompt -s project \
  -- \
  uv --directory . \
  run just-prompt --default-models "openai:gpt-4o"

# With multiple default models
claude mcp add just-prompt -s user \
  -- \
  uv --directory . \
  run just-prompt --default-models "anthropic:claude-3-7-sonnet-20250219:4k,openai:o3-mini,gemini:gemini-2.0-flash,openai:gpt-4.5-preview,gemini:gemini-2.5-pro-exp-03-25"
```


## `mcp remove`

claude mcp remove just-prompt

## Running Tests

```bash
uv run pytest
```

## Codebase Structure

```
.
├── ai_docs/                   # Documentation for AI model details
│   ├── llm_providers_details.xml
│   └── pocket-pick-mcp-server-example.xml
├── list_models.py             # Script to list available LLM models
├── pyproject.toml             # Python project configuration
├── specs/                     # Project specifications
│   └── init-just-prompt.md
├── src/                       # Source code directory
│   └── just_prompt/
│       ├── __init__.py
│       ├── __main__.py
│       ├── atoms/             # Core components
│       │   ├── llm_providers/ # Individual provider implementations
│       │   │   ├── anthropic.py
│       │   │   ├── deepseek.py
│       │   │   ├── gemini.py
│       │   │   ├── groq.py
│       │   │   ├── ollama.py
│       │   │   └── openai.py
│       │   └── shared/        # Shared utilities and data types
│       │       ├── data_types.py
│       │       ├── model_router.py
│       │       ├── utils.py
│       │       └── validator.py
│       ├── molecules/         # Higher-level functionality
│       │   ├── list_models.py
│       │   ├── list_providers.py
│       │   ├── prompt.py
│       │   ├── prompt_from_file.py
│       │   └── prompt_from_file_to_file.py
│       ├── server.py          # MCP server implementation
│       └── tests/             # Test directory
│           ├── atoms/         # Tests for atoms
│           │   ├── llm_providers/
│           │   └── shared/
│           └── molecules/     # Tests for molecules
```

## Context Priming
READ README.md, then run git ls-files, and 'eza --git-ignore --tree' to understand the context of the project.

## Thinking Tokens with Claude

The Anthropic Claude model `claude-3-7-sonnet-20250219` supports extended thinking capabilities using thinking tokens. This allows Claude to do more thorough thought processes before answering.

You can enable thinking tokens by adding a suffix to the model name in this format:
- `anthropic:claude-3-7-sonnet-20250219:1k` - Use 1024 thinking tokens
- `anthropic:claude-3-7-sonnet-20250219:4k` - Use 4096 thinking tokens
- `anthropic:claude-3-7-sonnet-20250219:8000` - Use 8000 thinking tokens

Example usage:
```bash
# Using 4k thinking tokens with Claude
uv run just-prompt prompt "Analyze the advantages and disadvantages of quantum computing vs classical computing" \
  --models-prefixed-by-provider anthropic:claude-3-7-sonnet-20250219:4k
```

Notes:
- Thinking tokens are only supported for the `claude-3-7-sonnet-20250219` model
- Valid thinking token budgets range from 1024 to 16000
- Values outside this range will be automatically adjusted to be within range
- You can specify the budget with k notation (1k, 4k, etc.) or with exact numbers (1024, 4096, etc.)

## Resources
- https://docs.anthropic.com/en/api/models-list?q=list+models
- https://github.com/googleapis/python-genai
- https://platform.openai.com/docs/api-reference/models/list
- https://api-docs.deepseek.com/api/list-models
- https://github.com/ollama/ollama-python
- https://github.com/openai/openai-python
</file>

</files>

```