#
tokens: 35163/50000 51/51 files
lines: off (toggle) GitHub
raw markdown copy
# Directory Structure

```
├── .github
│   ├── FUNDING.yml
│   └── workflows
│       ├── claude-code-review.yml
│       └── claude.yml
├── .gitignore
├── LICENSE
├── package-lock.json
├── package.json
├── README.md
├── src
│   ├── autoloader.ts
│   ├── index.ts
│   ├── schemas.ts
│   ├── server.ts
│   ├── tools
│   │   ├── chat.ts
│   │   ├── copy.ts
│   │   ├── create.ts
│   │   ├── delete.ts
│   │   ├── embed.ts
│   │   ├── generate.ts
│   │   ├── list.ts
│   │   ├── ps.ts
│   │   ├── pull.ts
│   │   ├── push.ts
│   │   ├── show.ts
│   │   ├── web-fetch.ts
│   │   └── web-search.ts
│   ├── types.ts
│   └── utils
│       ├── http-error.ts
│       ├── response-formatter.ts
│       ├── retry-config.ts
│       └── retry.ts
├── tests
│   ├── autoloader.test.ts
│   ├── index.test.ts
│   ├── integration
│   │   └── server.test.ts
│   ├── schemas
│   │   └── chat-input.test.ts
│   ├── tools
│   │   ├── chat.test.ts
│   │   ├── copy.test.ts
│   │   ├── create.test.ts
│   │   ├── delete.test.ts
│   │   ├── embed.test.ts
│   │   ├── generate.test.ts
│   │   ├── list.test.ts
│   │   ├── ps.test.ts
│   │   ├── pull.test.ts
│   │   ├── push.test.ts
│   │   ├── show.test.ts
│   │   ├── web-fetch.test.ts
│   │   └── web-search.test.ts
│   └── utils
│       ├── http-error.test.ts
│       ├── response-formatter.test.ts
│       ├── retry-config.test.ts
│       └── retry.test.ts
├── tsconfig.json
└── vitest.config.ts
```

# Files

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
# Dependencies
node_modules/

# Compiled output
dist/

# Test coverage
coverage/

# Logs
npm-debug.log*
yarn-debug.log*
yarn-error.log*

# Environment variables
.env

# IDE
.idea/
.vscode/
*.swp
*.swo

# OS
.DS_Store
Thumbs.db
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
<div align="center">

# 🦙 Ollama MCP Server

**Supercharge your AI assistant with local LLM access**

[![License: AGPL-3.0](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
[![TypeScript](https://img.shields.io/badge/TypeScript-5.7-blue)](https://www.typescriptlang.org/)
[![MCP](https://img.shields.io/badge/MCP-1.0-green)](https://github.com/anthropics/model-context-protocol)
[![Coverage](https://img.shields.io/badge/Coverage-96%25-brightgreen)](https://github.com/rawveg/ollama-mcp)

An MCP (Model Context Protocol) server that exposes the complete Ollama SDK as MCP tools, enabling seamless integration between your local LLM models and MCP-compatible applications like Claude Desktop and Cline.

[Features](#-features) • [Installation](#-installation) • [Available Tools](#-available-tools) • [Configuration](#-configuration) • [Retry Behavior](#-retry-behavior) • [Development](#-development)

</div>

---

## ✨ Features

- ☁️ **Ollama Cloud Support** - Full integration with Ollama's cloud platform
- 🔧 **14 Comprehensive Tools** - Full access to Ollama's SDK functionality
- 🔄 **Hot-Swap Architecture** - Automatic tool discovery with zero-config
- 🎯 **Type-Safe** - Built with TypeScript and Zod validation
- 📊 **High Test Coverage** - 96%+ coverage with comprehensive test suite
- 🚀 **Zero Dependencies** - Minimal footprint, maximum performance
- 🔌 **Drop-in Integration** - Works with Claude Desktop, Cline, and other MCP clients
- 🌐 **Web Search & Fetch** - Real-time web search and content extraction via Ollama Cloud
- 🔀 **Hybrid Mode** - Use local and cloud models seamlessly in one server

## 💡 Level Up Your Ollama Experience with Claude Code and Desktop

### The Complete Package: Tools + Knowledge

This MCP server gives Claude the **tools** to interact with Ollama - but you'll get even more value by also installing the **Ollama Skill** from the [Skillsforge Marketplace](https://github.com/rawveg/skillsforge-marketplace):

- 🚗 **This MCP = The Car** - All the tools and capabilities
- 🎓 **Ollama Skill = Driving Lessons** - Expert knowledge on how to use them effectively

The Ollama Skill teaches Claude:
- Best practices for model selection and configuration
- Optimal prompting strategies for different Ollama models
- When to use chat vs generate, embeddings, and other tools
- Performance optimization and troubleshooting
- Advanced features like tool calling and function support

**Install both for the complete experience:**
1. ✅ This MCP server (tools)
2. ✅ [Ollama Skill](https://github.com/rawveg/skillsforge-marketplace) (expertise)

Result: Claude doesn't just have the car - it knows how to drive! 🏎️

## 📦 Installation

### Quick Start with Claude Desktop

Add to your Claude Desktop config (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS):

```json
{
  "mcpServers": {
    "ollama": {
      "command": "npx",
      "args": ["-y", "ollama-mcp"]
    }
  }
}
```

### Global Installation

```bash
npm install -g ollama-mcp
```

### For Cline (VS Code)

Add to your Cline MCP settings (`cline_mcp_settings.json`):

```json
{
  "mcpServers": {
    "ollama": {
      "command": "npx",
      "args": ["-y", "ollama-mcp"]
    }
  }
}
```

## 🛠️ Available Tools

### Model Management
| Tool | Description |
|------|-------------|
| `ollama_list` | List all available local models |
| `ollama_show` | Get detailed information about a specific model |
| `ollama_pull` | Download models from Ollama library |
| `ollama_push` | Push models to Ollama library |
| `ollama_copy` | Create a copy of an existing model |
| `ollama_delete` | Remove models from local storage |
| `ollama_create` | Create custom models from Modelfile |

### Model Operations
| Tool | Description |
|------|-------------|
| `ollama_ps` | List currently running models |
| `ollama_generate` | Generate text completions |
| `ollama_chat` | Interactive chat with models (supports tools/functions) |
| `ollama_embed` | Generate embeddings for text |

### Web Tools (Ollama Cloud)
| Tool | Description |
|------|-------------|
| `ollama_web_search` | Search the web with customizable result limits (requires `OLLAMA_API_KEY`) |
| `ollama_web_fetch` | Fetch and parse web page content (requires `OLLAMA_API_KEY`) |

> **Note:** Web tools require an Ollama Cloud API key. They connect to `https://ollama.com/api` for web search and fetch operations.

## ⚙️ Configuration

### Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `OLLAMA_HOST` | `http://127.0.0.1:11434` | Ollama server endpoint (use `https://ollama.com` for cloud) |
| `OLLAMA_API_KEY` | - | API key for Ollama Cloud (required for web tools and cloud models) |

### Custom Ollama Host

```json
{
  "mcpServers": {
    "ollama": {
      "command": "npx",
      "args": ["-y", "ollama-mcp"],
      "env": {
        "OLLAMA_HOST": "http://localhost:11434"
      }
    }
  }
}
```

### Ollama Cloud Configuration

To use Ollama's cloud platform with web search and fetch capabilities:

```json
{
  "mcpServers": {
    "ollama": {
      "command": "npx",
      "args": ["-y", "ollama-mcp"],
      "env": {
        "OLLAMA_HOST": "https://ollama.com",
        "OLLAMA_API_KEY": "your-ollama-cloud-api-key"
      }
    }
  }
}
```

**Cloud Features:**
- ☁️ Access cloud-hosted models
- 🔍 Web search with `ollama_web_search` (requires API key)
- 📄 Web fetch with `ollama_web_fetch` (requires API key)
- 🚀 Faster inference on cloud infrastructure

**Get your API key:** Visit [ollama.com](https://ollama.com) to sign up and obtain your API key.

### Hybrid Mode (Local + Cloud)

You can use both local and cloud models by pointing to your local Ollama instance while providing an API key:

```json
{
  "mcpServers": {
    "ollama": {
      "command": "npx",
      "args": ["-y", "ollama-mcp"],
      "env": {
        "OLLAMA_HOST": "http://127.0.0.1:11434",
        "OLLAMA_API_KEY": "your-ollama-cloud-api-key"
      }
    }
  }
}
```

This configuration:
- ✅ Runs local models from your Ollama instance
- ✅ Enables cloud-only web search and fetch tools
- ✅ Best of both worlds: privacy + web connectivity

## 🔄 Retry Behavior

The MCP server includes intelligent retry logic for handling transient failures when communicating with Ollama APIs:

### Automatic Retry Strategy

**Web Tools (`ollama_web_search` and `ollama_web_fetch`):**
- Automatically retry on rate limit errors (HTTP 429)
- Maximum of **3 retry attempts** (4 total requests including initial)
- **Request timeout:** 30 seconds per request (prevents hung connections)
- Respects the `Retry-After` header when provided by the API
- Falls back to exponential backoff with jitter when `Retry-After` is not present

### Retry-After Header Support

The server intelligently handles the standard HTTP `Retry-After` header in two formats:

**1. Delay-Seconds Format:**
```
Retry-After: 60
```
Waits exactly 60 seconds before retrying.

**2. HTTP-Date Format:**
```
Retry-After: Wed, 21 Oct 2025 07:28:00 GMT
```
Calculates delay until the specified timestamp.

### Exponential Backoff

When `Retry-After` is not provided or invalid:
- **Initial delay:** 1 second (default)
- **Maximum delay:** 10 seconds (default, configurable)
- **Strategy:** Exponential backoff with full jitter
- **Formula:** `random(0, min(initialDelay × 2^attempt, maxDelay))`

**Example retry delays:**
- 1st retry: 0-1 seconds
- 2nd retry: 0-2 seconds
- 3rd retry: 0-4 seconds (capped at 0-10s max)

### Error Handling

**Retried Errors (transient failures):**
- HTTP 429 (Too Many Requests) - rate limiting
- HTTP 500 (Internal Server Error) - transient server issues
- HTTP 502 (Bad Gateway) - gateway/proxy received invalid response
- HTTP 503 (Service Unavailable) - server temporarily unable to handle request
- HTTP 504 (Gateway Timeout) - gateway/proxy did not receive timely response

**Non-Retried Errors (permanent failures):**
- Request timeouts (30 second limit exceeded)
- Network timeouts (no status code)
- Abort/cancel errors
- HTTP 4xx errors (except 429) - client errors requiring changes
- Other HTTP 5xx errors (501, 505, 506, 508, etc.) - configuration/implementation issues

The retry mechanism ensures robust handling of temporary API issues while respecting server-provided retry guidance and preventing excessive request rates. Transient 5xx errors (500, 502, 503, 504) are safe to retry for the idempotent POST operations used by `ollama_web_search` and `ollama_web_fetch`. Individual requests timeout after 30 seconds to prevent indefinitely hung connections.

## 🎯 Usage Examples

### Chat with a Model

```typescript
// MCP clients can invoke:
{
  "tool": "ollama_chat",
  "arguments": {
    "model": "llama3.2:latest",
    "messages": [
      { "role": "user", "content": "Explain quantum computing" }
    ]
  }
}
```

### Generate Embeddings

```typescript
{
  "tool": "ollama_embed",
  "arguments": {
    "model": "nomic-embed-text",
    "input": ["Hello world", "Embeddings are great"]
  }
}
```

### Web Search

```typescript
{
  "tool": "ollama_web_search",
  "arguments": {
    "query": "latest AI developments",
    "max_results": 5
  }
}
```

## 🏗️ Architecture

This server uses a **hot-swap autoloader** pattern:

```
src/
├── index.ts          # Entry point (27 lines)
├── server.ts         # MCP server creation
├── autoloader.ts     # Dynamic tool discovery
└── tools/            # Tool implementations
    ├── chat.ts       # Each exports toolDefinition
    ├── generate.ts
    └── ...
```

**Key Benefits:**
- Add new tools by dropping files in `src/tools/`
- Zero server code changes required
- Each tool is independently testable
- 100% function coverage on all tools

## 🧪 Development

### Prerequisites

- Node.js v16+
- npm or pnpm
- Ollama running locally

### Setup

```bash
# Clone repository
git clone https://github.com/rawveg/ollama-mcp.git
cd ollama-mcp

# Install dependencies
npm install

# Build project
npm run build

# Run tests
npm test

# Run tests with coverage
npm run test:coverage
```

### Test Coverage

```
Statements   : 96.37%
Branches     : 84.82%
Functions    : 100%
Lines        : 96.37%
```

### Adding a New Tool

1. Create `src/tools/your-tool.ts`:

```typescript
import { ToolDefinition } from '../autoloader.js';
import { Ollama } from 'ollama';
import { ResponseFormat } from '../types.js';

export const toolDefinition: ToolDefinition = {
  name: 'ollama_your_tool',
  description: 'Your tool description',
  inputSchema: {
    type: 'object',
    properties: {
      param: { type: 'string' }
    },
    required: ['param']
  },
  handler: async (ollama, args, format) => {
    // Implementation
    return 'result';
  }
};
```

2. Create tests in `tests/tools/your-tool.test.ts`
3. Done! The autoloader discovers it automatically.

## 🤝 Contributing

Contributions are welcome! Please follow these guidelines:

1. **Fork** the repository
2. **Create** a feature branch (`git checkout -b feature/amazing-feature`)
3. **Write tests** - We maintain 96%+ coverage
4. **Commit** with clear messages (`git commit -m 'Add amazing feature'`)
5. **Push** to your branch (`git push origin feature/amazing-feature`)
6. **Open** a Pull Request

### Code Quality Standards

- All new tools must export `toolDefinition`
- Maintain ≥80% test coverage
- Follow existing TypeScript patterns
- Use Zod schemas for input validation

## 📄 License

This project is licensed under the **GNU Affero General Public License v3.0** (AGPL-3.0).

See [LICENSE](LICENSE) for details.

## 🔗 Related Projects

- [Skillsforge Marketplace](https://github.com/rawveg/skillsforge-marketplace) - Claude Code skills including the Ollama Skill
- [Ollama](https://ollama.ai) - Get up and running with large language models locally
- [Model Context Protocol](https://github.com/anthropics/model-context-protocol) - Open standard for AI assistant integration
- [Claude Desktop](https://claude.ai/desktop) - Anthropic's desktop application
- [Cline](https://github.com/cline/cline) - VS Code AI assistant

## 🙏 Acknowledgments

Built with:
- [Ollama SDK](https://github.com/ollama/ollama-js) - Official Ollama JavaScript library
- [MCP SDK](https://github.com/anthropics/model-context-protocol) - Model Context Protocol SDK
- [Zod](https://zod.dev) - TypeScript-first schema validation

---

<div align="center">

**[⬆ back to top](#-ollama-mcp-server)**

Made with ❤️ by [Tim Green](https://github.com/rawveg)

</div>

```

--------------------------------------------------------------------------------
/vitest.config.ts:
--------------------------------------------------------------------------------

```typescript
import { defineConfig } from 'vitest/config';

export default defineConfig({
  test: {
    globals: true,
    environment: 'node',
    coverage: {
      provider: 'v8',
      reporter: ['text', 'html', 'lcov'],
      include: ['src/**/*.ts'],
      exclude: ['src/**/*.test.ts', 'src/**/*.d.ts', 'src/types.ts'],
      lines: 100,
      functions: 100,
      branches: 100,
      statements: 100,
    },
  },
});

```

--------------------------------------------------------------------------------
/tsconfig.json:
--------------------------------------------------------------------------------

```json
{
  "compilerOptions": {
    "target": "ES2022",
    "module": "ES2022",
    "lib": ["ES2022"],
    "moduleResolution": "node",
    "outDir": "./dist",
    "rootDir": "./src",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "forceConsistentCasingInFileNames": true,
    "resolveJsonModule": true,
    "declaration": true,
    "declarationMap": true,
    "sourceMap": true,
    "types": ["node", "vitest/globals"]
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules", "dist"]
}

```

--------------------------------------------------------------------------------
/src/utils/http-error.ts:
--------------------------------------------------------------------------------

```typescript
/**
 * Custom error class for HTTP errors with status codes
 */
export class HttpError extends Error {
  /**
   * Create an HTTP error
   * @param message - Error message
   * @param status - HTTP status code
   * @param retryAfter - Optional Retry-After header value (seconds or date string)
   */
  constructor(
    message: string,
    public status: number,
    public retryAfter?: string
  ) {
    super(message);
    this.name = 'HttpError';

    // Maintains proper stack trace for where our error was thrown (only available on V8)
    if (Error.captureStackTrace) {
      Error.captureStackTrace(this, HttpError);
    }
  }
}

```

--------------------------------------------------------------------------------
/.github/FUNDING.yml:
--------------------------------------------------------------------------------

```yaml
# These are supported funding model platforms

github: rawveg
patreon: # Replace with a single Patreon username
open_collective: # Replace with a single Open Collective username
ko_fi: # Replace with a single Ko-fi username
tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel
community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry
liberapay: # Replace with a single Liberapay username
issuehunt: # Replace with a single IssueHunt username
lfx_crowdfunding: # Replace with a single LFX Crowdfunding project-name e.g., cloud-foundry
polar: # Replace with a single Polar username
buy_me_a_coffee: rawveg
thanks_dev: # Replace with a single thanks.dev username
custom: # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2']

```

--------------------------------------------------------------------------------
/src/utils/retry-config.ts:
--------------------------------------------------------------------------------

```typescript
/**
 * Retry configuration constants for web API calls
 *
 * These values are used by web-fetch and web-search tools when calling
 * the Ollama Cloud API endpoints. They align with the default values
 * in RetryOptions but are extracted here for consistency and maintainability.
 */

import { RetryOptions } from './retry.js';

/**
 * Standard retry configuration for web API calls
 *
 * - maxRetries: 3 retry attempts after the initial call
 * - initialDelay: 1000ms (1 second) before first retry
 * - maxDelay: Uses default from RetryOptions (10000ms)
 */
export const WEB_API_RETRY_CONFIG: RetryOptions = {
  maxRetries: 3,
  initialDelay: 1000,
} as const;

/**
 * Request timeout for web API calls
 *
 * Individual requests timeout after 30 seconds to prevent
 * indefinitely hung connections.
 */
export const WEB_API_TIMEOUT = 30000 as const;

```

--------------------------------------------------------------------------------
/tests/utils/retry-config.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect } from 'vitest';
import { WEB_API_RETRY_CONFIG, WEB_API_TIMEOUT } from '../../src/utils/retry-config.js';

describe('Retry Configuration Constants', () => {
  it('should define WEB_API_RETRY_CONFIG with correct default values', () => {
    // Assert - Verify configuration matches documented defaults
    expect(WEB_API_RETRY_CONFIG).toEqual({
      maxRetries: 3,
      initialDelay: 1000,
    });
  });

  it('should define WEB_API_TIMEOUT with correct value', () => {
    // Assert - Verify timeout matches documented 30 second limit
    expect(WEB_API_TIMEOUT).toBe(30000);
  });

  it('should use values consistent with retry.ts defaults', () => {
    // Assert - Ensure retry config doesn't override maxDelay
    // (maxDelay should use the default 10000ms from RetryOptions)
    expect(WEB_API_RETRY_CONFIG).not.toHaveProperty('maxDelay');
  });
});

```

--------------------------------------------------------------------------------
/tests/autoloader.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect } from 'vitest';
import { discoverTools } from '../src/autoloader.js';

describe('Tool Autoloader', () => {
  it('should discover all .ts files in tools directory', async () => {
    const tools = await discoverTools();

    expect(tools).toBeDefined();
    expect(Array.isArray(tools)).toBe(true);
    expect(tools.length).toBeGreaterThan(0);
  });

  it('should discover tool metadata from each file', async () => {
    const tools = await discoverTools();

    // Check that each tool has required metadata
    tools.forEach((tool) => {
      expect(tool).toHaveProperty('name');
      expect(tool).toHaveProperty('description');
      expect(tool).toHaveProperty('inputSchema');
      expect(tool).toHaveProperty('handler');

      expect(typeof tool.name).toBe('string');
      expect(typeof tool.description).toBe('string');
      expect(typeof tool.inputSchema).toBe('object');
      expect(typeof tool.handler).toBe('function');
    });
  });
});

```

--------------------------------------------------------------------------------
/src/tools/ps.ts:
--------------------------------------------------------------------------------

```typescript
import type { Ollama } from 'ollama';
import { ResponseFormat } from '../types.js';
import { formatResponse } from '../utils/response-formatter.js';
import type { ToolDefinition } from '../autoloader.js';
import { PsInputSchema } from '../schemas.js';

/**
 * List running models
 */
export async function listRunningModels(
  ollama: Ollama,
  format: ResponseFormat
): Promise<string> {
  const response = await ollama.ps();

  return formatResponse(JSON.stringify(response), format);
}

export const toolDefinition: ToolDefinition = {
  name: 'ollama_ps',
  description:
    'List running models. Shows which models are currently loaded in memory.',
  inputSchema: {
    type: 'object',
    properties: {
      format: {
        type: 'string',
        enum: ['json', 'markdown'],
        default: 'json',
      },
    },
  },
  handler: async (ollama: Ollama, args: Record<string, unknown>, format: ResponseFormat) => {
    PsInputSchema.parse(args);
    return listRunningModels(ollama, format);
  },
};

```

--------------------------------------------------------------------------------
/src/tools/list.ts:
--------------------------------------------------------------------------------

```typescript
import type { Ollama } from 'ollama';
import { ResponseFormat } from '../types.js';
import { formatResponse } from '../utils/response-formatter.js';
import type { ToolDefinition } from '../autoloader.js';

/**
 * List all available models
 */
export async function listModels(
  ollama: Ollama,
  format: ResponseFormat
): Promise<string> {
  const response = await ollama.list();

  return formatResponse(JSON.stringify(response), format);
}

/**
 * Tool metadata definition
 */
export const toolDefinition: ToolDefinition = {
  name: 'ollama_list',
  description:
    'List all available Ollama models installed locally. Returns model names, sizes, and modification dates.',
  inputSchema: {
    type: 'object',
    properties: {
      format: {
        type: 'string',
        enum: ['json', 'markdown'],
        description: 'Output format (default: json)',
        default: 'json',
      },
    },
  },
  handler: async (ollama: Ollama, args: Record<string, unknown>, format: ResponseFormat) => {
    return listModels(ollama, format);
  },
};

```

--------------------------------------------------------------------------------
/src/index.ts:
--------------------------------------------------------------------------------

```typescript
#!/usr/bin/env node

/**
 * Ollama MCP Server - Main entry point
 * Exposes Ollama SDK functionality through MCP tools
 */

import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { createServer } from './server.js';

/**
 * Main function to start the MCP server
 * Exported for testing purposes
 */
export async function main() {
  const server = createServer();
  const transport = new StdioServerTransport();
  await server.connect(transport);

  // Handle shutdown gracefully
  process.on('SIGINT', async () => {
    await server.close();
    process.exit(0);
  });

  return { server, transport };
}

// Only run if this is the main module (not being imported for testing)
// Check both direct execution and npx execution
const isMain = import.meta.url.startsWith('file://') &&
               (import.meta.url === `file://${process.argv[1]}` ||
                process.argv[1]?.includes('ollama-mcp'));

if (isMain) {
  main().catch((error) => {
    console.error('Fatal error:', error);
    process.exit(1);
  });
}

```

--------------------------------------------------------------------------------
/tests/tools/delete.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { Ollama } from 'ollama';
import { deleteModel, toolDefinition } from '../../src/tools/delete.js';
import { ResponseFormat } from '../../src/types.js';

describe('deleteModel', () => {
  let ollama: Ollama;
  let mockDelete: ReturnType<typeof vi.fn>;

  beforeEach(() => {
    mockDelete = vi.fn();
    ollama = {
      delete: mockDelete,
    } as any;
  });

  it('should delete a model', async () => {
    mockDelete.mockResolvedValue({
      status: 'success',
    });

    const result = await deleteModel(
      ollama,
      'my-model:latest',
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
    expect(mockDelete).toHaveBeenCalledWith({
      model: 'my-model:latest',
    });
  });

  it('should work through toolDefinition handler', async () => {
    const result = await toolDefinition.handler(
      ollama,
      { model: 'model-to-delete:latest', format: 'json' },
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
  });

});
```

--------------------------------------------------------------------------------
/tests/tools/push.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { Ollama } from 'ollama';
import { pushModel, toolDefinition } from '../../src/tools/push.js';
import { ResponseFormat } from '../../src/types.js';

describe('pushModel', () => {
  let ollama: Ollama;
  let mockPush: ReturnType<typeof vi.fn>;

  beforeEach(() => {
    mockPush = vi.fn();
    ollama = {
      push: mockPush,
    } as any;
  });

  it('should push a model to registry', async () => {
    mockPush.mockResolvedValue({
      status: 'success',
    });

    const result = await pushModel(
      ollama,
      'my-model:latest',
      false,
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
    expect(mockPush).toHaveBeenCalledWith({
      model: 'my-model:latest',
      insecure: false,
      stream: false,
    });
  });

  it('should work through toolDefinition handler', async () => {
    const result = await toolDefinition.handler(
      ollama,
      { model: 'my-model:latest', format: 'json' },
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
  });

});
```

--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------

```json
{
  "name": "ollama-mcp",
  "version": "2.1.0",
  "description": "MCP server for Ollama - exposes all Ollama SDK functionality through MCP tools",
  "type": "module",
  "main": "dist/index.js",
  "types": "dist/index.d.ts",
  "bin": {
    "ollama-mcp": "dist/index.js"
  },
  "files": [
    "dist",
    "README.md",
    "LICENSE"
  ],
  "repository": {
    "type": "git",
    "url": "git+https://github.com/rawveg/ollama-mcp.git"
  },
  "bugs": {
    "url": "https://github.com/rawveg/ollama-mcp/issues"
  },
  "homepage": "https://github.com/rawveg/ollama-mcp#readme",
  "scripts": {
    "build": "tsc",
    "test": "vitest run",
    "test:coverage": "vitest run --coverage",
    "dev": "node dist/index.js",
    "prepare": "npm run build"
  },
  "keywords": [
    "mcp",
    "ollama",
    "ai",
    "llm"
  ],
  "author": "Tim Green <[email protected]>",
  "license": "AGPL-3.0",
  "dependencies": {
    "@modelcontextprotocol/sdk": "^1.0.4",
    "json2md": "^2.0.3",
    "markdown-table": "^3.0.4",
    "ollama": "^0.5.11",
    "zod": "^3.24.1"
  },
  "devDependencies": {
    "@types/node": "^22.10.5",
    "@vitest/coverage-v8": "^2.1.9",
    "typescript": "^5.7.3",
    "vitest": "^2.1.9"
  }
}

```

--------------------------------------------------------------------------------
/src/tools/delete.ts:
--------------------------------------------------------------------------------

```typescript
import type { Ollama } from 'ollama';
import { ResponseFormat } from '../types.js';
import { formatResponse } from '../utils/response-formatter.js';
import type { ToolDefinition } from '../autoloader.js';
import { DeleteModelInputSchema } from '../schemas.js';

/**
 * Delete a model
 */
export async function deleteModel(
  ollama: Ollama,
  model: string,
  format: ResponseFormat
): Promise<string> {
  const response = await ollama.delete({
    model,
  });

  return formatResponse(JSON.stringify(response), format);
}

export const toolDefinition: ToolDefinition = {
  name: 'ollama_delete',
  description:
    'Delete a model from local storage. Removes the model and frees up disk space.',
  inputSchema: {
    type: 'object',
    properties: {
      model: {
        type: 'string',
        description: 'Name of the model to delete',
      },
      format: {
        type: 'string',
        enum: ['json', 'markdown'],
        default: 'json',
      },
    },
    required: ['model'],
  },
  handler: async (ollama: Ollama, args: Record<string, unknown>, format: ResponseFormat) => {
    const validated = DeleteModelInputSchema.parse(args);
    return deleteModel(ollama, validated.model, format);
  },
};

```

--------------------------------------------------------------------------------
/tests/tools/pull.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { Ollama } from 'ollama';
import { pullModel, toolDefinition } from '../../src/tools/pull.js';
import { ResponseFormat } from '../../src/types.js';

describe('pullModel', () => {
  let ollama: Ollama;
  let mockPull: ReturnType<typeof vi.fn>;

  beforeEach(() => {
    mockPull = vi.fn();
    ollama = {
      pull: mockPull,
    } as any;
  });

  it('should pull a model from registry', async () => {
    mockPull.mockResolvedValue({
      status: 'success',
    });

    const result = await pullModel(
      ollama,
      'llama3.2:latest',
      false,
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
    expect(mockPull).toHaveBeenCalledTimes(1);
    expect(mockPull).toHaveBeenCalledWith({
      model: 'llama3.2:latest',
      insecure: false,
      stream: false,
    });

    const parsed = JSON.parse(result);
    expect(parsed).toHaveProperty('status');
  });

  it('should work through toolDefinition handler', async () => {
    const result = await toolDefinition.handler(
      ollama,
      { model: 'llama3.2:latest', format: 'json' },
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
  });

});
```

--------------------------------------------------------------------------------
/tests/tools/copy.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { Ollama } from 'ollama';
import { copyModel, toolDefinition } from '../../src/tools/copy.js';
import { ResponseFormat } from '../../src/types.js';

describe('copyModel', () => {
  let ollama: Ollama;
  let mockCopy: ReturnType<typeof vi.fn>;

  beforeEach(() => {
    mockCopy = vi.fn();
    ollama = {
      copy: mockCopy,
    } as any;
  });

  it('should copy a model', async () => {
    mockCopy.mockResolvedValue({
      status: 'success',
    });

    const result = await copyModel(
      ollama,
      'model-a:latest',
      'model-b:latest',
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
    expect(mockCopy).toHaveBeenCalledWith({
      source: 'model-a:latest',
      destination: 'model-b:latest',
    });
  });

  it('should work through toolDefinition handler', async () => {
    mockCopy.mockResolvedValue({
      status: 'success',
    });

    const result = await toolDefinition.handler(
      ollama,
      {
        source: 'model-a:latest',
        destination: 'model-b:latest',
        format: 'json',
      },
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
    expect(mockCopy).toHaveBeenCalledTimes(1);
  });
});

```

--------------------------------------------------------------------------------
/src/tools/show.ts:
--------------------------------------------------------------------------------

```typescript
import type { Ollama } from 'ollama';
import { ResponseFormat } from '../types.js';
import { formatResponse } from '../utils/response-formatter.js';
import type { ToolDefinition } from '../autoloader.js';
import { ShowModelInputSchema } from '../schemas.js';

/**
 * Show information about a specific model
 */
export async function showModel(
  ollama: Ollama,
  model: string,
  format: ResponseFormat
): Promise<string> {
  const response = await ollama.show({ model });

  return formatResponse(JSON.stringify(response), format);
}

export const toolDefinition: ToolDefinition = {
  name: 'ollama_show',
  description:
    'Show detailed information about a specific model including modelfile, parameters, and architecture details.',
  inputSchema: {
    type: 'object',
    properties: {
      model: {
        type: 'string',
        description: 'Name of the model to show',
      },
      format: {
        type: 'string',
        enum: ['json', 'markdown'],
        description: 'Output format (default: json)',
        default: 'json',
      },
    },
    required: ['model'],
  },
  handler: async (ollama: Ollama, args: Record<string, unknown>, format: ResponseFormat) => {
    const validated = ShowModelInputSchema.parse(args);
    return showModel(ollama, validated.model, format);
  },
};

```

--------------------------------------------------------------------------------
/tests/tools/ps.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { Ollama } from 'ollama';
import { listRunningModels, toolDefinition } from '../../src/tools/ps.js';
import { ResponseFormat } from '../../src/types.js';

describe('listRunningModels', () => {
  let ollama: Ollama;
  let mockPs: ReturnType<typeof vi.fn>;

  beforeEach(() => {
    mockPs = vi.fn();
    ollama = {
      ps: mockPs,
    } as any;
  });

  it('should list running models', async () => {
    mockPs.mockResolvedValue({
      models: [
        {
          name: 'llama3.2:latest',
          size: 5000000000,
        },
      ],
    });

    const result = await listRunningModels(ollama, ResponseFormat.JSON);

    expect(typeof result).toBe('string');
    expect(mockPs).toHaveBeenCalledTimes(1);

    const parsed = JSON.parse(result);
    expect(parsed).toHaveProperty('models');
  });

  it('should work through toolDefinition handler', async () => {
    mockPs.mockResolvedValue({
      models: [
        {
          name: 'llama3.2:latest',
          size: 5000000000,
        },
      ],
    });

    const result = await toolDefinition.handler(
      ollama,
      { format: 'json' },
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
    expect(mockPs).toHaveBeenCalledTimes(1);
  });
});

```

--------------------------------------------------------------------------------
/src/tools/copy.ts:
--------------------------------------------------------------------------------

```typescript
import type { Ollama } from 'ollama';
import { ResponseFormat } from '../types.js';
import { formatResponse } from '../utils/response-formatter.js';
import type { ToolDefinition } from '../autoloader.js';
import { CopyModelInputSchema } from '../schemas.js';

/**
 * Copy a model
 */
export async function copyModel(
  ollama: Ollama,
  source: string,
  destination: string,
  format: ResponseFormat
): Promise<string> {
  const response = await ollama.copy({
    source,
    destination,
  });

  return formatResponse(JSON.stringify(response), format);
}

export const toolDefinition: ToolDefinition = {
  name: 'ollama_copy',
  description:
    'Copy a model. Creates a duplicate of an existing model with a new name.',
  inputSchema: {
    type: 'object',
    properties: {
      source: {
        type: 'string',
        description: 'Name of the source model',
      },
      destination: {
        type: 'string',
        description: 'Name for the copied model',
      },
      format: {
        type: 'string',
        enum: ['json', 'markdown'],
        default: 'json',
      },
    },
    required: ['source', 'destination'],
  },
  handler: async (ollama: Ollama, args: Record<string, unknown>, format: ResponseFormat) => {
    const validated = CopyModelInputSchema.parse(args);
    return copyModel(ollama, validated.source, validated.destination, format);
  },
};

```

--------------------------------------------------------------------------------
/tests/tools/create.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { Ollama } from 'ollama';
import { createModel, toolDefinition } from '../../src/tools/create.js';
import { ResponseFormat } from '../../src/types.js';

describe('createModel', () => {
  let ollama: Ollama;
  let mockCreate: ReturnType<typeof vi.fn>;

  beforeEach(() => {
    mockCreate = vi.fn();
    ollama = {
      create: mockCreate,
    } as any;
  });

  it('should create a model with structured parameters', async () => {
    mockCreate.mockResolvedValue({
      status: 'success',
    });

    const result = await createModel(
      ollama,
      {
        model: 'my-model:latest',
        from: 'llama3.2',
        system: 'You are a helpful assistant',
      },
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
    expect(mockCreate).toHaveBeenCalledWith({
      model: 'my-model:latest',
      from: 'llama3.2',
      system: 'You are a helpful assistant',
      template: undefined,
      license: undefined,
      stream: false,
    });
  });

  it('should work through toolDefinition handler', async () => {
    mockCreate.mockResolvedValue({
      status: 'success',
    });

    const result = await toolDefinition.handler(
      ollama,
      { model: 'my-custom-model:latest', from: 'llama3.2', format: 'json' },
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
  });

});
```

--------------------------------------------------------------------------------
/src/tools/push.ts:
--------------------------------------------------------------------------------

```typescript
import type { Ollama } from 'ollama';
import { ResponseFormat } from '../types.js';
import { formatResponse } from '../utils/response-formatter.js';
import type { ToolDefinition } from '../autoloader.js';
import { PushModelInputSchema } from '../schemas.js';

/**
 * Push a model to the Ollama registry
 */
export async function pushModel(
  ollama: Ollama,
  model: string,
  insecure: boolean,
  format: ResponseFormat
): Promise<string> {
  const response = await ollama.push({
    model,
    insecure,
    stream: false,
  });

  return formatResponse(JSON.stringify(response), format);
}

export const toolDefinition: ToolDefinition = {
  name: 'ollama_push',
  description:
    'Push a model to the Ollama registry. Uploads a local model to make it available remotely.',
  inputSchema: {
    type: 'object',
    properties: {
      model: {
        type: 'string',
        description: 'Name of the model to push',
      },
      insecure: {
        type: 'boolean',
        description: 'Allow insecure connections',
        default: false,
      },
      format: {
        type: 'string',
        enum: ['json', 'markdown'],
        default: 'json',
      },
    },
    required: ['model'],
  },
  handler: async (ollama: Ollama, args: Record<string, unknown>, format: ResponseFormat) => {
    const validated = PushModelInputSchema.parse(args);
    return pushModel(ollama, validated.model, validated.insecure, format);
  },
};

```

--------------------------------------------------------------------------------
/src/tools/pull.ts:
--------------------------------------------------------------------------------

```typescript
import type { Ollama } from 'ollama';
import { ResponseFormat } from '../types.js';
import { formatResponse } from '../utils/response-formatter.js';
import type { ToolDefinition } from '../autoloader.js';
import { PullModelInputSchema } from '../schemas.js';

/**
 * Pull a model from the Ollama registry
 */
export async function pullModel(
  ollama: Ollama,
  model: string,
  insecure: boolean,
  format: ResponseFormat
): Promise<string> {
  const response = await ollama.pull({
    model,
    insecure,
    stream: false,
  });

  return formatResponse(JSON.stringify(response), format);
}

export const toolDefinition: ToolDefinition = {
  name: 'ollama_pull',
  description:
    'Pull a model from the Ollama registry. Downloads the model to make it available locally.',
  inputSchema: {
    type: 'object',
    properties: {
      model: {
        type: 'string',
        description: 'Name of the model to pull',
      },
      insecure: {
        type: 'boolean',
        description: 'Allow insecure connections',
        default: false,
      },
      format: {
        type: 'string',
        enum: ['json', 'markdown'],
        default: 'json',
      },
    },
    required: ['model'],
  },
  handler: async (ollama: Ollama, args: Record<string, unknown>, format: ResponseFormat) => {
    const validated = PullModelInputSchema.parse(args);
    return pullModel(ollama, validated.model, validated.insecure, format);
  },
};

```

--------------------------------------------------------------------------------
/tests/tools/embed.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { Ollama } from 'ollama';
import { embedWithModel, toolDefinition } from '../../src/tools/embed.js';
import { ResponseFormat } from '../../src/types.js';

describe('embedWithModel', () => {
  let ollama: Ollama;
  let mockEmbed: ReturnType<typeof vi.fn>;

  beforeEach(() => {
    mockEmbed = vi.fn();
    ollama = {
      embed: mockEmbed,
    } as any;
  });

  it('should generate embeddings for single input', async () => {
    mockEmbed.mockResolvedValue({
      embeddings: [[0.1, 0.2, 0.3, 0.4, 0.5]],
    });

    const result = await embedWithModel(
      ollama,
      'llama3.2:latest',
      'Hello world',
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
    expect(mockEmbed).toHaveBeenCalledTimes(1);
    expect(mockEmbed).toHaveBeenCalledWith({
      model: 'llama3.2:latest',
      input: 'Hello world',
    });

    const parsed = JSON.parse(result);
    expect(parsed).toHaveProperty('embeddings');
    expect(Array.isArray(parsed.embeddings)).toBe(true);
  });

  it('should work through toolDefinition handler', async () => {
    mockEmbed.mockResolvedValue({
      embeddings: [[0.1, 0.2, 0.3]],
    });

    const result = await toolDefinition.handler(
      ollama,
      { model: 'llama3.2:latest', input: 'Test input', format: 'json' },
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
  });

});
```

--------------------------------------------------------------------------------
/src/autoloader.ts:
--------------------------------------------------------------------------------

```typescript
import { readdir } from 'fs/promises';
import { join, dirname } from 'path';
import { fileURLToPath } from 'url';
import type { Ollama } from 'ollama';
import { ResponseFormat } from './types.js';

const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);

/**
 * Represents a tool's metadata and handler function
 */
export interface ToolDefinition {
  name: string;
  description: string;
  inputSchema: {
    type: 'object';
    properties: Record<string, unknown>;
    required?: string[];
  };
  handler: (
    ollama: Ollama,
    args: Record<string, unknown>,
    format: ResponseFormat
  ) => Promise<string>;
}

/**
 * Discover and load all tools from the tools directory
 */
export async function discoverTools(): Promise<ToolDefinition[]> {
  const toolsDir = join(__dirname, 'tools');
  const files = await readdir(toolsDir);

  // Filter for .js files (production) or .ts files (development)
  // Exclude test files and declaration files
  const toolFiles = files.filter(
    (file) =>
      (file.endsWith('.js') || file.endsWith('.ts')) &&
      !file.includes('.test.') &&
      !file.endsWith('.d.ts')
  );

  const tools: ToolDefinition[] = [];

  for (const file of toolFiles) {
    const toolPath = join(toolsDir, file);
    const module = await import(toolPath);

    // Check if module exports tool metadata
    if (module.toolDefinition) {
      tools.push(module.toolDefinition);
    }
  }

  return tools;
}

```

--------------------------------------------------------------------------------
/src/tools/embed.ts:
--------------------------------------------------------------------------------

```typescript
import type { Ollama } from 'ollama';
import { ResponseFormat } from '../types.js';
import { formatResponse } from '../utils/response-formatter.js';
import type { ToolDefinition } from '../autoloader.js';
import { EmbedInputSchema } from '../schemas.js';

/**
 * Generate embeddings for text input
 */
export async function embedWithModel(
  ollama: Ollama,
  model: string,
  input: string | string[],
  format: ResponseFormat
): Promise<string> {
  const response = await ollama.embed({
    model,
    input,
  });

  return formatResponse(JSON.stringify(response), format);
}

export const toolDefinition: ToolDefinition = {
  name: 'ollama_embed',
  description:
    'Generate embeddings for text input. Returns numerical vector representations.',
  inputSchema: {
    type: 'object',
    properties: {
      model: {
        type: 'string',
        description: 'Name of the model to use',
      },
      input: {
        type: 'string',
        description:
          'Text input. For batch processing, provide a JSON-encoded array of strings, e.g., ["text1", "text2"]',
      },
      format: {
        type: 'string',
        enum: ['json', 'markdown'],
        default: 'json',
      },
    },
    required: ['model', 'input'],
  },
  handler: async (ollama: Ollama, args: Record<string, unknown>, format: ResponseFormat) => {
    const validated = EmbedInputSchema.parse(args);
    return embedWithModel(ollama, validated.model, validated.input, format);
  },
};

```

--------------------------------------------------------------------------------
/tests/tools/list.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { Ollama } from 'ollama';
import { listModels } from '../../src/tools/list.js';
import { ResponseFormat } from '../../src/types.js';

describe('listModels', () => {
  let ollama: Ollama;
  let mockList: ReturnType<typeof vi.fn>;

  beforeEach(() => {
    mockList = vi.fn();
    ollama = {
      list: mockList,
    } as any;
  });

  it('should return formatted model list in JSON format', async () => {
    mockList.mockResolvedValue({
      models: [
        {
          name: 'llama3.2:latest',
          modified_at: '2024-01-01T00:00:00Z',
          size: 5000000000,
          digest: 'abc123',
        },
      ],
    });

    const result = await listModels(ollama, ResponseFormat.JSON);

    expect(typeof result).toBe('string');
    expect(mockList).toHaveBeenCalledTimes(1);

    const parsed = JSON.parse(result);
    expect(parsed).toHaveProperty('models');
    expect(Array.isArray(parsed.models)).toBe(true);
  });

  it('should return markdown format when specified', async () => {
    mockList.mockResolvedValue({
      models: [
        {
          name: 'llama3.2:latest',
          modified_at: '2024-01-01T00:00:00Z',
          size: 5000000000,
          digest: 'abc123',
        },
      ],
    });

    const result = await listModels(ollama, ResponseFormat.MARKDOWN);

    expect(typeof result).toBe('string');
    expect(mockList).toHaveBeenCalledTimes(1);
    // Markdown format should contain markdown table with headers
    expect(result).toContain('| name');
    expect(result).toContain('llama3.2:latest');
  });
});

```

--------------------------------------------------------------------------------
/tests/tools/show.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { Ollama } from 'ollama';
import { showModel, toolDefinition } from '../../src/tools/show.js';
import { ResponseFormat } from '../../src/types.js';

describe('showModel', () => {
  let ollama: Ollama;
  let mockShow: ReturnType<typeof vi.fn>;

  beforeEach(() => {
    mockShow = vi.fn();
    ollama = {
      show: mockShow,
    } as any;
  });

  it('should return model information in JSON format', async () => {
    mockShow.mockResolvedValue({
      modelfile: 'FROM llama3.2\nPARAMETER temperature 0.7',
      parameters: 'temperature 0.7',
      template: 'template content',
      details: {
        parent_model: '',
        format: 'gguf',
        family: 'llama',
        families: ['llama'],
        parameter_size: '3B',
        quantization_level: 'Q4_0',
      },
    });

    const result = await showModel(ollama, 'llama3.2:latest', ResponseFormat.JSON);

    expect(typeof result).toBe('string');
    expect(mockShow).toHaveBeenCalledWith({ model: 'llama3.2:latest' });
    expect(mockShow).toHaveBeenCalledTimes(1);

    const parsed = JSON.parse(result);
    expect(parsed).toHaveProperty('modelfile');
    expect(parsed).toHaveProperty('details');
  });

  it('should work through toolDefinition handler', async () => {
    mockShow.mockResolvedValue({
      modelfile: 'FROM llama3.2',
      details: { family: 'llama' },
    });

    const result = await toolDefinition.handler(
      ollama,
      { model: 'llama3.2:latest', format: 'json' },
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
  });

});
```

--------------------------------------------------------------------------------
/src/tools/generate.ts:
--------------------------------------------------------------------------------

```typescript
import type { Ollama } from 'ollama';
import type { GenerationOptions } from '../types.js';
import { ResponseFormat } from '../types.js';
import { formatResponse } from '../utils/response-formatter.js';
import type { ToolDefinition } from '../autoloader.js';
import { GenerateInputSchema } from '../schemas.js';

/**
 * Generate completion from a prompt
 */
export async function generateWithModel(
  ollama: Ollama,
  model: string,
  prompt: string,
  options: GenerationOptions,
  format: ResponseFormat
): Promise<string> {
  const response = await ollama.generate({
    model,
    prompt,
    options,
    format: format === ResponseFormat.JSON ? 'json' : undefined,
    stream: false,
  });

  return formatResponse(response.response, format);
}

export const toolDefinition: ToolDefinition = {
  name: 'ollama_generate',
  description:
    'Generate completion from a prompt. Simpler than chat, useful for single-turn completions.',
  inputSchema: {
    type: 'object',
    properties: {
      model: {
        type: 'string',
        description: 'Name of the model to use',
      },
      prompt: {
        type: 'string',
        description: 'The prompt to generate from',
      },
      options: {
        type: 'string',
        description: 'Generation options (optional). Provide as JSON object with settings like temperature, top_p, etc.',
      },
      format: {
        type: 'string',
        enum: ['json', 'markdown'],
        default: 'json',
      },
    },
    required: ['model', 'prompt'],
  },
  handler: async (ollama: Ollama, args: Record<string, unknown>, format: ResponseFormat) => {
    const validated = GenerateInputSchema.parse(args);
    return generateWithModel(
      ollama,
      validated.model,
      validated.prompt,
      validated.options || {},
      format
    );
  },
};

```

--------------------------------------------------------------------------------
/.github/workflows/claude.yml:
--------------------------------------------------------------------------------

```yaml
name: Claude Code

on:
  issue_comment:
    types: [created]
  pull_request_review_comment:
    types: [created]
  issues:
    types: [opened, assigned]
  pull_request_review:
    types: [submitted]

jobs:
  claude:
    if: |
      (github.event_name == 'issue_comment' && contains(github.event.comment.body, '@claude')) ||
      (github.event_name == 'pull_request_review_comment' && contains(github.event.comment.body, '@claude')) ||
      (github.event_name == 'pull_request_review' && contains(github.event.review.body, '@claude')) ||
      (github.event_name == 'issues' && (contains(github.event.issue.body, '@claude') || contains(github.event.issue.title, '@claude')))
    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: read
      issues: read
      id-token: write
      actions: read # Required for Claude to read CI results on PRs
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
        with:
          fetch-depth: 1

      - name: Run Claude Code
        id: claude
        uses: anthropics/claude-code-action@v1
        with:
          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}

          # This is an optional setting that allows Claude to read CI results on PRs
          additional_permissions: |
            actions: read

          # Optional: Give a custom prompt to Claude. If this is not specified, Claude will perform the instructions specified in the comment that tagged it.
          # prompt: 'Update the pull request description to include a summary of changes.'

          # Optional: Add claude_args to customize behavior and configuration
          # See https://github.com/anthropics/claude-code-action/blob/main/docs/usage.md
          # or https://docs.claude.com/en/docs/claude-code/cli-reference for available options
          # claude_args: '--allowed-tools Bash(gh pr:*)'


```

--------------------------------------------------------------------------------
/.github/workflows/claude-code-review.yml:
--------------------------------------------------------------------------------

```yaml
name: Claude Code Review

on:
  pull_request:
    types: [opened, synchronize]
    # Optional: Only run on specific file changes
    # paths:
    #   - "src/**/*.ts"
    #   - "src/**/*.tsx"
    #   - "src/**/*.js"
    #   - "src/**/*.jsx"

jobs:
  claude-review:
    # Optional: Filter by PR author
    # if: |
    #   github.event.pull_request.user.login == 'external-contributor' ||
    #   github.event.pull_request.user.login == 'new-developer' ||
    #   github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR'

    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: read
      issues: read
      id-token: write

    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
        with:
          fetch-depth: 1

      - name: Run Claude Code Review
        id: claude-review
        uses: anthropics/claude-code-action@v1
        with:
          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
          prompt: |
            REPO: ${{ github.repository }}
            PR NUMBER: ${{ github.event.pull_request.number }}

            Please review this pull request and provide feedback on:
            - Code quality and best practices
            - Potential bugs or issues
            - Performance considerations
            - Security concerns
            - Test coverage

            Use the repository's CLAUDE.md for guidance on style and conventions. Be constructive and helpful in your feedback.

            Use `gh pr comment` with your Bash tool to leave your review as a comment on the PR.

          # See https://github.com/anthropics/claude-code-action/blob/main/docs/usage.md
          # or https://docs.claude.com/en/docs/claude-code/cli-reference for available options
          claude_args: '--allowed-tools "Bash(gh issue view:*),Bash(gh search:*),Bash(gh issue list:*),Bash(gh pr comment:*),Bash(gh pr diff:*),Bash(gh pr view:*),Bash(gh pr list:*)"'


```

--------------------------------------------------------------------------------
/tests/tools/generate.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { Ollama } from 'ollama';
import { generateWithModel, toolDefinition } from '../../src/tools/generate.js';
import { ResponseFormat } from '../../src/types.js';

describe('generateWithModel', () => {
  let ollama: Ollama;
  let mockGenerate: ReturnType<typeof vi.fn>;

  beforeEach(() => {
    mockGenerate = vi.fn();
    ollama = {
      generate: mockGenerate,
    } as any;
  });

  it('should generate completion from prompt', async () => {
    mockGenerate.mockResolvedValue({
      response: 'The sky appears blue because...',
      done: true,
    });

    const result = await generateWithModel(
      ollama,
      'llama3.2:latest',
      'Why is the sky blue?',
      {},
      ResponseFormat.MARKDOWN
    );

    expect(typeof result).toBe('string');
    expect(mockGenerate).toHaveBeenCalledTimes(1);
    expect(mockGenerate).toHaveBeenCalledWith({
      model: 'llama3.2:latest',
      prompt: 'Why is the sky blue?',
      options: {},
      stream: false,
    });
    expect(result).toContain('The sky appears blue because');
  });

  it('should use JSON format when ResponseFormat.JSON is specified', async () => {
    mockGenerate.mockResolvedValue({
      response: '{"answer": "test"}',
      done: true,
    });

    const result = await generateWithModel(
      ollama,
      'llama3.2:latest',
      'Test prompt',
      {},
      ResponseFormat.JSON
    );

    expect(mockGenerate).toHaveBeenCalledWith({
      model: 'llama3.2:latest',
      prompt: 'Test prompt',
      options: {},
      format: 'json',
      stream: false,
    });
  });

  it('should work through toolDefinition handler', async () => {
    mockGenerate.mockResolvedValue({ response: "test", done: true });
    const result = await toolDefinition.handler(
      ollama,
      { model: 'llama3.2:latest', prompt: 'Test prompt', format: 'json' },
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
  });

});
```

--------------------------------------------------------------------------------
/tests/index.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
import { main } from '../src/index.js';

describe('index (main entry point)', () => {
  let processOnSpy: ReturnType<typeof vi.fn>;
  let processExitSpy: ReturnType<typeof vi.fn>;
  let originalProcessOn: typeof process.on;
  let originalProcessExit: typeof process.exit;

  beforeEach(() => {
    // Store original process functions
    originalProcessOn = process.on;
    originalProcessExit = process.exit;

    // Mock process.on to capture SIGINT handler
    processOnSpy = vi.fn();
    process.on = processOnSpy as any;

    // Mock process.exit to prevent actual exits during testing
    processExitSpy = vi.fn();
    process.exit = processExitSpy as any;
  });

  afterEach(() => {
    // Restore original process functions
    process.on = originalProcessOn;
    process.exit = originalProcessExit;
    vi.restoreAllMocks();
  });

  it('should create and connect server', async () => {
    const result = await main();

    // Verify server and transport are returned
    expect(result).toHaveProperty('server');
    expect(result).toHaveProperty('transport');
    expect(result.server).toBeDefined();
    expect(result.transport).toBeDefined();
  });

  it('should register SIGINT handler for graceful shutdown', async () => {
    await main();

    // Verify SIGINT handler was registered
    expect(processOnSpy).toHaveBeenCalledWith('SIGINT', expect.any(Function));
  });

  it('should close server and exit on SIGINT', async () => {
    const result = await main();

    // Get the SIGINT handler that was registered
    const sigintHandler = processOnSpy.mock.calls.find(
      (call) => call[0] === 'SIGINT'
    )?.[1] as () => Promise<void>;

    expect(sigintHandler).toBeDefined();

    // Mock server.close
    const closeSpy = vi.spyOn(result.server, 'close').mockResolvedValue();

    // Call the SIGINT handler
    await sigintHandler();

    // Verify server.close was called and process.exit(0) was called
    expect(closeSpy).toHaveBeenCalled();
    expect(processExitSpy).toHaveBeenCalledWith(0);
  });
});

```

--------------------------------------------------------------------------------
/src/tools/create.ts:
--------------------------------------------------------------------------------

```typescript
import type { Ollama } from 'ollama';
import { ResponseFormat } from '../types.js';
import { formatResponse } from '../utils/response-formatter.js';
import type { ToolDefinition } from '../autoloader.js';
import { CreateModelInputSchema } from '../schemas.js';

export interface CreateModelOptions {
  model: string;
  from: string;
  system?: string;
  template?: string;
  license?: string;
}

/**
 * Create a model with structured parameters
 */
export async function createModel(
  ollama: Ollama,
  options: CreateModelOptions,
  format: ResponseFormat
): Promise<string> {
  const response = await ollama.create({
    model: options.model,
    from: options.from,
    system: options.system,
    template: options.template,
    license: options.license,
    stream: false,
  });

  return formatResponse(JSON.stringify(response), format);
}

export const toolDefinition: ToolDefinition = {
  name: 'ollama_create',
  description:
    'Create a new model with structured parameters. Allows customization of model behavior, system prompts, and templates.',
  inputSchema: {
    type: 'object',
    properties: {
      model: {
        type: 'string',
        description: 'Name for the new model',
      },
      from: {
        type: 'string',
        description: 'Base model to derive from (e.g., llama2, llama3)',
      },
      system: {
        type: 'string',
        description: 'System prompt for the model',
      },
      template: {
        type: 'string',
        description: 'Prompt template to use',
      },
      license: {
        type: 'string',
        description: 'License for the model',
      },
      format: {
        type: 'string',
        enum: ['json', 'markdown'],
        default: 'json',
      },
    },
    required: ['model', 'from'],
  },
  handler: async (ollama: Ollama, args: Record<string, unknown>, format: ResponseFormat) => {
    const validated = CreateModelInputSchema.parse(args);
    return createModel(
      ollama,
      {
        model: validated.model,
        from: validated.from,
        system: validated.system,
        template: validated.template,
        license: validated.license,
      },
      format
    );
  },
};

```

--------------------------------------------------------------------------------
/src/types.ts:
--------------------------------------------------------------------------------

```typescript
/**
 * Core types for Ollama MCP Server
 */

import type { Ollama } from 'ollama';

/**
 * Response format for tool outputs
 */
export enum ResponseFormat {
  MARKDOWN = 'markdown',
  JSON = 'json',
}

/**
 * Generation options that can be passed to Ollama models
 */
export interface GenerationOptions {
  temperature?: number;
  top_p?: number;
  top_k?: number;
  num_predict?: number;
  repeat_penalty?: number;
  seed?: number;
  stop?: string[];
}

/**
 * Message role for chat
 */
export type MessageRole = 'system' | 'user' | 'assistant';

/**
 * Chat message structure
 */
export interface ChatMessage {
  role: MessageRole;
  content: string;
  images?: string[];
  tool_calls?: ToolCall[];
}

/**
 * Tool definition for function calling
 */
export interface Tool {
  type: string;
  function: {
    name?: string;
    description?: string;
    parameters?: {
      type?: string;
      required?: string[];
      properties?: {
        [key: string]: {
          type?: string | string[];
          description?: string;
          enum?: any[];
        };
      };
    };
  };
}

/**
 * Tool call made by the model
 */
export interface ToolCall {
  function: {
    name: string;
    arguments: {
      [key: string]: any;
    };
  };
}

/**
 * Base tool context passed to all tool implementations
 */
export interface ToolContext {
  ollama: Ollama;
}

/**
 * Tool result with content and format
 */
export interface ToolResult {
  content: string;
  format: ResponseFormat;
}

/**
 * Error types specific to Ollama operations
 */
export class OllamaError extends Error {
  constructor(
    message: string,
    public readonly cause?: unknown
  ) {
    super(message);
    this.name = 'OllamaError';
  }
}

export class ModelNotFoundError extends OllamaError {
  constructor(modelName: string) {
    super(
      `Model not found: ${modelName}. Use ollama_list to see available models.`
    );
    this.name = 'ModelNotFoundError';
  }
}

export class NetworkError extends OllamaError {
  constructor(message: string, cause?: unknown) {
    super(message, cause);
    this.name = 'NetworkError';
  }
}

/**
 * Web search result
 */
export interface WebSearchResult {
  title: string;
  url: string;
  content: string;
}

/**
 * Web fetch result
 */
export interface WebFetchResult {
  title: string;
  content: string;
  links: string[];
}

```

--------------------------------------------------------------------------------
/src/tools/web-fetch.ts:
--------------------------------------------------------------------------------

```typescript
import type { Ollama } from 'ollama';
import { ResponseFormat } from '../types.js';
import { formatResponse } from '../utils/response-formatter.js';
import type { ToolDefinition } from '../autoloader.js';
import { WebFetchInputSchema } from '../schemas.js';
import { retryWithBackoff, fetchWithTimeout } from '../utils/retry.js';
import { HttpError } from '../utils/http-error.js';
import { WEB_API_RETRY_CONFIG, WEB_API_TIMEOUT } from '../utils/retry-config.js';

/**
 * Fetch a web page using Ollama's web fetch API
 */
export async function webFetch(
  ollama: Ollama,
  url: string,
  format: ResponseFormat
): Promise<string> {
  // Web fetch requires direct API call as it's not in the SDK
  const apiKey = process.env.OLLAMA_API_KEY;
  if (!apiKey) {
    throw new Error(
      'OLLAMA_API_KEY environment variable is required for web fetch'
    );
  }

  return retryWithBackoff(
    async () => {
      const response = await fetchWithTimeout(
        'https://ollama.com/api/web_fetch',
        {
          method: 'POST',
          headers: {
            'Content-Type': 'application/json',
            Authorization: `Bearer ${apiKey}`,
          },
          body: JSON.stringify({
            url,
          }),
        },
        WEB_API_TIMEOUT
      );

      if (!response.ok) {
        const retryAfter = response.headers.get('retry-after') ?? undefined;
        throw new HttpError(
          `Web fetch failed: ${response.status} ${response.statusText}`,
          response.status,
          retryAfter
        );
      }

      const data = await response.json();
      return formatResponse(JSON.stringify(data), format);
    },
    WEB_API_RETRY_CONFIG
  );
}

export const toolDefinition: ToolDefinition = {
  name: 'ollama_web_fetch',
  description:
    'Fetch a web page by URL using Ollama\'s web fetch API. Returns the page title, content, and links. Requires OLLAMA_API_KEY environment variable.',
  inputSchema: {
    type: 'object',
    properties: {
      url: {
        type: 'string',
        description: 'The URL to fetch',
      },
      format: {
        type: 'string',
        enum: ['json', 'markdown'],
        default: 'json',
      },
    },
    required: ['url'],
  },
  handler: async (ollama: Ollama, args: Record<string, unknown>, format: ResponseFormat) => {
    const validated = WebFetchInputSchema.parse(args);
    return webFetch(ollama, validated.url, format);
  },
};

```

--------------------------------------------------------------------------------
/tests/schemas/chat-input.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect } from 'vitest';
import { ChatInputSchema } from '../../src/schemas.js';

describe('ChatInputSchema', () => {
  it('should validate valid chat input with messages', () => {
    const input = {
      model: 'llama3.2:latest',
      messages: [
        { role: 'user', content: 'Hello' },
      ],
      format: 'json',
    };

    const result = ChatInputSchema.safeParse(input);
    expect(result.success).toBe(true);
  });

  it('should parse tools from JSON string', () => {
    const input = {
      model: 'llama3.2:latest',
      messages: [{ role: 'user', content: 'Test' }],
      tools: JSON.stringify([
        {
          type: 'function',
          function: { name: 'get_weather', description: 'Get weather' },
        },
      ]),
    };

    const result = ChatInputSchema.safeParse(input);
    expect(result.success).toBe(true);
    if (result.success) {
      expect(Array.isArray(result.data.tools)).toBe(true);
      expect(result.data.tools[0].function.name).toBe('get_weather');
    }
  });

  it('should default tools to empty array when not provided', () => {
    const input = {
      model: 'llama3.2:latest',
      messages: [{ role: 'user', content: 'Test' }],
    };

    const result = ChatInputSchema.safeParse(input);
    expect(result.success).toBe(true);
    if (result.success) {
      expect(result.data.tools).toEqual([]);
    }
  });

  it('should parse options from JSON string', () => {
    const input = {
      model: 'llama3.2:latest',
      messages: [{ role: 'user', content: 'Test' }],
      options: JSON.stringify({ temperature: 0.7, top_p: 0.9 }),
    };

    const result = ChatInputSchema.safeParse(input);
    expect(result.success).toBe(true);
    if (result.success) {
      expect(result.data.options).toEqual({ temperature: 0.7, top_p: 0.9 });
    }
  });

  it('should reject invalid JSON in tools field', () => {
    const input = {
      model: 'llama3.2:latest',
      messages: [{ role: 'user', content: 'Test' }],
      tools: 'not valid json{',
    };

    const result = ChatInputSchema.safeParse(input);
    expect(result.success).toBe(false);
  });

  it('should reject missing model field', () => {
    const input = {
      messages: [{ role: 'user', content: 'Test' }],
    };

    const result = ChatInputSchema.safeParse(input);
    expect(result.success).toBe(false);
  });

  it('should reject empty messages array', () => {
    const input = {
      model: 'llama3.2:latest',
      messages: [],
    };

    const result = ChatInputSchema.safeParse(input);
    expect(result.success).toBe(false);
  });
});

```

--------------------------------------------------------------------------------
/src/tools/web-search.ts:
--------------------------------------------------------------------------------

```typescript
import type { Ollama } from 'ollama';
import { ResponseFormat } from '../types.js';
import { formatResponse } from '../utils/response-formatter.js';
import type { ToolDefinition } from '../autoloader.js';
import { WebSearchInputSchema } from '../schemas.js';
import { retryWithBackoff, fetchWithTimeout } from '../utils/retry.js';
import { HttpError } from '../utils/http-error.js';
import { WEB_API_RETRY_CONFIG, WEB_API_TIMEOUT } from '../utils/retry-config.js';

/**
 * Perform a web search using Ollama's web search API
 */
export async function webSearch(
  ollama: Ollama,
  query: string,
  maxResults: number,
  format: ResponseFormat
): Promise<string> {
  // Web search requires direct API call as it's not in the SDK
  const apiKey = process.env.OLLAMA_API_KEY;
  if (!apiKey) {
    throw new Error(
      'OLLAMA_API_KEY environment variable is required for web search'
    );
  }

  return retryWithBackoff(
    async () => {
      const response = await fetchWithTimeout(
        'https://ollama.com/api/web_search',
        {
          method: 'POST',
          headers: {
            'Content-Type': 'application/json',
            Authorization: `Bearer ${apiKey}`,
          },
          body: JSON.stringify({
            query,
            max_results: maxResults,
          }),
        },
        WEB_API_TIMEOUT
      );

      if (!response.ok) {
        const retryAfter = response.headers.get('retry-after') ?? undefined;
        throw new HttpError(
          `Web search failed: ${response.status} ${response.statusText}`,
          response.status,
          retryAfter
        );
      }

      const data = await response.json();
      return formatResponse(JSON.stringify(data), format);
    },
    WEB_API_RETRY_CONFIG
  );
}

export const toolDefinition: ToolDefinition = {
  name: 'ollama_web_search',
  description:
    'Perform a web search using Ollama\'s web search API. Augments models with latest information to reduce hallucinations. Requires OLLAMA_API_KEY environment variable.',
  inputSchema: {
    type: 'object',
    properties: {
      query: {
        type: 'string',
        description: 'The search query string',
      },
      max_results: {
        type: 'number',
        description: 'Maximum number of results to return (1-10, default 5)',
        default: 5,
      },
      format: {
        type: 'string',
        enum: ['json', 'markdown'],
        default: 'json',
      },
    },
    required: ['query'],
  },
  handler: async (ollama: Ollama, args: Record<string, unknown>, format: ResponseFormat) => {
    const validated = WebSearchInputSchema.parse(args);
    return webSearch(ollama, validated.query, validated.max_results, format);
  },
};

```

--------------------------------------------------------------------------------
/tests/utils/http-error.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect } from 'vitest';
import { HttpError } from '../../src/utils/http-error.js';

describe('HttpError', () => {
  it('should create an HttpError with message and status', () => {
    // Arrange & Act
    const error = new HttpError('Not found', 404);

    // Assert
    expect(error).toBeInstanceOf(Error);
    expect(error).toBeInstanceOf(HttpError);
    expect(error.message).toBe('Not found');
    expect(error.status).toBe(404);
    expect(error.name).toBe('HttpError');
  });

  it('should create a rate limit error (429)', () => {
    // Arrange & Act
    const error = new HttpError('Rate limit exceeded', 429);

    // Assert
    expect(error.status).toBe(429);
    expect(error.message).toBe('Rate limit exceeded');
  });

  it('should create a server error (500)', () => {
    // Arrange & Act
    const error = new HttpError('Internal server error', 500);

    // Assert
    expect(error.status).toBe(500);
    expect(error.message).toBe('Internal server error');
  });

  it('should have correct error name', () => {
    // Arrange & Act
    const error = new HttpError('Bad request', 400);

    // Assert
    expect(error.name).toBe('HttpError');
  });

  it('should be throwable and catchable', () => {
    // Arrange
    const throwError = () => {
      throw new HttpError('Unauthorized', 401);
    };

    // Act & Assert
    expect(throwError).toThrow(HttpError);
    expect(throwError).toThrow('Unauthorized');

    try {
      throwError();
    } catch (error) {
      expect(error).toBeInstanceOf(HttpError);
      expect((error as HttpError).status).toBe(401);
    }
  });

  it('should preserve stack trace', () => {
    // Arrange & Act
    const error = new HttpError('Test error', 500);

    // Assert
    expect(error.stack).toBeDefined();
    expect(error.stack).toContain('HttpError');
  });

  it('should create an HttpError with Retry-After header', () => {
    // Arrange & Act
    const error = new HttpError('Rate limit exceeded', 429, '60');

    // Assert
    expect(error).toBeInstanceOf(HttpError);
    expect(error.status).toBe(429);
    expect(error.retryAfter).toBe('60');
  });

  it('should create an HttpError without Retry-After header', () => {
    // Arrange & Act
    const error = new HttpError('Rate limit exceeded', 429);

    // Assert
    expect(error).toBeInstanceOf(HttpError);
    expect(error.status).toBe(429);
    expect(error.retryAfter).toBeUndefined();
  });

  it('should handle Retry-After as HTTP-date string', () => {
    // Arrange & Act
    const dateString = 'Wed, 21 Oct 2025 07:28:00 GMT';
    const error = new HttpError('Rate limit exceeded', 429, dateString);

    // Assert
    expect(error.retryAfter).toBe(dateString);
  });
});

```

--------------------------------------------------------------------------------
/tests/utils/response-formatter.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect } from 'vitest';
import { formatResponse } from '../../src/utils/response-formatter.js';
import { ResponseFormat } from '../../src/types.js';

describe('formatResponse', () => {
  it('should return plain text as-is for markdown format', () => {
    const content = 'Hello, world!';
    const result = formatResponse(content, ResponseFormat.MARKDOWN);

    expect(result).toBe(content);
  });

  it('should convert JSON object to markdown format', () => {
    const jsonObject = { message: 'Hello', count: 42 };
    const content = JSON.stringify(jsonObject);
    const result = formatResponse(content, ResponseFormat.MARKDOWN);

    expect(result).toContain('**message:** Hello');
    expect(result).toContain('**count:** 42');
  });

  it('should convert JSON array to markdown table', () => {
    const content = JSON.stringify({
      models: [
        { name: 'model1', size: 100 },
        { name: 'model2', size: 200 },
      ],
    });
    const result = formatResponse(content, ResponseFormat.MARKDOWN);

    // Check for markdown table elements (markdown-table adds proper spacing)
    expect(result).toContain('| name');
    expect(result).toContain('| size');
    expect(result).toContain('model1');
    expect(result).toContain('model2');
    expect(result).toContain('100');
    expect(result).toContain('200');
  });

  it('should parse and stringify JSON content', () => {
    const jsonObject = { message: 'Hello', count: 42 };
    const content = JSON.stringify(jsonObject);
    const result = formatResponse(content, ResponseFormat.JSON);

    const parsed = JSON.parse(result);
    expect(parsed).toEqual(jsonObject);
  });

  it('should wrap non-JSON content in error object for JSON format', () => {
    const content = 'This is not JSON';
    const result = formatResponse(content, ResponseFormat.JSON);

    const parsed = JSON.parse(result);
    expect(parsed).toHaveProperty('error');
    expect(parsed.error).toContain('Invalid JSON');
    expect(parsed).toHaveProperty('raw_content');
  });

  it('should format object with array value', () => {
    const content = JSON.stringify({
      name: 'test',
      items: ['a', 'b', 'c'],
    });
    const result = formatResponse(content, ResponseFormat.MARKDOWN);

    expect(result).toContain('**name:** test');
    expect(result).toContain('**items:**');
    expect(result).toContain('- a');
  });

  it('should format object with nested object value', () => {
    const content = JSON.stringify({
      user: 'alice',
      details: { age: 30, city: 'NYC' },
    });
    const result = formatResponse(content, ResponseFormat.MARKDOWN);

    expect(result).toContain('**user:** alice');
    expect(result).toContain('**details:**');
    expect(result).toContain('**age:** 30');
  });
});

```

--------------------------------------------------------------------------------
/src/server.ts:
--------------------------------------------------------------------------------

```typescript
/**
 * MCP Server creation and configuration
 */

import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import {
  CallToolRequestSchema,
  ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
import { Ollama } from 'ollama';
import { discoverTools } from './autoloader.js';
import { ResponseFormat } from './types.js';

/**
 * Create and configure the MCP server with tool handlers
 */
export function createServer(ollamaInstance?: Ollama): Server {
  // Initialize Ollama client
  const ollamaConfig: {
    host: string;
    headers?: Record<string, string>;
  } = {
    host: process.env.OLLAMA_HOST || 'http://127.0.0.1:11434',
  };

  // Add API key header if OLLAMA_API_KEY is set
  if (process.env.OLLAMA_API_KEY) {
    ollamaConfig.headers = {
      Authorization: `Bearer ${process.env.OLLAMA_API_KEY}`,
    };
  }

  const ollama = ollamaInstance || new Ollama(ollamaConfig);

  // Create MCP server
  const server = new Server(
    {
      name: 'ollama-mcp',
      version: '2.0.2',
    },
    {
      capabilities: {
        tools: {},
      },
    }
  );

  // Register tool list handler
  server.setRequestHandler(ListToolsRequestSchema, async () => {
    const tools = await discoverTools();

    return {
      tools: tools.map((tool) => ({
        name: tool.name,
        description: tool.description,
        inputSchema: tool.inputSchema,
      })),
    };
  });

  // Register tool call handler
  server.setRequestHandler(CallToolRequestSchema, async (request) => {
    try {
      const { name, arguments: args } = request.params;

      // Discover all tools
      const tools = await discoverTools();

      // Find the matching tool
      const tool = tools.find((t) => t.name === name);

      if (!tool) {
        throw new Error(`Unknown tool: ${name}`);
      }

      // Determine format from args
      const formatArg = (args as Record<string, unknown>).format;
      const format =
        formatArg === 'markdown' ? ResponseFormat.MARKDOWN : ResponseFormat.JSON;

      // Call the tool handler
      const result = await tool.handler(
        ollama,
        args as Record<string, unknown>,
        format
      );

      // Parse the result to extract structured data
      let structuredData: unknown = undefined;
      try {
        // Attempt to parse the result as JSON
        structuredData = JSON.parse(result);
      } catch {
        // If parsing fails, leave structuredData as undefined
        // This handles cases where the result is markdown or plain text
      }

      return {
        structuredContent: structuredData,
        content: [
          {
            type: 'text',
            text: result,
          },
        ],
      };
    } catch (error) {
      const errorMessage =
        error instanceof Error ? error.message : String(error);
      return {
        content: [
          {
            type: 'text',
            text: `Error: ${errorMessage}`,
          },
        ],
        isError: true,
      };
    }
  });

  return server;
}

```

--------------------------------------------------------------------------------
/src/tools/chat.ts:
--------------------------------------------------------------------------------

```typescript
import type { Ollama } from 'ollama';
import type { ChatMessage, GenerationOptions, Tool } from '../types.js';
import { ResponseFormat } from '../types.js';
import { formatResponse } from '../utils/response-formatter.js';
import type { ToolDefinition } from '../autoloader.js';
import { ChatInputSchema } from '../schemas.js';

/**
 * Chat with a model using conversation messages
 */
export async function chatWithModel(
  ollama: Ollama,
  model: string,
  messages: ChatMessage[],
  options: GenerationOptions,
  format: ResponseFormat,
  tools?: Tool[]
): Promise<string> {
  // Determine format parameter for Ollama API
  let ollamaFormat: 'json' | undefined = undefined;
  if (format === ResponseFormat.JSON) {
    ollamaFormat = 'json';
  }

  const response = await ollama.chat({
    model,
    messages,
    tools,
    options,
    format: ollamaFormat,
    stream: false,
  });

  // Extract content with fallback
  let content = response.message.content;
  if (!content) {
    content = '';
  }

  const tool_calls = response.message.tool_calls;

  // If the response includes tool calls, include them in the output
  let hasToolCalls = false;
  if (tool_calls) {
    if (tool_calls.length > 0) {
      hasToolCalls = true;
    }
  }

  if (hasToolCalls) {
    const fullResponse = {
      content,
      tool_calls,
    };
    return formatResponse(JSON.stringify(fullResponse), format);
  }

  return formatResponse(content, format);
}

export const toolDefinition: ToolDefinition = {
  name: 'ollama_chat',
  description:
    'Chat with a model using conversation messages. Supports system messages, multi-turn conversations, tool calling, and generation options.',
  inputSchema: {
    type: 'object',
    properties: {
      model: {
        type: 'string',
        description: 'Name of the model to use',
      },
      messages: {
        type: 'array',
        description: 'Array of chat messages',
        items: {
          type: 'object',
          properties: {
            role: {
              type: 'string',
              enum: ['system', 'user', 'assistant'],
            },
            content: {
              type: 'string',
            },
            images: {
              type: 'array',
              items: { type: 'string' },
            },
          },
          required: ['role', 'content'],
        },
      },
      tools: {
        type: 'string',
        description: 'Tools that the model can call (optional). Provide as JSON array of tool objects.',
      },
      options: {
        type: 'string',
        description: 'Generation options (optional). Provide as JSON object with settings like temperature, top_p, etc.',
      },
      format: {
        type: 'string',
        enum: ['json', 'markdown'],
        default: 'json',
      },
    },
    required: ['model', 'messages'],
  },
  handler: async (ollama: Ollama, args: Record<string, unknown>, format: ResponseFormat) => {
    const validated = ChatInputSchema.parse(args);
    return chatWithModel(
      ollama,
      validated.model,
      validated.messages,
      validated.options || {},
      format,
      validated.tools.length > 0 ? validated.tools : undefined
    );
  },
};

```

--------------------------------------------------------------------------------
/tests/integration/server.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, beforeAll, afterAll, vi } from 'vitest';
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { InMemoryTransport } from '@modelcontextprotocol/sdk/inMemory.js';
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { Ollama } from 'ollama';

// Mock the Ollama SDK
vi.mock('ollama', () => {
  return {
    Ollama: vi.fn().mockImplementation(() => ({
      list: vi.fn().mockResolvedValue({
        models: [
          {
            name: 'llama2:latest',
            size: 3825819519,
            digest: 'abc123',
            modified_at: '2024-01-01T00:00:00Z',
          },
        ],
      }),
      ps: vi.fn().mockResolvedValue({
        models: [
          {
            name: 'llama2:latest',
            size: 3825819519,
            size_vram: 3825819519,
          },
        ],
      }),
    })),
  };
});

describe('MCP Server Integration', () => {
  let server: Server;
  let client: Client;
  let serverTransport: InMemoryTransport;
  let clientTransport: InMemoryTransport;

  beforeAll(async () => {
    // Create transport pair
    [serverTransport, clientTransport] = InMemoryTransport.createLinkedPair();

    // Import and create server
    const { createServer } = await import('../../src/server.js');

    // Create a mock Ollama instance
    const mockOllama = new Ollama({ host: 'http://localhost:11434' });
    server = createServer(mockOllama);

    // Create client
    client = new Client(
      {
        name: 'test-client',
        version: '1.0.0',
      },
      {
        capabilities: {},
      }
    );

    // Connect both
    await Promise.all([
      server.connect(serverTransport),
      client.connect(clientTransport),
    ]);
  });

  afterAll(async () => {
    await server.close();
    await client.close();
  });

  it('should list available tools', async () => {
    const response = await client.listTools();

    expect(response.tools).toBeDefined();
    expect(Array.isArray(response.tools)).toBe(true);
    expect(response.tools.length).toBeGreaterThan(0);

    // Check that tools have expected properties
    const tool = response.tools[0];
    expect(tool).toHaveProperty('name');
    expect(tool).toHaveProperty('description');
    expect(tool).toHaveProperty('inputSchema');
  });

  it('should call ollama_list tool', async () => {
    const response = await client.callTool({
      name: 'ollama_list',
      arguments: {
        format: 'json',
      },
    });

    expect(response.content).toBeDefined();
    expect(Array.isArray(response.content)).toBe(true);
    expect(response.content.length).toBeGreaterThan(0);
    expect(response.content[0].type).toBe('text');
  });

  it('should call ollama_ps tool', async () => {
    const response = await client.callTool({
      name: 'ollama_ps',
      arguments: {
        format: 'json',
      },
    });

    expect(response.content).toBeDefined();
    expect(Array.isArray(response.content)).toBe(true);
    expect(response.content.length).toBeGreaterThan(0);
    expect(response.content[0].type).toBe('text');
  });

  it('should return error for unknown tool', async () => {
    const response = await client.callTool({
      name: 'ollama_unknown',
      arguments: {},
    });

    expect(response.isError).toBe(true);
    expect(response.content[0].text).toContain('Unknown tool');
  });
});

```

--------------------------------------------------------------------------------
/src/utils/response-formatter.ts:
--------------------------------------------------------------------------------

```typescript
import { markdownTable } from 'markdown-table';
import { ResponseFormat } from '../types.js';

/**
 * Format response content based on the specified format
 */
export function formatResponse(
  content: string,
  format: ResponseFormat
): string {
  if (format === ResponseFormat.JSON) {
    // For JSON format, validate and potentially wrap errors
    try {
      // Try to parse to validate it's valid JSON
      JSON.parse(content);
      return content;
    } catch {
      // If not valid JSON, wrap in error object
      return JSON.stringify({
        error: 'Invalid JSON content',
        raw_content: content,
      });
    }
  }

  // Format as markdown
  try {
    const data = JSON.parse(content);
    return jsonToMarkdown(data);
  } catch {
    // If not valid JSON, return as-is
    return content;
  }
}

/**
 * Convert JSON data to markdown format
 */
function jsonToMarkdown(data: any, indent: string = ''): string {
  // Handle null/undefined
  if (data === null || data === undefined) {
    return `${indent}_null_`;
  }

  // Handle primitives
  if (typeof data !== 'object') {
    return `${indent}${String(data)}`;
  }

  // Handle arrays
  if (Array.isArray(data)) {
    if (data.length === 0) {
      return `${indent}_empty array_`;
    }

    // Check if array of objects with consistent keys (table format)
    if (
      data.length > 0 &&
      typeof data[0] === 'object' &&
      !Array.isArray(data[0]) &&
      data[0] !== null
    ) {
      return arrayToMarkdownTable(data, indent);
    }

    // Array of primitives or mixed types
    return data
      .map((item) => `${indent}- ${jsonToMarkdown(item, '')}`)
      .join('\n');
  }

  // Handle objects
  const entries = Object.entries(data);
  if (entries.length === 0) {
    return `${indent}_empty object_`;
  }

  return entries
    .map(([key, value]) => {
      const formattedKey = key.replace(/_/g, ' ');
      if (typeof value === 'object' && value !== null) {
        if (Array.isArray(value)) {
          return `${indent}**${formattedKey}:**\n${jsonToMarkdown(value, indent + '  ')}`;
        }
        return `${indent}**${formattedKey}:**\n${jsonToMarkdown(value, indent + '  ')}`;
      }
      return `${indent}**${formattedKey}:** ${value}`;
    })
    .join('\n');
}

/**
 * Convert array of objects to markdown table using markdown-table library
 */
function arrayToMarkdownTable(data: any[], indent: string = ''): string {
  if (data.length === 0) return `${indent}_empty_`;

  // Get all unique keys from all objects
  const allKeys = new Set<string>();
  data.forEach((item) => {
    if (item && typeof item === 'object') {
      Object.keys(item).forEach((key) => allKeys.add(key));
    }
  });
  const keys = Array.from(allKeys);

  // Format headers (replace underscores with spaces)
  const headers = keys.map((k) => k.replace(/_/g, ' '));

  // Build table data
  const tableData = data.map((item) => {
    return keys.map((key) => {
      const value = item[key];
      if (value === null || value === undefined) return '';
      if (typeof value === 'object') return JSON.stringify(value);
      return String(value);
    });
  });

  // Generate markdown table
  const table = markdownTable([headers, ...tableData]);

  // Add indent to each line if needed
  if (indent) {
    return table
      .split('\n')
      .map((line) => indent + line)
      .join('\n');
  }

  return table;
}

```

--------------------------------------------------------------------------------
/tests/tools/chat.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { Ollama } from 'ollama';
import { chatWithModel, toolDefinition } from '../../src/tools/chat.js';
import { ResponseFormat } from '../../src/types.js';

describe('chatWithModel', () => {
  let ollama: Ollama;
  let mockChat: ReturnType<typeof vi.fn>;

  beforeEach(() => {
    mockChat = vi.fn();
    ollama = {
      chat: mockChat,
    } as any;
  });

  it('should handle basic chat with single user message', async () => {
    mockChat.mockResolvedValue({
      message: {
        role: 'assistant',
        content: 'Hello! How can I help you today?',
      },
      done: true,
    });

    const messages = [{ role: 'user' as const, content: 'Hello' }];
    const result = await chatWithModel(
      ollama,
      'llama3.2:latest',
      messages,
      {},
      ResponseFormat.MARKDOWN
    );

    expect(typeof result).toBe('string');
    expect(mockChat).toHaveBeenCalledTimes(1);
    expect(mockChat).toHaveBeenCalledWith({
      model: 'llama3.2:latest',
      messages,
      options: {},
      stream: false,
    });
    expect(result).toContain('Hello! How can I help you today?');
  });

  it('should handle chat with system message and options', async () => {
    mockChat.mockResolvedValue({
      message: {
        role: 'assistant',
        content: 'I will be helpful and concise.',
      },
      done: true,
    });

    const messages = [
      { role: 'system' as const, content: 'Be helpful' },
      { role: 'user' as const, content: 'Hello' },
    ];
    const options = { temperature: 0.7, top_p: 0.9 };

    const result = await chatWithModel(
      ollama,
      'llama3.2:latest',
      messages,
      options,
      ResponseFormat.MARKDOWN
    );

    expect(typeof result).toBe('string');
    expect(mockChat).toHaveBeenCalledWith({
      model: 'llama3.2:latest',
      messages,
      options,
      stream: false,
    });
  });

  it('should use JSON format when ResponseFormat.JSON is specified', async () => {
    mockChat.mockResolvedValue({
      message: {
        role: 'assistant',
        content: '{"response": "test"}',
      },
      done: true,
    });

    const messages = [{ role: 'user' as const, content: 'Hello' }];
    const result = await chatWithModel(
      ollama,
      'llama3.2:latest',
      messages,
      {},
      ResponseFormat.JSON
    );

    expect(mockChat).toHaveBeenCalledWith({
      model: 'llama3.2:latest',
      messages,
      options: {},
      format: 'json',
      stream: false,
    });
  });

  it('should handle empty content with fallback', async () => {
    mockChat.mockResolvedValue({
      message: {
        role: 'assistant',
        content: '',
      },
      done: true,
    });

    const messages = [{ role: 'user' as const, content: 'Hello' }];
    const result = await chatWithModel(
      ollama,
      'llama3.2:latest',
      messages,
      {},
      ResponseFormat.MARKDOWN
    );

    expect(typeof result).toBe('string');
  });

  it('should handle tool_calls when present', async () => {
    mockChat.mockResolvedValue({
      message: {
        role: 'assistant',
        content: 'Checking weather',
        tool_calls: [{ function: { name: 'get_weather' } }],
      },
      done: true,
    });

    const messages = [{ role: 'user' as const, content: 'Weather?' }];
    const result = await chatWithModel(
      ollama,
      'llama3.2:latest',
      messages,
      {},
      ResponseFormat.JSON
    );

    expect(result).toContain('tool_calls');
  });

  it('should work through toolDefinition handler', async () => {
    mockChat.mockResolvedValue({ message: { content: "test" }, done: true });
    const result = await toolDefinition.handler(
      ollama,
      { model: 'llama3.2:latest', messages: [{ role: 'user', content: 'test' }], format: 'json' },
      ResponseFormat.JSON
    );

    expect(typeof result).toBe('string');
  });

});
```

--------------------------------------------------------------------------------
/src/schemas.ts:
--------------------------------------------------------------------------------

```typescript
/**
 * Zod schemas for MCP tool input validation
 */

import { z } from 'zod';

/**
 * Response format enum schema
 */
export const ResponseFormatSchema = z.enum(['markdown', 'json']);

/**
 * Generation options schema
 */
export const GenerationOptionsSchema = z
  .object({
    temperature: z.number().min(0).max(2).optional(),
    top_p: z.number().min(0).max(1).optional(),
    top_k: z.number().min(0).optional(),
    num_predict: z.number().int().positive().optional(),
    repeat_penalty: z.number().min(0).optional(),
    seed: z.number().int().optional(),
    stop: z.array(z.string()).optional(),
  })
  .optional();

/**
 * Tool schema for function calling
 */
export const ToolSchema = z.object({
  type: z.string(),
  function: z.object({
    name: z.string().optional(),
    description: z.string().optional(),
    parameters: z
      .object({
        type: z.string().optional(),
        required: z.array(z.string()).optional(),
        properties: z.record(z.any()).optional(),
      })
      .optional(),
  }),
});

/**
 * Chat message schema
 */
export const ChatMessageSchema = z.object({
  role: z.enum(['system', 'user', 'assistant']),
  content: z.string(),
  images: z.array(z.string()).optional(),
});

/**
 * Schema for ollama_list tool
 */
export const ListModelsInputSchema = z.object({
  format: ResponseFormatSchema.default('json'),
});

/**
 * Schema for ollama_show tool
 */
export const ShowModelInputSchema = z.object({
  model: z.string().min(1),
  format: ResponseFormatSchema.default('json'),
});

/**
 * Helper to parse JSON string or return default value
 */
const parseJsonOrDefault = <T>(defaultValue: T) =>
  z.string().optional().transform((val, ctx) => {
    if (!val || val.trim() === '') {
      return defaultValue;
    }
    try {
      return JSON.parse(val) as T;
    } catch (e) {
      ctx.addIssue({
        code: z.ZodIssueCode.custom,
        message: 'Invalid JSON format',
      });
      return z.NEVER;
    }
  });

/**
 * Schema for ollama_chat tool
 */
export const ChatInputSchema = z.object({
  model: z.string().min(1),
  messages: z.array(ChatMessageSchema).min(1),
  tools: parseJsonOrDefault([]).pipe(z.array(ToolSchema)),
  options: parseJsonOrDefault({}).pipe(GenerationOptionsSchema),
  format: ResponseFormatSchema.default('json'),
  stream: z.boolean().default(false),
});

/**
 * Schema for ollama_generate tool
 */
export const GenerateInputSchema = z.object({
  model: z.string().min(1),
  prompt: z.string(),
  options: parseJsonOrDefault({}).pipe(GenerationOptionsSchema),
  format: ResponseFormatSchema.default('json'),
  stream: z.boolean().default(false),
});

/**
 * Schema for ollama_embed tool
 */
export const EmbedInputSchema = z.object({
  model: z.string().min(1),
  input: z.string().transform((val, ctx) => {
    const trimmed = val.trim();
    // If it looks like a JSON array, try to parse it
    if (trimmed.startsWith('[') && trimmed.endsWith(']')) {
      try {
        const parsed = JSON.parse(trimmed);
        if (Array.isArray(parsed)) {
          // Validate all elements are strings
          const allStrings = parsed.every((item) => typeof item === 'string');
          if (allStrings) {
            return parsed as string[];
          } else {
            ctx.addIssue({
              code: z.ZodIssueCode.custom,
              message:
                'Input is a JSON array but contains non-string elements',
            });
            return z.NEVER;
          }
        }
      } catch (e) {
        // Failed to parse as JSON, treat as plain string
      }
    }
    // Return as plain string
    return trimmed;
  }),
  format: ResponseFormatSchema.default('json'),
});

/**
 * Schema for ollama_pull tool
 */
export const PullModelInputSchema = z.object({
  model: z.string().min(1),
  insecure: z.boolean().default(false),
  format: ResponseFormatSchema.default('json'),
});

/**
 * Schema for ollama_push tool
 */
export const PushModelInputSchema = z.object({
  model: z.string().min(1),
  insecure: z.boolean().default(false),
  format: ResponseFormatSchema.default('json'),
});

/**
 * Schema for ollama_create tool
 */
export const CreateModelInputSchema = z.object({
  model: z.string().min(1),
  from: z.string().min(1),
  system: z.string().optional(),
  template: z.string().optional(),
  license: z.string().optional(),
  format: ResponseFormatSchema.default('json'),
});

/**
 * Schema for ollama_delete tool
 */
export const DeleteModelInputSchema = z.object({
  model: z.string().min(1),
  format: ResponseFormatSchema.default('json'),
});

/**
 * Schema for ollama_copy tool
 */
export const CopyModelInputSchema = z.object({
  source: z.string().min(1),
  destination: z.string().min(1),
  format: ResponseFormatSchema.default('json'),
});

/**
 * Schema for ollama_ps tool (list running models)
 */
export const PsInputSchema = z.object({
  format: ResponseFormatSchema.default('json'),
});

/**
 * Schema for ollama_abort tool
 */
export const AbortRequestInputSchema = z.object({
  model: z.string().min(1),
});

/**
 * Schema for ollama_web_search tool
 */
export const WebSearchInputSchema = z.object({
  query: z.string().min(1),
  max_results: z.number().int().min(1).max(10).default(5),
  format: ResponseFormatSchema.default('json'),
});

/**
 * Schema for ollama_web_fetch tool
 */
export const WebFetchInputSchema = z.object({
  url: z.string().url().min(1),
  format: ResponseFormatSchema.default('json'),
});

```

--------------------------------------------------------------------------------
/tests/tools/web-fetch.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
import { webFetch } from '../../src/tools/web-fetch.js';
import { ResponseFormat } from '../../src/types.js';
import type { Ollama } from 'ollama';
import { HttpError } from '../../src/utils/http-error.js';

// Mock fetch globally
global.fetch = vi.fn();

describe('webFetch', () => {
  const mockOllama = {} as Ollama;
  const testUrl = 'https://example.com';

  beforeEach(() => {
    vi.clearAllMocks();
    process.env.OLLAMA_API_KEY = 'test-api-key';
  });

  afterEach(() => {
    delete process.env.OLLAMA_API_KEY;
  });

  it('should throw error if OLLAMA_API_KEY is not set', async () => {
    // Arrange
    delete process.env.OLLAMA_API_KEY;

    // Act & Assert
    await expect(webFetch(mockOllama, testUrl, ResponseFormat.JSON))
      .rejects.toThrow('OLLAMA_API_KEY environment variable is required');
  });

  it('should successfully fetch web page', async () => {
    // Arrange
    const mockResponse = {
      title: 'Test Page',
      content: 'Test content',
      links: [],
    };

    (global.fetch as any).mockResolvedValueOnce({
      ok: true,
      json: async () => mockResponse,
    });

    // Act
    const result = await webFetch(mockOllama, testUrl, ResponseFormat.JSON);

    // Assert
    expect(global.fetch).toHaveBeenCalledWith(
      'https://ollama.com/api/web_fetch',
      {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          Authorization: 'Bearer test-api-key',
        },
        body: JSON.stringify({ url: testUrl }),
        signal: expect.any(AbortSignal),
      }
    );
    expect(result).toContain('Test Page');
  });

  it('should successfully complete on first attempt (no retry needed)', async () => {
    // Arrange
    const mockResponse = {
      title: 'Test Page',
      content: 'Test content',
    };

    (global.fetch as any).mockResolvedValueOnce({
      ok: true,
      json: async () => mockResponse,
    });

    // Act
    const result = await webFetch(mockOllama, testUrl, ResponseFormat.JSON);

    // Assert
    expect(result).toContain('Test Page');
    expect(global.fetch).toHaveBeenCalledTimes(1);
  });

  it('should retry on 429 rate limit error and eventually succeed', async () => {
    // Arrange
    const mockResponse = {
      title: 'Success after retry',
      content: 'content',
    };

    // Mock setTimeout to execute immediately (avoid real delays in tests)
    vi.spyOn(global, 'setTimeout').mockImplementation(((callback: any) => {
      Promise.resolve().then(() => callback());
      return 0 as any;
    }) as any);

    // First call returns 429, second call succeeds
    (global.fetch as any)
      .mockResolvedValueOnce({
        ok: false,
        status: 429,
        statusText: 'Too Many Requests',
        headers: {
          get: vi.fn().mockReturnValue(null),
        },
      })
      .mockResolvedValueOnce({
        ok: true,
        json: async () => mockResponse,
      });

    // Act
    const result = await webFetch(mockOllama, testUrl, ResponseFormat.JSON);

    // Assert
    expect(result).toContain('Success after retry');
    expect(global.fetch).toHaveBeenCalledTimes(2);
  });

  it('should throw error on non-retryable HTTP errors', async () => {
    // Arrange - 501 Not Implemented is not retried
    (global.fetch as any).mockResolvedValueOnce({
      ok: false,
      status: 501,
      statusText: 'Not Implemented',
      headers: {
        get: vi.fn().mockReturnValue(null),
      },
    });

    // Act & Assert
    await expect(webFetch(mockOllama, testUrl, ResponseFormat.JSON))
      .rejects.toThrow('Web fetch failed: 501 Not Implemented');
  });

  it('should throw error on network timeout (no status code)', async () => {
    // Arrange
    const networkError = new Error('Network timeout - no response from server');
    (global.fetch as any).mockRejectedValueOnce(networkError);

    // Act & Assert
    await expect(webFetch(mockOllama, testUrl, ResponseFormat.JSON))
      .rejects.toThrow('Network timeout - no response from server');

    // Should not retry network errors
    expect(global.fetch).toHaveBeenCalledTimes(1);
  });

  it('should throw error when response.json() fails (malformed JSON)', async () => {
    // Arrange
    (global.fetch as any).mockResolvedValueOnce({
      ok: true,
      json: async () => {
        throw new Error('Unexpected token < in JSON at position 0');
      },
    });

    // Act & Assert
    await expect(webFetch(mockOllama, testUrl, ResponseFormat.JSON))
      .rejects.toThrow('Unexpected token < in JSON at position 0');
  });

  it('should handle fetch abort/cancel errors', async () => {
    // Arrange
    const abortError = new Error('The operation was aborted');
    abortError.name = 'AbortError';
    (global.fetch as any).mockRejectedValueOnce(abortError);

    // Act & Assert
    // Note: fetchWithTimeout transforms AbortError to timeout message
    await expect(webFetch(mockOllama, testUrl, ResponseFormat.JSON))
      .rejects.toThrow('Request to https://ollama.com/api/web_fetch timed out after 30000ms');

    // Should not retry abort errors
    expect(global.fetch).toHaveBeenCalledTimes(1);
  });

  it('should eventually fail after multiple 429 retries', async () => {
    // Arrange
    vi.spyOn(global, 'setTimeout').mockImplementation(((callback: any) => {
      Promise.resolve().then(() => callback());
      return 0 as any;
    }) as any);

    // Always return 429 (will exhaust retries)
    (global.fetch as any).mockResolvedValue({
      ok: false,
      status: 429,
      statusText: 'Too Many Requests',
      headers: {
        get: vi.fn().mockReturnValue(null),
      },
    });

    // Act & Assert
    await expect(webFetch(mockOllama, testUrl, ResponseFormat.JSON))
      .rejects.toThrow('Web fetch failed: 429 Too Many Requests');

    // Should attempt initial + 3 retries = 4 total
    expect(global.fetch).toHaveBeenCalledTimes(4);
  });
});

```

--------------------------------------------------------------------------------
/src/utils/retry.ts:
--------------------------------------------------------------------------------

```typescript
import { HttpError } from './http-error.js';

/**
 * Options for retry behavior
 */
export interface RetryOptions {
  /** Number of retry attempts after the initial call (default: 3) */
  maxRetries?: number;
  /** Initial delay in milliseconds before first retry (default: 1000ms) */
  initialDelay?: number;
  /** Maximum delay in milliseconds to cap exponential backoff (default: 10000ms) */
  maxDelay?: number;
  /** Request timeout in milliseconds (default: 30000ms) */
  timeout?: number;
}

/**
 * Sleep for a specified duration
 */
function sleep(ms: number): Promise<void> {
  return new Promise((resolve) => setTimeout(resolve, ms));
}

/**
 * Check if an error is retryable based on HTTP status code
 *
 * Retryable errors include:
 * - 429 (Too Many Requests) - Rate limiting
 * - 500 (Internal Server Error) - Transient server issues
 * - 502 (Bad Gateway) - Gateway/proxy received invalid response
 * - 503 (Service Unavailable) - Server temporarily unable to handle request
 * - 504 (Gateway Timeout) - Gateway/proxy did not receive timely response
 *
 * These errors are typically transient and safe to retry for idempotent operations.
 * Other 5xx errors (501, 505, 506, 508, etc.) indicate permanent configuration
 * or implementation issues and should not be retried.
 */
function isRetryableError(error: unknown): boolean {
  if (!(error instanceof HttpError)) {
    return false;
  }

  const retryableStatuses = [429, 500, 502, 503, 504];
  return retryableStatuses.includes(error.status);
}

/**
 * Fetch with timeout support using AbortController
 *
 * Note: Creates an internal AbortController for timeout management.
 * External cancellation via options.signal is not supported - any signal
 * passed in options will be overridden by the internal timeout signal.
 *
 * @param url - URL to fetch
 * @param options - Fetch options (signal will be overridden)
 * @param timeout - Timeout in milliseconds (default: 30000ms)
 * @returns Fetch response
 * @throws Error if request times out
 */
export async function fetchWithTimeout(
  url: string,
  options?: RequestInit,
  timeout: number = 30000
): Promise<Response> {
  const controller = new AbortController();
  const timeoutId = setTimeout(() => controller.abort(), timeout);

  try {
    const response = await fetch(url, {
      ...options,
      signal: controller.signal,
    });
    return response;
  } catch (error: unknown) {
    if (error instanceof Error && error.name === 'AbortError') {
      throw new Error(`Request to ${url} timed out after ${timeout}ms`);
    }
    throw error;
  } finally {
    // Always clear timeout to prevent memory leaks and race conditions.
    // If fetch completes exactly at timeout boundary, clearTimeout ensures
    // the timeout callback doesn't execute after we've already returned.
    clearTimeout(timeoutId);
  }
}

/**
 * Parse Retry-After header value to milliseconds
 * Supports both delay-seconds and HTTP-date formats
 * @param retryAfter - Retry-After header value
 * @returns Delay in milliseconds, or null if invalid
 */
function parseRetryAfter(retryAfter: string | undefined): number | null {
  if (!retryAfter) {
    return null;
  }

  // Try parsing as seconds (integer)
  const seconds = parseInt(retryAfter, 10);
  if (!isNaN(seconds) && seconds >= 0) {
    return seconds * 1000;
  }

  // Try parsing as HTTP-date
  const date = new Date(retryAfter);
  if (!isNaN(date.getTime())) {
    const delay = date.getTime() - Date.now();
    // Only use if it's a future date
    return delay > 0 ? delay : 0;
  }

  return null;
}

/**
 * Retry a function with exponential backoff on rate limit errors
 *
 * Uses exponential backoff with full jitter to prevent thundering herd:
 * - Attempt 0: 0-1 seconds (random in range [0, 1s])
 * - Attempt 1: 0-2 seconds (random in range [0, 2s])
 * - Attempt 2: 0-4 seconds (random in range [0, 4s])
 * - And so on...
 *
 * Retry attempts are logged to console.debug for debugging and telemetry purposes,
 * including attempt number, delay, and error message.
 *
 * @param fn - The function to retry
 * @param options - Retry options (maxRetries: number of retry attempts after initial call)
 * @returns The result of the function
 * @throws The last error if max retries exceeded or non-retryable error
 */
export async function retryWithBackoff<T>(
  fn: () => Promise<T>,
  options: RetryOptions = {}
): Promise<T> {
  const { maxRetries = 3, initialDelay = 1000, maxDelay = 10000 } = options;

  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      return await fn();
    } catch (error) {
      // Only retry on transient errors (429, 500, 502, 503, 504)
      // Throw immediately for any other error type
      if (!isRetryableError(error)) {
        throw error;
      }

      // Throw if we've exhausted all retry attempts
      if (attempt === maxRetries) {
        throw error;
      }

      // Check if error has Retry-After header
      let delay: number;
      const retryAfterDelay = error instanceof HttpError ? parseRetryAfter(error.retryAfter) : null;

      if (retryAfterDelay !== null) {
        // Use Retry-After header value, capped at maxDelay
        delay = Math.min(retryAfterDelay, maxDelay);
      } else {
        // Calculate delay with exponential backoff and full jitter, capped at maxDelay
        const exponentialDelay = Math.min(
          initialDelay * Math.pow(2, attempt),
          maxDelay
        );
        // Full jitter: random value between 0 and exponentialDelay
        delay = Math.random() * exponentialDelay;
      }

      // Log retry attempt for debugging/telemetry
      console.debug(
        `Retry attempt ${attempt + 1}/${maxRetries} after ${Math.round(delay)}ms delay`,
        {
          delay: Math.round(delay),
          error: error instanceof Error ? error.message : String(error)
        }
      );

      await sleep(delay);
    }
  }

  // This line is unreachable due to the throw statements above,
  // but TypeScript requires it for exhaustiveness checking
  throw new Error('Unexpected: retry loop completed without return or throw');
}

```

--------------------------------------------------------------------------------
/tests/tools/web-search.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
import { webSearch } from '../../src/tools/web-search.js';
import { ResponseFormat } from '../../src/types.js';
import type { Ollama } from 'ollama';
import { HttpError } from '../../src/utils/http-error.js';

// Mock fetch globally
global.fetch = vi.fn();

describe('webSearch', () => {
  const mockOllama = {} as Ollama;
  const testQuery = 'test search query';
  const maxResults = 5;

  beforeEach(() => {
    vi.clearAllMocks();
    process.env.OLLAMA_API_KEY = 'test-api-key';
  });

  afterEach(() => {
    delete process.env.OLLAMA_API_KEY;
  });

  it('should throw error if OLLAMA_API_KEY is not set', async () => {
    // Arrange
    delete process.env.OLLAMA_API_KEY;

    // Act & Assert
    await expect(webSearch(mockOllama, testQuery, maxResults, ResponseFormat.JSON))
      .rejects.toThrow('OLLAMA_API_KEY environment variable is required');
  });

  it('should successfully perform web search', async () => {
    // Arrange
    const mockResponse = {
      results: [
        { title: 'Result 1', url: 'https://example.com/1', snippet: 'Test snippet 1' },
        { title: 'Result 2', url: 'https://example.com/2', snippet: 'Test snippet 2' },
      ],
    };

    (global.fetch as any).mockResolvedValueOnce({
      ok: true,
      json: async () => mockResponse,
    });

    // Act
    const result = await webSearch(mockOllama, testQuery, maxResults, ResponseFormat.JSON);

    // Assert
    expect(global.fetch).toHaveBeenCalledWith(
      'https://ollama.com/api/web_search',
      {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          Authorization: 'Bearer test-api-key',
        },
        body: JSON.stringify({ query: testQuery, max_results: maxResults }),
        signal: expect.any(AbortSignal),
      }
    );
    expect(result).toContain('Result 1');
    expect(result).toContain('Result 2');
  });

  it('should successfully complete on first attempt (no retry needed)', async () => {
    // Arrange
    const mockResponse = {
      results: [
        { title: 'Result 1', url: 'https://example.com/1', snippet: 'Test' },
      ],
    };

    (global.fetch as any).mockResolvedValueOnce({
      ok: true,
      json: async () => mockResponse,
    });

    // Act
    const result = await webSearch(mockOllama, testQuery, maxResults, ResponseFormat.JSON);

    // Assert
    expect(result).toContain('Result 1');
    expect(global.fetch).toHaveBeenCalledTimes(1);
  });

  it('should retry on 429 rate limit error and eventually succeed', async () => {
    // Arrange
    const mockResponse = {
      results: [
        { title: 'Success after retry', url: 'https://example.com', snippet: 'Test' },
      ],
    };

    // Mock setTimeout to execute immediately (avoid real delays in tests)
    vi.spyOn(global, 'setTimeout').mockImplementation(((callback: any) => {
      Promise.resolve().then(() => callback());
      return 0 as any;
    }) as any);

    // First call returns 429, second call succeeds
    (global.fetch as any)
      .mockResolvedValueOnce({
        ok: false,
        status: 429,
        statusText: 'Too Many Requests',
        headers: {
          get: vi.fn().mockReturnValue(null),
        },
      })
      .mockResolvedValueOnce({
        ok: true,
        json: async () => mockResponse,
      });

    // Act
    const result = await webSearch(mockOllama, testQuery, maxResults, ResponseFormat.JSON);

    // Assert
    expect(result).toContain('Success after retry');
    expect(global.fetch).toHaveBeenCalledTimes(2);
  });

  it('should throw error on non-retryable HTTP errors', async () => {
    // Arrange - 501 Not Implemented is not retried
    (global.fetch as any).mockResolvedValueOnce({
      ok: false,
      status: 501,
      statusText: 'Not Implemented',
      headers: {
        get: vi.fn().mockReturnValue(null),
      },
    });

    // Act & Assert
    await expect(webSearch(mockOllama, testQuery, maxResults, ResponseFormat.JSON))
      .rejects.toThrow('Web search failed: 501 Not Implemented');
  });

  it('should throw error on network timeout (no status code)', async () => {
    // Arrange
    const networkError = new Error('Network timeout - no response from server');
    (global.fetch as any).mockRejectedValueOnce(networkError);

    // Act & Assert
    await expect(webSearch(mockOllama, testQuery, maxResults, ResponseFormat.JSON))
      .rejects.toThrow('Network timeout - no response from server');

    // Should not retry network errors
    expect(global.fetch).toHaveBeenCalledTimes(1);
  });

  it('should throw error when response.json() fails (malformed JSON)', async () => {
    // Arrange
    (global.fetch as any).mockResolvedValueOnce({
      ok: true,
      json: async () => {
        throw new Error('Unexpected token < in JSON at position 0');
      },
    });

    // Act & Assert
    await expect(webSearch(mockOllama, testQuery, maxResults, ResponseFormat.JSON))
      .rejects.toThrow('Unexpected token < in JSON at position 0');
  });

  it('should handle fetch abort/cancel errors', async () => {
    // Arrange
    const abortError = new Error('The operation was aborted');
    abortError.name = 'AbortError';
    (global.fetch as any).mockRejectedValueOnce(abortError);

    // Act & Assert
    // Note: fetchWithTimeout transforms AbortError to timeout message
    await expect(webSearch(mockOllama, testQuery, maxResults, ResponseFormat.JSON))
      .rejects.toThrow('Request to https://ollama.com/api/web_search timed out after 30000ms');

    // Should not retry abort errors
    expect(global.fetch).toHaveBeenCalledTimes(1);
  });

  it('should eventually fail after multiple 429 retries', async () => {
    // Arrange
    vi.spyOn(global, 'setTimeout').mockImplementation(((callback: any) => {
      Promise.resolve().then(() => callback());
      return 0 as any;
    }) as any);

    // Always return 429 (will exhaust retries)
    (global.fetch as any).mockResolvedValue({
      ok: false,
      status: 429,
      statusText: 'Too Many Requests',
      headers: {
        get: vi.fn().mockReturnValue(null),
      },
    });

    // Act & Assert
    await expect(webSearch(mockOllama, testQuery, maxResults, ResponseFormat.JSON))
      .rejects.toThrow('Web search failed: 429 Too Many Requests');

    // Should attempt initial + 3 retries = 4 total
    expect(global.fetch).toHaveBeenCalledTimes(4);
  });
});

```

--------------------------------------------------------------------------------
/tests/utils/retry.test.ts:
--------------------------------------------------------------------------------

```typescript
import { describe, it, expect, vi, afterEach } from 'vitest';
import { retryWithBackoff, fetchWithTimeout } from '../../src/utils/retry.js';
import { HttpError } from '../../src/utils/http-error.js';

// Type-safe mock for setTimeout that executes callbacks immediately
type TimeoutCallback = (...args: any[]) => void;
type MockSetTimeout = (callback: TimeoutCallback, ms?: number) => NodeJS.Timeout;

const createMockSetTimeout = (delayTracker?: number[]): MockSetTimeout => {
  return ((callback: TimeoutCallback, delay?: number) => {
    if (delayTracker && delay !== undefined) {
      delayTracker.push(delay);
    }
    Promise.resolve().then(() => callback());
    return 0 as unknown as NodeJS.Timeout;
  }) as MockSetTimeout;
};

describe('retryWithBackoff', () => {
  afterEach(() => {
    vi.restoreAllMocks();
  });

  it('should successfully execute function on first attempt', async () => {
    // Arrange
    const mockFn = vi.fn().mockResolvedValue('success');

    // Act
    const result = await retryWithBackoff(mockFn);

    // Assert
    expect(result).toBe('success');
    expect(mockFn).toHaveBeenCalledTimes(1);
  });

  it('should retry on 429 rate limit error with exponential backoff', async () => {
    // Arrange
    const error429 = new HttpError('Rate limit exceeded', 429);

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout());

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429)
      .mockRejectedValueOnce(error429)
      .mockResolvedValueOnce('success');

    // Act
    const result = await retryWithBackoff(mockFn, { maxRetries: 3, initialDelay: 1000 });

    // Assert
    expect(result).toBe('success');
    expect(mockFn).toHaveBeenCalledTimes(3);
  });

  it('should throw error after max retries exceeded', async () => {
    // Arrange
    const error429 = new HttpError('Rate limit exceeded', 429);

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout());

    const mockFn = vi.fn().mockRejectedValue(error429);

    // Act & Assert
    await expect(retryWithBackoff(mockFn, { maxRetries: 2, initialDelay: 100 }))
      .rejects.toThrow('Rate limit exceeded');
    expect(mockFn).toHaveBeenCalledTimes(3); // Initial + 2 retries
  });

  it('should not retry on non-retryable errors (e.g., 400, 404)', async () => {
    // Arrange
    const error404 = new HttpError('Not Found', 404);

    const mockFn = vi.fn().mockRejectedValue(error404);

    // Act & Assert
    await expect(retryWithBackoff(mockFn)).rejects.toThrow('Not Found');
    expect(mockFn).toHaveBeenCalledTimes(1);
  });

  it('should use exponential backoff with jitter: 1-2s, 2-4s, 4-8s', async () => {
    // Arrange
    const error429 = new HttpError('Rate limit exceeded', 429);

    const delays: number[] = [];

    // Immediately execute callback to avoid timing issues
    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout(delays));

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429)
      .mockRejectedValueOnce(error429)
      .mockRejectedValueOnce(error429)
      .mockResolvedValueOnce('success');

    // Act
    await retryWithBackoff(mockFn, { maxRetries: 4, initialDelay: 1000 });

    // Assert
    // Check that delays follow exponential pattern with full jitter
    // Formula: Full jitter = random() * exponentialDelay
    // Attempt 0: 0 to 1000ms
    expect(delays[0]).toBeGreaterThanOrEqual(0);
    expect(delays[0]).toBeLessThan(1000);

    // Attempt 1: 0 to 2000ms
    expect(delays[1]).toBeGreaterThanOrEqual(0);
    expect(delays[1]).toBeLessThan(2000);

    // Attempt 2: 0 to 4000ms
    expect(delays[2]).toBeGreaterThanOrEqual(0);
    expect(delays[2]).toBeLessThan(4000);
  });

  it('should add jitter to prevent thundering herd', async () => {
    // Arrange
    const error429 = new HttpError('Rate limit exceeded', 429);

    const delays: number[] = [];

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout(delays));

    vi.spyOn(Math, 'random').mockReturnValue(0.5); // Fixed jitter for testing

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429)
      .mockResolvedValueOnce('success');

    // Act
    await retryWithBackoff(mockFn, { maxRetries: 2, initialDelay: 1000 });

    // Assert
    // Full jitter: Delay should be 1000 * 0.5 = 500ms
    expect(delays[0]).toBe(500);
  });

  it('should respect maxRetries exactly (3 retries = 4 total attempts)', async () => {
    // Arrange
    const error429 = new HttpError('Rate limit exceeded', 429);

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout());

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429)
      .mockRejectedValueOnce(error429)
      .mockRejectedValueOnce(error429)
      .mockResolvedValueOnce('success');

    // Act
    const result = await retryWithBackoff(mockFn, { maxRetries: 3, initialDelay: 100 });

    // Assert
    expect(result).toBe('success');
    expect(mockFn).toHaveBeenCalledTimes(4); // Initial + 3 retries
  });

  it('should throw after exactly maxRetries attempts (not maxRetries + 1)', async () => {
    // Arrange
    const error429 = new HttpError('Rate limit exceeded', 429);

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout());

    const mockFn = vi.fn().mockRejectedValue(error429);

    // Act & Assert
    await expect(retryWithBackoff(mockFn, { maxRetries: 3, initialDelay: 100 }))
      .rejects.toThrow('Rate limit exceeded');

    // Should be called 4 times: initial attempt + 3 retries
    expect(mockFn).toHaveBeenCalledTimes(4);
  });

  it('should not retry on network timeout errors (no status code)', async () => {
    // Arrange
    const networkError = new Error('Network timeout');
    // Intentionally don't add status property - simulates timeout/network errors

    const mockFn = vi.fn().mockRejectedValue(networkError);

    // Act & Assert
    await expect(retryWithBackoff(mockFn, { maxRetries: 3 }))
      .rejects.toThrow('Network timeout');

    // Should only be called once (no retries for non-HTTP errors)
    expect(mockFn).toHaveBeenCalledTimes(1);
  });

  it('should not retry on non-HttpError errors', async () => {
    // Arrange
    const genericError = new Error('Something went wrong');

    const mockFn = vi.fn().mockRejectedValue(genericError);

    // Act & Assert
    await expect(retryWithBackoff(mockFn, { maxRetries: 3 }))
      .rejects.toThrow('Something went wrong');

    // Should only be called once
    expect(mockFn).toHaveBeenCalledTimes(1);
  });

  it('should handle maximum retries with high initial delay', async () => {
    // Arrange
    const error429 = new HttpError('Rate limit exceeded', 429);
    const delays: number[] = [];

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout(delays));

    const mockFn = vi.fn().mockRejectedValue(error429);

    // Act & Assert
    await expect(retryWithBackoff(mockFn, { maxRetries: 5, initialDelay: 10000 }))
      .rejects.toThrow('Rate limit exceeded');

    // Verify delays grow exponentially even with high initial delay (full jitter)
    // Attempt 0: 0 to 10000ms
    expect(delays[0]).toBeGreaterThanOrEqual(0);
    expect(delays[0]).toBeLessThan(10000);

    // Attempt 1: 0 to 20000ms
    expect(delays[1]).toBeGreaterThanOrEqual(0);
    expect(delays[1]).toBeLessThan(20000);

    // Attempt 2: 0 to 40000ms
    expect(delays[2]).toBeGreaterThanOrEqual(0);
    expect(delays[2]).toBeLessThan(40000);

    // Should attempt maxRetries + 1 times (initial + 5 retries)
    expect(mockFn).toHaveBeenCalledTimes(6);
  });

  it('should cap delays at maxDelay when specified', async () => {
    // Arrange
    const error429 = new HttpError('Rate limit exceeded', 429);
    const delays: number[] = [];

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout(delays));

    const mockFn = vi.fn().mockRejectedValue(error429);

    // Act & Assert
    await expect(retryWithBackoff(mockFn, {
      maxRetries: 5,
      initialDelay: 1000,
      maxDelay: 5000
    })).rejects.toThrow('Rate limit exceeded');

    // All delays should be capped at maxDelay (5000ms) with full jitter
    // Without cap: 1000, 2000, 4000, 8000, 16000 (exponentialDelay)
    // With cap: 1000, 2000, 4000, 5000, 5000 (exponentialDelay capped)
    // With full jitter: delays are random(0, exponentialDelay)

    // First three delays should follow exponential pattern (full jitter)
    expect(delays[0]).toBeGreaterThanOrEqual(0);
    expect(delays[0]).toBeLessThan(1000);

    expect(delays[1]).toBeGreaterThanOrEqual(0);
    expect(delays[1]).toBeLessThan(2000);

    expect(delays[2]).toBeGreaterThanOrEqual(0);
    expect(delays[2]).toBeLessThan(4000);

    // Fourth and fifth delays should be capped at maxDelay (full jitter: 0 to 5000)
    expect(delays[3]).toBeGreaterThanOrEqual(0);
    expect(delays[3]).toBeLessThan(5000);

    expect(delays[4]).toBeGreaterThanOrEqual(0);
    expect(delays[4]).toBeLessThan(5000);
  });

  it('should default maxDelay to 10000ms when not specified', async () => {
    // Arrange
    const error429 = new HttpError('Rate limit exceeded', 429);
    const delays: number[] = [];

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout(delays));

    // Mock Math.random to always return maximum value (0.999...) to test upper bound
    const originalRandom = Math.random;
    Math.random = vi.fn().mockReturnValue(0.9999);

    const mockFn = vi.fn().mockRejectedValue(error429);

    // Act & Assert
    await expect(retryWithBackoff(mockFn, {
      maxRetries: 5,
      initialDelay: 1000
      // maxDelay not specified - should default to 10000ms
    })).rejects.toThrow('Rate limit exceeded');

    // Restore Math.random
    Math.random = originalRandom;

    // Verify delays are capped at default 10000ms
    // With Math.random = 0.9999 (near maximum):
    // Attempt 0: ~999.9ms (0.9999 * 1000)
    // Attempt 1: ~1999.8ms (0.9999 * 2000)
    // Attempt 2: ~3999.6ms (0.9999 * 4000)
    // Attempt 3: ~7999.2ms (0.9999 * 8000)
    // Attempt 4: ~9999ms (0.9999 * min(16000, 10000)) - should be capped at 10000

    // With current Infinity default, attempt 4 would be ~15999ms (0.9999 * 16000)
    // This assertion should FAIL with Infinity default
    expect(delays[4]).toBeLessThan(10000); // Should fail: will be ~16000 with Infinity
  });

  it('should allow unbounded growth when maxDelay is explicitly set to Infinity', async () => {
    // Arrange
    const error429 = new HttpError('Rate limit exceeded', 429);
    const delays: number[] = [];

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout(delays));

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429)
      .mockRejectedValueOnce(error429)
      .mockResolvedValueOnce('success');

    // Act - explicitly set maxDelay to Infinity to allow unbounded growth
    await retryWithBackoff(mockFn, { maxRetries: 3, initialDelay: 1000, maxDelay: Infinity });

    // Assert - delays should grow without bounds (full jitter)
    // Attempt 0: 0 to 1000ms
    expect(delays[0]).toBeGreaterThanOrEqual(0);
    expect(delays[0]).toBeLessThan(1000);

    // Attempt 1: 0 to 2000ms
    expect(delays[1]).toBeGreaterThanOrEqual(0);
    expect(delays[1]).toBeLessThan(2000);
  });

  it('should handle maxDelay smaller than initialDelay', async () => {
    // Arrange
    const error429 = new HttpError('Rate limit exceeded', 429);
    const delays: number[] = [];

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout(delays));

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429)
      .mockResolvedValueOnce('success');

    // Act - maxDelay (500) is less than initialDelay (1000)
    await retryWithBackoff(mockFn, {
      maxRetries: 2,
      initialDelay: 1000,
      maxDelay: 500
    });

    // Assert - delay should be capped at maxDelay from the start (full jitter)
    // exponentialDelay would be min(1000, 500) = 500
    // With full jitter: 0 to 500ms
    expect(delays[0]).toBeGreaterThanOrEqual(0);
    expect(delays[0]).toBeLessThan(500);
  });

  it('should use Retry-After header value in seconds when provided', async () => {
    // Arrange
    const error429 = new HttpError('Rate limit exceeded', 429, '5'); // 5 seconds
    const delays: number[] = [];

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout(delays));

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429)
      .mockResolvedValueOnce('success');

    // Act
    await retryWithBackoff(mockFn, { maxRetries: 2, initialDelay: 1000 });

    // Assert - should use 5000ms from Retry-After instead of exponential backoff
    expect(delays[0]).toBe(5000);
  });

  it('should use Retry-After header value as HTTP-date when provided', async () => {
    // Arrange
    const futureDate = new Date(Date.now() + 3000); // 3 seconds from now
    const error429 = new HttpError('Rate limit exceeded', 429, futureDate.toUTCString());
    const delays: number[] = [];

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout(delays));

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429)
      .mockResolvedValueOnce('success');

    // Act
    await retryWithBackoff(mockFn, { maxRetries: 2, initialDelay: 1000 });

    // Assert - should use calculated delay from date (behavior: delay is non-zero and reasonable)
    // Test behavior: Retry-After date was parsed and used instead of exponential backoff
    expect(delays[0]).toBeGreaterThan(0);
    expect(mockFn).toHaveBeenCalledTimes(2); // Initial call + 1 retry
  });

  it('should fallback to exponential backoff if Retry-After is invalid', async () => {
    // Arrange
    const error429 = new HttpError('Rate limit exceeded', 429, 'invalid-value');
    const delays: number[] = [];

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout(delays));

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429)
      .mockResolvedValueOnce('success');

    // Act
    await retryWithBackoff(mockFn, { maxRetries: 2, initialDelay: 1000 });

    // Assert - should fallback to exponential backoff with full jitter (0 to 1000ms)
    expect(delays[0]).toBeGreaterThanOrEqual(0);
    expect(delays[0]).toBeLessThan(1000);
  });

  it('should respect maxDelay even with Retry-After header', async () => {
    // Arrange
    const error429 = new HttpError('Rate limit exceeded', 429, '20'); // 20 seconds
    const delays: number[] = [];

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout(delays));

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429)
      .mockResolvedValueOnce('success');

    // Act - maxDelay is 5000ms, but Retry-After says 20000ms
    await retryWithBackoff(mockFn, { maxRetries: 2, initialDelay: 1000, maxDelay: 5000 });

    // Assert - should be capped at maxDelay
    expect(delays[0]).toBe(5000);
  });

  it('should handle negative or past Retry-After dates gracefully', async () => {
    // Arrange
    const pastDate = new Date(Date.now() - 5000); // 5 seconds ago
    const error429 = new HttpError('Rate limit exceeded', 429, pastDate.toUTCString());
    const delays: number[] = [];

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout(delays));

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429)
      .mockResolvedValueOnce('success');

    // Act
    await retryWithBackoff(mockFn, { maxRetries: 2, initialDelay: 1000 });

    // Assert - should handle past date gracefully (behavior: retry still occurs)
    // Past dates return 0ms delay, but retry mechanism should still work
    expect(delays[0]).toBeGreaterThanOrEqual(0);
    expect(mockFn).toHaveBeenCalledTimes(2); // Initial call + 1 retry
  });

  it('should cap very large Retry-After values (3600+ seconds) at maxDelay', async () => {
    // Arrange - Server requests 1 hour (3600 seconds) delay
    const error429 = new HttpError('Rate limit exceeded', 429, '3600');
    const delays: number[] = [];

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout(delays));

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429)
      .mockResolvedValueOnce('success');

    // Act - maxDelay defaults to 10000ms, but Retry-After says 3600000ms (1 hour)
    await retryWithBackoff(mockFn, { maxRetries: 2, initialDelay: 1000 });

    // Assert - should be capped at default maxDelay (10000ms), not 3600000ms
    expect(delays[0]).toBe(10000);
  });

  it('should use standard full jitter (0 to exponentialDelay, not additive)', async () => {
    // Arrange - Test that jitter is in range [0, exponentialDelay] not [exponentialDelay, 2*exponentialDelay]
    const error429 = new HttpError('Rate limit exceeded', 429);
    const delays: number[] = [];

    vi.spyOn(global, 'setTimeout').mockImplementation(createMockSetTimeout(delays));

    // Mock Math.random to return 0.5 (50% of range)
    const originalRandom = Math.random;
    Math.random = vi.fn().mockReturnValue(0.5);

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429) // Attempt 0: exponentialDelay = 1000
      .mockRejectedValueOnce(error429) // Attempt 1: exponentialDelay = 2000
      .mockResolvedValueOnce('success');

    // Act
    await retryWithBackoff(mockFn, { maxRetries: 3, initialDelay: 1000 });

    // Restore Math.random
    Math.random = originalRandom;

    // Assert - Standard full jitter should produce delays at 50% of exponentialDelay
    // Attempt 0: exponentialDelay=1000, jitter should be ~500ms (0.5 * 1000)
    // Attempt 1: exponentialDelay=2000, jitter should be ~1000ms (0.5 * 2000)
    expect(delays[0]).toBeCloseTo(500, 0); // Should be ~500ms, not ~1500ms (additive)
    expect(delays[1]).toBeCloseTo(1000, 0); // Should be ~1000ms, not ~3000ms (additive)
  });

  it('should retry on 500 Internal Server Error', async () => {
    // Arrange
    const error500 = new HttpError('Internal Server Error', 500);
    const mockResponse = { ok: true };

    vi.spyOn(global, 'setTimeout').mockImplementation(((callback: any) => {
      Promise.resolve().then(() => callback());
      return 0 as any;
    }) as any);

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error500)
      .mockResolvedValueOnce(mockResponse);

    // Act
    const result = await retryWithBackoff(mockFn, { maxRetries: 2 });

    // Assert
    expect(result).toBe(mockResponse);
    expect(mockFn).toHaveBeenCalledTimes(2);
  });

  it('should retry on 502 Bad Gateway', async () => {
    // Arrange
    const error502 = new HttpError('Bad Gateway', 502);
    const mockResponse = { ok: true };

    vi.spyOn(global, 'setTimeout').mockImplementation(((callback: any) => {
      Promise.resolve().then(() => callback());
      return 0 as any;
    }) as any);

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error502)
      .mockResolvedValueOnce(mockResponse);

    // Act
    const result = await retryWithBackoff(mockFn, { maxRetries: 2 });

    // Assert
    expect(result).toBe(mockResponse);
    expect(mockFn).toHaveBeenCalledTimes(2);
  });

  it('should retry on 503 Service Unavailable', async () => {
    // Arrange
    const error503 = new HttpError('Service Unavailable', 503);
    const mockResponse = { ok: true };

    vi.spyOn(global, 'setTimeout').mockImplementation(((callback: any) => {
      Promise.resolve().then(() => callback());
      return 0 as any;
    }) as any);

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error503)
      .mockResolvedValueOnce(mockResponse);

    // Act
    const result = await retryWithBackoff(mockFn, { maxRetries: 2 });

    // Assert
    expect(result).toBe(mockResponse);
    expect(mockFn).toHaveBeenCalledTimes(2);
  });

  it('should retry on 504 Gateway Timeout', async () => {
    // Arrange
    const error504 = new HttpError('Gateway Timeout', 504);
    const mockResponse = { ok: true };

    vi.spyOn(global, 'setTimeout').mockImplementation(((callback: any) => {
      Promise.resolve().then(() => callback());
      return 0 as any;
    }) as any);

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error504)
      .mockResolvedValueOnce(mockResponse);

    // Act
    const result = await retryWithBackoff(mockFn, { maxRetries: 2 });

    // Assert
    expect(result).toBe(mockResponse);
    expect(mockFn).toHaveBeenCalledTimes(2);
  });

  it('should not retry on 501 Not Implemented (non-retryable 5xx)', async () => {
    // Arrange
    const error501 = new HttpError('Not Implemented', 501);

    const mockFn = vi.fn().mockRejectedValue(error501);

    // Act & Assert
    await expect(retryWithBackoff(mockFn, { maxRetries: 2 }))
      .rejects.toThrow('Not Implemented');

    // Should not retry - only 1 attempt
    expect(mockFn).toHaveBeenCalledTimes(1);
  });

  it('should handle real timing with exponential backoff (integration test)', async () => {
    // Arrange - Integration test with real timing (no mocked setTimeout)
    const error429 = new HttpError('Rate limit', 429);
    const mockResponse = { ok: true };

    // Don't mock setTimeout - use real timing
    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429)
      .mockRejectedValueOnce(error429)
      .mockResolvedValueOnce(mockResponse);

    // Act - Use small delays for fast test execution
    const result = await retryWithBackoff(mockFn, {
      maxRetries: 3,
      initialDelay: 10,
      maxDelay: 50,
    });

    // Assert - Test behavior, not implementation details (timing)
    expect(result).toBe(mockResponse);
    expect(mockFn).toHaveBeenCalledTimes(3); // Initial call + 2 retries
  });

  it('should log retry attempts for debugging/telemetry', async () => {
    // Arrange
    const consoleDebugSpy = vi.spyOn(console, 'debug').mockImplementation(() => {});
    const error429 = new HttpError('Rate limit', 429);
    const mockResponse = { ok: true };

    // Mock setTimeout to execute immediately
    vi.spyOn(global, 'setTimeout').mockImplementation(((callback: any) => {
      Promise.resolve().then(() => callback());
      return 0 as any;
    }) as any);

    const mockFn = vi
      .fn()
      .mockRejectedValueOnce(error429)
      .mockRejectedValueOnce(error429)
      .mockResolvedValueOnce(mockResponse);

    // Act
    await retryWithBackoff(mockFn, { maxRetries: 3, initialDelay: 1000 });

    // Assert - Verify debug logs were called
    expect(consoleDebugSpy).toHaveBeenCalledTimes(2); // Two retries

    // First retry log
    expect(consoleDebugSpy).toHaveBeenNthCalledWith(
      1,
      expect.stringContaining('Retry attempt 1/3'),
      expect.objectContaining({
        delay: expect.any(Number),
        error: 'Rate limit'
      })
    );

    // Second retry log
    expect(consoleDebugSpy).toHaveBeenNthCalledWith(
      2,
      expect.stringContaining('Retry attempt 2/3'),
      expect.objectContaining({
        delay: expect.any(Number),
        error: 'Rate limit'
      })
    );

    consoleDebugSpy.mockRestore();
  });
});

describe('fetchWithTimeout', () => {
  afterEach(() => {
    vi.restoreAllMocks();
  });

  it('should successfully fetch when request completes before timeout', async () => {
    // Arrange
    const mockResponse = { ok: true, status: 200 } as Response;
    global.fetch = vi.fn().mockResolvedValue(mockResponse);

    // Act
    const result = await fetchWithTimeout('https://example.com', undefined, 5000);

    // Assert
    expect(result).toBe(mockResponse);
    expect(global.fetch).toHaveBeenCalledWith('https://example.com', {
      signal: expect.any(AbortSignal),
    });
  });

  it('should timeout when request takes too long', async () => {
    // Arrange - Create a fetch that respects abort signal
    const slowFetch = vi.fn((url: string, options?: RequestInit): Promise<Response> =>
      new Promise((resolve, reject) => {
        const timeoutId = setTimeout(() => resolve({ ok: true } as Response), 200);

        // Listen for abort signal
        if (options?.signal) {
          options.signal.addEventListener('abort', () => {
            clearTimeout(timeoutId);
            const error = new Error('The operation was aborted');
            error.name = 'AbortError';
            reject(error);
          });
        }
      })
    );
    global.fetch = slowFetch as typeof fetch;

    // Act & Assert
    await expect(fetchWithTimeout('https://example.com', undefined, 50))
      .rejects.toThrow('Request to https://example.com timed out after 50ms');
  });

  it('should pass through non-abort errors', async () => {
    // Arrange - Create a fetch that throws a network error
    const networkError = new Error('Network failure');
    global.fetch = vi.fn().mockRejectedValue(networkError);

    // Act & Assert
    await expect(fetchWithTimeout('https://example.com'))
      .rejects.toThrow('Network failure');
  });
});

```