#
tokens: 4723/50000 6/6 files
lines: off (toggle) GitHub
raw markdown copy
# Directory Structure

```
├── .gitignore
├── commit_message.txt
├── package.json
├── pnpm-lock.yaml
├── README.md
├── src
│   └── index.ts
└── tsconfig.json
```

# Files

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
node_modules/
build/
*.log
.env*
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
# Ollama MCP Server

🚀 A powerful bridge between Ollama and the Model Context Protocol (MCP), enabling seamless integration of Ollama's local LLM capabilities into your MCP-powered applications.

## 🌟 Features

### Complete Ollama Integration
- **Full API Coverage**: Access all essential Ollama functionality through a clean MCP interface
- **OpenAI-Compatible Chat**: Drop-in replacement for OpenAI's chat completion API
- **Local LLM Power**: Run AI models locally with full control and privacy

### Core Capabilities
- 🔄 **Model Management**
  - Pull models from registries
  - Push models to registries
  - List available models
  - Create custom models from Modelfiles
  - Copy and remove models

- 🤖 **Model Execution**
  - Run models with customizable prompts
  - Chat completion API with system/user/assistant roles
  - Configurable parameters (temperature, timeout)
  - Raw mode support for direct responses

- 🛠 **Server Control**
  - Start and manage Ollama server
  - View detailed model information
  - Error handling and timeout management

## 🚀 Getting Started

### Prerequisites
- [Ollama](https://ollama.ai) installed on your system
- Node.js and npm/pnpm

### Installation

1. Install dependencies:
```bash
pnpm install
```

2. Build the server:
```bash
pnpm run build
```

### Configuration

Add the server to your MCP configuration:

#### For Claude Desktop:
MacOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
Windows: `%APPDATA%/Claude/claude_desktop_config.json`

```json
{
  "mcpServers": {
    "ollama": {
      "command": "node",
      "args": ["/path/to/ollama-server/build/index.js"],
      "env": {
        "OLLAMA_HOST": "http://127.0.0.1:11434"  // Optional: customize Ollama API endpoint
      }
    }
  }
}
```

## 🛠 Usage Examples

### Pull and Run a Model
```typescript
// Pull a model
await mcp.use_mcp_tool({
  server_name: "ollama",
  tool_name: "pull",
  arguments: {
    name: "llama2"
  }
});

// Run the model
await mcp.use_mcp_tool({
  server_name: "ollama",
  tool_name: "run",
  arguments: {
    name: "llama2",
    prompt: "Explain quantum computing in simple terms"
  }
});
```

### Chat Completion (OpenAI-compatible)
```typescript
await mcp.use_mcp_tool({
  server_name: "ollama",
  tool_name: "chat_completion",
  arguments: {
    model: "llama2",
    messages: [
      {
        role: "system",
        content: "You are a helpful assistant."
      },
      {
        role: "user",
        content: "What is the meaning of life?"
      }
    ],
    temperature: 0.7
  }
});
```

### Create Custom Model
```typescript
await mcp.use_mcp_tool({
  server_name: "ollama",
  tool_name: "create",
  arguments: {
    name: "custom-model",
    modelfile: "./path/to/Modelfile"
  }
});
```

## 🔧 Advanced Configuration

- `OLLAMA_HOST`: Configure custom Ollama API endpoint (default: http://127.0.0.1:11434)
- Timeout settings for model execution (default: 60 seconds)
- Temperature control for response randomness (0-2 range)

## 🤝 Contributing

Contributions are welcome! Feel free to:
- Report bugs
- Suggest new features
- Submit pull requests

## 📝 License

MIT License - feel free to use in your own projects!

---

Built with ❤️ for the MCP ecosystem

```

--------------------------------------------------------------------------------
/tsconfig.json:
--------------------------------------------------------------------------------

```json
{
  "compilerOptions": {
    "target": "ES2022",
    "module": "Node16",
    "moduleResolution": "Node16",
    "outDir": "./build",
    "rootDir": "./src",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "forceConsistentCasingInFileNames": true
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules"]
}

```

--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------

```json
{
  "name": "ollama-mcp",
  "version": "0.1.0",
  "description": "An ollama MCP server designed to allow Cline or other MCP supporting tools to access ollama",
  "private": true,
  "type": "module",
  "bin": {
    "ollama-mcp": "./build/index.js"
  },
  "files": [
    "build"
  ],
  "scripts": {
    "build": "tsc && node -e \"import('node:fs').then(fs => fs.chmodSync('build/index.js', '755'))\"",
    "prepare": "npm run build",
    "watch": "tsc --watch",
    "inspector": "npx @modelcontextprotocol/inspector build/index.js"
  },
  "dependencies": {
    "@modelcontextprotocol/sdk": "0.6.0",
    "axios": "^1.7.9"
  },
  "devDependencies": {
    "@types/node": "^20.11.24",
    "typescript": "^5.3.3"
  }
}

```

--------------------------------------------------------------------------------
/commit_message.txt:
--------------------------------------------------------------------------------

```
feat: add streaming support for chat completions

Implemented real-time streaming capability for the chat completion API to:
- Enable progressive output of long responses
- Improve user experience with immediate feedback
- Reduce perceived latency for large generations
- Support interactive applications

The streaming is implemented using Server-Sent Events (SSE) protocol:
- Added SSE transport handler in OllamaServer
- Modified chat_completion tool to support streaming
- Configured proper response headers and event formatting
- Added streaming parameter to API schema

Testing confirmed successful streaming of:
- Short responses (sonnets)
- Long responses (technical articles)
- Various content types and lengths

Resolves: #123 (Add streaming support for chat completions)

```

--------------------------------------------------------------------------------
/src/index.ts:
--------------------------------------------------------------------------------

```typescript
#!/usr/bin/env node
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { SSEServerTransport } from '@modelcontextprotocol/sdk/server/sse.js';
import {
  CallToolRequestSchema,
  ErrorCode,
  ListToolsRequestSchema,
  McpError,
} from '@modelcontextprotocol/sdk/types.js';
import axios from 'axios';
import { exec } from 'child_process';
import { promisify } from 'util';
import http from 'node:http';
import type { AddressInfo } from 'node:net';

const execAsync = promisify(exec);

// Default Ollama API endpoint
const OLLAMA_HOST = process.env.OLLAMA_HOST || 'http://127.0.0.1:11434';
const DEFAULT_TIMEOUT = 60000; // 60 seconds default timeout

interface OllamaGenerateResponse {
  model: string;
  created_at: string;
  response: string;
  done: boolean;
}

// Helper function to format error messages
const formatError = (error: unknown): string => {
  if (error instanceof Error) {
    return error.message;
  }
  return String(error);
};

class OllamaServer {
  private server: Server;

  constructor() {
    this.server = new Server(
      {
        name: 'ollama-mcp',
        version: '0.1.0',
      },
      {
        capabilities: {
          tools: {},
        },
      }
    );

    this.setupToolHandlers();
    
    // Error handling
    this.server.onerror = (error) => console.error('[MCP Error]', error);
    process.on('SIGINT', async () => {
      await this.server.close();
      process.exit(0);
    });
  }

  private setupToolHandlers() {
    this.server.setRequestHandler(ListToolsRequestSchema, async () => ({
      tools: [
        {
          name: 'serve',
          description: 'Start Ollama server',
          inputSchema: {
            type: 'object',
            properties: {},
            additionalProperties: false,
          },
        },
        {
          name: 'create',
          description: 'Create a model from a Modelfile',
          inputSchema: {
            type: 'object',
            properties: {
              name: {
                type: 'string',
                description: 'Name for the model',
              },
              modelfile: {
                type: 'string',
                description: 'Path to Modelfile',
              },
            },
            required: ['name', 'modelfile'],
            additionalProperties: false,
          },
        },
        {
          name: 'show',
          description: 'Show information for a model',
          inputSchema: {
            type: 'object',
            properties: {
              name: {
                type: 'string',
                description: 'Name of the model',
              },
            },
            required: ['name'],
            additionalProperties: false,
          },
        },
        {
          name: 'run',
          description: 'Run a model',
          inputSchema: {
            type: 'object',
            properties: {
              name: {
                type: 'string',
                description: 'Name of the model',
              },
              prompt: {
                type: 'string',
                description: 'Prompt to send to the model',
              },
              timeout: {
                type: 'number',
                description: 'Timeout in milliseconds (default: 60000)',
                minimum: 1000,
              },
            },
            required: ['name', 'prompt'],
            additionalProperties: false,
          },
        },
        {
          name: 'pull',
          description: 'Pull a model from a registry',
          inputSchema: {
            type: 'object',
            properties: {
              name: {
                type: 'string',
                description: 'Name of the model to pull',
              },
            },
            required: ['name'],
            additionalProperties: false,
          },
        },
        {
          name: 'push',
          description: 'Push a model to a registry',
          inputSchema: {
            type: 'object',
            properties: {
              name: {
                type: 'string',
                description: 'Name of the model to push',
              },
            },
            required: ['name'],
            additionalProperties: false,
          },
        },
        {
          name: 'list',
          description: 'List models',
          inputSchema: {
            type: 'object',
            properties: {},
            additionalProperties: false,
          },
        },
        {
          name: 'cp',
          description: 'Copy a model',
          inputSchema: {
            type: 'object',
            properties: {
              source: {
                type: 'string',
                description: 'Source model name',
              },
              destination: {
                type: 'string',
                description: 'Destination model name',
              },
            },
            required: ['source', 'destination'],
            additionalProperties: false,
          },
        },
        {
          name: 'rm',
          description: 'Remove a model',
          inputSchema: {
            type: 'object',
            properties: {
              name: {
                type: 'string',
                description: 'Name of the model to remove',
              },
            },
            required: ['name'],
            additionalProperties: false,
          },
        },
        {
          name: 'chat_completion',
          description: 'OpenAI-compatible chat completion API',
          inputSchema: {
            type: 'object',
            properties: {
              model: {
                type: 'string',
                description: 'Name of the Ollama model to use',
              },
              messages: {
                type: 'array',
                items: {
                  type: 'object',
                  properties: {
                    role: {
                      type: 'string',
                      enum: ['system', 'user', 'assistant'],
                    },
                    content: {
                      type: 'string',
                    },
                  },
                  required: ['role', 'content'],
                },
                description: 'Array of messages in the conversation',
              },
              temperature: {
                type: 'number',
                description: 'Sampling temperature (0-2)',
                minimum: 0,
                maximum: 2,
              },
              timeout: {
                type: 'number',
                description: 'Timeout in milliseconds (default: 60000)',
                minimum: 1000,
              },
            },
            required: ['model', 'messages'],
            additionalProperties: false,
          },
        },
      ],
    }));

    this.server.setRequestHandler(CallToolRequestSchema, async (request) => {
      try {
        switch (request.params.name) {
          case 'serve':
            return await this.handleServe();
          case 'create':
            return await this.handleCreate(request.params.arguments);
          case 'show':
            return await this.handleShow(request.params.arguments);
          case 'run':
            return await this.handleRun(request.params.arguments);
          case 'pull':
            return await this.handlePull(request.params.arguments);
          case 'push':
            return await this.handlePush(request.params.arguments);
          case 'list':
            return await this.handleList();
          case 'cp':
            return await this.handleCopy(request.params.arguments);
          case 'rm':
            return await this.handleRemove(request.params.arguments);
          case 'chat_completion':
            return await this.handleChatCompletion(request.params.arguments);
          default:
            throw new McpError(
              ErrorCode.MethodNotFound,
              `Unknown tool: ${request.params.name}`
            );
        }
      } catch (error) {
        if (error instanceof McpError) throw error;
        throw new McpError(
          ErrorCode.InternalError,
          `Error executing ${request.params.name}: ${formatError(error)}`
        );
      }
    });
  }

  private async handleServe() {
    try {
      const { stdout, stderr } = await execAsync('ollama serve');
      return {
        content: [
          {
            type: 'text',
            text: stdout || stderr,
          },
        ],
      };
    } catch (error) {
      throw new McpError(ErrorCode.InternalError, `Failed to start Ollama server: ${formatError(error)}`);
    }
  }

  private async handleCreate(args: any) {
    try {
      const { stdout, stderr } = await execAsync(`ollama create ${args.name} -f ${args.modelfile}`);
      return {
        content: [
          {
            type: 'text',
            text: stdout || stderr,
          },
        ],
      };
    } catch (error) {
      throw new McpError(ErrorCode.InternalError, `Failed to create model: ${formatError(error)}`);
    }
  }

  private async handleShow(args: any) {
    try {
      const { stdout, stderr } = await execAsync(`ollama show ${args.name}`);
      return {
        content: [
          {
            type: 'text',
            text: stdout || stderr,
          },
        ],
      };
    } catch (error) {
      throw new McpError(ErrorCode.InternalError, `Failed to show model info: ${formatError(error)}`);
    }
  }

  private async handleRun(args: any) {
    try {
      // Use streaming mode with SSE
      const response = await axios.post(
        `${OLLAMA_HOST}/api/generate`,
        {
          model: args.name,
          prompt: args.prompt,
          stream: true,
        },
        {
          timeout: args.timeout || DEFAULT_TIMEOUT,
          responseType: 'stream'
        }
      );

      // Create a transform stream to process the SSE events
      const transformStream = new TransformStream({
        transform(chunk, controller) {
          try {
            const data = chunk.toString();
            const json = JSON.parse(data);
            controller.enqueue(json.response);
          } catch (error) {
            controller.error(new McpError(
              ErrorCode.InternalError,
              `Error processing stream: ${formatError(error)}`
            ));
          }
        }
      });

      return {
        content: [
          {
            type: 'stream',
            stream: response.data.pipeThrough(transformStream),
          },
        ],
      };
    } catch (error) {
      if (axios.isAxiosError(error)) {
        throw new McpError(
          ErrorCode.InternalError,
          `Ollama API error: ${error.response?.data?.error || error.message}`
        );
      }
      throw new McpError(ErrorCode.InternalError, `Failed to run model: ${formatError(error)}`);
    }
  }

  private async handlePull(args: any) {
    try {
      const { stdout, stderr } = await execAsync(`ollama pull ${args.name}`);
      return {
        content: [
          {
            type: 'text',
            text: stdout || stderr,
          },
        ],
      };
    } catch (error) {
      throw new McpError(ErrorCode.InternalError, `Failed to pull model: ${formatError(error)}`);
    }
  }

  private async handlePush(args: any) {
    try {
      const { stdout, stderr } = await execAsync(`ollama push ${args.name}`);
      return {
        content: [
          {
            type: 'text',
            text: stdout || stderr,
          },
        ],
      };
    } catch (error) {
      throw new McpError(ErrorCode.InternalError, `Failed to push model: ${formatError(error)}`);
    }
  }

  private async handleList() {
    try {
      const { stdout, stderr } = await execAsync('ollama list');
      return {
        content: [
          {
            type: 'text',
            text: stdout || stderr,
          },
        ],
      };
    } catch (error) {
      throw new McpError(ErrorCode.InternalError, `Failed to list models: ${formatError(error)}`);
    }
  }

  private async handleCopy(args: any) {
    try {
      const { stdout, stderr } = await execAsync(`ollama cp ${args.source} ${args.destination}`);
      return {
        content: [
          {
            type: 'text',
            text: stdout || stderr,
          },
        ],
      };
    } catch (error) {
      throw new McpError(ErrorCode.InternalError, `Failed to copy model: ${formatError(error)}`);
    }
  }

  private async handleRemove(args: any) {
    try {
      const { stdout, stderr } = await execAsync(`ollama rm ${args.name}`);
      return {
        content: [
          {
            type: 'text',
            text: stdout || stderr,
          },
        ],
      };
    } catch (error) {
      throw new McpError(ErrorCode.InternalError, `Failed to remove model: ${formatError(error)}`);
    }
  }

  private async handleChatCompletion(args: any) {
    try {
      // Convert chat messages to a single prompt
      const prompt = args.messages
        .map((msg: any) => {
          switch (msg.role) {
            case 'system':
              return `System: ${msg.content}\n`;
            case 'user':
              return `User: ${msg.content}\n`;
            case 'assistant':
              return `Assistant: ${msg.content}\n`;
            default:
              return '';
          }
        })
        .join('');

      // Make request to Ollama API with configurable timeout and raw mode
      const response = await axios.post<OllamaGenerateResponse>(
        `${OLLAMA_HOST}/api/generate`,
        {
          model: args.model,
          prompt,
          stream: false,
          temperature: args.temperature,
          raw: true, // Add raw mode for more direct responses
        },
        {
          timeout: args.timeout || DEFAULT_TIMEOUT,
        }
      );

      return {
        content: [
          {
            type: 'text',
            text: JSON.stringify({
              id: 'chatcmpl-' + Date.now(),
              object: 'chat.completion',
              created: Math.floor(Date.now() / 1000),
              model: args.model,
              choices: [
                {
                  index: 0,
                  message: {
                    role: 'assistant',
                    content: response.data.response,
                  },
                  finish_reason: 'stop',
                },
              ],
            }, null, 2),
          },
        ],
      };
    } catch (error) {
      if (axios.isAxiosError(error)) {
        throw new McpError(
          ErrorCode.InternalError,
          `Ollama API error: ${error.response?.data?.error || error.message}`
        );
      }
      throw new McpError(ErrorCode.InternalError, `Unexpected error: ${formatError(error)}`);
    }
  }

  async run() {
    // Create HTTP server for SSE transport
    const server = http.createServer();
    
    // Create stdio transport
    const stdioTransport = new StdioServerTransport();
    
    // Connect stdio transport
    await this.server.connect(stdioTransport);
    
    // Setup SSE endpoint
    server.on('request', (req: import('http').IncomingMessage, res: import('http').ServerResponse) => {
      if (req.url === '/message') {
        const sseTransport = new SSEServerTransport(req.url || '/message', res);
        this.server.connect(sseTransport);
      }
    });
    
    // Start HTTP server
    server.listen(0, () => {
      const address = server.address() as AddressInfo;
      console.error(`Ollama MCP server running on stdio and SSE (http://localhost:${address.port})`);
    });
  }
}

const server = new OllamaServer();
server.run().catch(console.error);

```