hannesrudolph/mcp-ragdocs # codebase.md

# Directory Structure

```
├── .gitignore
├── CHANGELOG.md
├── LICENSE
├── package-lock.json
├── package.json
├── README.md
├── src
│   ├── api-client.ts
│   ├── handler-registry.ts
│   ├── handlers
│   │   ├── add-documentation.ts
│   │   ├── base-handler.ts
│   │   ├── clear-queue.ts
│   │   ├── extract-urls.ts
│   │   ├── index.ts
│   │   ├── list-queue.ts
│   │   ├── list-sources.ts
│   │   ├── remove-documentation.ts
│   │   ├── run-queue.ts
│   │   └── search-documentation.ts
│   ├── index.ts
│   ├── tools
│   │   ├── base-tool.ts
│   │   ├── clear-queue.ts
│   │   ├── extract-urls.ts
│   │   ├── index.ts
│   │   ├── list-queue.ts
│   │   ├── list-sources.ts
│   │   ├── remove-documentation.ts
│   │   ├── run-queue.ts
│   │   └── search-documentation.ts
│   └── types.ts
└── tsconfig.json
```

# Files

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
# Dependencies
node_modules/
.pnp/
.pnp.js

# Build output
build/
dist/
*.tsbuildinfo

# Environment variables
.env
.env.local
.env.development.local
.env.test.local
.env.production.local

# Logs
logs/
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*

# Editor directories and files
.idea/
.vscode/
*.swp
*.swo
.DS_Store

# Test coverage
coverage/

# Local documentation files
INTERNAL.TXT
queue.txt
MCPguide.txt

```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
# RAG Documentation MCP Server

An MCP server implementation that provides tools for retrieving and processing documentation through vector search, enabling AI assistants to augment their responses with relevant documentation context.

<a href="https://glama.ai/mcp/servers/54hsrjhmq9"><img width="380" height="200" src="https://glama.ai/mcp/servers/54hsrjhmq9/badge" alt="mcp-ragdocs MCP server" /></a>

## Features

- Vector-based documentation search and retrieval
- Support for multiple documentation sources
- Semantic search capabilities
- Automated documentation processing
- Real-time context augmentation for LLMs

## Tools

### search_documentation
Search through stored documentation using natural language queries. Returns matching excerpts with context, ranked by relevance.

**Inputs:**
- `query` (string): The text to search for in the documentation. Can be a natural language query, specific terms, or code snippets.
- `limit` (number, optional): Maximum number of results to return (1-20, default: 5). Higher limits provide more comprehensive results but may take longer to process.

### list_sources
List all documentation sources currently stored in the system. Returns a comprehensive list of all indexed documentation including source URLs, titles, and last update times. Use this to understand what documentation is available for searching or to verify if specific sources have been indexed.

### extract_urls
Extract and analyze all URLs from a given web page. This tool crawls the specified webpage, identifies all hyperlinks, and optionally adds them to the processing queue.

**Inputs:**
- `url` (string): The complete URL of the webpage to analyze (must include protocol, e.g., https://). The page must be publicly accessible.
- `add_to_queue` (boolean, optional): If true, automatically add extracted URLs to the processing queue for later indexing. Use with caution on large sites to avoid excessive queuing.

### remove_documentation
Remove specific documentation sources from the system by their URLs. The removal is permanent and will affect future search results.

**Inputs:**
- `urls` (string[]): Array of URLs to remove from the database. Each URL must exactly match the URL used when the documentation was added.

### list_queue
List all URLs currently waiting in the documentation processing queue. Shows pending documentation sources that will be processed when run_queue is called. Use this to monitor queue status, verify URLs were added correctly, or check processing backlog.

### run_queue
Process and index all URLs currently in the documentation queue. Each URL is processed sequentially, with proper error handling and retry logic. Progress updates are provided as processing occurs. Long-running operations will process until the queue is empty or an unrecoverable error occurs.

### clear_queue
Remove all pending URLs from the documentation processing queue. Use this to reset the queue when you want to start fresh, remove unwanted URLs, or cancel pending processing. This operation is immediate and permanent - URLs will need to be re-added if you want to process them later.

## Usage

The RAG Documentation tool is designed for:

- Enhancing AI responses with relevant documentation
- Building documentation-aware AI assistants
- Creating context-aware tooling for developers
- Implementing semantic documentation search
- Augmenting existing knowledge bases

## Configuration

### Usage with Claude Desktop

Add this to your `claude_desktop_config.json`:

```json
{
  "mcpServers": {
    "rag-docs": {
      "command": "npx",
      "args": [
        "-y",
        "@hannesrudolph/mcp-ragdocs"
      ],
      "env": {
        "OPENAI_API_KEY": "",
        "QDRANT_URL": "",
        "QDRANT_API_KEY": ""
      }
    }
  }
}
```

You'll need to provide values for the following environment variables:
- `OPENAI_API_KEY`: Your OpenAI API key for embeddings generation
- `QDRANT_URL`: URL of your Qdrant vector database instance
- `QDRANT_API_KEY`: API key for authenticating with Qdrant

## License

This MCP server is licensed under the MIT License. This means you are free to use, modify, and distribute the software, subject to the terms and conditions of the MIT License. For more details, please see the LICENSE file in the project repository.

## Acknowledgments

This project is a fork of [qpd-v/mcp-ragdocs](https://github.com/qpd-v/mcp-ragdocs), originally developed by qpd-v. The original project provided the foundation for this implementation.

```

--------------------------------------------------------------------------------
/src/tools/index.ts:
--------------------------------------------------------------------------------

```typescript
export * from './search-documentation.js';
export * from './list-sources.js';
export * from './extract-urls.js';
export * from './remove-documentation.js';
export * from './list-queue.js';
export * from './run-queue.js';
export * from './clear-queue.js';
```

--------------------------------------------------------------------------------
/src/handlers/index.ts:
--------------------------------------------------------------------------------

```typescript
export * from './base-handler.js';
export * from './add-documentation.js';
export * from './search-documentation.js';
export * from './list-sources.js';
export * from './extract-urls.js';
export * from './remove-documentation.js';
export * from './list-queue.js';
export * from './run-queue.js';
export * from './clear-queue.js';
```

--------------------------------------------------------------------------------
/tsconfig.json:
--------------------------------------------------------------------------------

```json
{
  "compilerOptions": {
    "target": "ES2022",
    "module": "Node16",
    "moduleResolution": "Node16",
    "outDir": "./build",
    "rootDir": "./src",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "forceConsistentCasingInFileNames": true
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules"]
}

```

--------------------------------------------------------------------------------
/src/handlers/clear-queue.ts:
--------------------------------------------------------------------------------

```typescript
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { ApiClient } from '../api-client.js';
import { ClearQueueTool } from '../tools/clear-queue.js';

export class ClearQueueHandler extends ClearQueueTool {
  constructor(server: Server, apiClient: ApiClient) {
    super();
  }

		async handle(args: any) {
				return this.execute(args);
		}
}
```

--------------------------------------------------------------------------------
/src/handlers/base-handler.ts:
--------------------------------------------------------------------------------

```typescript
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { ApiClient } from '../api-client.js';
import { McpToolResponse } from '../types.js';

export abstract class BaseHandler {
  protected server: Server;
  protected apiClient: ApiClient;

  constructor(server: Server, apiClient: ApiClient) {
    this.server = server;
    this.apiClient = apiClient;
  }

  protected abstract handle(args: any): Promise<McpToolResponse>;
}
```

--------------------------------------------------------------------------------
/src/tools/base-tool.ts:
--------------------------------------------------------------------------------

```typescript
import { ToolDefinition, McpToolResponse } from '../types.js';

export abstract class BaseTool {
  abstract get definition(): ToolDefinition;
  abstract execute(args: unknown): Promise<McpToolResponse>;

  protected formatResponse(data: unknown): McpToolResponse {
    return {
      content: [
        {
          type: 'text',
          text: JSON.stringify(data, null, 2),
        },
      ],
    };
  }

  protected handleError(error: any): McpToolResponse {
    return {
      content: [
        {
          type: 'text',
          text: `Error: ${error}`,
        },
      ],
      isError: true,
    };
  }
}
```

--------------------------------------------------------------------------------
/src/types.ts:
--------------------------------------------------------------------------------

```typescript
export interface DocumentChunk {
  text: string;
  url: string;
  title: string;
  timestamp: string;
}

export interface DocumentPayload extends DocumentChunk {
  _type: 'DocumentChunk';
  [key: string]: unknown;
}

export function isDocumentPayload(payload: unknown): payload is DocumentPayload {
  if (!payload || typeof payload !== 'object') return false;
  const p = payload as Partial<DocumentPayload>;
  return (
    p._type === 'DocumentChunk' &&
    typeof p.text === 'string' &&
    typeof p.url === 'string' &&
    typeof p.title === 'string' &&
    typeof p.timestamp === 'string'
  );
}

export interface ToolDefinition {
  name: string;
  description: string;
  inputSchema: {
    type: string;
    properties: Record<string, any>;
    required: string[];
  };
}

export interface McpToolResponse {
  content: Array<{
    type: string;
    text: string;
  }>;
  isError?: boolean;
}
```

--------------------------------------------------------------------------------
/src/index.ts:
--------------------------------------------------------------------------------

```typescript
#!/usr/bin/env node
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { ApiClient } from './api-client.js';
import { HandlerRegistry } from './handler-registry.js';

class RagDocsServer {
  private server: Server;
  private apiClient: ApiClient;
  private handlerRegistry: HandlerRegistry;

  constructor() {
    this.server = new Server(
      {
        name: 'mcp-ragdocs',
        version: '0.1.0',
      },
      {
        capabilities: {
          tools: {},
        },
      }
    );

    this.apiClient = new ApiClient();
    this.handlerRegistry = new HandlerRegistry(this.server, this.apiClient);
    
    // Error handling
    this.server.onerror = (error) => console.error('[MCP Error]', error);
    process.on('SIGINT', async () => {
      await this.cleanup();
      process.exit(0);
    });
  }

  private async cleanup() {
    await this.apiClient.cleanup();
    await this.server.close();
  }

  async run() {
    const transport = new StdioServerTransport();
    await this.server.connect(transport);
    console.error('RAG Docs MCP server running on stdio');
  }
}

const server = new RagDocsServer();
server.run().catch(console.error);
```

--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------

```json
{
  "name": "@hannesrudolph/mcp-ragdocs",
  "version": "1.1.0",
  "description": "An MCP server for semantic documentation search and retrieval using vector databases to augment LLM capabilities.",
  "private": false,
  "type": "module",
  "bin": {
    "@hannesrudolph/mcp-ragdocs": "./build/index.js"
  },
  "files": [
    "build",
    "README.md",
    "LICENSE"
  ],
  "scripts": {
    "build": "tsc && node -e \"require('fs').chmodSync('build/index.js', '755')\"",
    "prepare": "npm run build",
    "watch": "tsc --watch",
    "inspector": "npx @modelcontextprotocol/inspector build/index.js",
    "start": "node build/index.js"
  },
  "keywords": [
    "mcp",
    "model-context-protocol",
    "rag",
    "documentation",
    "vector-database",
    "qdrant",
    "claude",
    "llm"
  ],
  "author": "hannesrudolph",
  "license": "MIT",
  "repository": {
    "type": "git",
    "url": "git+https://github.com/hannesrudolph/mcp-ragdocs.git"
  },
  "bugs": {
    "url": "https://github.com/hannesrudolph/mcp-ragdocs/issues"
  },
  "homepage": "https://github.com/hannesrudolph/mcp-ragdocs#readme",
  "dependencies": {
    "@azure/openai": "2.0.0",
    "@modelcontextprotocol/sdk": "1.0.3",
    "@qdrant/js-client-rest": "1.12.0",
    "axios": "1.7.9",
    "cheerio": "1.0.0",
    "openai": "4.76.2",
    "playwright": "1.49.1"
  },
  "devDependencies": {
    "@types/node": "^20.17.10",
    "ts-node": "^10.9.2",
    "typescript": "^5.7.2"
  },
  "publishConfig": {
    "access": "public"
  }
}

```

--------------------------------------------------------------------------------
/src/handlers/list-queue.ts:
--------------------------------------------------------------------------------

```typescript
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { ApiClient } from '../api-client.js';
import { BaseHandler } from './base-handler.js';
import fs from 'fs/promises';
import path from 'path';
import { fileURLToPath } from 'url';

// Get current directory in ES modules
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
const QUEUE_FILE = path.join(__dirname, '..', '..', 'queue.txt');

export class ListQueueHandler extends BaseHandler {
  constructor(server: Server, apiClient: ApiClient) {
    super(server, apiClient);
  }

  async handle(_args: any) {
    try {
      // Check if queue file exists
      try {
        await fs.access(QUEUE_FILE);
      } catch {
        return {
          content: [
            {
              type: 'text',
              text: 'Queue is empty (queue file does not exist)',
            },
          ],
        };
      }

      // Read queue file
      const content = await fs.readFile(QUEUE_FILE, 'utf-8');
      const urls = content.split('\n').filter(url => url.trim() !== '');

      if (urls.length === 0) {
        return {
          content: [
            {
              type: 'text',
              text: 'Queue is empty',
            },
          ],
        };
      }

      return {
        content: [
          {
            type: 'text',
            text: `Queue contains ${urls.length} URLs:\n${urls.join('\n')}`,
          },
        ],
      };
    } catch (error) {
      return {
        content: [
          {
            type: 'text',
            text: `Failed to read queue: ${error}`,
          },
        ],
        isError: true,
      };
    }
  }
}
```

--------------------------------------------------------------------------------
/CHANGELOG.md:
--------------------------------------------------------------------------------

```markdown
# Changelog

## [1.1.0] - 2024-03-14

### Initial Feature Addition
- Implemented new clear_queue tool for queue management
  - Created src/tools/clear-queue.ts with core functionality
  - Added handler in src/handlers/clear-queue.ts
  - Integrated with existing queue management system
  - Added tool exports and registration

### Code Organization
- Improved tool ordering in handler-registry.ts
  - Moved remove_documentation before extract_urls
  - Enhanced logical grouping of related tools
  - Updated imports to match new ordering

### Documentation Enhancement Phase 1
- Enhanced tool descriptions in handler-registry.ts:
  1. search_documentation
     - Added natural language query support details
     - Clarified result ranking and context
     - Improved limit parameter documentation
  2. list_sources
     - Added details about indexed documentation
     - Clarified source information returned
  3. extract_urls
     - Enhanced URL crawling explanation
     - Added queue integration details
     - Clarified URL validation requirements
  4. remove_documentation
     - Added permanence warning
     - Clarified URL matching requirements
  5. list_queue
     - Added queue monitoring details
     - Clarified status checking capabilities
  6. run_queue
     - Added processing behavior details
     - Documented error handling
  7. clear_queue
     - Detailed queue clearing behavior
     - Added permanence warnings
     - Documented URL re-adding requirements

### Documentation Enhancement Phase 2
- Updated README.md
  - Removed add_documentation and queue_documentation tools
  - Updated tool descriptions to match handler-registry.ts
  - Added parameter format requirements
  - Enhanced usage guidance
```

--------------------------------------------------------------------------------
/src/tools/list-queue.ts:
--------------------------------------------------------------------------------

```typescript
import { BaseTool } from './base-tool.js';
import { ToolDefinition, McpToolResponse } from '../types.js';
import { ErrorCode, McpError } from '@modelcontextprotocol/sdk/types.js';
import fs from 'fs/promises';
import path from 'path';

const QUEUE_FILE = path.join(process.cwd(), 'queue.txt');

export class ListQueueTool extends BaseTool {
  constructor() {
    super();
  }

  get definition(): ToolDefinition {
    return {
      name: 'list_queue',
      description: 'List all URLs currently in the documentation processing queue',
      inputSchema: {
        type: 'object',
        properties: {},
        required: [],
      },
    };
  }

  async execute(_args: any): Promise<McpToolResponse> {
    try {
      // Check if queue file exists
      try {
        await fs.access(QUEUE_FILE);
      } catch {
        return {
          content: [
            {
              type: 'text',
              text: 'Queue is empty (queue file does not exist)',
            },
          ],
        };
      }

      // Read queue file
      const content = await fs.readFile(QUEUE_FILE, 'utf-8');
      const urls = content.split('\n').filter(url => url.trim() !== '');

      if (urls.length === 0) {
        return {
          content: [
            {
              type: 'text',
              text: 'Queue is empty',
            },
          ],
        };
      }

      return {
        content: [
          {
            type: 'text',
            text: `Queue contains ${urls.length} URLs:\n${urls.join('\n')}`,
          },
        ],
      };
    } catch (error) {
      return {
        content: [
          {
            type: 'text',
            text: `Failed to read queue: ${error}`,
          },
        ],
        isError: true,
      };
    }
  }
}
```

--------------------------------------------------------------------------------
/src/tools/clear-queue.ts:
--------------------------------------------------------------------------------

```typescript
import { BaseTool } from './base-tool.js';
import { ToolDefinition, McpToolResponse } from '../types.js';
import fs from 'fs/promises';
import path from 'path';
import { fileURLToPath } from 'url';

// Get current directory in ES modules
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
const QUEUE_FILE = path.join(__dirname, '..', '..', 'queue.txt');

export class ClearQueueTool extends BaseTool {
  get definition(): ToolDefinition {
    return {
      name: 'clear_queue',
      description: 'Clear all URLs from the queue',
      inputSchema: {
        type: 'object',
        properties: {},
        required: [],
      },
    };
  }

  async execute(_args: any): Promise<McpToolResponse> {
    try {
      // Check if queue file exists
      try {
        await fs.access(QUEUE_FILE);
      } catch {
        return {
          content: [
            {
              type: 'text',
              text: 'Queue is already empty (queue file does not exist)',
            },
          ],
        };
      }

      // Read current queue to get count of URLs being cleared
      const content = await fs.readFile(QUEUE_FILE, 'utf-8');
      const urlCount = content.split('\n').filter(url => url.trim() !== '').length;

      // Clear the queue by emptying the file
      await fs.writeFile(QUEUE_FILE, '');

      return {
        content: [
          {
            type: 'text',
            text: `Queue cleared successfully. Removed ${urlCount} URL${urlCount === 1 ? '' : 's'} from the queue.`,
          },
        ],
      };
    } catch (error) {
      return {
        content: [
          {
            type: 'text',
            text: `Failed to clear queue: ${error}`,
          },
        ],
        isError: true,
      };
    }
  }
}
```

--------------------------------------------------------------------------------
/src/handlers/search-documentation.ts:
--------------------------------------------------------------------------------

```typescript
import { McpError, ErrorCode } from '@modelcontextprotocol/sdk/types.js';
import { BaseHandler } from './base-handler.js';
import { McpToolResponse, isDocumentPayload } from '../types.js';

const COLLECTION_NAME = 'documentation';

export class SearchDocumentationHandler extends BaseHandler {
  async handle(args: any): Promise<McpToolResponse> {
    if (!args.query || typeof args.query !== 'string') {
      throw new McpError(ErrorCode.InvalidParams, 'Query is required');
    }

    const limit = args.limit || 5;

    try {
      const queryEmbedding = await this.apiClient.getEmbeddings(args.query);
      
      const searchResults = await this.apiClient.qdrantClient.search(COLLECTION_NAME, {
        vector: queryEmbedding,
        limit,
        with_payload: true,
        with_vector: false, // Optimize network transfer by not retrieving vectors
        score_threshold: 0.7, // Only return relevant results
      });

      const formattedResults = searchResults.map(result => {
        if (!isDocumentPayload(result.payload)) {
          throw new Error('Invalid payload type');
        }
        return `[${result.payload.title}](${result.payload.url})\nScore: ${result.score.toFixed(3)}\nContent: ${result.payload.text}\n`;
      }).join('\n---\n');

      return {
        content: [
          {
            type: 'text',
            text: formattedResults || 'No results found matching the query.',
          },
        ],
      };
    } catch (error) {
      if (error instanceof Error) {
        if (error.message.includes('unauthorized')) {
          throw new McpError(
            ErrorCode.InvalidRequest,
            'Failed to authenticate with Qdrant cloud while searching'
          );
        } else if (error.message.includes('ECONNREFUSED') || error.message.includes('ETIMEDOUT')) {
          throw new McpError(
            ErrorCode.InternalError,
            'Connection to Qdrant cloud failed while searching'
          );
        }
      }
      return {
        content: [
          {
            type: 'text',
            text: `Search failed: ${error}`,
          },
        ],
        isError: true,
      };
    }
  }
}
```

--------------------------------------------------------------------------------
/src/handlers/remove-documentation.ts:
--------------------------------------------------------------------------------

```typescript
import { McpError, ErrorCode } from '@modelcontextprotocol/sdk/types.js';
import { BaseHandler } from './base-handler.js';
import { McpToolResponse } from '../types.js';

const COLLECTION_NAME = 'documentation';

export class RemoveDocumentationHandler extends BaseHandler {
  async handle(args: any): Promise<McpToolResponse> {
    if (!args.urls || !Array.isArray(args.urls) || args.urls.length === 0) {
      throw new McpError(ErrorCode.InvalidParams, 'urls must be a non-empty array');
    }

    if (!args.urls.every((url: string) => typeof url === 'string')) {
      throw new McpError(ErrorCode.InvalidParams, 'All URLs must be strings');
    }

    try {
      // Delete using filter to match any of the provided URLs
      const result = await this.apiClient.qdrantClient.delete(COLLECTION_NAME, {
        filter: {
          should: args.urls.map((url: string) => ({
            key: 'url',
            match: { value: url }
          }))
        },
        wait: true // Ensure deletion is complete before responding
      });

      if (!['acknowledged', 'completed'].includes(result.status)) {
        throw new Error('Delete operation failed');
      }

      return {
        content: [
          {
            type: 'text',
            text: `Successfully removed documentation from ${args.urls.length} source${args.urls.length > 1 ? 's' : ''}: ${args.urls.join(', ')}`,
          },
        ],
      };
    } catch (error) {
      if (error instanceof Error) {
        if (error.message.includes('unauthorized')) {
          throw new McpError(
            ErrorCode.InvalidRequest,
            'Failed to authenticate with Qdrant cloud while removing documentation'
          );
        } else if (error.message.includes('ECONNREFUSED') || error.message.includes('ETIMEDOUT')) {
          throw new McpError(
            ErrorCode.InternalError,
            'Connection to Qdrant cloud failed while removing documentation'
          );
        }
      }
      return {
        content: [
          {
            type: 'text',
            text: `Failed to remove documentation: ${error}`,
          },
        ],
        isError: true,
      };
    }
  }
}
```

--------------------------------------------------------------------------------
/src/tools/list-sources.ts:
--------------------------------------------------------------------------------

```typescript
import { BaseTool } from './base-tool.js';
import { ToolDefinition, McpToolResponse, isDocumentPayload } from '../types.js';
import { ApiClient } from '../api-client.js';
import { ErrorCode, McpError } from '@modelcontextprotocol/sdk/types.js';

const COLLECTION_NAME = 'documentation';

export class ListSourcesTool extends BaseTool {
  private apiClient: ApiClient;

  constructor(apiClient: ApiClient) {
    super();
    this.apiClient = apiClient;
  }

  get definition(): ToolDefinition {
    return {
      name: 'list_sources',
      description: 'List all documentation sources currently stored',
      inputSchema: {
        type: 'object',
        properties: {},
        required: [],
      },
    };
  }

  async execute(args: any): Promise<McpToolResponse> {
    try {
      // Use pagination for better performance with large datasets
      const pageSize = 100;
      let offset: string | null = null;
      const sources = new Set<string>();
      
      while (true) {
        const scroll = await this.apiClient.qdrantClient.scroll(COLLECTION_NAME, {
          with_payload: true,
          with_vector: false, // Optimize network transfer
          limit: pageSize,
          offset,
        });

        if (scroll.points.length === 0) break;
        
        for (const point of scroll.points) {
          if (isDocumentPayload(point.payload)) {
            sources.add(`${point.payload.title} (${point.payload.url})`);
          }
        }

        if (scroll.points.length < pageSize) break;
        offset = scroll.points[scroll.points.length - 1].id as string;
      }

      return {
        content: [
          {
            type: 'text',
            text: Array.from(sources).join('\n') || 'No documentation sources found in the cloud collection.',
          },
        ],
      };
    } catch (error) {
      if (error instanceof Error) {
        if (error.message.includes('unauthorized')) {
          throw new McpError(
            ErrorCode.InvalidRequest,
            'Failed to authenticate with Qdrant cloud while listing sources'
          );
        } else if (error.message.includes('ECONNREFUSED') || error.message.includes('ETIMEDOUT')) {
          throw new McpError(
            ErrorCode.InternalError,
            'Connection to Qdrant cloud failed while listing sources'
          );
        }
      }
      return {
        content: [
          {
            type: 'text',
            text: `Failed to list sources: ${error}`,
          },
        ],
        isError: true,
      };
    }
  }
}
```

--------------------------------------------------------------------------------
/src/handlers/run-queue.ts:
--------------------------------------------------------------------------------

```typescript
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { ApiClient } from '../api-client.js';
import { BaseHandler } from './base-handler.js';
import { McpToolResponse } from '../types.js';
import { AddDocumentationHandler } from './add-documentation.js';
import fs from 'fs/promises';
import path from 'path';
import { fileURLToPath } from 'url';

// Get current directory in ES modules
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
const QUEUE_FILE = path.join(__dirname, '..', '..', 'queue.txt');

export class RunQueueHandler extends BaseHandler {
  private addDocHandler: AddDocumentationHandler;

  constructor(server: Server, apiClient: ApiClient) {
    super(server, apiClient);
    this.addDocHandler = new AddDocumentationHandler(server, apiClient);
  }

  async handle(_args: any): Promise<McpToolResponse> {
    try {
      // Check if queue file exists
      try {
        await fs.access(QUEUE_FILE);
      } catch {
        return {
          content: [
            {
              type: 'text',
              text: 'Queue is empty (queue file does not exist)',
            },
          ],
        };
      }

      let processedCount = 0;
      let failedCount = 0;
      const failedUrls: string[] = [];

      while (true) {
        // Read current queue
        const content = await fs.readFile(QUEUE_FILE, 'utf-8');
        const urls = content.split('\n').filter(url => url.trim() !== '');

        if (urls.length === 0) {
          break; // Queue is empty
        }

        const currentUrl = urls[0]; // Get first URL
        
        try {
          // Process the URL using add_documentation handler
          await this.addDocHandler.handle({ url: currentUrl });
          processedCount++;
        } catch (error) {
          failedCount++;
          failedUrls.push(currentUrl);
          console.error(`Failed to process URL ${currentUrl}:`, error);
        }

        // Remove the processed URL from queue
        const remainingUrls = urls.slice(1);
        await fs.writeFile(QUEUE_FILE, remainingUrls.join('\n') + (remainingUrls.length > 0 ? '\n' : ''));
      }

      let resultText = `Queue processing complete.\nProcessed: ${processedCount} URLs\nFailed: ${failedCount} URLs`;
      if (failedUrls.length > 0) {
        resultText += `\n\nFailed URLs:\n${failedUrls.join('\n')}`;
      }

      return {
        content: [
          {
            type: 'text',
            text: resultText,
          },
        ],
      };
    } catch (error) {
      return {
        content: [
          {
            type: 'text',
            text: `Failed to process queue: ${error}`,
          },
        ],
        isError: true,
      };
    }
  }
}
```

--------------------------------------------------------------------------------
/src/tools/search-documentation.ts:
--------------------------------------------------------------------------------

```typescript
import { BaseTool } from './base-tool.js';
import { ToolDefinition, McpToolResponse, isDocumentPayload } from '../types.js';
import { ApiClient } from '../api-client.js';
import { ErrorCode, McpError } from '@modelcontextprotocol/sdk/types.js';

const COLLECTION_NAME = 'documentation';

export class SearchDocumentationTool extends BaseTool {
  private apiClient: ApiClient;

  constructor(apiClient: ApiClient) {
    super();
    this.apiClient = apiClient;
  }

  get definition(): ToolDefinition {
    return {
      name: 'search_documentation',
      description: 'Search through stored documentation',
      inputSchema: {
        type: 'object',
        properties: {
          query: {
            type: 'string',
            description: 'Search query',
          },
          limit: {
            type: 'number',
            description: 'Maximum number of results to return',
            default: 5,
          },
        },
        required: ['query'],
      },
    };
  }

  async execute(args: any): Promise<McpToolResponse> {
    if (!args.query || typeof args.query !== 'string') {
      throw new McpError(ErrorCode.InvalidParams, 'Query is required');
    }

    const limit = args.limit || 5;

    try {
      const queryEmbedding = await this.apiClient.getEmbeddings(args.query);
      
      const searchResults = await this.apiClient.qdrantClient.search(COLLECTION_NAME, {
        vector: queryEmbedding,
        limit,
        with_payload: true,
        with_vector: false, // Optimize network transfer by not retrieving vectors
        score_threshold: 0.7, // Only return relevant results
      });

      const formattedResults = searchResults.map(result => {
        if (!isDocumentPayload(result.payload)) {
          throw new Error('Invalid payload type');
        }
        return `[${result.payload.title}](${result.payload.url})\nScore: ${result.score.toFixed(3)}\nContent: ${result.payload.text}\n`;
      }).join('\n---\n');

      return {
        content: [
          {
            type: 'text',
            text: formattedResults || 'No results found matching the query.',
          },
        ],
      };
    } catch (error) {
      if (error instanceof Error) {
        if (error.message.includes('unauthorized')) {
          throw new McpError(
            ErrorCode.InvalidRequest,
            'Failed to authenticate with Qdrant cloud while searching'
          );
        } else if (error.message.includes('ECONNREFUSED') || error.message.includes('ETIMEDOUT')) {
          throw new McpError(
            ErrorCode.InternalError,
            'Connection to Qdrant cloud failed while searching'
          );
        }
      }
      return {
        content: [
          {
            type: 'text',
            text: `Search failed: ${error}`,
          },
        ],
        isError: true,
      };
    }
  }
}
```

--------------------------------------------------------------------------------
/src/tools/remove-documentation.ts:
--------------------------------------------------------------------------------

```typescript
import { BaseTool } from './base-tool.js';
import { ToolDefinition, McpToolResponse } from '../types.js';
import { ApiClient } from '../api-client.js';
import { ErrorCode, McpError } from '@modelcontextprotocol/sdk/types.js';

const COLLECTION_NAME = 'documentation';

export class RemoveDocumentationTool extends BaseTool {
  private apiClient: ApiClient;

  constructor(apiClient: ApiClient) {
    super();
    this.apiClient = apiClient;
  }

  get definition(): ToolDefinition {
    return {
      name: 'remove_documentation',
      description: 'Remove one or more documentation sources by their URLs',
      inputSchema: {
        type: 'object',
        properties: {
          urls: {
            type: 'array',
            items: {
              type: 'string',
              description: 'URL of a documentation source to remove'
            },
            description: 'Array of URLs to remove. Can be a single URL or multiple URLs.',
            minItems: 1
          }
        },
        required: ['urls'],
      },
    };
  }

  async execute(args: { urls: string[] }): Promise<McpToolResponse> {
    if (!Array.isArray(args.urls) || args.urls.length === 0) {
      throw new McpError(ErrorCode.InvalidParams, 'At least one URL is required');
    }

    if (!args.urls.every(url => typeof url === 'string')) {
      throw new McpError(ErrorCode.InvalidParams, 'All URLs must be strings');
    }

    try {
      // Delete using filter to match any of the provided URLs
      const result = await this.apiClient.qdrantClient.delete(COLLECTION_NAME, {
        filter: {
          should: args.urls.map(url => ({
            key: 'url',
            match: { value: url }
          }))
        },
        wait: true
      });

      if (!['acknowledged', 'completed'].includes(result.status)) {
        throw new Error('Delete operation failed');
      }

      return {
        content: [
          {
            type: 'text',
            text: `Successfully removed documentation from ${args.urls.length} source${args.urls.length > 1 ? 's' : ''}: ${args.urls.join(', ')}`,
          },
        ],
      };
    } catch (error) {
      if (error instanceof Error) {
        if (error.message.includes('unauthorized')) {
          throw new McpError(
            ErrorCode.InvalidRequest,
            'Failed to authenticate with Qdrant cloud while removing documentation'
          );
        } else if (error.message.includes('ECONNREFUSED') || error.message.includes('ETIMEDOUT')) {
          throw new McpError(
            ErrorCode.InternalError,
            'Connection to Qdrant cloud failed while removing documentation'
          );
        }
      }
      return {
        content: [
          {
            type: 'text',
            text: `Failed to remove documentation: ${error}`,
          },
        ],
        isError: true,
      };
    }
  }
}
```

--------------------------------------------------------------------------------
/src/handlers/extract-urls.ts:
--------------------------------------------------------------------------------

```typescript
import { McpError, ErrorCode } from '@modelcontextprotocol/sdk/types.js';
import { BaseHandler } from './base-handler.js';
import { McpToolResponse } from '../types.js';
import * as cheerio from 'cheerio';
import fs from 'fs/promises';
import path from 'path';
import { fileURLToPath } from 'url';

// Get current directory in ES modules
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
const QUEUE_FILE = path.join(__dirname, '..', '..', 'queue.txt');

export class ExtractUrlsHandler extends BaseHandler {
  async handle(args: any): Promise<McpToolResponse> {
    if (!args.url || typeof args.url !== 'string') {
      throw new McpError(ErrorCode.InvalidParams, 'URL is required');
    }

    await this.apiClient.initBrowser();
    const page = await this.apiClient.browser.newPage();

    try {
      const baseUrl = new URL(args.url);
      const basePath = baseUrl.pathname.split('/').slice(0, 3).join('/'); // Get the base path (e.g., /3/ for Python docs)

      await page.goto(args.url, { waitUntil: 'networkidle' });
      const content = await page.content();
      const $ = cheerio.load(content);
      const urls = new Set<string>();

      $('a[href]').each((_, element) => {
        const href = $(element).attr('href');
        if (href) {
          try {
            const url = new URL(href, args.url);
            // Only include URLs from the same documentation section
            if (url.hostname === baseUrl.hostname && 
                url.pathname.startsWith(basePath) && 
                !url.hash && 
                !url.href.endsWith('#')) {
              urls.add(url.href);
            }
          } catch (e) {
            // Ignore invalid URLs
          }
        }
      });

      const urlArray = Array.from(urls);

      if (args.add_to_queue) {
        try {
          // Ensure queue file exists
          try {
            await fs.access(QUEUE_FILE);
          } catch {
            await fs.writeFile(QUEUE_FILE, '');
          }

          // Append URLs to queue
          const urlsToAdd = urlArray.join('\n') + (urlArray.length > 0 ? '\n' : '');
          await fs.appendFile(QUEUE_FILE, urlsToAdd);

          return {
            content: [
              {
                type: 'text',
                text: `Successfully added ${urlArray.length} URLs to the queue`,
              },
            ],
          };
        } catch (error) {
          return {
            content: [
              {
                type: 'text',
                text: `Failed to add URLs to queue: ${error}`,
              },
            ],
            isError: true,
          };
        }
      }

      return {
        content: [
          {
            type: 'text',
            text: urlArray.join('\n') || 'No URLs found on this page.',
          },
        ],
      };
    } catch (error) {
      return {
        content: [
          {
            type: 'text',
            text: `Failed to extract URLs: ${error}`,
          },
        ],
        isError: true,
      };
    } finally {
      await page.close();
    }
  }
}
```

--------------------------------------------------------------------------------
/src/api-client.ts:
--------------------------------------------------------------------------------

```typescript
import { QdrantClient } from '@qdrant/js-client-rest';
import OpenAI from 'openai';
import { chromium } from 'playwright';
import { McpError, ErrorCode } from '@modelcontextprotocol/sdk/types.js';

// Environment variables for configuration
const OPENAI_API_KEY = process.env.OPENAI_API_KEY;
const QDRANT_URL = process.env.QDRANT_URL;
const QDRANT_API_KEY = process.env.QDRANT_API_KEY;

if (!QDRANT_URL) {
  throw new Error('QDRANT_URL environment variable is required for cloud storage');
}

if (!QDRANT_API_KEY) {
  throw new Error('QDRANT_API_KEY environment variable is required for cloud storage');
}

export class ApiClient {
  qdrantClient: QdrantClient;
  openaiClient?: OpenAI;
  browser: any;

  constructor() {
    // Initialize Qdrant client with cloud configuration
    this.qdrantClient = new QdrantClient({
      url: QDRANT_URL,
      apiKey: QDRANT_API_KEY,
    });

    // Initialize OpenAI client if API key is provided
    if (OPENAI_API_KEY) {
      this.openaiClient = new OpenAI({
        apiKey: OPENAI_API_KEY,
      });
    }
  }

  async initBrowser() {
    if (!this.browser) {
      this.browser = await chromium.launch();
    }
  }

  async cleanup() {
    if (this.browser) {
      await this.browser.close();
    }
  }

  async getEmbeddings(text: string): Promise<number[]> {
    if (!this.openaiClient) {
      throw new McpError(
        ErrorCode.InvalidRequest,
        'OpenAI API key not configured'
      );
    }

    try {
      const response = await this.openaiClient.embeddings.create({
        model: 'text-embedding-ada-002',
        input: text,
      });
      return response.data[0].embedding;
    } catch (error) {
      throw new McpError(
        ErrorCode.InternalError,
        `Failed to generate embeddings: ${error}`
      );
    }
  }

  async initCollection(COLLECTION_NAME: string) {
    try {
      const collections = await this.qdrantClient.getCollections();
      const exists = collections.collections.some(c => c.name === COLLECTION_NAME);

      if (!exists) {
        await this.qdrantClient.createCollection(COLLECTION_NAME, {
          vectors: {
            size: 1536, // OpenAI ada-002 embedding size
            distance: 'Cosine',
          },
          // Add optimized settings for cloud deployment
          optimizers_config: {
            default_segment_number: 2,
            memmap_threshold: 20000,
          },
          replication_factor: 2,
        });
      }
    } catch (error) {
      if (error instanceof Error) {
        if (error.message.includes('unauthorized')) {
          throw new McpError(
            ErrorCode.InvalidRequest,
            'Failed to authenticate with Qdrant cloud. Please check your API key.'
          );
        } else if (error.message.includes('ECONNREFUSED') || error.message.includes('ETIMEDOUT')) {
          throw new McpError(
            ErrorCode.InternalError,
            'Failed to connect to Qdrant cloud. Please check your QDRANT_URL.'
          );
        }
      }
      throw new McpError(
        ErrorCode.InternalError,
        `Failed to initialize Qdrant cloud collection: ${error}`
      );
    }
  }
}
```

--------------------------------------------------------------------------------
/src/tools/run-queue.ts:
--------------------------------------------------------------------------------

```typescript
import { BaseTool } from './base-tool.js';
import { ToolDefinition, McpToolResponse } from '../types.js';
import { ErrorCode, McpError } from '@modelcontextprotocol/sdk/types.js';
import fs from 'fs/promises';
import path from 'path';
import { fileURLToPath } from 'url';
import { ApiClient } from '../api-client.js';
import { AddDocumentationHandler } from '../handlers/add-documentation.js';
import { Server } from '@modelcontextprotocol/sdk/server/index.js';

// Get current directory in ES modules
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
const QUEUE_FILE = path.join(__dirname, '..', '..', 'queue.txt');

export class RunQueueTool extends BaseTool {
  private apiClient: ApiClient;
  private addDocHandler: AddDocumentationHandler;

  constructor(apiClient: ApiClient) {
    super();
    this.apiClient = apiClient;
    // Create a temporary server instance just for the handler
    const tempServer = new Server(
      { name: 'temp', version: '0.0.0' },
      { capabilities: { tools: {} } }
    );
    this.addDocHandler = new AddDocumentationHandler(tempServer, apiClient);
  }

  get definition(): ToolDefinition {
    return {
      name: 'run_queue',
      description: 'Process URLs from the queue one at a time until complete',
      inputSchema: {
        type: 'object',
        properties: {},
        required: [],
      },
    };
  }

  async execute(_args: any): Promise<McpToolResponse> {
    try {
      // Check if queue file exists
      try {
        await fs.access(QUEUE_FILE);
      } catch {
        return {
          content: [
            {
              type: 'text',
              text: 'Queue is empty (queue file does not exist)',
            },
          ],
        };
      }

      let processedCount = 0;
      let failedCount = 0;
      const failedUrls: string[] = [];

      while (true) {
        // Read current queue
        const content = await fs.readFile(QUEUE_FILE, 'utf-8');
        const urls = content.split('\n').filter(url => url.trim() !== '');

        if (urls.length === 0) {
          break; // Queue is empty
        }

        const currentUrl = urls[0]; // Get first URL
        
        try {
          // Process the URL using the handler
          await this.addDocHandler.handle({ url: currentUrl });
          processedCount++;
        } catch (error) {
          failedCount++;
          failedUrls.push(currentUrl);
          console.error(`Failed to process URL ${currentUrl}:`, error);
        }

        // Remove the processed URL from queue
        const remainingUrls = urls.slice(1);
        await fs.writeFile(QUEUE_FILE, remainingUrls.join('\n') + (remainingUrls.length > 0 ? '\n' : ''));
      }

      let resultText = `Queue processing complete.\nProcessed: ${processedCount} URLs\nFailed: ${failedCount} URLs`;
      if (failedUrls.length > 0) {
        resultText += `\n\nFailed URLs:\n${failedUrls.join('\n')}`;
      }

      return {
        content: [
          {
            type: 'text',
            text: resultText,
          },
        ],
      };
    } catch (error) {
      return {
        content: [
          {
            type: 'text',
            text: `Failed to process queue: ${error}`,
          },
        ],
        isError: true,
      };
    }
  }
}

```

--------------------------------------------------------------------------------
/src/tools/extract-urls.ts:
--------------------------------------------------------------------------------

```typescript
import { BaseTool } from './base-tool.js';
import { ToolDefinition, McpToolResponse } from '../types.js';
import { ApiClient } from '../api-client.js';
import { ErrorCode, McpError } from '@modelcontextprotocol/sdk/types.js';
import * as cheerio from 'cheerio';
import fs from 'fs/promises';
import path from 'path';
import { fileURLToPath } from 'url';

// Get current directory in ES modules
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
const QUEUE_FILE = path.join(__dirname, '..', '..', 'queue.txt');

export class ExtractUrlsTool extends BaseTool {
  private apiClient: ApiClient;

  constructor(apiClient: ApiClient) {
    super();
    this.apiClient = apiClient;
  }

  get definition(): ToolDefinition {
    return {
      name: 'extract_urls',
      description: 'Extract all URLs from a given web page',
      inputSchema: {
        type: 'object',
        properties: {
          url: {
            type: 'string',
            description: 'URL of the page to extract URLs from',
          },
          add_to_queue: {
            type: 'boolean',
            description: 'If true, automatically add extracted URLs to the queue',
            default: false,
          },
        },
        required: ['url'],
      },
    };
  }

  async execute(args: any): Promise<McpToolResponse> {
    if (!args.url || typeof args.url !== 'string') {
      throw new McpError(ErrorCode.InvalidParams, 'URL is required');
    }

    await this.apiClient.initBrowser();
    const page = await this.apiClient.browser.newPage();

    try {
      await page.goto(args.url, { waitUntil: 'networkidle' });
      const content = await page.content();
      const $ = cheerio.load(content);
      const urls = new Set<string>();

      $('a[href]').each((_, element) => {
        const href = $(element).attr('href');
        if (href) {
          try {
            const url = new URL(href, args.url);
            // Only include URLs from the same domain to avoid external links
            if (url.origin === new URL(args.url).origin && !url.hash && !url.href.endsWith('#')) {
              urls.add(url.href);
            }
          } catch (e) {
            // Ignore invalid URLs
          }
        }
      });

      const urlArray = Array.from(urls);

      if (args.add_to_queue) {
        try {
          // Ensure queue file exists
          try {
            await fs.access(QUEUE_FILE);
          } catch {
            await fs.writeFile(QUEUE_FILE, '');
          }

          // Append URLs to queue
          const urlsToAdd = urlArray.join('\n') + (urlArray.length > 0 ? '\n' : '');
          await fs.appendFile(QUEUE_FILE, urlsToAdd);

          return {
            content: [
              {
                type: 'text',
                text: `Successfully added ${urlArray.length} URLs to the queue`,
              },
            ],
          };
        } catch (error) {
          return {
            content: [
              {
                type: 'text',
                text: `Failed to add URLs to queue: ${error}`,
              },
            ],
            isError: true,
          };
        }
      }

      return {
        content: [
          {
            type: 'text',
            text: urlArray.join('\n') || 'No URLs found on this page.',
          },
        ],
      };
    } catch (error) {
      return {
        content: [
          {
            type: 'text',
            text: `Failed to extract URLs: ${error}`,
          },
        ],
        isError: true,
      };
    } finally {
      await page.close();
    }
  }
}
```

--------------------------------------------------------------------------------
/src/handlers/add-documentation.ts:
--------------------------------------------------------------------------------

```typescript
import { McpError, ErrorCode } from '@modelcontextprotocol/sdk/types.js';
import { BaseHandler } from './base-handler.js';
import { DocumentChunk, McpToolResponse } from '../types.js';
import * as cheerio from 'cheerio';
import crypto from 'crypto';

const COLLECTION_NAME = 'documentation';

export class AddDocumentationHandler extends BaseHandler {
  async handle(args: any): Promise<McpToolResponse> {
    if (!args.url || typeof args.url !== 'string') {
      throw new McpError(ErrorCode.InvalidParams, 'URL is required');
    }

    try {
      const chunks = await this.fetchAndProcessUrl(args.url);
      
      // Batch process chunks for better performance
      const batchSize = 100;
      for (let i = 0; i < chunks.length; i += batchSize) {
        const batch = chunks.slice(i, i + batchSize);
        const points = await Promise.all(
          batch.map(async (chunk) => {
            const embedding = await this.apiClient.getEmbeddings(chunk.text);
            return {
              id: this.generatePointId(),
              vector: embedding,
              payload: {
                ...chunk,
                _type: 'DocumentChunk' as const,
              } as Record<string, unknown>,
            };
          })
        );

        try {
          await this.apiClient.qdrantClient.upsert(COLLECTION_NAME, {
            wait: true,
            points,
          });
        } catch (error) {
          if (error instanceof Error) {
            if (error.message.includes('unauthorized')) {
              throw new McpError(
                ErrorCode.InvalidRequest,
                'Failed to authenticate with Qdrant cloud while adding documents'
              );
            } else if (error.message.includes('ECONNREFUSED') || error.message.includes('ETIMEDOUT')) {
              throw new McpError(
                ErrorCode.InternalError,
                'Connection to Qdrant cloud failed while adding documents'
              );
            }
          }
          throw error;
        }
      }

      return {
        content: [
          {
            type: 'text',
            text: `Successfully added documentation from ${args.url} (${chunks.length} chunks processed in ${Math.ceil(chunks.length / batchSize)} batches)`,
          },
        ],
      };
    } catch (error) {
      if (error instanceof McpError) {
        throw error;
      }
      return {
        content: [
          {
            type: 'text',
            text: `Failed to add documentation: ${error}`,
          },
        ],
        isError: true,
      };
    }
  }

  private async fetchAndProcessUrl(url: string): Promise<DocumentChunk[]> {
    await this.apiClient.initBrowser();
    const page = await this.apiClient.browser.newPage();
    
    try {
      await page.goto(url, { waitUntil: 'networkidle' });
      const content = await page.content();
      const $ = cheerio.load(content);
      
      // Remove script tags, style tags, and comments
      $('script').remove();
      $('style').remove();
      $('noscript').remove();
      
      // Extract main content
      const title = $('title').text() || url;
      const mainContent = $('main, article, .content, .documentation, body').text();
      
      // Split content into chunks
      const chunks = this.chunkText(mainContent, 1000);
      
      return chunks.map(chunk => ({
        text: chunk,
        url,
        title,
        timestamp: new Date().toISOString(),
      }));
    } catch (error) {
      throw new McpError(
        ErrorCode.InternalError,
        `Failed to fetch URL ${url}: ${error}`
      );
    } finally {
      await page.close();
    }
  }

  private chunkText(text: string, maxChunkSize: number): string[] {
    const words = text.split(/\s+/);
    const chunks: string[] = [];
    let currentChunk: string[] = [];
    
    for (const word of words) {
      currentChunk.push(word);
      const currentLength = currentChunk.join(' ').length;
      
      if (currentLength >= maxChunkSize) {
        chunks.push(currentChunk.join(' '));
        currentChunk = [];
      }
    }
    
    if (currentChunk.length > 0) {
      chunks.push(currentChunk.join(' '));
    }
    
    return chunks;
  }

  private generatePointId(): string {
    return crypto.randomBytes(16).toString('hex');
  }
}
```

--------------------------------------------------------------------------------
/src/handlers/list-sources.ts:
--------------------------------------------------------------------------------

```typescript
import { McpError, ErrorCode } from '@modelcontextprotocol/sdk/types.js';
import { BaseHandler } from './base-handler.js';
import { McpToolResponse, isDocumentPayload } from '../types.js';

const COLLECTION_NAME = 'documentation';

interface Source {
  title: string;
  url: string;
}

interface GroupedSources {
  [domain: string]: {
    [subdomain: string]: Source[];
  };
}

export class ListSourcesHandler extends BaseHandler {
  private groupSourcesByDomainAndSubdomain(sources: Source[]): GroupedSources {
    const grouped: GroupedSources = {};

    for (const source of sources) {
      try {
        const url = new URL(source.url);
        const domain = url.hostname;
        const pathParts = url.pathname.split('/').filter(p => p);
        const subdomain = pathParts[0] || '/';

        if (!grouped[domain]) {
          grouped[domain] = {};
        }
        if (!grouped[domain][subdomain]) {
          grouped[domain][subdomain] = [];
        }
        grouped[domain][subdomain].push(source);
      } catch (error) {
        console.error(`Invalid URL: ${source.url}`);
      }
    }

    return grouped;
  }

  private formatGroupedSources(grouped: GroupedSources): string {
    const output: string[] = [];
    let domainCounter = 1;

    for (const [domain, subdomains] of Object.entries(grouped)) {
      output.push(`${domainCounter}. ${domain}`);
      
      // Create a Set of unique URL+title combinations
      const uniqueSources = new Map<string, Source>();
      for (const sources of Object.values(subdomains)) {
        for (const source of sources) {
          uniqueSources.set(source.url, source);
        }
      }

      // Convert to array and sort
      const sortedSources = Array.from(uniqueSources.values())
        .sort((a, b) => a.title.localeCompare(b.title));

      // Use letters for subdomain entries
      sortedSources.forEach((source, index) => {
        output.push(`${domainCounter}.${index + 1}. ${source.title} (${source.url})`);
      });

      output.push(''); // Add blank line between domains
      domainCounter++;
    }

    return output.join('\n');
  }

  async handle(): Promise<McpToolResponse> {
    try {
      await this.apiClient.initCollection(COLLECTION_NAME);
      
      const pageSize = 100;
      let offset = null;
      const sources: Source[] = [];
      
      while (true) {
        const scroll = await this.apiClient.qdrantClient.scroll(COLLECTION_NAME, {
          with_payload: true,
          with_vector: false,
          limit: pageSize,
          offset,
        });

        if (scroll.points.length === 0) break;
        
        for (const point of scroll.points) {
          if (point.payload && typeof point.payload === 'object' && 'url' in point.payload && 'title' in point.payload) {
            const payload = point.payload as any;
            sources.push({
              title: payload.title,
              url: payload.url
            });
          }
        }

        if (scroll.points.length < pageSize) break;
        offset = scroll.points[scroll.points.length - 1].id;
      }

      if (sources.length === 0) {
        return {
          content: [
            {
              type: 'text',
              text: 'No documentation sources found.',
            },
          ],
        };
      }

      const grouped = this.groupSourcesByDomainAndSubdomain(sources);
      const formattedOutput = this.formatGroupedSources(grouped);

      return {
        content: [
          {
            type: 'text',
            text: formattedOutput,
          },
        ],
      };
    } catch (error) {
      if (error instanceof Error) {
        if (error.message.includes('unauthorized')) {
          throw new McpError(
            ErrorCode.InvalidRequest,
            'Failed to authenticate with Qdrant cloud while listing sources'
          );
        } else if (error.message.includes('ECONNREFUSED') || error.message.includes('ETIMEDOUT')) {
          throw new McpError(
            ErrorCode.InternalError,
            'Connection to Qdrant cloud failed while listing sources'
          );
        }
      }
      return {
        content: [
          {
            type: 'text',
            text: `Failed to list sources: ${error}`,
          },
        ],
        isError: true,
      };
    }
  }
}
```

--------------------------------------------------------------------------------
/src/handler-registry.ts:
--------------------------------------------------------------------------------

```typescript
import {
  CallToolRequestSchema,
  ErrorCode,
  ListToolsRequestSchema,
  McpError,
} from '@modelcontextprotocol/sdk/types.js';
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { ApiClient } from './api-client.js';
import { ToolDefinition } from './types.js';
import {
  AddDocumentationHandler,
  SearchDocumentationHandler,
  ListSourcesHandler,
  RemoveDocumentationHandler,
  ExtractUrlsHandler,
  ListQueueHandler,
  RunQueueHandler,
		ClearQueueHandler,
} from './handlers/index.js';

const COLLECTION_NAME = 'documentation';

export class HandlerRegistry {
  private server: Server;
  private apiClient: ApiClient;
  private handlers: Map<string, any>;

  constructor(server: Server, apiClient: ApiClient) {
    this.server = server;
    this.apiClient = apiClient;
    this.handlers = new Map();
    this.setupHandlers();
    this.registerHandlers();
  }

  private setupHandlers() {
    this.handlers.set('add_documentation', new AddDocumentationHandler(this.server, this.apiClient));
    this.handlers.set('search_documentation', new SearchDocumentationHandler(this.server, this.apiClient));
    this.handlers.set('list_sources', new ListSourcesHandler(this.server, this.apiClient));
    this.handlers.set('remove_documentation', new RemoveDocumentationHandler(this.server, this.apiClient));
    this.handlers.set('extract_urls', new ExtractUrlsHandler(this.server, this.apiClient));
    this.handlers.set('list_queue', new ListQueueHandler(this.server, this.apiClient));
    this.handlers.set('run_queue', new RunQueueHandler(this.server, this.apiClient));
    this.handlers.set('clear_queue', new ClearQueueHandler(this.server, this.apiClient));
  }

  private registerHandlers() {
    this.server.setRequestHandler(ListToolsRequestSchema, async () => ({
      tools: [
        {
          name: 'search_documentation',
          description: 'Search through stored documentation using natural language queries. Use this tool to find relevant information across all stored documentation sources. Returns matching excerpts with context, ranked by relevance. Useful for finding specific information, code examples, or related documentation.',
          inputSchema: {
            type: 'object',
            properties: {
              query: {
                type: 'string',
                description: 'The text to search for in the documentation. Can be a natural language query, specific terms, or code snippets.',
              },
              limit: {
                type: 'number',
                description: 'Maximum number of results to return (1-20). Higher limits provide more comprehensive results but may take longer to process. Default is 5.',
                default: 5,
              },
            },
            required: ['query'],
          },
        } as ToolDefinition,
        {
          name: 'list_sources',
          description: 'List all documentation sources currently stored in the system. Returns a comprehensive list of all indexed documentation including source URLs, titles, and last update times. Use this to understand what documentation is available for searching or to verify if specific sources have been indexed.',
          inputSchema: {
            type: 'object',
            properties: {},
          },
        } as ToolDefinition,
        {
          name: 'extract_urls',
          description: 'Extract and analyze all URLs from a given web page. This tool crawls the specified webpage, identifies all hyperlinks, and optionally adds them to the processing queue. Useful for discovering related documentation pages, API references, or building a documentation graph. Handles various URL formats and validates links before extraction.',
          inputSchema: {
            type: 'object',
            properties: {
              url: {
                type: 'string',
                description: 'The complete URL of the webpage to analyze (must include protocol, e.g., https://). The page must be publicly accessible.',
              },
              add_to_queue: {
                type: 'boolean',
                description: 'If true, automatically add extracted URLs to the processing queue for later indexing. This enables recursive documentation discovery. Use with caution on large sites to avoid excessive queuing.',
                default: false,
              },
            },
            required: ['url'],
          },
        } as ToolDefinition,
        {
          name: 'remove_documentation',
          description: 'Remove specific documentation sources from the system by their URLs. Use this tool to clean up outdated documentation, remove incorrect sources, or manage the documentation collection. The removal is permanent and will affect future search results. Supports removing multiple URLs in a single operation.',
          inputSchema: {
            type: 'object',
            properties: {
              urls: {
                type: 'array',
                items: {
                  type: 'string',
                  description: 'The complete URL of the documentation source to remove. Must exactly match the URL used when the documentation was added.',
                },
                description: 'Array of URLs to remove from the database',
              },
            },
            required: ['urls'],
          },
        } as ToolDefinition,
        {
          name: 'list_queue',
          description: 'List all URLs currently waiting in the documentation processing queue. Shows pending documentation sources that will be processed when run_queue is called. Use this to monitor queue status, verify URLs were added correctly, or check processing backlog. Returns URLs in the order they will be processed.',
          inputSchema: {
            type: 'object',
            properties: {},
          },
        } as ToolDefinition,
        {
          name: 'run_queue',
          description: 'Process and index all URLs currently in the documentation queue. Each URL is processed sequentially, with proper error handling and retry logic. Progress updates are provided as processing occurs. Use this after adding new URLs to ensure all documentation is indexed and searchable. Long-running operations will process until the queue is empty or an unrecoverable error occurs.',
          inputSchema: {
            type: 'object',
            properties: {},
          },
        } as ToolDefinition,
        {
          name: 'clear_queue',
          description: 'Remove all pending URLs from the documentation processing queue. Use this to reset the queue when you want to start fresh, remove unwanted URLs, or cancel pending processing. This operation is immediate and permanent - URLs will need to be re-added if you want to process them later. Returns the number of URLs that were cleared from the queue.',
          inputSchema: {
            type: 'object',
            properties: {},
          },
        } as ToolDefinition,
      ],
    }));

    this.server.setRequestHandler(CallToolRequestSchema, async (request) => {
      await this.apiClient.initCollection(COLLECTION_NAME);

      const handler = this.handlers.get(request.params.name);
      if (!handler) {
        throw new McpError(
          ErrorCode.MethodNotFound,
          `Unknown tool: ${request.params.name}`
        );
      }

      const response = await handler.handle(request.params.arguments);
      return {
        _meta: {},
        ...response
      };
    });
  }
}
```