# Directory Structure ``` ├── .gitignore ├── package.json ├── pnpm-lock.yaml ├── README.md ├── src │ └── index.ts ├── test-rag.ts └── tsconfig.json ``` # Files -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- ``` 1 | node_modules/ 2 | build/ 3 | *.log 4 | .env* ``` -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- ```markdown 1 | # mcp-docs-rag MCP Server 2 | 3 | RAG (Retrieval-Augmented Generation) for documents in a local directory 4 | 5 | This is a TypeScript-based MCP server that implements a RAG system for documents stored in a local directory. It allows users to query documents using LLMs with context from locally stored repositories and text files. 6 | 7 | ## Features 8 | 9 | ### Resources 10 | - List and access documents via `docs://` URIs 11 | - Documents can be Git repositories or text files 12 | - Plain text mime type for content access 13 | 14 | ### Tools 15 | - `list_documents` - List all available documents in the DOCS_PATH directory 16 | - Returns a formatted list of all documents 17 | - Shows total number of available documents 18 | - `rag_query` - Query documents using RAG 19 | - Takes document_id and query as parameters 20 | - Returns AI-generated responses with context from documents 21 | - `add_git_repository` - Clone a Git repository to the docs directory with optional sparse checkout 22 | - Takes repository_url as parameter 23 | - Optional document_name parameter to customize the name of the document (use simple descriptive names without '-docs' suffix) 24 | - Optional subdirectory parameter for sparse checkout of specific directories 25 | - Automatically pulls latest changes if repository already exists 26 | - `add_text_file` - Download a text file to the docs directory 27 | - Takes file_url as parameter 28 | - Uses wget to download file 29 | 30 | ### Prompts 31 | - `guide_documents_usage` - Guide on how to use documents and RAG functionality 32 | - Includes list of available documents 33 | - Provides usage hints for RAG functionality 34 | 35 | ## Development 36 | 37 | Install dependencies: 38 | ```bash 39 | npm install 40 | ``` 41 | 42 | Build the server: 43 | ```bash 44 | npm run build 45 | ``` 46 | 47 | For development with auto-rebuild: 48 | ```bash 49 | npm run watch 50 | ``` 51 | 52 | ## Setup 53 | 54 | This server requires a local directory for storing documents. By default, it uses `~/docs` but you can configure a different location with the `DOCS_PATH` environment variable. 55 | 56 | ### Document Structure 57 | 58 | The documents directory can contain: 59 | - Git repositories (cloned directories) 60 | - Plain text files (with .txt extension) 61 | 62 | Each document is indexed separately using llama-index.ts with Google's Gemini embeddings. 63 | 64 | ### API Keys 65 | 66 | This server uses Google's Gemini API for document indexing and querying. You need to set your Gemini API key as an environment variable: 67 | 68 | ```bash 69 | export GEMINI_API_KEY=your-api-key-here 70 | ``` 71 | 72 | You can obtain a Gemini API key from the [Google AI Studio](https://makersuite.google.com/app/apikey) website. Add this key to your shell profile or include it in the environment configuration for Claude Desktop. 73 | 74 | ## Installation 75 | 76 | To use with Claude Desktop, add the server config: 77 | 78 | On MacOS: `~/Library/Application Support/Claude/claude_desktop_config.json` 79 | On Windows: `%APPDATA%/Claude/claude_desktop_config.json` 80 | On Linux: `~/.config/Claude/claude_desktop_config.json` 81 | 82 | ```json 83 | { 84 | "mcpServers": { 85 | "docs-rag": { 86 | "command": "npx", 87 | "args": ["-y", "@kazuph/mcp-docs-rag"], 88 | "env": { 89 | "DOCS_PATH": "/Users/username/docs", 90 | "GEMINI_API_KEY": "your-api-key-here" 91 | } 92 | } 93 | } 94 | } 95 | ``` 96 | 97 | Make sure to replace `/Users/username/docs` with the actual path to your documents directory. 98 | 99 | ### Debugging 100 | 101 | Since MCP servers communicate over stdio, debugging can be challenging. We recommend using the [MCP Inspector](https://github.com/modelcontextprotocol/inspector), which is available as a package script: 102 | 103 | ```bash 104 | npm run inspector 105 | ``` 106 | 107 | The Inspector will provide a URL to access debugging tools in your browser. 108 | 109 | ## Usage 110 | 111 | Once configured, you can use the server with Claude to: 112 | 113 | 1. **Add documents**: 114 | ``` 115 | Add a new document from GitHub: https://github.com/username/repository 116 | ``` 117 | or with a custom document name: 118 | ``` 119 | Add GitHub repository https://github.com/username/repository-name and name it 'framework' 120 | ``` 121 | or with sparse checkout of a specific directory: 122 | ``` 123 | Add only the 'src/components' directory from https://github.com/username/repository 124 | ``` 125 | or combine custom name and sparse checkout: 126 | ``` 127 | Add the 'examples/demo' directory from https://github.com/username/large-repo and name it 'demo-app' 128 | ``` 129 | or add a text file: 130 | ``` 131 | Add this text file: https://example.com/document.txt 132 | ``` 133 | 134 | 2. **Query documents**: 135 | ``` 136 | What does the documentation say about X in the Y repository? 137 | ``` 138 | 139 | 3. **List available documents**: 140 | ``` 141 | What documents do you have access to? 142 | ``` 143 | 144 | The server will automatically handle indexing of documents for efficient retrieval. 145 | ``` -------------------------------------------------------------------------------- /tsconfig.json: -------------------------------------------------------------------------------- ```json 1 | { 2 | "compilerOptions": { 3 | "target": "ES2022", 4 | "module": "Node16", 5 | "moduleResolution": "Node16", 6 | "outDir": "./build", 7 | "rootDir": "./src", 8 | "strict": true, 9 | "esModuleInterop": true, 10 | "skipLibCheck": true, 11 | "forceConsistentCasingInFileNames": true 12 | }, 13 | "include": ["src/**/*"], 14 | "exclude": ["node_modules"] 15 | } 16 | ``` -------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- ```json 1 | { 2 | "name": "@kazuph/mcp-docs-rag", 3 | "version": "0.5.0", 4 | "description": "RAG (Retrieval-Augmented Generation) MCP server for documents using Gemini", 5 | "author": "kazuph", 6 | "license": "MIT", 7 | "type": "module", 8 | "bin": { 9 | "@kazuph/mcp-docs-rag": "./build/index.js" 10 | }, 11 | "files": [ 12 | "build" 13 | ], 14 | "keywords": [ 15 | "mcp", 16 | "model-context-protocol", 17 | "rag", 18 | "documents", 19 | "gemini", 20 | "llamaindex" 21 | ], 22 | "publishConfig": { 23 | "access": "public" 24 | }, 25 | "repository": { 26 | "type": "git", 27 | "url": "https://github.com/kazuph/mcp-docs-rag.git" 28 | }, 29 | "engines": { 30 | "node": ">=18.0.0" 31 | }, 32 | "mcp": { 33 | "serverType": "docs-rag", 34 | "apiKeyRequirements": [ 35 | "GEMINI_API_KEY" 36 | ], 37 | "environmentVariables": { 38 | "DOCS_PATH": "Path to documents directory" 39 | }, 40 | "capabilities": [ 41 | "rag", 42 | "documentManagement" 43 | ] 44 | }, 45 | "scripts": { 46 | "build": "tsc && node -e \"require('fs').chmodSync('build/index.js', '755')\"", 47 | "prepare": "pnpm run build || npm run build", 48 | "watch": "tsc --watch", 49 | "inspector": "npx @modelcontextprotocol/inspector build/index.js" 50 | }, 51 | "dependencies": { 52 | "@llamaindex/google": "^0.1.0", 53 | "@modelcontextprotocol/sdk": "0.6.0", 54 | "llamaindex": "^0.9.9" 55 | }, 56 | "devDependencies": { 57 | "@types/node": "^20.11.24", 58 | "typescript": "^5.3.3", 59 | "dotenv": "^16.4.7" 60 | } 61 | } 62 | ```