sagacious-satadru/documentation-mcp # codebase.md

# Directory Structure

```
├── .gitignore
├── .pylintrc
├── .python-version
├── main.py
├── MCP_arch_explained.png
├── mcp-diagram-bg.png
├── pyproject.toml
├── README.md
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------

```
1 | 3.11
2 | 
```

--------------------------------------------------------------------------------
/.pylintrc:
--------------------------------------------------------------------------------

```
1 | [MASTER]
2 | init-hook="from pylint.config import find_pylintrc; import os, sys; sys.path.append(os.path.dirname(find_pylintrc()))"
3 | 
4 | [MESSAGES CONTROL]
5 | disable=C0111,C0103
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
 1 | # --- Python ---
 2 | # Byte-compiled / optimized / DLL files
 3 | __pycache__/
 4 | *.pyc
 5 | *.pyo
 6 | *.pyd
 7 | 
 8 | # Distribution / packaging
 9 | dist/
10 | build/
11 | wheels/
12 | *.egg-info/
13 | *.egg
14 | *.tar.gz
15 | *.whl
16 | 
17 | # --- Virtual Environments ---
18 | # Common virtual environment directory names
19 | .venv/
20 | venv/
21 | env/
22 | ENV/
23 | */env/
24 | */venv/
25 | 
26 | # Configuration file specific to venv
27 | pyvenv.cfg
28 | 
29 | # Environment variables file (often contains secrets)
30 | .env*
31 | 
32 | # --- IDE / Editor Files ---
33 | # VS Code specific folder (user settings, state, launch configs etc.)
34 | # Only commit .vscode/settings.json, launch.json, tasks.json, extensions.json
35 | # if they contain project-specific configurations you want to share.
36 | .vscode/
37 | 
38 | # PyCharm specific folder
39 | .idea/
40 | 
41 | # --- Testing ---
42 | .pytest_cache/
43 | .tox/
44 | htmlcov/
45 | .coverage
46 | *.cover
47 | nosetests.xml
48 | coverage.xml
49 | 
50 | # --- Operating System Files ---
51 | # macOS
52 | .DS_Store
53 | ._*
54 | 
55 | # Windows
56 | Thumbs.db
57 | 
58 | # --- Specific Files from Your Project ---
59 | # Cache file seen in your project list
60 | CACHEHDIR.TAG
61 | 
62 | # --- Logs ---
63 | *.log
64 | logs/
65 | 
66 | # --- Other ---
67 | # Add any other generated files, temporary files, or sensitive data files here
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Documentation MCP Server 📚🔍
  2 | 
  3 | A Model Context Protocol (MCP) server that enables Claude to search and access documentation from popular libraries like LangChain, LlamaIndex, and OpenAI directly within conversations.
  4 | 
  5 | ## What is MCP? 🤔
  6 | 
  7 | MCP (Model Context Protocol) is an open protocol that standardizes how applications provide context to Large Language Models. Think of it as a universal connector that lets AI assistants like Claude access external data sources and tools.
  8 | 
  9 | ![MCP Architecture](MCP_arch_explained.png)
 10 | 
 11 | 
 12 | ![MCP Architecture](mcp-diagram-bg.png)
 13 | 
 14 | ## Features ✨
 15 | 
 16 | - **Documentation Search Tool**: Search through documentation of popular AI libraries
 17 | - **Supported Libraries**:
 18 |   - [LangChain](https://python.langchain.com/docs) 🔗
 19 |   - [LlamaIndex](https://docs.llamaindex.ai/en/stable) 🦙
 20 |   - [OpenAI](https://platform.openai.com/docs) 🤖
 21 | - **Smart Extraction**: Intelligently parses HTML content to extract the most relevant information
 22 | - **Configurable Results**: Limit the amount of text returned based on your needs
 23 | 
 24 | ## How It Works 🛠️
 25 | 
 26 | 1. The server uses the Serper API to perform Google searches with site-specific queries
 27 | 2. It fetches the content from the search results
 28 | 3. BeautifulSoup extracts the most relevant text from main content areas
 29 | 4. Claude can access this information through the `get_docs` tool
 30 | 
 31 | ## System Requirements 🖥️
 32 | 
 33 | - Python 3.11 or higher
 34 | - `uv` package manager
 35 | - A Serper API key
 36 | 
 37 | ## Setup Instructions 🚀
 38 | 
 39 | ### 1. Install uv Package Manager
 40 | 
 41 | ```bash
 42 | curl -LsSf https://astral.sh/uv/install.sh | sh
 43 | ```
 44 | 
 45 | ### 2. Clone and Set Up the Project
 46 | 
 47 | ```bash
 48 | # Clone or download the project
 49 | cd documentation
 50 | 
 51 | # Create and activate virtual environment
 52 | uv venv
 53 | # On Windows:
 54 | .venv\Scripts\activate
 55 | # On macOS/Linux:
 56 | source .venv/bin/activate
 57 | 
 58 | # Install dependencies
 59 | uv pip install -e .
 60 | ```
 61 | 
 62 | ### 3. Configure the Serper API Key
 63 | 
 64 | Create a `.env` file in the project directory with your Serper API key:
 65 | 
 66 | ```
 67 | SERPER_API_KEY=your_serper_api_key_here
 68 | ```
 69 | 
 70 | You can get a Serper API key by signing up at [serper.dev](https://serper.dev).
 71 | 
 72 | ### 4. Configure Claude Desktop
 73 | 
 74 | Edit your Claude Desktop configuration file at:
 75 | - Windows: `/C:/Users/[Your Username]/AppData/Roaming/Claude/claude_desktop_config.json`
 76 | 
 77 | - macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
 78 | 
 79 | Add the following to the `mcpServers` section:
 80 | 
 81 | ```json
 82 | "documentation": {
 83 |   "command": "uv",
 84 |   "args": [
 85 |     "--directory",
 86 |     "/ABSOLUTE/PATH/TO/YOUR/documentation",
 87 |     "run",
 88 |     "main.py"
 89 |   ]
 90 | }
 91 | ```
 92 | 
 93 | Replace `/ABSOLUTE/PATH/TO/YOUR/documentation` with the absolute path to your project directory.
 94 | 
 95 | ### 5. Restart Claude Desktop
 96 | 
 97 | Close and reopen Claude Desktop to apply the new configuration.
 98 | 
 99 | ## Using the Documentation Tool 🧩
100 | 
101 | Once connected, you can ask Claude to use the documentation tool:
102 | 
103 | > "Can you look up information about vector stores in LangChain documentation?"
104 | 
105 | Claude will use the `get_docs` tool to search for relevant information and provide you with documentation excerpts.
106 | 
107 | ## Tool Parameters 📋
108 | 
109 | The `get_docs` tool accepts the following parameters:
110 | 
111 | - `query`: The search term (e.g., "vector stores", "embedding models")
112 | - `library`: Which library to search (langchain, llama-index, or openai)
113 | - `max_chars`: Maximum characters to return (default: 1000)
114 | 
115 | ## Troubleshooting 🛠️
116 | 
117 | - **Claude can't find the server**: Verify the path in `/C:/Users/fcbsa/AppData/Roaming/Claude/claude_desktop_config.json` is correct
118 | - **Search returns no results**: Check your Serper API key and internet connection
119 | - **Timeout errors**: The server might be experiencing connectivity issues or rate limits
120 | 
121 | ## License 📜
122 | 
123 | This project is provided as an educational example of MCP server implementation.
124 | 
125 | ## Acknowledgements 🙏
126 | 
127 | - Built using the [MCP SDK](https://github.com/modelcontextprotocol)
128 | - Powered by [Serper API](https://serper.dev) for Google search integration
129 | - Uses [BeautifulSoup4](https://www.crummy.com/software/BeautifulSoup/) for HTML parsing
130 | - Inspired by the growing MCP community
131 | 
132 | ---
133 | 
134 | *This MCP server enhances Claude's capabilities by providing direct access to documentation resources. Explore, learn, and build better AI applications with contextual knowledge from the docs!*
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
 1 | [project]
 2 | name = "documentation"
 3 | version = "0.1.0"
 4 | description = "Add your description here"
 5 | readme = "README.md"
 6 | requires-python = ">=3.11"
 7 | dependencies = [
 8 |     "beautifulsoup4>=4.13.3",
 9 |     "httpx>=0.28.1",
10 |     "mcp[cli]>=1.6.0",
11 | ]
12 | 
```

--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------

```python
 1 | from mcp.server.fastmcp import FastMCP
 2 | from dotenv import load_dotenv
 3 | import httpx
 4 | import os
 5 | import json
 6 | from bs4 import BeautifulSoup
 7 | 
 8 | load_dotenv()
 9 | mcp = FastMCP("docs")
10 | 
11 | USER_AGENT = "docs-app/1.0"
12 | SERPER_URL = "https://google.serper.dev/search"
13 | 
14 | docs_urls = {
15 |     "langchain": "python.langchain.com/docs",
16 |     "llama-index": "docs.llamaindex.ai/en/stable",
17 |     "openai": "platform.openai.com/docs",
18 | }
19 | 
20 | """
21 | Our agent is going to first search the web using the Serper API key for Google search, for the given query, and then use those search results to access the URLs returned in the search results and get the contents of the page from the URL 
22 | """
23 | 
24 | async def search_web(query: str) -> dict | None:
25 |     """
26 |     Search the web using the Serper API key for Google search, for the given query.
27 |     """
28 |     payload = json.dumps({"q": query, "num": 2})
29 |     headers = {
30 |         "X-API-KEY": os.getenv("SERPER_API_KEY"),
31 |         "Content-Type": "application/json",       
32 |     }
33 | 
34 |     async with httpx.AsyncClient() as client:
35 |         try:
36 |             response = await client.post(url=SERPER_URL, headers=headers, 
37 |                                          data=payload, timeout=30.0)
38 |             response.raise_for_status()
39 |             return response.json()
40 |         except httpx.TimeoutException:
41 |             print("Timeout occurred while searching the web.")
42 |             return {"organic": []}        
43 | 
44 | 
45 | async def fetch_url(url: str):
46 |     """
47 |     Fetch the content in the page of the URL using the Serper API key for Google search, 
48 |     for the given query.
49 |     """
50 |     async with httpx.AsyncClient() as client:        
51 |         try:
52 |             response = await client.get(url=url, timeout=30.0)
53 |             soup = BeautifulSoup(response.text, "html.parser")
54 |             # text = soup.get_text()
55 |             # return text
56 |             # Target main content areas instead of all text
57 |             main_content = soup.find("main") or soup.find("article") or soup
58 |             text = main_content.get_text(separator="\n\n", strip=True)
59 |             return text
60 |         except httpx.TimeoutException:
61 |             return "Timeout occurred while fetching the URL."
62 | 
63 | @mcp.tool()
64 | async def get_docs(query: str, library: str, max_chars: int = 1000):
65 |     """
66 |     Search the docs for a given query and library.
67 |     Supports langchain, llama-index, and openai.
68 | 
69 |     Args:
70 |         query: The query to search for (e.g.: "Chroma DB").
71 |         library: The library to search in. One of langchain, llama-index, openai.
72 |         max_chars: Maximum characters to return (default: 1000 for free tier).
73 | 
74 |     Returns:
75 |         Text from the documentation.
76 |     """
77 |     if library not in docs_urls:
78 |         raise ValueError(f"Library {library} not supported. Supported libraries are: {', '.join(docs_urls.keys())}")
79 | 
80 |     url = f"site:{docs_urls[library]} {query}"
81 |     results = await search_web(url)
82 |     if len(results["organic"]) == 0:
83 |         return "No results found."
84 |     text = ""
85 |     for result in results["organic"]:
86 |         text += await fetch_url(result["link"])
87 |     return text[:max_chars]  # Limit to max_chars characters
88 | 
89 | 
90 | 
91 | 
92 | if __name__ == "__main__":
93 |     mcp.run(transport="stdio")
94 | 
```