#
tokens: 2325/50000 6/6 files
lines: off (toggle) GitHub
raw markdown copy
# Directory Structure

```
├── .gitignore
├── .pylintrc
├── .python-version
├── main.py
├── MCP_arch_explained.png
├── mcp-diagram-bg.png
├── pyproject.toml
├── README.md
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------

```
3.11

```

--------------------------------------------------------------------------------
/.pylintrc:
--------------------------------------------------------------------------------

```
[MASTER]
init-hook="from pylint.config import find_pylintrc; import os, sys; sys.path.append(os.path.dirname(find_pylintrc()))"

[MESSAGES CONTROL]
disable=C0111,C0103
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
# --- Python ---
# Byte-compiled / optimized / DLL files
__pycache__/
*.pyc
*.pyo
*.pyd

# Distribution / packaging
dist/
build/
wheels/
*.egg-info/
*.egg
*.tar.gz
*.whl

# --- Virtual Environments ---
# Common virtual environment directory names
.venv/
venv/
env/
ENV/
*/env/
*/venv/

# Configuration file specific to venv
pyvenv.cfg

# Environment variables file (often contains secrets)
.env*

# --- IDE / Editor Files ---
# VS Code specific folder (user settings, state, launch configs etc.)
# Only commit .vscode/settings.json, launch.json, tasks.json, extensions.json
# if they contain project-specific configurations you want to share.
.vscode/

# PyCharm specific folder
.idea/

# --- Testing ---
.pytest_cache/
.tox/
htmlcov/
.coverage
*.cover
nosetests.xml
coverage.xml

# --- Operating System Files ---
# macOS
.DS_Store
._*

# Windows
Thumbs.db

# --- Specific Files from Your Project ---
# Cache file seen in your project list
CACHEHDIR.TAG

# --- Logs ---
*.log
logs/

# --- Other ---
# Add any other generated files, temporary files, or sensitive data files here
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
# Documentation MCP Server 📚🔍

A Model Context Protocol (MCP) server that enables Claude to search and access documentation from popular libraries like LangChain, LlamaIndex, and OpenAI directly within conversations.

## What is MCP? 🤔

MCP (Model Context Protocol) is an open protocol that standardizes how applications provide context to Large Language Models. Think of it as a universal connector that lets AI assistants like Claude access external data sources and tools.

![MCP Architecture](MCP_arch_explained.png)


![MCP Architecture](mcp-diagram-bg.png)

## Features ✨

- **Documentation Search Tool**: Search through documentation of popular AI libraries
- **Supported Libraries**:
  - [LangChain](https://python.langchain.com/docs) 🔗
  - [LlamaIndex](https://docs.llamaindex.ai/en/stable) 🦙
  - [OpenAI](https://platform.openai.com/docs) 🤖
- **Smart Extraction**: Intelligently parses HTML content to extract the most relevant information
- **Configurable Results**: Limit the amount of text returned based on your needs

## How It Works 🛠️

1. The server uses the Serper API to perform Google searches with site-specific queries
2. It fetches the content from the search results
3. BeautifulSoup extracts the most relevant text from main content areas
4. Claude can access this information through the `get_docs` tool

## System Requirements 🖥️

- Python 3.11 or higher
- `uv` package manager
- A Serper API key

## Setup Instructions 🚀

### 1. Install uv Package Manager

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```

### 2. Clone and Set Up the Project

```bash
# Clone or download the project
cd documentation

# Create and activate virtual environment
uv venv
# On Windows:
.venv\Scripts\activate
# On macOS/Linux:
source .venv/bin/activate

# Install dependencies
uv pip install -e .
```

### 3. Configure the Serper API Key

Create a `.env` file in the project directory with your Serper API key:

```
SERPER_API_KEY=your_serper_api_key_here
```

You can get a Serper API key by signing up at [serper.dev](https://serper.dev).

### 4. Configure Claude Desktop

Edit your Claude Desktop configuration file at:
- Windows: `/C:/Users/[Your Username]/AppData/Roaming/Claude/claude_desktop_config.json`

- macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`

Add the following to the `mcpServers` section:

```json
"documentation": {
  "command": "uv",
  "args": [
    "--directory",
    "/ABSOLUTE/PATH/TO/YOUR/documentation",
    "run",
    "main.py"
  ]
}
```

Replace `/ABSOLUTE/PATH/TO/YOUR/documentation` with the absolute path to your project directory.

### 5. Restart Claude Desktop

Close and reopen Claude Desktop to apply the new configuration.

## Using the Documentation Tool 🧩

Once connected, you can ask Claude to use the documentation tool:

> "Can you look up information about vector stores in LangChain documentation?"

Claude will use the `get_docs` tool to search for relevant information and provide you with documentation excerpts.

## Tool Parameters 📋

The `get_docs` tool accepts the following parameters:

- `query`: The search term (e.g., "vector stores", "embedding models")
- `library`: Which library to search (langchain, llama-index, or openai)
- `max_chars`: Maximum characters to return (default: 1000)

## Troubleshooting 🛠️

- **Claude can't find the server**: Verify the path in `/C:/Users/fcbsa/AppData/Roaming/Claude/claude_desktop_config.json` is correct
- **Search returns no results**: Check your Serper API key and internet connection
- **Timeout errors**: The server might be experiencing connectivity issues or rate limits

## License 📜

This project is provided as an educational example of MCP server implementation.

## Acknowledgements 🙏

- Built using the [MCP SDK](https://github.com/modelcontextprotocol)
- Powered by [Serper API](https://serper.dev) for Google search integration
- Uses [BeautifulSoup4](https://www.crummy.com/software/BeautifulSoup/) for HTML parsing
- Inspired by the growing MCP community

---

*This MCP server enhances Claude's capabilities by providing direct access to documentation resources. Explore, learn, and build better AI applications with contextual knowledge from the docs!*
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
[project]
name = "documentation"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.11"
dependencies = [
    "beautifulsoup4>=4.13.3",
    "httpx>=0.28.1",
    "mcp[cli]>=1.6.0",
]

```

--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------

```python
from mcp.server.fastmcp import FastMCP
from dotenv import load_dotenv
import httpx
import os
import json
from bs4 import BeautifulSoup

load_dotenv()
mcp = FastMCP("docs")

USER_AGENT = "docs-app/1.0"
SERPER_URL = "https://google.serper.dev/search"

docs_urls = {
    "langchain": "python.langchain.com/docs",
    "llama-index": "docs.llamaindex.ai/en/stable",
    "openai": "platform.openai.com/docs",
}

"""
Our agent is going to first search the web using the Serper API key for Google search, for the given query, and then use those search results to access the URLs returned in the search results and get the contents of the page from the URL 
"""

async def search_web(query: str) -> dict | None:
    """
    Search the web using the Serper API key for Google search, for the given query.
    """
    payload = json.dumps({"q": query, "num": 2})
    headers = {
        "X-API-KEY": os.getenv("SERPER_API_KEY"),
        "Content-Type": "application/json",       
    }

    async with httpx.AsyncClient() as client:
        try:
            response = await client.post(url=SERPER_URL, headers=headers, 
                                         data=payload, timeout=30.0)
            response.raise_for_status()
            return response.json()
        except httpx.TimeoutException:
            print("Timeout occurred while searching the web.")
            return {"organic": []}        


async def fetch_url(url: str):
    """
    Fetch the content in the page of the URL using the Serper API key for Google search, 
    for the given query.
    """
    async with httpx.AsyncClient() as client:        
        try:
            response = await client.get(url=url, timeout=30.0)
            soup = BeautifulSoup(response.text, "html.parser")
            # text = soup.get_text()
            # return text
            # Target main content areas instead of all text
            main_content = soup.find("main") or soup.find("article") or soup
            text = main_content.get_text(separator="\n\n", strip=True)
            return text
        except httpx.TimeoutException:
            return "Timeout occurred while fetching the URL."

@mcp.tool()
async def get_docs(query: str, library: str, max_chars: int = 1000):
    """
    Search the docs for a given query and library.
    Supports langchain, llama-index, and openai.

    Args:
        query: The query to search for (e.g.: "Chroma DB").
        library: The library to search in. One of langchain, llama-index, openai.
        max_chars: Maximum characters to return (default: 1000 for free tier).

    Returns:
        Text from the documentation.
    """
    if library not in docs_urls:
        raise ValueError(f"Library {library} not supported. Supported libraries are: {', '.join(docs_urls.keys())}")

    url = f"site:{docs_urls[library]} {query}"
    results = await search_web(url)
    if len(results["organic"]) == 0:
        return "No results found."
    text = ""
    for result in results["organic"]:
        text += await fetch_url(result["link"])
    return text[:max_chars]  # Limit to max_chars characters




if __name__ == "__main__":
    mcp.run(transport="stdio")

```