contextualai/contextual-mcp-server # codebase.md

# Directory Structure

```
├── .gitignore
├── .python-version
├── document-agent
│   ├── document.py
│   ├── img
│   │   └── img-chat.png
│   ├── README.md
│   ├── server.py
│   └── submit_parse_job.py
├── multi-agent
│   └── server.py
├── pyproject.toml
├── README.md
├── single_agent
│   └── server.py
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------

```
3.10

```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
# Python-generated files
__pycache__/
*.py[oc]
build/
dist/
wheels/
*.egg-info

# Virtual environments
.venv

# Credentials
.env

# Cursor
.cursor/
mcp.json
```

--------------------------------------------------------------------------------
/document-agent/README.md:
--------------------------------------------------------------------------------

```markdown
# Document Navigator Agent

An MCP server to enable agents to understand and navigate large, complex documents with agentic RAG tools enabled by document metadata inferred by the Contextual AI [/parse API](https://docs.contextual.ai/api-reference/parse/parse-file).

This is a prototype showing how to:
- Get document comprehension right in Cursor (or any MCP client) with purely function calls
- Ask more complex queries than possible with naive RAG
- Get interpretable attributions via tool call traces as the agent navigates a document

![Chat screenshot](img/img-chat.png)

> **Note:** For faster iteration/caching, documents are referenced by Job IDs obtained from Contextual AI's [/parse API](https://docs.contextual.ai/api-reference/parse/parse-file) after they've completed processing. The demo uses [US Govt Financial Report FY2024](https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2024/01-16-2025-FR-(Final).pdf).


## Quick Setup

A quick overview of steps needed to get started:
1. Setup your local python environment with `uv`
    1. Follow [this](../README.md#installation) to create an env and install dependencies
2. Create a `.cursor/mcp.json` file
    1. See details below on the config file and enable it in Cursor following [this](https://docs.cursor.com/context/model-context-protocol)
3. Get a Contextual AI API key
    1. Add it to a `.env` file in your Cursor workspace as `CTXL_API_KEY=key-XX`
4. Submit a `/parse` job with your document to get a job ID
    1. Use `uv run submit_parse_job.py "FILE_OR_URL"`


### MCP JSON config file

Get the path to your `uv` binary using `which uv` (e.g. /Users/username/miniconda3/envs/envname/bin/uv)

Add to your `.cursor/mcp.json`:
```json
{
 "mcpServers": {
   "ContextualAI-DocumentNavigatorAgent": {
     "command": "/path/to/your/uv",
     "args": [
       "--directory",
       "/path/to/contextual-mcp-server",
       "run",
       "document-agent/server.py"
     ]
   }
 }
}
```

This can be configured for use with other MCP clients e.g. [Claude Desktop](https://modelcontextprotocol.io/quickstart/user).


## Key Components

### `server.py`
Main MCP server with three core tools:
- `initialize_document_agent(job_id)` - Switch between documents
- `read_hierarchy()` - Get document outline and structure  
- `read_pages(rationale, start_index, end_index)` - Read specific page ranges

### `document.py` 
Contains `ParsedDocumentNavigator` class that wraps Contextual AI's parse output for easy navigation:
- Access Document hierarchy as a kind of [llms.txt](https://llmstxt.org/) file
- Page-based content retrieval

## Usage Examples

```python
# In Cursor, ask questions like:
"Initialize document agent with job ID abc-123"
"Can you give me an overview of the document with page numbers"
"Can you summarize parts of the document about US government debt?"
```


## Development

To extend functionality, add new `@mcp.tool()` decorated functions in `server.py`. Refer to [this](../README.md#development) for more. 


## Extensions

This is a simple prototype to show agentic RAG using purely function calls over tools enabled by document structure metadata inferred by the `/parse` API. In practice, combining with text/semantic retrieval using [Contextual AI datastores](https://docs.contextual.ai/user-guides/beginner-guide) will allow scaling context for your agent to a corpus with 10-100x more such documents, while supporting agentic retrieval of context, for complex synthesis and summarization.

```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
# Contextual MCP Server

A Model Context Protocol (MCP) server that provides RAG (Retrieval-Augmented Generation) capabilities using Contextual AI. This server integrates with a variety of MCP clients. It provides flexibility in you can decide what functionality to offer in the server. In this readme, we will show integration with the both Cursor IDE and Claude Desktop.

Contextual AI now offers a hosted server inside the platform available at: https://mcp.app.contextual.ai/mcp/   
After you connect to the server, you can use the tools, such as query, provided by the platform MCP server.  
For a complete walkthrough, check out the MCP [user guide](https://docs.contextual.ai/user-guides/mcp-server).


## Overview

An MCP server acts as a bridge between AI interfaces (Cursor IDE or Claude Desktop) and a specialized Contextual AI agent. It enables:

1. **Query Processing**: Direct your domain specific questions to a dedicated Contextual AI agent
2. **Intelligent Retrieval**: Searches through comprehensive information in your knowledge base
3. **Context-Aware Responses**: Generates answers that are:
  - Grounded in source documentation
  - Include citations and attributions
  - Maintain conversation context


### Integration Flow

```
Cursor/Claude Desktop → MCP Server → Contextual AI RAG Agent
        ↑                  ↓             ↓                         
        └──────────────────┴─────────────┴─────────────── Response with citations
```

## Prerequisites

- Python 3.10 or higher
- Cursor IDE and/or Claude Desktop
- Contextual AI API key
- MCP-compatible environment


## Installation

1. Clone the repository:
```bash
git clone https://github.com/ContextualAI/contextual-mcp-server.git
cd contextual-mcp-server
```

2. Create and activate a virtual environment:
```bash
python -m venv .venv
source .venv/bin/activate  # On Windows, use `.venv\Scripts\activate`
```

3. Install dependencies:
```bash
pip install -e .
```

## Configuration

### Configure MCP Server

The server requires modifications of settings or use.
For example, the single_agent server should be customized with an appropriate docstring for your RAG Agent.

The docstring for your query tool is critical as it helps the MCP client understand when to route questions to your RAG agent. Make it specific to your knowledge domain. Here is an example:
```
A research tool focused on financial data on the largest US firms
```
or
```
A research tool focused on technical documents for Omaha semiconductors
```

The server also requires the following settings from your RAG Agent:
- `API_KEY`: Your Contextual AI API key
- `AGENT_ID`: Your Contextual AI agent ID

If you'd like to store these files in `.env` file you can specify them like so:

```bash
cat > .env << EOF
API_KEY=key...
AGENT_ID=...
EOF
```

The repo also contains more advance MPC servers for multi-agent systems or a [document-agent](https://www.linkedin.com/feed/update/urn:li:activity:7346595035770929152/).

### AI Interface Integration

This MCP server can be integrated with a variety of clients. To use with either Cursor IDE or Claude Desktop create or modify the MCP configuration file in the appropriate location:

1. First, find the path to your `uv` installation:
```bash
UV_PATH=$(which uv)
echo $UV_PATH
# Example output: /Users/username/miniconda3/bin/uv
```

2. Create the configuration file using the full path from step 1:

```bash
cat > mcp.json << EOF
{
 "mcpServers": {
   "ContextualAI-TechDocs": {
     "command": "$UV_PATH", # make sure this is set properly
     "args": [
       "--directory",
       "\${workspaceFolder}",  # Will be replaced with your project path
       "run",
       "multi-agent/server.py"
     ]
   }
 }
}
EOF
```

3. Move to the correct folder location, see below for options:

```bash
mkdir -p .cursor/
mv mcp.json .cursor/
```

Configuration locations:
- For Cursor:
 - Project-specific: `.cursor/mcp.json` in your project directory
 - Global: `~/.cursor/mcp.json` for system-wide access
- For Claude Desktop:
 - Use the same configuration file format in the appropriate Claude Desktop configuration directory


### Environment Setup

This project uses `uv` for dependency management, which provides faster and more reliable Python package installation.

## Usage

The server provides Contextual AI RAG capabilities using the python SDK, which can available a variety of commands accessible from MCP clients, such as Cursor IDE and Claude Desktop.
The current server focuses on using the query command from the Contextual AI python SDK, however you could extend this to support other features such as listing all the agents, updating retrieval settings, updating prompts, extracting retrievals, or downloading metrics.

### Example Usage
```python
# In Cursor, you might ask:
"Show me the code for initiating the RF345 microchip?"

# The MCP client will:
1. Determine if this should be routed to the MCP Server

# Then the MCP server will:
1. Route the query to the Contextual AI agent
2. Retrieve relevant documentation
3. Generate a response with specific citations
4. Return the formatted answer to Cursor
```


### Key Benefits
1. **Accurate Responses**: All answers are grounded in your documentation
2. **Source Attribution**: Every response includes references to source documents
3. **Context Awareness**: The system maintains conversation context for follow-up questions
4. **Real-time Updates**: Responses reflect the latest documentation in your datastore


## Development

### Modifying the Server

To add new capabilities:

1. Add new tools by creating additional functions decorated with `@mcp.tool()`
2. Define the tool's parameters using Python type hints
3. Provide a clear docstring describing the tool's functionality

Example:
```python
@mcp.tool()
def new_tool(param: str) -> str:
   """Description of what the tool does"""
   # Implementation
   return result
```

## Limitations

- The server runs locally and may not work in remote development environments
- Tool responses are subject to Contextual AI API limits and quotas
- Currently only supports stdio transport mode


For all the capabilities of Contextual AI, please check the [official documentation](https://docs.contextual.ai/).

```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
[project]
name = "contextual-mcp-server"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.10"
dependencies = [
    "contextual-client>=0.5.1",
    "httpx>=0.28.1",
    "mcp[cli]>=1.6.0",
    "dotenv",
    "uv",
    "tiktoken"
]

```

--------------------------------------------------------------------------------
/single_agent/server.py:
--------------------------------------------------------------------------------

```python
from contextual import ContextualAI
from mcp.server.fastmcp import FastMCP

API_KEY = ""
AGENT = ""

# Create an MCP server
mcp = FastMCP("Contextual AI RAG Platform")

# Add query tool to interact with Contextual agent
@mcp.tool()
def query(prompt: str) -> str:
    """An enterprise search tool that can answer questions about a specific knowledge base"""
    client = ContextualAI(
        api_key=API_KEY,  # This is the default and can be omitted
    )
    query_result = client.agents.query.create(
        agent_id=AGENT,
        messages=[{
            "content": prompt,
            "role": "user"
        }]
    )
    return query_result.message.content

if __name__ == "__main__":
    # Initialize and run the server
    mcp.run(transport='stdio')

```

--------------------------------------------------------------------------------
/multi-agent/server.py:
--------------------------------------------------------------------------------

```python
from contextual import ContextualAI
from mcp.server.fastmcp import FastMCP

API_KEY = ""

# Create an MCP server
mcp = FastMCP("Contextual AI RAG Platform")

# Add query tool to interact with Contextual agent
@mcp.tool()
def query(prompt: str) -> str:
    """An enterprise search tool that can answer questions about any sort of knowledge base"""

    client = ContextualAI(
        api_key=API_KEY,  # This is the default and can be omitted
    )

    instruction = "Rank documents based on their ability to answer the question/query"

    agents = {}
    for agent in client.agents.list():
        agents.update({agent.id: f"{agent.name} - {agent.description}"})
    documents = list(agents.values())

    results = client.rerank.create(
        model="ctxl-rerank-en-v1-instruct",
        instruction=instruction,
        query=prompt,
        documents=documents,
        metadata=metadata,
        top_n=1
    )

    agent_index = results.results[0].index
    agent_id = list(agents.keys())[agent_index]

    query_result = client.agents.query.create(
        agent_id=agent_id,
        messages=[{
            "content": prompt,
            "role": "user"
        }]
    )
    return query_result.message.content

if __name__ == "__main__":
    # Initialize and run the server
    mcp.run(transport='stdio')

```

--------------------------------------------------------------------------------
/document-agent/document.py:
--------------------------------------------------------------------------------

```python
class ParsedDocumentNavigator:
    """
    This class wraps `/parse` API output exposing methods enabling an LLM agent to
    navigate and interact with the parsed document.
    """

    def __init__(self, parsed_document):
        self.parsed_document = parsed_document
        self.block_map = {
            block.id: block
            for page in self.parsed_document.pages
            for block in page.blocks
        }
        self.heading_block_map = {
            block.id: block
            for block in self.parsed_document.document_metadata.hierarchy.blocks
        }

    def read_document(self) -> str:
        """
        Read contents of the entire document as markdown (may be large)
        """
        return self.read_pages(list(range(len(self.parsed_document.pages))))

    def read_hierarchy(self) -> tuple[str, list[dict]]:
        """
        Read the parsed heading structure of the entire document.

        Result is a tuple of:
            (i) human/LLM readable document hierarchy with pages indexes (a.k.a. table of contents)
            (ii) JSON list of headings in the document hierarchy
        """
        hierarchy_markdown = (
            self.parsed_document.document_metadata.hierarchy.table_of_contents
        )

        hierarchy_list = []
        for block in self.parsed_document.document_metadata.hierarchy.blocks:
            hierarchy_list.append(
                {
                    "block_id": block.id,  # might need to translate uuid to a LLM-friendly integer index instead
                    "hierarchy_level": block.hierarchy_level,
                    "markdown": block.markdown,
                    "page_index": block.page_index,
                }
            )
        return hierarchy_markdown, hierarchy_list

    def read_pages(self, page_indexes: list[int]) -> str:
        """
        Read the contents of the document for the provided page indexes
        """
        page_separator = "\n\n---\nPage index: {page_index}\n\n"
        content = ""
        for page_index in page_indexes:
            content += (
                page_separator.format(page_index=page_index)
                + self.parsed_document.pages[page_index].markdown
            )
        return content

    def read_heading_contents(self, heading_block_id: str) -> str:
        """
        Read the contents of the document that are children of the given heading block referenced by `heading_block_id`
        """
        heading_block = self.heading_block_map[heading_block_id]
        parent_path_prefix = heading_block.parent_ids + [heading_block_id]

        section_blocks = []
        for page in self.parsed_document.pages:
            for block in page.blocks:
                # filter for blocks that share the same parent path
                if block.parent_ids[: len(parent_path_prefix)] == parent_path_prefix:
                    section_blocks.append(block)

        section_content = "\n".join([block.markdown for block in section_blocks])
        section_prefix = "\n".join(
            [
                self.heading_block_map[block_id].markdown
                for block_id in parent_path_prefix
            ]
        )

        return section_prefix + "\n\n" + section_content

```

--------------------------------------------------------------------------------
/document-agent/submit_parse_job.py:
--------------------------------------------------------------------------------

```python
import argparse
import os
import time
from urllib.parse import urlparse

import httpx
from contextual import ContextualAI
from dotenv import load_dotenv

load_dotenv()

CTXL_API_KEY = os.getenv("CTXL_API_KEY")


def submit_parse_job(file_path: str, polling_interval_s: int = 30):
    """Submits a file to the /parse endpoint and waits for completion."""
    if not CTXL_API_KEY:
        raise ValueError("CTXL_API_KEY environment variable not set.")

    client = ContextualAI(api_key=CTXL_API_KEY)

    print(f"Submitting '{file_path}' for parsing...")
    with open(file_path, "rb") as fp:
        response = client.parse.create(
            raw_file=fp,
            parse_mode="standard",
            enable_document_hierarchy=True,
        )

    job_id = response.job_id
    print(f"Parse job submitted. Job ID: {job_id}")
    print(
        f"You can view the job in the UI at: https://app.contextual.ai/{{tenant}}/components/parse?job={job_id}"
    )
    print("(Remember to replace {tenant} with your workspace name)")

    print("Waiting for job to complete...")
    while True:
        try:
            result = client.parse.job_status(job_id)
            status = result.status
            print(f"Job status: {status}")
            if status == "completed":
                print(f"Job completed successfully. Job ID: {job_id}")
                break
            elif status in ["failed", "cancelled"]:
                print(f"Job {status}. Aborting.")
                break
            time.sleep(polling_interval_s)
        except Exception as e:
            print(f"An error occurred while checking job status: {e}")
            break

    return job_id


def download_file(url: str, output_dir: str = "."):
    """Downloads a file from a URL."""
    try:
        response = httpx.get(url, follow_redirects=True)
        response.raise_for_status()

        # get filename from URL
        parsed_url = urlparse(url)
        filename = os.path.basename(parsed_url.path)
        if not filename:
            filename = "downloaded_file"  # fallback

        file_path = os.path.join(output_dir, filename)

        with open(file_path, "wb") as f:
            f.write(response.content)

        print(f"File downloaded to {file_path}")
        return file_path
    except httpx.RequestError as e:
        print(f"Error downloading file: {e}")
        return None


def main():
    parser = argparse.ArgumentParser(
        description="Submit a document to the Contextual AI /parse API."
    )
    parser.add_argument("path_or_url", help="Local file path or URL to the document.")
    args = parser.parse_args()

    path_or_url = args.path_or_url

    downloaded_file_path = None
    try:
        if urlparse(path_or_url).scheme in ("http", "https"):
            print(f"Input is a URL: {path_or_url}")
            file_path = download_file(path_or_url)
            if not file_path:
                return
            downloaded_file_path = file_path
        elif os.path.isfile(path_or_url):
            print(f"Input is a local file: {path_or_url}")
            file_path = path_or_url
        else:
            print(f"Error: Input '{path_or_url}' is not a valid file path or URL.")
            return

        submit_parse_job(file_path)

    finally:
        if downloaded_file_path:
            # Clean up the downloaded file
            print(f"Cleaning up downloaded file: {downloaded_file_path}")
            os.remove(downloaded_file_path)


if __name__ == "__main__":
    main()

```

--------------------------------------------------------------------------------
/document-agent/server.py:
--------------------------------------------------------------------------------

```python
import os

from contextual import ContextualAI
from document import ParsedDocumentNavigator
from dotenv import load_dotenv
from mcp.server.fastmcp import FastMCP
from tiktoken import encoding_for_model

load_dotenv()

CTXL_API_KEY = os.getenv("CTXL_API_KEY")


def initialize_document_navigator(parse_job_id: str):
    ctxl_client = ContextualAI(api_key=CTXL_API_KEY)
    parsed_document = ctxl_client.parse.job_results(
        parse_job_id, output_types=["markdown-per-page", "blocks-per-page"]
    )
    document_navigator = ParsedDocumentNavigator(parsed_document)
    return document_navigator


def count_tokens_fast(text: str) -> int:
    """
    Count tokens in a string using a fast approximation.
    """
    multiplier, max_chars = 1.0, 80000  # ~20k tokens
    if len(text) > max_chars:
        multiplier = len(text) / max_chars
        text = text[:max_chars]
    n_tokens = len(encoding_for_model("gpt-4o").encode(text))
    return int(n_tokens * multiplier)


document_navigator = None
mcp = FastMCP(
    name="CTXL Document Navigator",
    instructions="""
    You are a document comprehension agent that uses tools to navigate, read and understand a document.
    """,
)


@mcp.tool()
def initialize_document_agent(job_id: str) -> str:
    """
    Initialize the document agent with a provided job id.

    Guidance:
        - When asked for an outline of the document, read the hierarchy and then look up an initial few pages of the document before answering.
        - Use this to request the user to provide a job id for a document so you can answer questions about it.
        - When answering questions, provide references to page indexes used in the answer.
    """
    global document_navigator
    document_navigator = initialize_document_navigator(job_id)
    message = f"Document agent initialized for job id: {job_id}"
    # add summary stats for the document
    n_pages = len(document_navigator.parsed_document.pages)
    n_doc_tokens = count_tokens_fast(document_navigator.read_document())
    n_hierarchy_tokens = count_tokens_fast(document_navigator.read_hierarchy()[0])
    stats = f"""
        - document has {n_doc_tokens} tokens, {n_pages} pages 
        - hierarchy has {n_hierarchy_tokens} tokens
    """
    return f"{message}\n{stats}"


@mcp.tool()
def read_hierarchy() -> str:
    """
    Read a markdown nested list of the hierarchical structure of the document.
    This contains headings with their nesting as well as the page index where the section with this heading starts.

    Guidance:
        - Use these results to look up the start and end page indexes to read the contents of a specific section for further context.
    """
    return document_navigator.read_hierarchy()[
        0
    ]  # human/llm readable index structure for the document


@mcp.tool()
def read_pages(rationale: str, start_index: int, end_index: int) -> str:
    """
    Read the contents of the document between the start and end page indexes, both inclusive.
    Provide a brief 1-line rationale for what you are trying to read e.g. the name of the section or other context.
    """
    page_indexes = list(range(start_index, end_index + 1))
    return document_navigator.read_pages(page_indexes)


# NOTE: not used, but could be exposed with some control over context utilization
# @mcp.tool()
# def read_document() -> str:
#     """
#     Read contents of the entire document as markdown (may be large)
#     """
#     return document_navigator.read_document()

# NOTE: not used, as reading by page indexes was more flexible and reliable than getting LLM to reference headings by ID
# @mcp.tool()
# def read_heading_contents(heading_block_id: str) -> str:
#     """Read the contents of the document that are children of the given heading block referenced by `heading_block_id`"""
#     return document_navigator.read_heading_contents(heading_block_id)


if __name__ == "__main__":
    # Initialize and run the server
    mcp.run(transport="stdio")

```