# Directory Structure ``` ├── .gitignore ├── .python-version ├── document-agent │ ├── document.py │ ├── img │ │ └── img-chat.png │ ├── README.md │ ├── server.py │ └── submit_parse_job.py ├── multi-agent │ └── server.py ├── pyproject.toml ├── README.md ├── single_agent │ └── server.py └── uv.lock ``` # Files -------------------------------------------------------------------------------- /.python-version: -------------------------------------------------------------------------------- ``` 1 | 3.10 2 | ``` -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- ``` 1 | # Python-generated files 2 | __pycache__/ 3 | *.py[oc] 4 | build/ 5 | dist/ 6 | wheels/ 7 | *.egg-info 8 | 9 | # Virtual environments 10 | .venv 11 | 12 | # Credentials 13 | .env 14 | 15 | # Cursor 16 | .cursor/ 17 | mcp.json ``` -------------------------------------------------------------------------------- /document-agent/README.md: -------------------------------------------------------------------------------- ```markdown 1 | # Document Navigator Agent 2 | 3 | An MCP server to enable agents to understand and navigate large, complex documents with agentic RAG tools enabled by document metadata inferred by the Contextual AI [/parse API](https://docs.contextual.ai/api-reference/parse/parse-file). 4 | 5 | This is a prototype showing how to: 6 | - Get document comprehension right in Cursor (or any MCP client) with purely function calls 7 | - Ask more complex queries than possible with naive RAG 8 | - Get interpretable attributions via tool call traces as the agent navigates a document 9 | 10 |  11 | 12 | > **Note:** For faster iteration/caching, documents are referenced by Job IDs obtained from Contextual AI's [/parse API](https://docs.contextual.ai/api-reference/parse/parse-file) after they've completed processing. The demo uses [US Govt Financial Report FY2024](https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2024/01-16-2025-FR-(Final).pdf). 13 | 14 | 15 | ## Quick Setup 16 | 17 | A quick overview of steps needed to get started: 18 | 1. Setup your local python environment with `uv` 19 | 1. Follow [this](../README.md#installation) to create an env and install dependencies 20 | 2. Create a `.cursor/mcp.json` file 21 | 1. See details below on the config file and enable it in Cursor following [this](https://docs.cursor.com/context/model-context-protocol) 22 | 3. Get a Contextual AI API key 23 | 1. Add it to a `.env` file in your Cursor workspace as `CTXL_API_KEY=key-XX` 24 | 4. Submit a `/parse` job with your document to get a job ID 25 | 1. Use `uv run submit_parse_job.py "FILE_OR_URL"` 26 | 27 | 28 | ### MCP JSON config file 29 | 30 | Get the path to your `uv` binary using `which uv` (e.g. /Users/username/miniconda3/envs/envname/bin/uv) 31 | 32 | Add to your `.cursor/mcp.json`: 33 | ```json 34 | { 35 | "mcpServers": { 36 | "ContextualAI-DocumentNavigatorAgent": { 37 | "command": "/path/to/your/uv", 38 | "args": [ 39 | "--directory", 40 | "/path/to/contextual-mcp-server", 41 | "run", 42 | "document-agent/server.py" 43 | ] 44 | } 45 | } 46 | } 47 | ``` 48 | 49 | This can be configured for use with other MCP clients e.g. [Claude Desktop](https://modelcontextprotocol.io/quickstart/user). 50 | 51 | 52 | ## Key Components 53 | 54 | ### `server.py` 55 | Main MCP server with three core tools: 56 | - `initialize_document_agent(job_id)` - Switch between documents 57 | - `read_hierarchy()` - Get document outline and structure 58 | - `read_pages(rationale, start_index, end_index)` - Read specific page ranges 59 | 60 | ### `document.py` 61 | Contains `ParsedDocumentNavigator` class that wraps Contextual AI's parse output for easy navigation: 62 | - Access Document hierarchy as a kind of [llms.txt](https://llmstxt.org/) file 63 | - Page-based content retrieval 64 | 65 | ## Usage Examples 66 | 67 | ```python 68 | # In Cursor, ask questions like: 69 | "Initialize document agent with job ID abc-123" 70 | "Can you give me an overview of the document with page numbers" 71 | "Can you summarize parts of the document about US government debt?" 72 | ``` 73 | 74 | 75 | ## Development 76 | 77 | To extend functionality, add new `@mcp.tool()` decorated functions in `server.py`. Refer to [this](../README.md#development) for more. 78 | 79 | 80 | ## Extensions 81 | 82 | This is a simple prototype to show agentic RAG using purely function calls over tools enabled by document structure metadata inferred by the `/parse` API. In practice, combining with text/semantic retrieval using [Contextual AI datastores](https://docs.contextual.ai/user-guides/beginner-guide) will allow scaling context for your agent to a corpus with 10-100x more such documents, while supporting agentic retrieval of context, for complex synthesis and summarization. 83 | ``` -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- ```markdown 1 | # Contextual MCP Server 2 | 3 | A Model Context Protocol (MCP) server that provides RAG (Retrieval-Augmented Generation) capabilities using Contextual AI. This server integrates with a variety of MCP clients. It provides flexibility in you can decide what functionality to offer in the server. In this readme, we will show integration with the both Cursor IDE and Claude Desktop. 4 | 5 | Contextual AI now offers a hosted server inside the platform available at: https://mcp.app.contextual.ai/mcp/ 6 | After you connect to the server, you can use the tools, such as query, provided by the platform MCP server. 7 | For a complete walkthrough, check out the MCP [user guide](https://docs.contextual.ai/user-guides/mcp-server). 8 | 9 | 10 | ## Overview 11 | 12 | An MCP server acts as a bridge between AI interfaces (Cursor IDE or Claude Desktop) and a specialized Contextual AI agent. It enables: 13 | 14 | 1. **Query Processing**: Direct your domain specific questions to a dedicated Contextual AI agent 15 | 2. **Intelligent Retrieval**: Searches through comprehensive information in your knowledge base 16 | 3. **Context-Aware Responses**: Generates answers that are: 17 | - Grounded in source documentation 18 | - Include citations and attributions 19 | - Maintain conversation context 20 | 21 | 22 | ### Integration Flow 23 | 24 | ``` 25 | Cursor/Claude Desktop → MCP Server → Contextual AI RAG Agent 26 | ↑ ↓ ↓ 27 | └──────────────────┴─────────────┴─────────────── Response with citations 28 | ``` 29 | 30 | ## Prerequisites 31 | 32 | - Python 3.10 or higher 33 | - Cursor IDE and/or Claude Desktop 34 | - Contextual AI API key 35 | - MCP-compatible environment 36 | 37 | 38 | ## Installation 39 | 40 | 1. Clone the repository: 41 | ```bash 42 | git clone https://github.com/ContextualAI/contextual-mcp-server.git 43 | cd contextual-mcp-server 44 | ``` 45 | 46 | 2. Create and activate a virtual environment: 47 | ```bash 48 | python -m venv .venv 49 | source .venv/bin/activate # On Windows, use `.venv\Scripts\activate` 50 | ``` 51 | 52 | 3. Install dependencies: 53 | ```bash 54 | pip install -e . 55 | ``` 56 | 57 | ## Configuration 58 | 59 | ### Configure MCP Server 60 | 61 | The server requires modifications of settings or use. 62 | For example, the single_agent server should be customized with an appropriate docstring for your RAG Agent. 63 | 64 | The docstring for your query tool is critical as it helps the MCP client understand when to route questions to your RAG agent. Make it specific to your knowledge domain. Here is an example: 65 | ``` 66 | A research tool focused on financial data on the largest US firms 67 | ``` 68 | or 69 | ``` 70 | A research tool focused on technical documents for Omaha semiconductors 71 | ``` 72 | 73 | The server also requires the following settings from your RAG Agent: 74 | - `API_KEY`: Your Contextual AI API key 75 | - `AGENT_ID`: Your Contextual AI agent ID 76 | 77 | If you'd like to store these files in `.env` file you can specify them like so: 78 | 79 | ```bash 80 | cat > .env << EOF 81 | API_KEY=key... 82 | AGENT_ID=... 83 | EOF 84 | ``` 85 | 86 | The repo also contains more advance MPC servers for multi-agent systems or a [document-agent](https://www.linkedin.com/feed/update/urn:li:activity:7346595035770929152/). 87 | 88 | ### AI Interface Integration 89 | 90 | This MCP server can be integrated with a variety of clients. To use with either Cursor IDE or Claude Desktop create or modify the MCP configuration file in the appropriate location: 91 | 92 | 1. First, find the path to your `uv` installation: 93 | ```bash 94 | UV_PATH=$(which uv) 95 | echo $UV_PATH 96 | # Example output: /Users/username/miniconda3/bin/uv 97 | ``` 98 | 99 | 2. Create the configuration file using the full path from step 1: 100 | 101 | ```bash 102 | cat > mcp.json << EOF 103 | { 104 | "mcpServers": { 105 | "ContextualAI-TechDocs": { 106 | "command": "$UV_PATH", # make sure this is set properly 107 | "args": [ 108 | "--directory", 109 | "\${workspaceFolder}", # Will be replaced with your project path 110 | "run", 111 | "multi-agent/server.py" 112 | ] 113 | } 114 | } 115 | } 116 | EOF 117 | ``` 118 | 119 | 3. Move to the correct folder location, see below for options: 120 | 121 | ```bash 122 | mkdir -p .cursor/ 123 | mv mcp.json .cursor/ 124 | ``` 125 | 126 | Configuration locations: 127 | - For Cursor: 128 | - Project-specific: `.cursor/mcp.json` in your project directory 129 | - Global: `~/.cursor/mcp.json` for system-wide access 130 | - For Claude Desktop: 131 | - Use the same configuration file format in the appropriate Claude Desktop configuration directory 132 | 133 | 134 | ### Environment Setup 135 | 136 | This project uses `uv` for dependency management, which provides faster and more reliable Python package installation. 137 | 138 | ## Usage 139 | 140 | The server provides Contextual AI RAG capabilities using the python SDK, which can available a variety of commands accessible from MCP clients, such as Cursor IDE and Claude Desktop. 141 | The current server focuses on using the query command from the Contextual AI python SDK, however you could extend this to support other features such as listing all the agents, updating retrieval settings, updating prompts, extracting retrievals, or downloading metrics. 142 | 143 | ### Example Usage 144 | ```python 145 | # In Cursor, you might ask: 146 | "Show me the code for initiating the RF345 microchip?" 147 | 148 | # The MCP client will: 149 | 1. Determine if this should be routed to the MCP Server 150 | 151 | # Then the MCP server will: 152 | 1. Route the query to the Contextual AI agent 153 | 2. Retrieve relevant documentation 154 | 3. Generate a response with specific citations 155 | 4. Return the formatted answer to Cursor 156 | ``` 157 | 158 | 159 | ### Key Benefits 160 | 1. **Accurate Responses**: All answers are grounded in your documentation 161 | 2. **Source Attribution**: Every response includes references to source documents 162 | 3. **Context Awareness**: The system maintains conversation context for follow-up questions 163 | 4. **Real-time Updates**: Responses reflect the latest documentation in your datastore 164 | 165 | 166 | ## Development 167 | 168 | ### Modifying the Server 169 | 170 | To add new capabilities: 171 | 172 | 1. Add new tools by creating additional functions decorated with `@mcp.tool()` 173 | 2. Define the tool's parameters using Python type hints 174 | 3. Provide a clear docstring describing the tool's functionality 175 | 176 | Example: 177 | ```python 178 | @mcp.tool() 179 | def new_tool(param: str) -> str: 180 | """Description of what the tool does""" 181 | # Implementation 182 | return result 183 | ``` 184 | 185 | ## Limitations 186 | 187 | - The server runs locally and may not work in remote development environments 188 | - Tool responses are subject to Contextual AI API limits and quotas 189 | - Currently only supports stdio transport mode 190 | 191 | 192 | For all the capabilities of Contextual AI, please check the [official documentation](https://docs.contextual.ai/). 193 | ``` -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- ```toml 1 | [project] 2 | name = "contextual-mcp-server" 3 | version = "0.1.0" 4 | description = "Add your description here" 5 | readme = "README.md" 6 | requires-python = ">=3.10" 7 | dependencies = [ 8 | "contextual-client>=0.5.1", 9 | "httpx>=0.28.1", 10 | "mcp[cli]>=1.6.0", 11 | "dotenv", 12 | "uv", 13 | "tiktoken" 14 | ] 15 | ``` -------------------------------------------------------------------------------- /single_agent/server.py: -------------------------------------------------------------------------------- ```python 1 | from contextual import ContextualAI 2 | from mcp.server.fastmcp import FastMCP 3 | 4 | API_KEY = "" 5 | AGENT = "" 6 | 7 | # Create an MCP server 8 | mcp = FastMCP("Contextual AI RAG Platform") 9 | 10 | # Add query tool to interact with Contextual agent 11 | @mcp.tool() 12 | def query(prompt: str) -> str: 13 | """An enterprise search tool that can answer questions about a specific knowledge base""" 14 | client = ContextualAI( 15 | api_key=API_KEY, # This is the default and can be omitted 16 | ) 17 | query_result = client.agents.query.create( 18 | agent_id=AGENT, 19 | messages=[{ 20 | "content": prompt, 21 | "role": "user" 22 | }] 23 | ) 24 | return query_result.message.content 25 | 26 | if __name__ == "__main__": 27 | # Initialize and run the server 28 | mcp.run(transport='stdio') 29 | ``` -------------------------------------------------------------------------------- /multi-agent/server.py: -------------------------------------------------------------------------------- ```python 1 | from contextual import ContextualAI 2 | from mcp.server.fastmcp import FastMCP 3 | 4 | API_KEY = "" 5 | 6 | # Create an MCP server 7 | mcp = FastMCP("Contextual AI RAG Platform") 8 | 9 | # Add query tool to interact with Contextual agent 10 | @mcp.tool() 11 | def query(prompt: str) -> str: 12 | """An enterprise search tool that can answer questions about any sort of knowledge base""" 13 | 14 | client = ContextualAI( 15 | api_key=API_KEY, # This is the default and can be omitted 16 | ) 17 | 18 | instruction = "Rank documents based on their ability to answer the question/query" 19 | 20 | agents = {} 21 | for agent in client.agents.list(): 22 | agents.update({agent.id: f"{agent.name} - {agent.description}"}) 23 | documents = list(agents.values()) 24 | 25 | results = client.rerank.create( 26 | model="ctxl-rerank-en-v1-instruct", 27 | instruction=instruction, 28 | query=prompt, 29 | documents=documents, 30 | metadata=metadata, 31 | top_n=1 32 | ) 33 | 34 | agent_index = results.results[0].index 35 | agent_id = list(agents.keys())[agent_index] 36 | 37 | query_result = client.agents.query.create( 38 | agent_id=agent_id, 39 | messages=[{ 40 | "content": prompt, 41 | "role": "user" 42 | }] 43 | ) 44 | return query_result.message.content 45 | 46 | if __name__ == "__main__": 47 | # Initialize and run the server 48 | mcp.run(transport='stdio') 49 | ``` -------------------------------------------------------------------------------- /document-agent/document.py: -------------------------------------------------------------------------------- ```python 1 | class ParsedDocumentNavigator: 2 | """ 3 | This class wraps `/parse` API output exposing methods enabling an LLM agent to 4 | navigate and interact with the parsed document. 5 | """ 6 | 7 | def __init__(self, parsed_document): 8 | self.parsed_document = parsed_document 9 | self.block_map = { 10 | block.id: block 11 | for page in self.parsed_document.pages 12 | for block in page.blocks 13 | } 14 | self.heading_block_map = { 15 | block.id: block 16 | for block in self.parsed_document.document_metadata.hierarchy.blocks 17 | } 18 | 19 | def read_document(self) -> str: 20 | """ 21 | Read contents of the entire document as markdown (may be large) 22 | """ 23 | return self.read_pages(list(range(len(self.parsed_document.pages)))) 24 | 25 | def read_hierarchy(self) -> tuple[str, list[dict]]: 26 | """ 27 | Read the parsed heading structure of the entire document. 28 | 29 | Result is a tuple of: 30 | (i) human/LLM readable document hierarchy with pages indexes (a.k.a. table of contents) 31 | (ii) JSON list of headings in the document hierarchy 32 | """ 33 | hierarchy_markdown = ( 34 | self.parsed_document.document_metadata.hierarchy.table_of_contents 35 | ) 36 | 37 | hierarchy_list = [] 38 | for block in self.parsed_document.document_metadata.hierarchy.blocks: 39 | hierarchy_list.append( 40 | { 41 | "block_id": block.id, # might need to translate uuid to a LLM-friendly integer index instead 42 | "hierarchy_level": block.hierarchy_level, 43 | "markdown": block.markdown, 44 | "page_index": block.page_index, 45 | } 46 | ) 47 | return hierarchy_markdown, hierarchy_list 48 | 49 | def read_pages(self, page_indexes: list[int]) -> str: 50 | """ 51 | Read the contents of the document for the provided page indexes 52 | """ 53 | page_separator = "\n\n---\nPage index: {page_index}\n\n" 54 | content = "" 55 | for page_index in page_indexes: 56 | content += ( 57 | page_separator.format(page_index=page_index) 58 | + self.parsed_document.pages[page_index].markdown 59 | ) 60 | return content 61 | 62 | def read_heading_contents(self, heading_block_id: str) -> str: 63 | """ 64 | Read the contents of the document that are children of the given heading block referenced by `heading_block_id` 65 | """ 66 | heading_block = self.heading_block_map[heading_block_id] 67 | parent_path_prefix = heading_block.parent_ids + [heading_block_id] 68 | 69 | section_blocks = [] 70 | for page in self.parsed_document.pages: 71 | for block in page.blocks: 72 | # filter for blocks that share the same parent path 73 | if block.parent_ids[: len(parent_path_prefix)] == parent_path_prefix: 74 | section_blocks.append(block) 75 | 76 | section_content = "\n".join([block.markdown for block in section_blocks]) 77 | section_prefix = "\n".join( 78 | [ 79 | self.heading_block_map[block_id].markdown 80 | for block_id in parent_path_prefix 81 | ] 82 | ) 83 | 84 | return section_prefix + "\n\n" + section_content 85 | ``` -------------------------------------------------------------------------------- /document-agent/submit_parse_job.py: -------------------------------------------------------------------------------- ```python 1 | import argparse 2 | import os 3 | import time 4 | from urllib.parse import urlparse 5 | 6 | import httpx 7 | from contextual import ContextualAI 8 | from dotenv import load_dotenv 9 | 10 | load_dotenv() 11 | 12 | CTXL_API_KEY = os.getenv("CTXL_API_KEY") 13 | 14 | 15 | def submit_parse_job(file_path: str, polling_interval_s: int = 30): 16 | """Submits a file to the /parse endpoint and waits for completion.""" 17 | if not CTXL_API_KEY: 18 | raise ValueError("CTXL_API_KEY environment variable not set.") 19 | 20 | client = ContextualAI(api_key=CTXL_API_KEY) 21 | 22 | print(f"Submitting '{file_path}' for parsing...") 23 | with open(file_path, "rb") as fp: 24 | response = client.parse.create( 25 | raw_file=fp, 26 | parse_mode="standard", 27 | enable_document_hierarchy=True, 28 | ) 29 | 30 | job_id = response.job_id 31 | print(f"Parse job submitted. Job ID: {job_id}") 32 | print( 33 | f"You can view the job in the UI at: https://app.contextual.ai/{{tenant}}/components/parse?job={job_id}" 34 | ) 35 | print("(Remember to replace {tenant} with your workspace name)") 36 | 37 | print("Waiting for job to complete...") 38 | while True: 39 | try: 40 | result = client.parse.job_status(job_id) 41 | status = result.status 42 | print(f"Job status: {status}") 43 | if status == "completed": 44 | print(f"Job completed successfully. Job ID: {job_id}") 45 | break 46 | elif status in ["failed", "cancelled"]: 47 | print(f"Job {status}. Aborting.") 48 | break 49 | time.sleep(polling_interval_s) 50 | except Exception as e: 51 | print(f"An error occurred while checking job status: {e}") 52 | break 53 | 54 | return job_id 55 | 56 | 57 | def download_file(url: str, output_dir: str = "."): 58 | """Downloads a file from a URL.""" 59 | try: 60 | response = httpx.get(url, follow_redirects=True) 61 | response.raise_for_status() 62 | 63 | # get filename from URL 64 | parsed_url = urlparse(url) 65 | filename = os.path.basename(parsed_url.path) 66 | if not filename: 67 | filename = "downloaded_file" # fallback 68 | 69 | file_path = os.path.join(output_dir, filename) 70 | 71 | with open(file_path, "wb") as f: 72 | f.write(response.content) 73 | 74 | print(f"File downloaded to {file_path}") 75 | return file_path 76 | except httpx.RequestError as e: 77 | print(f"Error downloading file: {e}") 78 | return None 79 | 80 | 81 | def main(): 82 | parser = argparse.ArgumentParser( 83 | description="Submit a document to the Contextual AI /parse API." 84 | ) 85 | parser.add_argument("path_or_url", help="Local file path or URL to the document.") 86 | args = parser.parse_args() 87 | 88 | path_or_url = args.path_or_url 89 | 90 | downloaded_file_path = None 91 | try: 92 | if urlparse(path_or_url).scheme in ("http", "https"): 93 | print(f"Input is a URL: {path_or_url}") 94 | file_path = download_file(path_or_url) 95 | if not file_path: 96 | return 97 | downloaded_file_path = file_path 98 | elif os.path.isfile(path_or_url): 99 | print(f"Input is a local file: {path_or_url}") 100 | file_path = path_or_url 101 | else: 102 | print(f"Error: Input '{path_or_url}' is not a valid file path or URL.") 103 | return 104 | 105 | submit_parse_job(file_path) 106 | 107 | finally: 108 | if downloaded_file_path: 109 | # Clean up the downloaded file 110 | print(f"Cleaning up downloaded file: {downloaded_file_path}") 111 | os.remove(downloaded_file_path) 112 | 113 | 114 | if __name__ == "__main__": 115 | main() 116 | ``` -------------------------------------------------------------------------------- /document-agent/server.py: -------------------------------------------------------------------------------- ```python 1 | import os 2 | 3 | from contextual import ContextualAI 4 | from document import ParsedDocumentNavigator 5 | from dotenv import load_dotenv 6 | from mcp.server.fastmcp import FastMCP 7 | from tiktoken import encoding_for_model 8 | 9 | load_dotenv() 10 | 11 | CTXL_API_KEY = os.getenv("CTXL_API_KEY") 12 | 13 | 14 | def initialize_document_navigator(parse_job_id: str): 15 | ctxl_client = ContextualAI(api_key=CTXL_API_KEY) 16 | parsed_document = ctxl_client.parse.job_results( 17 | parse_job_id, output_types=["markdown-per-page", "blocks-per-page"] 18 | ) 19 | document_navigator = ParsedDocumentNavigator(parsed_document) 20 | return document_navigator 21 | 22 | 23 | def count_tokens_fast(text: str) -> int: 24 | """ 25 | Count tokens in a string using a fast approximation. 26 | """ 27 | multiplier, max_chars = 1.0, 80000 # ~20k tokens 28 | if len(text) > max_chars: 29 | multiplier = len(text) / max_chars 30 | text = text[:max_chars] 31 | n_tokens = len(encoding_for_model("gpt-4o").encode(text)) 32 | return int(n_tokens * multiplier) 33 | 34 | 35 | document_navigator = None 36 | mcp = FastMCP( 37 | name="CTXL Document Navigator", 38 | instructions=""" 39 | You are a document comprehension agent that uses tools to navigate, read and understand a document. 40 | """, 41 | ) 42 | 43 | 44 | @mcp.tool() 45 | def initialize_document_agent(job_id: str) -> str: 46 | """ 47 | Initialize the document agent with a provided job id. 48 | 49 | Guidance: 50 | - When asked for an outline of the document, read the hierarchy and then look up an initial few pages of the document before answering. 51 | - Use this to request the user to provide a job id for a document so you can answer questions about it. 52 | - When answering questions, provide references to page indexes used in the answer. 53 | """ 54 | global document_navigator 55 | document_navigator = initialize_document_navigator(job_id) 56 | message = f"Document agent initialized for job id: {job_id}" 57 | # add summary stats for the document 58 | n_pages = len(document_navigator.parsed_document.pages) 59 | n_doc_tokens = count_tokens_fast(document_navigator.read_document()) 60 | n_hierarchy_tokens = count_tokens_fast(document_navigator.read_hierarchy()[0]) 61 | stats = f""" 62 | - document has {n_doc_tokens} tokens, {n_pages} pages 63 | - hierarchy has {n_hierarchy_tokens} tokens 64 | """ 65 | return f"{message}\n{stats}" 66 | 67 | 68 | @mcp.tool() 69 | def read_hierarchy() -> str: 70 | """ 71 | Read a markdown nested list of the hierarchical structure of the document. 72 | This contains headings with their nesting as well as the page index where the section with this heading starts. 73 | 74 | Guidance: 75 | - Use these results to look up the start and end page indexes to read the contents of a specific section for further context. 76 | """ 77 | return document_navigator.read_hierarchy()[ 78 | 0 79 | ] # human/llm readable index structure for the document 80 | 81 | 82 | @mcp.tool() 83 | def read_pages(rationale: str, start_index: int, end_index: int) -> str: 84 | """ 85 | Read the contents of the document between the start and end page indexes, both inclusive. 86 | Provide a brief 1-line rationale for what you are trying to read e.g. the name of the section or other context. 87 | """ 88 | page_indexes = list(range(start_index, end_index + 1)) 89 | return document_navigator.read_pages(page_indexes) 90 | 91 | 92 | # NOTE: not used, but could be exposed with some control over context utilization 93 | # @mcp.tool() 94 | # def read_document() -> str: 95 | # """ 96 | # Read contents of the entire document as markdown (may be large) 97 | # """ 98 | # return document_navigator.read_document() 99 | 100 | # NOTE: not used, as reading by page indexes was more flexible and reliable than getting LLM to reference headings by ID 101 | # @mcp.tool() 102 | # def read_heading_contents(heading_block_id: str) -> str: 103 | # """Read the contents of the document that are children of the given heading block referenced by `heading_block_id`""" 104 | # return document_navigator.read_heading_contents(heading_block_id) 105 | 106 | 107 | if __name__ == "__main__": 108 | # Initialize and run the server 109 | mcp.run(transport="stdio") 110 | ```