deepseekmine/mcp-pdf-reader # codebase.md

This is page 1 of 2. Use http://codebase.md/deepseekmine/mcp-pdf-reader?lines=true&page={x} to view the full context.

# Directory Structure

```
├── .idea
│   ├── vcs.xml
│   └── workspace.xml
├── pdf_server.py
├── README.md
└── requirements.txt
```

# Files

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
  1 | # 📄 MCP PDF Server
  2 | 
  3 | A PDF file reading server based on [FastMCP](https://github.com/minimaxir/fastmcp).
  4 | 
  5 | Supports PDF text extraction, OCR recognition, and image extraction via the MCP protocol, with a built-in web debugger for easy testing.
  6 | 
  7 | ---
  8 | 
  9 | ## 🚀 Features
 10 | 
 11 | - **read_pdf_text**  
 12 |   Extracts normal text from a PDF (page by page).
 13 | 
 14 | - **read_by_ocr**  
 15 |   Uses OCR to recognize text from scanned or image-based PDFs.
 16 | 
 17 | - **read_pdf_images**  
 18 |   Extracts all images from a specified PDF page (Base64 encoded output).
 19 | 
 20 | ---
 21 | 
 22 | ## 📂 Project Structure
 23 | 
 24 | ```
 25 | mcp-pdf-server/
 26 | ├── pdf_resources/        # Directory for uploaded and processed PDF files
 27 | ├── txt_server.py         # Main server entry point
 28 | └── README.md             # Project documentation
 29 | ```
 30 | 
 31 | ---
 32 | 
 33 | ## ⚙️ Installation
 34 | 
 35 | Recommended Python version: 3.9+
 36 | 
 37 | ```bash
 38 | pip install pymupdf mcp
 39 | ```
 40 | 
 41 | > Note: To use OCR features, you may need a MuPDF build with OCR support or external OCR libraries.
 42 | 
 43 | ---
 44 | 
 45 | ## 🔦 Start the Server
 46 | 
 47 | Run the following command:
 48 | 
 49 | ```bash
 50 | python txt_server.py
 51 | ```
 52 | 
 53 | You should see logs like:
 54 | 
 55 | ```
 56 | Serving on http://127.0.0.1:6231
 57 | ```
 58 | 
 59 | ---
 60 | 
 61 | ## 🌐 Web Debugging Interface
 62 | 
 63 | Open your browser and visit:
 64 | 
 65 | ```
 66 | http://127.0.0.1:6231
 67 | ```
 68 | 
 69 | - Select a tool from the left panel
 70 | - Fill in parameters on the right panel
 71 | - Click "Run" to test the tool
 72 | 
 73 | No coding required — easily debug and test via the web UI.
 74 | 
 75 | ---
 76 | 
 77 | ## 🛠️ API Tool List
 78 | 
 79 | | Tool | Description | Input Parameters | Returns |
 80 | |:-----|:------------|:-----------------|:--------|
 81 | | `read_pdf_text` | Extracts normal text from PDF pages | `file_path`, `start_page`, `end_page` | List of page texts |
 82 | | `read_by_ocr` | Recognizes text via OCR | `file_path`, `start_page`, `end_page`, `language`, `dpi` | OCR extracted text |
 83 | | `read_pdf_images` | Extracts images from a PDF page | `file_path`, `page_number` | List of images (Base64 encoded) |
 84 | 
 85 | ---
 86 | 
 87 | ## 📝 Example Usage
 88 | 
 89 | Extract text from pages 1 to 5:
 90 | 
 91 | ```bash
 92 | mcp run read_pdf_text --args '{"file_path": "pdf_resources/example.pdf", "start_page": 1, "end_page": 5}'
 93 | ```
 94 | 
 95 | Perform OCR recognition on page 1:
 96 | 
 97 | ```bash
 98 | mcp run read_by_ocr --args '{"file_path": "pdf_resources/example.pdf", "start_page": 1, "end_page": 1, "language": "eng"}'
 99 | ```
100 | 
101 | Extract all images from page 3:
102 | 
103 | ```bash
104 | mcp run read_pdf_images --args '{"file_path": "pdf_resources/example.pdf", "page_number": 3}'
105 | ```
106 | 
107 | ---
108 | 
109 | ## 📢 Notes
110 | 
111 | - Files must be placed inside the `pdf_resources/` directory, or an absolute path must be provided.
112 | - OCR functionality requires appropriate OCR support in the environment.
113 | - When processing large files, adjust memory and timeout settings as needed.
114 | 
115 | ---
116 | 
117 | ## 📜 License
118 | 
119 | This project is licensed under the MIT License.  
120 | For commercial use, please credit the original source.
121 | 
122 | ---
```

--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------

```
1 | fitz
2 | mcp
3 | 
```

--------------------------------------------------------------------------------
/.idea/vcs.xml:
--------------------------------------------------------------------------------

```
1 | <?xml version="1.0" encoding="UTF-8"?>
2 | <project version="4">
3 |   <component name="VcsDirectoryMappings">
4 |     <mapping directory="$PROJECT_DIR$/.." vcs="Git" />
5 |     <mapping directory="$PROJECT_DIR$" vcs="Git" />
6 |   </component>
7 | </project>
```

--------------------------------------------------------------------------------
/pdf_server.py:
--------------------------------------------------------------------------------

```python
  1 | from typing import Optional, List, Dict, Any
  2 | import os, json
  3 | import base64
  4 | import fitz
  5 | from mcp.server.fastmcp import FastMCP
  6 | import uuid
  7 | import logging
  8 | from datetime import datetime
  9 | 
 10 | logging.basicConfig(level=logging.INFO)
 11 | logger = logging.getLogger('mcp-pdf-server')
 12 | 
 13 | PDF_DIR = os.path.join(os.getcwd(), "pdf_resources")
 14 | os.makedirs(PDF_DIR, exist_ok=True)
 15 | 
 16 | mcp = FastMCP("PDF Reader", version="1.0.0", description="MCP server for PDF reading")
 17 | 
 18 | pdf_resources = {}
 19 | pdf_cache = {}
 20 | 
 21 | 
 22 | @mcp.tool()
 23 | def read_pdf_text(file_path: str, start_page: int = 1, end_page: Optional[int] = None) -> Dict[
 24 |     str, Any]:
 25 |     """
 26 |     Read normal text from a PDF file, one text per page.
 27 | 
 28 |     Args:
 29 |         file_path: Path to the PDF file
 30 |         start_page: Start page (1-based)
 31 |         end_page: End page (inclusive)
 32 | 
 33 |     Returns:
 34 |         Dict containing:
 35 |             - page_count: total number of pages
 36 |             - pages: list of {page_number, text}
 37 |     """
 38 |     if not os.path.exists(file_path):
 39 |         raise ValueError(f"File not found: {file_path}")
 40 | 
 41 |     doc = fitz.open(file_path)
 42 |     total_pages = len(doc)
 43 | 
 44 |     if start_page < 1:
 45 |         start_page = 1
 46 |     if end_page is None or end_page > total_pages:
 47 |         end_page = total_pages
 48 |     if start_page > end_page:
 49 |         start_page, end_page = end_page, start_page
 50 | 
 51 |     pages = []
 52 | 
 53 |     for page_num in range(start_page - 1, end_page):
 54 |         page = doc[page_num]
 55 |         page_text = page.get_text()
 56 | 
 57 |         pages.append({"page_number": page_num + 1, "text": page_text.strip()})
 58 | 
 59 |     doc.close()
 60 | 
 61 |     return json.loads(json.dumps({"page_count": total_pages, "pages": pages}, ensure_ascii=False))
 62 | 
 63 | 
 64 | @mcp.tool()
 65 | def read_by_ocr(file_path: str, start_page: int = 1, end_page: Optional[int] = None,
 66 |         language: str = "eng", dpi: int = 300) -> Dict[str, Any]:
 67 |     """
 68 |     Read text from PDF file using OCR.
 69 |     Args:
 70 |         file_path: Path to the PDF file
 71 |         start_page: Start page (1-based)
 72 |         end_page: End page (inclusive)
 73 |         language: OCR language code
 74 |         dpi: OCR DPI
 75 |     Returns:
 76 |         Dict with extracted text, page_count, extracted_pages
 77 |     """
 78 |     if not os.path.exists(file_path):
 79 |         raise ValueError(f"File not found: {file_path}")
 80 | 
 81 |     doc = fitz.open(file_path)
 82 |     total_pages = len(doc)
 83 | 
 84 |     if start_page < 1:
 85 |         start_page = 1
 86 |     if end_page is None or end_page > total_pages:
 87 |         end_page = total_pages
 88 |     if start_page > end_page:
 89 |         start_page, end_page = end_page, start_page
 90 | 
 91 |     text_content = ""
 92 |     for page_num in range(start_page - 1, end_page):
 93 |         page = doc[page_num]
 94 |         try:
 95 |             textpage = page.get_textpage_ocr(flags=3, language=language, dpi=dpi, full=True)
 96 |             page_text = page.get_text(textpage=textpage)
 97 |         except Exception as e:
 98 |             logger.warning(f"OCR failed on page {page_num + 1}, fallback to normal text: {e}")
 99 |             page_text = page.get_text()
100 | 
101 |         text_content += page_text + "\n\n"
102 | 
103 |     doc.close()
104 | 
105 |     return {"text": text_content, "page_count": total_pages,
106 |         "extracted_pages": list(range(start_page, end_page + 1))}
107 | 
108 | 
109 | @mcp.tool()
110 | def read_pdf_images(file_path: str, page_number: int=1) -> Dict[str, List[Dict[str, Any]]]:
111 |     """
112 |     Extract images from a specific page in PDF.
113 |     Args:
114 |         file_path: Path to the PDF file
115 |         page_number: Page number (1-based)
116 |     Returns:
117 |         Dict with list of images (base64 format)
118 |     """
119 |     if not os.path.exists(file_path):
120 |         raise ValueError(f"File not found: {file_path}")
121 | 
122 |     doc = fitz.open(file_path)
123 |     total_pages = len(doc)
124 | 
125 |     if page_number < 1 or page_number > total_pages:
126 |         raise ValueError(f"Page number {page_number} out of range (1-{total_pages})")
127 | 
128 |     page = doc[page_number - 1]
129 |     image_list = page.get_images(full=True)
130 | 
131 |     images = []
132 |     for idx, img in enumerate(image_list):
133 |         xref = img[0]
134 |         base_image = doc.extract_image(xref)
135 |         image_data = base_image["image"]
136 |         image_ext = base_image["ext"]
137 |         image_b64 = base64.b64encode(image_data).decode('utf-8')
138 | 
139 |         images.append({"image_id": f"p{page_number}_img{idx}", "width": base_image["width"],
140 |             "height": base_image["height"], "format": image_ext, "image_b64": image_b64})
141 | 
142 |     doc.close()
143 | 
144 |     return {"images": images}
145 | 
146 | 
147 | if __name__ == "__main__":
148 |     logger.info("Starting MCP PDF Server...")
149 |     logger.info(f"PDF resources will be stored in: {PDF_DIR}")
150 |     mcp.run()
151 | 
```