afrise/academic-search-mcp-server # codebase.md

# Directory Structure

```
├── .gitignore
├── .python-version
├── .vscode
│   └── launch.json
├── Dockerfile
├── launch.json
├── LICENSE
├── pyproject.toml
├── README.md
├── server.py
├── smithery.yaml
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------

```
1 | 3.10
2 | 
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
 1 | # Python-generated files
 2 | __pycache__/
 3 | *.py[oc]
 4 | build/
 5 | dist/
 6 | wheels/
 7 | *.egg-info
 8 | 
 9 | logs/
10 | docs/ 
11 | # Virtual environments
12 | .venv
13 | 
14 | 
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Academic Paper Search MCP Server
  2 | 
  3 | [![smithery badge](https://smithery.ai/badge/@afrise/academic-search-mcp-server)](https://smithery.ai/server/@afrise/academic-search-mcp-server)
  4 | 
  5 | A [Model Context Protocol (MCP)](https://www.anthropic.com/news/model-context-protocol) server that enables searching and retrieving academic paper information from multiple sources.
  6 | 
  7 | The server provides LLMs with:
  8 | - Real-time academic paper search functionality  
  9 | - Access to paper metadata and abstracts
 10 | - Ability to retrieve full-text content when available
 11 | - Structured data responses following the MCP specification
 12 | 
 13 | While primarily designed for integration with Anthropic's Claude Desktop client, the MCP specification allows for potential compatibility with other AI models and clients that support tool/function calling capabilities (e.g. OpenAI's API).
 14 | 
 15 | **Note**: This software is under active development. Features and functionality are subject to change.
 16 | 
 17 | <a href="https://glama.ai/mcp/servers/kzsu1zzz9j"><img width="380" height="200" src="https://glama.ai/mcp/servers/kzsu1zzz9j/badge" alt="Academic Paper Search Server MCP server" /></a>
 18 | 
 19 | ## Features
 20 | 
 21 | This server exposes the following tools:
 22 | - `search_papers`: Search for academic papers across multiple sources
 23 |   - Parameters:
 24 |     - `query` (str): Search query text
 25 |     - `limit` (int, optional): Maximum number of results to return (default: 10)
 26 |   - Returns: Formatted string containing paper details
 27 |   
 28 | - `fetch_paper_details`: Retrieve detailed information for a specific paper
 29 |   - Parameters:
 30 |     - `paper_id` (str): Paper identifier (DOI or Semantic Scholar ID)
 31 |     - `source` (str, optional): Data source ("crossref" or "semantic_scholar", default: "crossref")
 32 |   - Returns: Formatted string with comprehensive paper metadata including:
 33 |     - Title, authors, year, DOI
 34 |     - Venue, open access status, PDF URL (Semantic Scholar only)
 35 |     - Abstract and TL;DR summary (when available)
 36 | 
 37 | - `search_by_topic`: Search for papers by topic with optional date range filter
 38 |   - Parameters:
 39 |     - `topic` (str): Search query text (limited to 300 characters)
 40 |     - `year_start` (int, optional): Start year for date range 
 41 |     - `year_end` (int, optional): End year for date range
 42 |     - `limit` (int, optional): Maximum number of results to return (default: 10)
 43 |   - Returns: Formatted string containing search results including:
 44 |     - Paper titles, authors, and years
 45 |     - Abstracts and TL;DR summaries when available
 46 |     - Venue and open access information
 47 | 
 48 | ## Setup
 49 | 
 50 | 
 51 | ### Installing via Smithery
 52 | 
 53 | To install Academic Paper Search Server for Claude Desktop automatically via [Smithery](https://smithery.ai/server/@afrise/academic-search-mcp-server):
 54 | 
 55 | ```bash
 56 | npx -y @smithery/cli install @afrise/academic-search-mcp-server --client claude
 57 | ```
 58 | 
 59 | ***note*** this method is largely untested, as their server seems to be having trouble. you can follow the standalone instructions until smithery gets fixed. 
 60 | 
 61 | ### Installing via uv (manual install): 
 62 | 
 63 | 1. Install dependencies:
 64 | ```sh
 65 | uv add "mcp[cli]" httpx
 66 | ```
 67 | 
 68 | 2. Set up required API keys in your environment or `.env` file:
 69 | ```sh
 70 | #  These are not actually implemented
 71 | SEMANTIC_SCHOLAR_API_KEY=your_key_here 
 72 | CROSSREF_API_KEY=your_key_here  # Optional but recommended
 73 | ```
 74 | 
 75 | 3. Run the server:
 76 | ```sh
 77 | uv run server.py
 78 | ```
 79 | 
 80 | ## Usage with Claude Desktop
 81 | 
 82 | 1. Add the server to your Claude Desktop configuration (`claude_desktop_config.json`):
 83 | ```json
 84 | {
 85 |   "mcpServers": {
 86 |     "academic-search": {
 87 |       "command": "uv",
 88 |       "args": ["run ", "/path/to/server/server.py"],
 89 |       "env": {
 90 |         "SEMANTIC_SCHOLAR_API_KEY": "your_key_here",
 91 |         "CROSSREF_API_KEY": "your_key_here"
 92 |       }
 93 |     }
 94 |   }
 95 | }
 96 | ```
 97 | 
 98 | 2. Restart Claude Desktop
 99 | 
100 | 
101 | ## Development
102 | 
103 | This server is built using:
104 | - Python MCP SDK
105 | - FastMCP for simplified server implementation
106 | - httpx for API requests
107 | 
108 | ## API Sources
109 | 
110 | - Semantic Scholar API
111 | - Crossref API
112 | 
113 | ## License
114 | 
115 | This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). This license ensures that:
116 | 
117 | - You can freely use, modify, and distribute this software
118 | - Any modifications must be open-sourced under the same license
119 | - Anyone providing network services using this software must make the source code available
120 | - Commercial use is allowed, but the software and any derivatives must remain free and open source
121 | 
122 | See the [LICENSE](LICENSE) file for the full license text.
123 | 
124 | ## Contributing
125 | 
126 | Contributions are welcome! Here's how you can help:
127 | 
128 | 1. Fork the repository
129 | 2. Create a feature branch (`git checkout -b feature/amazing-feature`)
130 | 3. Commit your changes (`git commit -m 'Add amazing feature'`)
131 | 4. Push to the branch (`git push origin feature/amazing-feature`)
132 | 5. Open a Pull Request
133 | 
134 | Please note:
135 | - Follow the existing code style and conventions
136 | - Add tests for any new functionality
137 | - Update documentation as needed
138 | - Ensure your changes respect the AGPL-3.0 license terms
139 | 
140 | By contributing to this project, you agree that your contributions will be licensed under the AGPL-3.0 license.
141 | 
```

--------------------------------------------------------------------------------
/launch.json:
--------------------------------------------------------------------------------

```json
1 | 
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
 1 | [project]
 2 | name = "academic-search"
 3 | version = "0.1.0"
 4 | description = "Add your description here"
 5 | readme = "README.md"
 6 | requires-python = ">=3.10"
 7 | dependencies = [
 8 |     "httpx>=0.28.1",
 9 |     "mcp[cli]>=1.2.1",
10 | ]
11 | 
```

--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
 1 | # Generated by https://smithery.ai. See: https://smithery.ai/docs/config#dockerfile
 2 | # Use the official Python image with version 3.10
 3 | FROM python:3.10-slim
 4 | 
 5 | # Set working directory
 6 | WORKDIR /app
 7 | 
 8 | # Copy the project files into the container
 9 | COPY . /app
10 | 
11 | # Install dependencies from pyproject.toml using uv
12 | # We will install uv first to use it for dependency management
13 | RUN pip install uv
14 | 
15 | # Install the project's dependencies using the lockfile
16 | RUN uv sync --frozen --no-install-project --no-dev --no-editable
17 | 
18 | # Set environment variables for the API keys
19 | ENV SEMANTIC_SCHOLAR_API_KEY=your_key_here
20 | ENV CROSSREF_API_KEY=your_key_here
21 | 
22 | # Expose the port that the server will run on
23 | EXPOSE 8000
24 | 
25 | # Command to run the server
26 | CMD ["uv", "run", "server.py"]
27 | 
```

--------------------------------------------------------------------------------
/.vscode/launch.json:
--------------------------------------------------------------------------------

```json
 1 | {
 2 |     // Use IntelliSense to learn about possible attributes.
 3 |     // Hover to view descriptions of existing attributes.
 4 |     // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
 5 |     "version": "0.2.0",
 6 |     "configurations": [
 7 | 
 8 |         {
 9 |             "name": "MCP Inspector",
10 |             "type": "node",
11 |             "request": "launch",
12 |             "runtimeExecutable": "npx",
13 |             "args": [
14 |                 "@modelcontextprotocol/inspector", 
15 |                 "uv", 
16 |                 "run", 
17 |                 "G:/code/science/server.py"], 
18 |             "env": {
19 |                 "PYTHONIOENCODING": "utf-8",
20 |                 "LANG": "en_US.UTF-8"
21 |             },    
22 |             "console": "integratedTerminal"
23 |         }
24 |     ]
25 | }
```

--------------------------------------------------------------------------------
/smithery.yaml:
--------------------------------------------------------------------------------

```yaml
 1 | # Smithery configuration file: https://smithery.ai/docs/config#smitheryyaml
 2 | 
 3 | startCommand:
 4 |   type: stdio
 5 |   configSchema:
 6 |     # JSON Schema defining the configuration options for the MCP.
 7 |     type: object
 8 |     required:
 9 |       - semanticScholarApiKey
10 |     properties:
11 |       semanticScholarApiKey:
12 |         type: string
13 |         description: The API key for Semantic Scholar (Optional).
14 |       crossrefApiKey:
15 |         type: string
16 |         description: The API key for Crossref (optional).
17 |   commandFunction:
18 |     # A function that produces the CLI command to start the MCP on stdio.
19 |     |-
20 |     (config) => ({ command: 'uv', args: ['run', 'server.py'], env: { SEMANTIC_SCHOLAR_API_KEY: config.semanticScholarApiKey, CROSSREF_API_KEY: config.crossrefApiKey || '' } })
21 | 
```

--------------------------------------------------------------------------------
/server.py:
--------------------------------------------------------------------------------

```python
  1 | import logging
  2 | import sys
  3 | import os
  4 | from datetime import datetime
  5 | from typing import Any
  6 | import httpx
  7 | from mcp.server.fastmcp import FastMCP
  8 | import unicodedata
  9 | import json
 10 | import sys
 11 | 
 12 | # Set UTF-8 as default encoding for Python
 13 | sys.stdout.recodeinfo = 'utf-8'
 14 | if sys.stdout.encoding != 'utf-8':
 15 |     sys.stdout.reconfigure(encoding='utf-8')
 16 | 
 17 | # Initialize FastMCP server
 18 | mcp = FastMCP("scientific_literature")
 19 | 
 20 | # Constants
 21 | SEMANTIC_SCHOLAR_API = "https://api.semanticscholar.org/graph/v1"
 22 | CROSSREF_API = "https://api.crossref.org/works"
 23 | USER_AGENT = "scientific-literature-app/1.0"
 24 | 
 25 | 
 26 | async def make_api_request(url: str, headers: dict = None, params: dict = None) -> dict[str, Any] | None:
 27 |     """Make a request to the API with proper error handling."""
 28 |     if headers is None:
 29 |         headers = { "User-Agent": USER_AGENT }
 30 |     async with httpx.AsyncClient() as client:
 31 |         try:
 32 |             response = await client.get(url, headers=headers, params=params, timeout=30.0)
 33 |             response.raise_for_status()
 34 |             return response.json()
 35 |         except Exception as e:
 36 |             return None
 37 | 
 38 | def format_paper_data(data: dict, source: str) -> str:
 39 |     """Format paper data from different sources into a consistent string format."""
 40 |     if not data:
 41 |         return "No paper data available"
 42 |         
 43 |     try:
 44 |         if source == "semantic_scholar":
 45 |             title = unicodedata.normalize('NFKD', str(data.get('title', 'No title available')))
 46 |             authors = ', '.join([author.get('name', 'Unknown Author') for author in data.get('authors', [])])
 47 |             year = data.get('year') or 'Year unknown'
 48 |             external_ids = data.get('externalIds', {}) or {}
 49 |             doi = external_ids.get('DOI', 'No DOI available')
 50 |             venue = data.get('venue') or 'Venue unknown'
 51 |             abstract = data.get('abstract') or 'No abstract available'
 52 |             tldr = (data.get('tldr') or {}).get('text', '')
 53 |             is_open = "Yes" if data.get('isOpenAccess') else "No"
 54 |             pdf_data = data.get('openAccessPdf', {}) or {}
 55 |             pdf_url = pdf_data.get('url', 'Not available')
 56 | 
 57 |         elif source == "crossref":
 58 |             title = (data.get('title') or ['No title available'])[0]
 59 |             authors = ', '.join([
 60 |                 f"{author.get('given', '')} {author.get('family', '')}".strip() or 'Unknown Author'
 61 |                 for author in data.get('author', [])
 62 |             ])
 63 |             year = (data.get('published-print', {}).get('date-parts', [['']])[0][0]) or 'Year unknown'
 64 |             doi = data.get('DOI') or 'No DOI available'
 65 |             
 66 |         result = [
 67 |             f"Title: {title}",
 68 |             f"Authors: {authors}",
 69 |             f"Year: {year}",
 70 |             f"DOI: {doi}"
 71 |         ]
 72 |         
 73 |         if source == "semantic_scholar":
 74 |             result.extend([
 75 |                 f"Venue: {venue}",
 76 |                 f"Open Access: {is_open}",
 77 |                 f"PDF URL: {pdf_url}",
 78 |                 f"Abstract: {abstract}"
 79 |             ])
 80 |             if tldr:
 81 |                 result.append(f"TL;DR: {tldr}")
 82 |                 
 83 |         return "\n".join(result) + "\t\t\n"
 84 |         
 85 |     except Exception as e:
 86 |         return f"Error formatting paper data: {str(e)}"
 87 | 
 88 | @mcp.tool()
 89 | async def search_papers(query: str, limit: int = 10) -> str:
 90 |     """Search for papers across multiple sources.
 91 | 
 92 |     args: 
 93 |         query: the search query
 94 |         limit: the maximum number of results to return (default 10)
 95 |     """
 96 | 
 97 |     if query == "":
 98 |         return "Please provide a search query."
 99 |     
100 |     # Truncate long queries
101 |     MAX_QUERY_LENGTH = 300
102 |     if len(query) > MAX_QUERY_LENGTH:
103 |         original_length = len(query)
104 |         query = query[:MAX_QUERY_LENGTH] + "..."
105 |     
106 |     try:
107 |         # Search Semantic Scholar
108 |         semantic_url = f"{SEMANTIC_SCHOLAR_API}/paper/search?query={query}&limit={limit}"
109 |         semantic_data = await make_api_request(semantic_url)
110 | 
111 |         # Search Crossref
112 |         crossref_url = f"{CROSSREF_API}?query={query}&rows={limit}"
113 |         crossref_data = await make_api_request(crossref_url)
114 | 
115 |         results = []
116 |         
117 |         if semantic_data and 'papers' in semantic_data:
118 |             results.append("=== Semantic Scholar Results ===")
119 |             for paper in semantic_data['papers']:
120 |                 results.append(format_paper_data(paper, "semantic_scholar"))
121 | 
122 |         if crossref_data and 'items' in crossref_data.get('message', {}):
123 |             results.append("\n=== Crossref Results ===")
124 |             for paper in crossref_data['message']['items']:
125 |                 results.append(format_paper_data(paper, "crossref"))
126 | 
127 |         if not results:
128 |             return "No results found or error occurred while fetching papers."
129 | 
130 |         return "\n".join(results)
131 |     except:
132 |         return "Error searching papers."
133 | 
134 | @mcp.tool()
135 | async def fetch_paper_details(paper_id: str, source: str = "semantic_scholar") -> str:
136 |     """Get detailed information about a specific paper.
137 | 
138 |     Args:
139 |         paper_id: Paper identifier (DOI for Crossref, paper ID for Semantic Scholar)
140 |         source: Source database ("semantic_scholar" or "crossref")
141 |     """
142 |     if source == "semantic_scholar":
143 |         url = f"{SEMANTIC_SCHOLAR_API}/paper/{paper_id}"
144 |     elif source == "crossref":
145 |         url = f"{CROSSREF_API}/{paper_id}"
146 |     else:
147 |         return "Unsupported source. Please use 'semantic_scholar' or 'crossref'."
148 | 
149 |     data = await make_api_request(url)
150 |     
151 |     if not data:
152 |         return f"Unable to fetch paper details from {source}."
153 | 
154 |     if source == "crossref":
155 |         data = data.get('message', {})
156 | 
157 |     return format_paper_data(data, source)
158 | 
159 | 
160 | @mcp.tool()
161 | async def search_by_topic(topic: str, year_start: int = None, year_end: int = None, limit: int = 10) -> str:
162 |     """Search for papers by topic with optional date range. 
163 |     
164 |     Note: Query length is limited to 300 characters. Longer queries will be automatically truncated.
165 |     
166 |     Args:
167 |         topic (str): Search query (max 300 chars)
168 |         year_start (int, optional): Start year for date range
169 |         year_end (int, optional): End year for date range  
170 |         limit (int, optional): Maximum number of results to return (default 10)
171 |         
172 |     Returns:
173 |         str: Formatted search results or error message
174 |     """
175 |     
176 |     try:
177 |         # Truncate long queries to prevent API errors
178 |         MAX_QUERY_LENGTH = 300
179 |         if len(topic) > MAX_QUERY_LENGTH:
180 |             original_length = len(topic)
181 |             topic = topic[:MAX_QUERY_LENGTH] + "..."
182 |         
183 |         # Try Semantic Scholar API first
184 |         semantic_url = f"{SEMANTIC_SCHOLAR_API}/paper/search"
185 |         params = {
186 |             "query": topic.encode('utf-8').decode('utf-8'),
187 |             "limit": limit,
188 |             "fields": "title,authors,year,paperId,externalIds,abstract,venue,isOpenAccess,openAccessPdf,tldr"
189 |         }
190 |         if year_start and year_end:
191 |             params["year"] = f"{year_start}-{year_end}"
192 |             
193 |         headers = {
194 |             "Accept": "application/json",
195 |             "Content-Type": "application/json; charset=utf-8"
196 |         }
197 |         data = await make_api_request(semantic_url, headers=headers, params=params)
198 |         
199 |         if data and 'data' in data:
200 |             results = ["=== Search Results ==="]
201 |             for paper in data['data']:
202 |                 results.append(format_paper_data(paper, "semantic_scholar"))
203 |             return "\n".join(results)
204 |             
205 |         # Fallback to Crossref if Semantic Scholar fails
206 |         return await search_papers(topic, limit)
207 |         
208 |     except Exception as e:
209 |         return f"Error searching papers!"
210 | 
211 | 
212 | if __name__ == "__main__":
213 |     # Initialize and run the server
214 |     mcp.run(transport='stdio')
215 | 
```