yuzongmin/semantic-scholar-fastmcp-mcp-server # codebase.md

This is page 1 of 2. Use http://codebase.md/yuzongmin/semantic-scholar-fastmcp-mcp-server?lines=true&page={x} to view the full context.

# Directory Structure

```
├── .gitignore
├── Dockerfile
├── LICENSE
├── README.md
├── REFACTORING.md
├── requirements.txt
├── run.py
├── semantic_scholar
│   ├── __init__.py
│   ├── api
│   │   ├── __init__.py
│   │   ├── authors.py
│   │   ├── papers.py
│   │   └── recommendations.py
│   ├── config.py
│   ├── mcp.py
│   ├── server.py
│   └── utils
│       ├── __init__.py
│       ├── errors.py
│       └── http.py
├── semantic_scholar_server.py
├── smithery.yaml
├── test
│   ├── __init__.py
│   ├── test_author.py
│   ├── test_paper.py
│   ├── test_recommend.py
│   └── test_utils.py
└── TOOLS.md
```

# Files

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
  1 | # Byte-compiled / optimized / DLL files
  2 | __pycache__/
  3 | *.py[cod]
  4 | *$py.class
  5 | 
  6 | # C extensions
  7 | *.so
  8 | 
  9 | # Distribution / packaging
 10 | .Python
 11 | build/
 12 | develop-eggs/
 13 | dist/
 14 | downloads/
 15 | eggs/
 16 | .eggs/
 17 | lib/
 18 | lib64/
 19 | parts/
 20 | sdist/
 21 | var/
 22 | wheels/
 23 | share/python-wheels/
 24 | *.egg-info/
 25 | .installed.cfg
 26 | *.egg
 27 | MANIFEST
 28 | 
 29 | # PyInstaller
 30 | #  Usually these files are written by a python script from a template
 31 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 32 | *.manifest
 33 | *.spec
 34 | 
 35 | # Installer logs
 36 | pip-log.txt
 37 | pip-delete-this-directory.txt
 38 | 
 39 | # Unit test / coverage reports
 40 | htmlcov/
 41 | .tox/
 42 | .nox/
 43 | .coverage
 44 | .coverage.*
 45 | .cache
 46 | nosetests.xml
 47 | coverage.xml
 48 | *.cover
 49 | *.py,cover
 50 | .hypothesis/
 51 | .pytest_cache/
 52 | cover/
 53 | 
 54 | # Translations
 55 | *.mo
 56 | *.pot
 57 | 
 58 | # Django stuff:
 59 | *.log
 60 | local_settings.py
 61 | db.sqlite3
 62 | db.sqlite3-journal
 63 | 
 64 | # Flask stuff:
 65 | instance/
 66 | .webassets-cache
 67 | 
 68 | # Scrapy stuff:
 69 | .scrapy
 70 | 
 71 | # Sphinx documentation
 72 | docs/_build/
 73 | 
 74 | # PyBuilder
 75 | .pybuilder/
 76 | target/
 77 | 
 78 | # Jupyter Notebook
 79 | .ipynb_checkpoints
 80 | 
 81 | # IPython
 82 | profile_default/
 83 | ipython_config.py
 84 | 
 85 | # pyenv
 86 | #   For a library or package, you might want to ignore these files since the code is
 87 | #   intended to run in multiple environments; otherwise, check them in:
 88 | # .python-version
 89 | 
 90 | # pipenv
 91 | #   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
 92 | #   However, in case of collaboration, if having platform-specific dependencies or dependencies
 93 | #   having no cross-platform support, pipenv may install dependencies that don't work, or not
 94 | #   install all needed dependencies.
 95 | #Pipfile.lock
 96 | 
 97 | # UV
 98 | #   Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
 99 | #   This is especially recommended for binary packages to ensure reproducibility, and is more
100 | #   commonly ignored for libraries.
101 | #uv.lock
102 | 
103 | # poetry
104 | #   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
105 | #   This is especially recommended for binary packages to ensure reproducibility, and is more
106 | #   commonly ignored for libraries.
107 | #   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
108 | #poetry.lock
109 | 
110 | # pdm
111 | #   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
112 | #pdm.lock
113 | #   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
114 | #   in version control.
115 | #   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
116 | .pdm.toml
117 | .pdm-python
118 | .pdm-build/
119 | 
120 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
121 | __pypackages__/
122 | 
123 | # Celery stuff
124 | celerybeat-schedule
125 | celerybeat.pid
126 | 
127 | # SageMath parsed files
128 | *.sage.py
129 | 
130 | # Environments
131 | .env
132 | .venv
133 | env/
134 | venv/
135 | ENV/
136 | env.bak/
137 | venv.bak/
138 | 
139 | # Spyder project settings
140 | .spyderproject
141 | .spyproject
142 | 
143 | # Rope project settings
144 | .ropeproject
145 | 
146 | # mkdocs documentation
147 | /site
148 | 
149 | # mypy
150 | .mypy_cache/
151 | .dmypy.json
152 | dmypy.json
153 | 
154 | # Pyre type checker
155 | .pyre/
156 | 
157 | # pytype static type analyzer
158 | .pytype/
159 | 
160 | # Cython debug symbols
161 | cython_debug/
162 | 
163 | # PyCharm
164 | #  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
165 | #  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
166 | #  and can be added to the global gitignore or merged into this file.  For a more nuclear
167 | #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
168 | #.idea/
169 | 
170 | # Ruff stuff:
171 | .ruff_cache/
172 | 
173 | # PyPI configuration file
174 | .pypirc
175 | 
176 | # Data files
177 | *.npy
178 | *.npz
179 | *.mat
180 | *.pkl
181 | 
182 | # Checkpoint files
183 | _METADATA
184 | _CHECKPOINT_METADATA
185 | 
186 | # Experimental results
187 | experimental_results/
188 | saved_models/
189 | 
190 | # VS Code
191 | .vscode/
192 | 
193 | # macOS
194 | .DS_Store
195 | 
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Semantic Scholar MCP Server
  2 | 
  3 | [![smithery badge](https://smithery.ai/badge/semantic-scholar-fastmcp-mcp-server)](https://smithery.ai/server/semantic-scholar-fastmcp-mcp-server)
  4 | 
  5 | A FastMCP server implementation for the Semantic Scholar API, providing comprehensive access to academic paper data, author information, and citation networks.
  6 | 
  7 | ## Project Structure
  8 | 
  9 | The project has been refactored into a modular structure for better maintainability:
 10 | 
 11 | ```
 12 | semantic-scholar-server/
 13 | ├── semantic_scholar/            # Main package
 14 | │   ├── __init__.py             # Package initialization
 15 | │   ├── server.py               # Server setup and main functionality
 16 | │   ├── mcp.py                  # Centralized FastMCP instance definition
 17 | │   ├── config.py               # Configuration classes
 18 | │   ├── utils/                  # Utility modules
 19 | │   │   ├── __init__.py
 20 | │   │   ├── errors.py           # Error handling
 21 | │   │   └── http.py             # HTTP client and rate limiting
 22 | │   ├── api/                    # API endpoints
 23 | │       ├── __init__.py
 24 | │       ├── papers.py           # Paper-related endpoints
 25 | │       ├── authors.py          # Author-related endpoints
 26 | │       └── recommendations.py  # Recommendation endpoints
 27 | ├── run.py                      # Entry point script
 28 | ```
 29 | 
 30 | This structure:
 31 | 
 32 | - Separates concerns into logical modules
 33 | - Makes the codebase easier to understand and maintain
 34 | - Allows for better testing and future extensions
 35 | - Keeps related functionality grouped together
 36 | - Centralizes the FastMCP instance to avoid circular imports
 37 | 
 38 | ## Features
 39 | 
 40 | - **Paper Search & Discovery**
 41 | 
 42 |   - Full-text search with advanced filtering
 43 |   - Title-based paper matching
 44 |   - Paper recommendations (single and multi-paper)
 45 |   - Batch paper details retrieval
 46 |   - Advanced search with ranking strategies
 47 | 
 48 | - **Citation Analysis**
 49 | 
 50 |   - Citation network exploration
 51 |   - Reference tracking
 52 |   - Citation context and influence analysis
 53 | 
 54 | - **Author Information**
 55 | 
 56 |   - Author search and profile details
 57 |   - Publication history
 58 |   - Batch author details retrieval
 59 | 
 60 | - **Advanced Features**
 61 |   - Complex search with multiple ranking strategies
 62 |   - Customizable field selection
 63 |   - Efficient batch operations
 64 |   - Rate limiting compliance
 65 |   - Support for both authenticated and unauthenticated access
 66 |   - Graceful shutdown and error handling
 67 |   - Connection pooling and resource management
 68 | 
 69 | ## System Requirements
 70 | 
 71 | - Python 3.8+
 72 | - FastMCP framework
 73 | - Environment variable for API key (optional)
 74 | 
 75 | ## Installation
 76 | 
 77 | ### Installing via Smithery
 78 | 
 79 | To install Semantic Scholar MCP Server for Claude Desktop automatically via [Smithery](https://smithery.ai/server/semantic-scholar-fastmcp-mcp-server):
 80 | 
 81 | ```bash
 82 | npx -y @smithery/cli install semantic-scholar-fastmcp-mcp-server --client claude
 83 | ```
 84 | 
 85 | ### Manual Installation
 86 | 
 87 | 1. Clone the repository:
 88 | 
 89 | ```bash
 90 | git clone https://github.com/YUZongmin/semantic-scholar-fastmcp-mcp-server.git
 91 | cd semantic-scholar-server
 92 | ```
 93 | 
 94 | 2. Install FastMCP and other dependencies following: https://github.com/jlowin/fastmcp
 95 | 
 96 | 3. Configure FastMCP:
 97 | 
 98 | For Claude Desktop users, you'll need to configure the server in your FastMCP configuration file. Add the following to your configuration (typically in `~/.config/claude-desktop/config.json`):
 99 | 
100 | ```json
101 | {
102 |   "mcps": {
103 |     "Semantic Scholar Server": {
104 |       "command": "/path/to/your/venv/bin/fastmcp",
105 |       "args": [
106 |         "run",
107 |         "/path/to/your/semantic-scholar-server/run.py"
108 |       ],
109 |       "env": {
110 |         "SEMANTIC_SCHOLAR_API_KEY": "your-api-key-here"  # Optional
111 |       }
112 |     }
113 |   }
114 | }
115 | ```
116 | 
117 | Make sure to:
118 | 
119 | - Replace `/path/to/your/venv/bin/fastmcp` with the actual path to your FastMCP installation
120 | - Replace `/path/to/your/semantic-scholar-server/run.py` with the actual path to run.py on your machine
121 | - If you have a Semantic Scholar API key, add it to the `env` section. If not, you can remove the `env` section entirely
122 | 
123 | 4. Start using the server:
124 | 
125 | The server will now be available to your Claude Desktop instance. No need to manually run any commands - Claude will automatically start and manage the server process when needed.
126 | 
127 | ### API Key (Optional)
128 | 
129 | To get higher rate limits and better performance:
130 | 
131 | 1. Get an API key from [Semantic Scholar API](https://www.semanticscholar.org/product/api)
132 | 2. Add it to your FastMCP configuration as shown above in the `env` section
133 | 
134 | If no API key is provided, the server will use unauthenticated access with lower rate limits.
135 | 
136 | ## Configuration
137 | 
138 | ### Environment Variables
139 | 
140 | - `SEMANTIC_SCHOLAR_API_KEY`: Your Semantic Scholar API key (optional)
141 |   - Get your key from [Semantic Scholar API](https://www.semanticscholar.org/product/api)
142 |   - If not provided, the server will use unauthenticated access
143 | 
144 | ### Rate Limits
145 | 
146 | The server automatically adjusts to the appropriate rate limits:
147 | 
148 | **With API Key**:
149 | 
150 | - Search, batch and recommendation endpoints: 1 request per second
151 | - Other endpoints: 10 requests per second
152 | 
153 | **Without API Key**:
154 | 
155 | - All endpoints: 100 requests per 5 minutes
156 | - Longer timeouts for requests
157 | 
158 | ## Available MCP Tools
159 | 
160 | > Note: All tools are aligned with the official [Semantic Scholar API documentation](https://api.semanticscholar.org/api-docs/). Please refer to the official documentation for detailed field specifications and the latest updates.
161 | 
162 | ### Paper Search Tools
163 | 
164 | - `paper_relevance_search`: Search for papers using relevance ranking
165 | 
166 |   - Supports comprehensive query parameters including year range and citation count filters
167 |   - Returns paginated results with customizable fields
168 | 
169 | - `paper_bulk_search`: Bulk paper search with sorting options
170 | 
171 |   - Similar to relevance search but optimized for larger result sets
172 |   - Supports sorting by citation count, publication date, etc.
173 | 
174 | - `paper_title_search`: Find papers by exact title match
175 | 
176 |   - Useful for finding specific papers when you know the title
177 |   - Returns detailed paper information with customizable fields
178 | 
179 | - `paper_details`: Get comprehensive details about a specific paper
180 | 
181 |   - Accepts various paper ID formats (S2 ID, DOI, ArXiv, etc.)
182 |   - Returns detailed paper metadata with nested field support
183 | 
184 | - `paper_batch_details`: Efficiently retrieve details for multiple papers
185 |   - Accepts up to 1000 paper IDs per request
186 |   - Supports the same ID formats and fields as single paper details
187 | 
188 | ### Citation Tools
189 | 
190 | - `paper_citations`: Get papers that cite a specific paper
191 | 
192 |   - Returns paginated list of citing papers
193 |   - Includes citation context when available
194 |   - Supports field customization and sorting
195 | 
196 | - `paper_references`: Get papers referenced by a specific paper
197 |   - Returns paginated list of referenced papers
198 |   - Includes reference context when available
199 |   - Supports field customization and sorting
200 | 
201 | ### Author Tools
202 | 
203 | - `author_search`: Search for authors by name
204 | 
205 |   - Returns paginated results with customizable fields
206 |   - Includes affiliations and publication counts
207 | 
208 | - `author_details`: Get detailed information about an author
209 | 
210 |   - Returns comprehensive author metadata
211 |   - Includes metrics like h-index and citation counts
212 | 
213 | - `author_papers`: Get papers written by an author
214 | 
215 |   - Returns paginated list of author's publications
216 |   - Supports field customization and sorting
217 | 
218 | - `author_batch_details`: Get details for multiple authors
219 |   - Efficiently retrieve information for up to 1000 authors
220 |   - Returns the same fields as single author details
221 | 
222 | ### Recommendation Tools
223 | 
224 | - `paper_recommendations_single`: Get recommendations based on a single paper
225 | 
226 |   - Returns similar papers based on content and citation patterns
227 |   - Supports field customization for recommended papers
228 | 
229 | - `paper_recommendations_multi`: Get recommendations based on multiple papers
230 |   - Accepts positive and negative example papers
231 |   - Returns papers similar to positive examples and dissimilar to negative ones
232 | 
233 | ## Usage Examples
234 | 
235 | ### Basic Paper Search
236 | 
237 | ```python
238 | results = await paper_relevance_search(
239 |     context,
240 |     query="machine learning",
241 |     year="2020-2024",
242 |     min_citation_count=50,
243 |     fields=["title", "abstract", "authors"]
244 | )
245 | ```
246 | 
247 | ### Paper Recommendations
248 | 
249 | ```python
250 | # Single paper recommendation
251 | recommendations = await paper_recommendations_single(
252 |     context,
253 |     paper_id="649def34f8be52c8b66281af98ae884c09aef38b",
254 |     fields="title,authors,year"
255 | )
256 | 
257 | # Multi-paper recommendation
258 | recommendations = await paper_recommendations_multi(
259 |     context,
260 |     positive_paper_ids=["649def34f8be52c8b66281af98ae884c09aef38b", "ARXIV:2106.15928"],
261 |     negative_paper_ids=["ArXiv:1805.02262"],
262 |     fields="title,abstract,authors"
263 | )
264 | ```
265 | 
266 | ### Batch Operations
267 | 
268 | ```python
269 | # Get details for multiple papers
270 | papers = await paper_batch_details(
271 |     context,
272 |     paper_ids=["649def34f8be52c8b66281af98ae884c09aef38b", "ARXIV:2106.15928"],
273 |     fields="title,authors,year,citations"
274 | )
275 | 
276 | # Get details for multiple authors
277 | authors = await author_batch_details(
278 |     context,
279 |     author_ids=["1741101", "1780531"],
280 |     fields="name,hIndex,citationCount,paperCount"
281 | )
282 | ```
283 | 
284 | ## Error Handling
285 | 
286 | The server provides standardized error responses:
287 | 
288 | ```python
289 | {
290 |     "error": {
291 |         "type": "error_type",  # rate_limit, api_error, validation, timeout
292 |         "message": "Error description",
293 |         "details": {
294 |             # Additional context
295 |             "authenticated": true/false  # Indicates if request was authenticated
296 |         }
297 |     }
298 | }
299 | ```
300 | 
```

--------------------------------------------------------------------------------
/test/__init__.py:
--------------------------------------------------------------------------------

```python
1 | """Test package for semantic-scholar-server""" 
```

--------------------------------------------------------------------------------
/semantic_scholar/utils/__init__.py:
--------------------------------------------------------------------------------

```python
1 | """
2 | Utility modules for the Semantic Scholar API Server.
3 | """ 
```

--------------------------------------------------------------------------------
/semantic_scholar/__init__.py:
--------------------------------------------------------------------------------

```python
1 | """
2 | Semantic Scholar API Server Package
3 | 
4 | A FastMCP-based server for accessing the Semantic Scholar Academic Graph API.
5 | """
6 | 
7 | __version__ = "0.1.0" 
```

--------------------------------------------------------------------------------
/semantic_scholar/mcp.py:
--------------------------------------------------------------------------------

```python
1 | """
2 | Central definition of the FastMCP instance.
3 | """
4 | 
5 | from fastmcp import FastMCP
6 | 
7 | # Create FastMCP instance
8 | mcp = FastMCP("Semantic Scholar Server") 
```

--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------

```
 1 | # HTTP client
 2 | httpx>=0.24.0
 3 | 
 4 | # Testing
 5 | pytest>=7.3.1
 6 | pytest-asyncio>=0.21.0
 7 | 
 8 | # Environment
 9 | python-dotenv>=1.0.0
10 | 
11 | # Server dependencies
12 | uvicorn>=0.27.1
13 | fastmcp>=0.1.0 
```

--------------------------------------------------------------------------------
/semantic_scholar/api/__init__.py:
--------------------------------------------------------------------------------

```python
 1 | """
 2 | API endpoints for the Semantic Scholar API Server.
 3 | """
 4 | 
 5 | # Import all endpoints to make them available when importing the package
 6 | from .papers import (
 7 |     paper_relevance_search,
 8 |     paper_bulk_search,
 9 |     paper_title_search,
10 |     paper_details,
11 |     paper_batch_details,
12 |     paper_authors,
13 |     paper_citations,
14 |     paper_references
15 | )
16 | 
17 | from .authors import (
18 |     author_search,
19 |     author_details,
20 |     author_papers,
21 |     author_batch_details
22 | )
23 | 
24 | from .recommendations import (
25 |     get_paper_recommendations_single,
26 |     get_paper_recommendations_multi
27 | ) 
```

--------------------------------------------------------------------------------
/smithery.yaml:
--------------------------------------------------------------------------------

```yaml
 1 | # Smithery configuration file: https://smithery.ai/docs/config#smitheryyaml
 2 | 
 3 | startCommand:
 4 |   type: stdio
 5 |   configSchema:
 6 |     # JSON Schema defining the configuration options for the MCP.
 7 |     type: object
 8 |     required: []
 9 |     properties:
10 |       semanticScholarApiKey:
11 |         type: string
12 |         description: The API key for the Semantic Scholar server. Optional for
13 |           authenticated access.
14 |   commandFunction:
15 |     # A function that produces the CLI command to start the MCP on stdio.
16 |     |-
17 |     (config) => ({command:'python',args:['semantic_scholar_server.py'],env:{SEMANTIC_SCHOLAR_API_KEY:config.semanticScholarApiKey || ''}})
```

--------------------------------------------------------------------------------
/semantic_scholar/utils/errors.py:
--------------------------------------------------------------------------------

```python
 1 | """
 2 | Error handling utilities for the Semantic Scholar API Server.
 3 | """
 4 | 
 5 | from typing import Dict, Optional
 6 | from ..config import ErrorType
 7 | 
 8 | def create_error_response(
 9 |     error_type: ErrorType,
10 |     message: str,
11 |     details: Optional[Dict] = None
12 | ) -> Dict:
13 |     """
14 |     Create a standardized error response.
15 | 
16 |     Args:
17 |         error_type: The type of error that occurred.
18 |         message: A human-readable message describing the error.
19 |         details: Optional additional details about the error.
20 | 
21 |     Returns:
22 |         A dictionary with the error information.
23 |     """
24 |     return {
25 |         "error": {
26 |             "type": error_type.value,
27 |             "message": message,
28 |             "details": details or {}
29 |         }
30 |     } 
```

--------------------------------------------------------------------------------
/run.py:
--------------------------------------------------------------------------------

```python
 1 | #!/usr/bin/env python3
 2 | """
 3 | Entry point script for the Semantic Scholar API Server.
 4 | 
 5 | Available tools:
 6 | - paper_relevance_search
 7 | - paper_bulk_search
 8 | - paper_title_search
 9 | - paper_details
10 | - paper_batch_details
11 | - paper_authors
12 | - paper_citations
13 | - paper_references
14 | - author_search
15 | - author_details
16 | - author_papers
17 | - author_batch_details
18 | - get_paper_recommendations_single
19 | - get_paper_recommendations_multi
20 | """
21 | 
22 | # Import the mcp instance from centralized location
23 | from semantic_scholar.mcp import mcp
24 | # Import the main function from server
25 | from semantic_scholar.server import main
26 | 
27 | # Import all API modules to ensure tools are registered
28 | from semantic_scholar.api import papers, authors, recommendations
29 | 
30 | if __name__ == "__main__":
31 |     main() 
```

--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
 1 | # Start from a base Python image
 2 | FROM python:3.8-slim
 3 | 
 4 | # Set the working directory
 5 | WORKDIR /app
 6 | 
 7 | # Copy the requirements file first to leverage Docker cache
 8 | COPY requirements.txt /app/requirements.txt
 9 | RUN pip install --no-cache-dir -r requirements.txt
10 | 
11 | # Copy the rest of the application code
12 | # This includes the 'semantic_scholar' package and 'run.py'
13 | COPY . /app
14 | # Alternatively, be more specific:
15 | # COPY semantic_scholar /app/semantic_scholar
16 | # COPY run.py /app/run.py
17 | 
18 | # Expose the port that the MCP server will run on
19 | EXPOSE 8000
20 | 
21 | # Set the environment variable for the API key (placeholder)
22 | # Glama or the user should provide the actual key at runtime
23 | ENV SEMANTIC_SCHOLAR_API_KEY=""
24 | 
25 | # Command to run the server using the refactored entry point
26 | CMD ["python", "run.py"]
```

--------------------------------------------------------------------------------
/test/test_recommend.py:
--------------------------------------------------------------------------------

```python
 1 | import unittest
 2 | import asyncio
 3 | import os
 4 | from typing import Optional, List, Dict
 5 | 
 6 | from .test_utils import make_request, create_error_response, ErrorType, Config
 7 | 
 8 | class TestRecommendationTools(unittest.TestCase):
 9 |     def setUp(self):
10 |         """Set up test environment"""
11 |         # API key is required for recommendations
12 |         api_key = os.getenv("SEMANTIC_SCHOLAR_API_KEY")
13 |         if not api_key:
14 |             raise ValueError("SEMANTIC_SCHOLAR_API_KEY environment variable is required for recommendation tests")
15 |         
16 |         # Create event loop for async tests
17 |         self.loop = asyncio.new_event_loop()
18 |         asyncio.set_event_loop(self.loop)
19 |         
20 |         # Sample paper IDs for testing (using full IDs)
21 |         self.sample_paper_id = "204e3073870fae3d05bcbc2f6a8e263d9b72e776"  # "Attention is All You Need"
22 |         self.positive_paper_ids = [
23 |             self.sample_paper_id,
24 |             "df2b0e26d0599ce3e70df8a9da02e51594e0e992"  # BERT
25 |         ]
26 |         self.negative_paper_ids = [
27 |             "649def34f8be52c8b66281af98ae884c09aef38b"  # Different topic
28 |         ]
29 | 
30 |     def tearDown(self):
31 |         """Clean up after tests"""
32 |         self.loop.close()
33 | 
34 |     def run_async(self, coro):
35 |         """Helper to run async functions in tests"""
36 |         return self.loop.run_until_complete(coro)
37 | 
38 |     async def async_test_with_delay(self, coro):
39 |         """Helper to run async tests with delay to handle rate limiting"""
40 |         await asyncio.sleep(1)  # Add 1 second delay between tests
41 |         return await coro
42 | 
43 |     def test_paper_recommendations_single(self):
44 |         """Test single paper recommendations functionality"""
45 |         result = self.run_async(self.async_test_with_delay(make_request(
46 |             f"papers/forpaper/{self.sample_paper_id}",  # Using full paper ID
47 |             params={
48 |                 "fields": "title,year"  # Minimal fields
49 |             }
50 |         )))
51 |         self.assertIn("recommendedPapers", result)
52 |         self.assertTrue(isinstance(result["recommendedPapers"], list))
53 | 
54 |     def test_paper_recommendations_multi(self):
55 |         """Test multi-paper recommendations functionality"""
56 |         result = self.run_async(self.async_test_with_delay(make_request(
57 |             "papers",  # No leading slash
58 |             method="POST",
59 |             params={"fields": "title,year"},  # Minimal fields
60 |             json={
61 |                 "positivePaperIds": self.positive_paper_ids,  # Changed key name to match API
62 |                 "negativePaperIds": self.negative_paper_ids
63 |             }
64 |         )))
65 |         self.assertIn("recommendedPapers", result)
66 |         self.assertTrue(isinstance(result["recommendedPapers"], list))
67 | 
68 | if __name__ == '__main__':
69 |     unittest.main()
70 | 
```

--------------------------------------------------------------------------------
/semantic_scholar/server.py:
--------------------------------------------------------------------------------

```python
 1 | """
 2 | Main server module for the Semantic Scholar API Server.
 3 | """
 4 | 
 5 | import logging
 6 | import asyncio
 7 | import signal
 8 | 
 9 | # Import mcp from centralized location
10 | from .mcp import mcp
11 | from .utils.http import initialize_client, cleanup_client
12 | 
13 | # Configure logging
14 | logging.basicConfig(level=logging.INFO)
15 | logger = logging.getLogger(__name__)
16 | 
17 | # Import API modules to register tools
18 | # Note: This must come AFTER mcp is initialized
19 | from .api import papers, authors, recommendations
20 | 
21 | async def handle_exception(loop, context):
22 |     """Global exception handler for the event loop."""
23 |     msg = context.get("exception", context["message"])
24 |     logger.error(f"Caught exception: {msg}")
25 |     asyncio.create_task(shutdown())
26 | 
27 | async def shutdown():
28 |     """Gracefully shut down the server."""
29 |     logger.info("Initiating graceful shutdown...")
30 |     
31 |     # Cancel all tasks
32 |     tasks = [t for t in asyncio.all_tasks() if t is not asyncio.current_task()]
33 |     for task in tasks:
34 |         task.cancel()
35 |         try:
36 |             await task
37 |         except asyncio.CancelledError:
38 |             pass
39 |     
40 |     # Cleanup resources
41 |     await cleanup_client()
42 |     await mcp.cleanup()
43 |     
44 |     logger.info(f"Cancelled {len(tasks)} tasks")
45 |     logger.info("Shutdown complete")
46 | 
47 | def init_signal_handlers(loop):
48 |     """Initialize signal handlers for graceful shutdown."""
49 |     for sig in (signal.SIGTERM, signal.SIGINT):
50 |         loop.add_signal_handler(sig, lambda: asyncio.create_task(shutdown()))
51 |     logger.info("Signal handlers initialized")
52 | 
53 | async def run_server():
54 |     """Run the server with proper async context management."""
55 |     async with mcp:
56 |         try:
57 |             # Initialize HTTP client
58 |             await initialize_client()
59 |             
60 |             # Start the server
61 |             logger.info("Starting Semantic Scholar Server")
62 |             await mcp.run_async()
63 |         except Exception as e:
64 |             logger.error(f"Server error: {e}")
65 |             raise
66 |         finally:
67 |             await shutdown()
68 | 
69 | def main():
70 |     """Main entry point for the server."""
71 |     try:
72 |         # Set up event loop with exception handler
73 |         loop = asyncio.new_event_loop()
74 |         asyncio.set_event_loop(loop)
75 |         loop.set_exception_handler(handle_exception)
76 |         
77 |         # Initialize signal handlers
78 |         init_signal_handlers(loop)
79 |         
80 |         # Run the server
81 |         loop.run_until_complete(run_server())
82 |     except KeyboardInterrupt:
83 |         logger.info("Received keyboard interrupt, shutting down...")
84 |     except Exception as e:
85 |         logger.error(f"Fatal error: {str(e)}")
86 |     finally:
87 |         try:
88 |             loop.run_until_complete(asyncio.sleep(0))  # Let pending tasks complete
89 |             loop.close()
90 |         except Exception as e:
91 |             logger.error(f"Error during final cleanup: {str(e)}")
92 |         logger.info("Server stopped")
93 | 
94 | if __name__ == "__main__":
95 |     main() 
```

--------------------------------------------------------------------------------
/test/test_author.py:
--------------------------------------------------------------------------------

```python
 1 | import unittest
 2 | import asyncio
 3 | import os
 4 | from typing import Optional, List, Dict
 5 | 
 6 | from .test_utils import make_request, create_error_response, ErrorType, Config
 7 | 
 8 | class TestAuthorTools(unittest.TestCase):
 9 |     def setUp(self):
10 |         """Set up test environment"""
11 |         # You can set your API key here for testing
12 |         os.environ["SEMANTIC_SCHOLAR_API_KEY"] = ""  # Optional
13 |         
14 |         # Create event loop for async tests
15 |         self.loop = asyncio.new_event_loop()
16 |         asyncio.set_event_loop(self.loop)
17 |         
18 |         # Sample author IDs for testing
19 |         self.sample_author_id = "1741101"  # Andrew Ng
20 |         self.sample_author_ids = [
21 |             self.sample_author_id,
22 |             "2061296"  # Yann LeCun
23 |         ]
24 | 
25 |     def tearDown(self):
26 |         """Clean up after tests"""
27 |         self.loop.close()
28 | 
29 |     def run_async(self, coro):
30 |         """Helper to run async functions in tests"""
31 |         return self.loop.run_until_complete(coro)
32 | 
33 |     async def async_test_with_delay(self, coro):
34 |         """Helper to run async tests with delay to handle rate limiting"""
35 |         await asyncio.sleep(1)  # Add 1 second delay between tests
36 |         return await coro
37 | 
38 |     def test_author_search(self):
39 |         """Test author search functionality"""
40 |         result = self.run_async(self.async_test_with_delay(make_request(
41 |             "/author/search",
42 |             params={
43 |                 "query": "Andrew Ng",
44 |                 "fields": "name,affiliations,paperCount"
45 |             }
46 |         )))
47 |         self.assertIn("data", result)
48 |         self.assertIn("total", result)
49 | 
50 |     def test_author_details(self):
51 |         """Test author details functionality"""
52 |         result = self.run_async(self.async_test_with_delay(make_request(
53 |             f"/author/{self.sample_author_id}",
54 |             params={
55 |                 "fields": "name,affiliations,paperCount,citationCount,hIndex"
56 |             }
57 |         )))
58 |         self.assertIn("authorId", result)
59 |         self.assertIn("name", result)
60 | 
61 |     def test_author_papers(self):
62 |         """Test author papers functionality"""
63 |         result = self.run_async(self.async_test_with_delay(make_request(
64 |             f"/author/{self.sample_author_id}/papers",
65 |             params={
66 |                 "fields": "title,year,citationCount",
67 |                 "limit": 10
68 |             }
69 |         )))
70 |         self.assertIn("data", result)
71 |         self.assertIn("next", result)
72 |         self.assertIn("offset", result)
73 |         self.assertTrue(isinstance(result["data"], list))
74 | 
75 |     def test_author_batch_details(self):
76 |         """Test batch author details functionality"""
77 |         result = self.run_async(self.async_test_with_delay(make_request(
78 |             "/author/batch",
79 |             method="POST",
80 |             params={"fields": "name,affiliations,paperCount"},
81 |             json={"ids": self.sample_author_ids}
82 |         )))
83 |         self.assertTrue(isinstance(result, list))
84 |         self.assertEqual(len(result), len(self.sample_author_ids))
85 | 
86 | if __name__ == '__main__':
87 |     unittest.main()
88 | 
```

--------------------------------------------------------------------------------
/TOOLS.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Semantic Scholar Server Tools
  2 | 
  3 | This document lists all the tools available in the Semantic Scholar API Server.
  4 | 
  5 | ## Paper-related Tools
  6 | 
  7 | ### `paper_relevance_search`
  8 | 
  9 | Search for papers on Semantic Scholar using relevance-based ranking.
 10 | 
 11 | ```json
 12 | {
 13 |   "query": "quantum computing",
 14 |   "fields": ["title", "abstract", "year", "authors"],
 15 |   "limit": 10
 16 | }
 17 | ```
 18 | 
 19 | ### `paper_bulk_search`
 20 | 
 21 | Bulk search for papers with advanced filtering and sorting options.
 22 | 
 23 | ```json
 24 | {
 25 |   "query": "machine learning",
 26 |   "fields": ["title", "abstract", "authors"],
 27 |   "sort": "citationCount:desc"
 28 | }
 29 | ```
 30 | 
 31 | ### `paper_title_search`
 32 | 
 33 | Find a specific paper by matching its title.
 34 | 
 35 | ```json
 36 | {
 37 |   "query": "Attention Is All You Need",
 38 |   "fields": ["title", "abstract", "authors", "year"]
 39 | }
 40 | ```
 41 | 
 42 | ### `paper_details`
 43 | 
 44 | Get detailed information about a specific paper by ID.
 45 | 
 46 | ```json
 47 | {
 48 |   "paper_id": "649def34f8be52c8b66281af98ae884c09aef38b",
 49 |   "fields": ["title", "abstract", "authors", "citations"]
 50 | }
 51 | ```
 52 | 
 53 | ### `paper_batch_details`
 54 | 
 55 | Get details for multiple papers in one request.
 56 | 
 57 | ```json
 58 | {
 59 |   "paper_ids": ["649def34f8be52c8b66281af98ae884c09aef38b", "ARXIV:2106.15928"],
 60 |   "fields": "title,abstract,authors"
 61 | }
 62 | ```
 63 | 
 64 | ### `paper_authors`
 65 | 
 66 | Get the authors of a specific paper.
 67 | 
 68 | ```json
 69 | {
 70 |   "paper_id": "649def34f8be52c8b66281af98ae884c09aef38b",
 71 |   "fields": ["name", "affiliations"]
 72 | }
 73 | ```
 74 | 
 75 | ### `paper_citations`
 76 | 
 77 | Get papers that cite a specific paper.
 78 | 
 79 | ```json
 80 | {
 81 |   "paper_id": "649def34f8be52c8b66281af98ae884c09aef38b",
 82 |   "fields": ["title", "year", "authors"],
 83 |   "limit": 50
 84 | }
 85 | ```
 86 | 
 87 | ### `paper_references`
 88 | 
 89 | Get papers referenced by a specific paper.
 90 | 
 91 | ```json
 92 | {
 93 |   "paper_id": "649def34f8be52c8b66281af98ae884c09aef38b",
 94 |   "fields": ["title", "year", "authors"],
 95 |   "limit": 50
 96 | }
 97 | ```
 98 | 
 99 | ## Author-related Tools
100 | 
101 | ### `author_search`
102 | 
103 | Search for authors by name.
104 | 
105 | ```json
106 | {
107 |   "query": "Albert Einstein",
108 |   "fields": ["name", "affiliations", "paperCount"]
109 | }
110 | ```
111 | 
112 | ### `author_details`
113 | 
114 | Get detailed information about a specific author.
115 | 
116 | ```json
117 | {
118 |   "author_id": "1741101",
119 |   "fields": ["name", "affiliations", "papers", "citationCount"]
120 | }
121 | ```
122 | 
123 | ### `author_papers`
124 | 
125 | Get papers written by a specific author.
126 | 
127 | ```json
128 | {
129 |   "author_id": "1741101",
130 |   "fields": ["title", "year", "venue"],
131 |   "limit": 50
132 | }
133 | ```
134 | 
135 | ### `author_batch_details`
136 | 
137 | Get details for multiple authors at once.
138 | 
139 | ```json
140 | {
141 |   "author_ids": ["1741101", "1741102"],
142 |   "fields": "name,affiliations,paperCount,citationCount"
143 | }
144 | ```
145 | 
146 | ## Recommendation Tools
147 | 
148 | ### `get_paper_recommendations_single`
149 | 
150 | Get paper recommendations based on a single paper.
151 | 
152 | ```json
153 | {
154 |   "paper_id": "649def34f8be52c8b66281af98ae884c09aef38b",
155 |   "fields": "title,authors,year,abstract",
156 |   "limit": 20
157 | }
158 | ```
159 | 
160 | ### `get_paper_recommendations_multi`
161 | 
162 | Get paper recommendations based on multiple papers.
163 | 
164 | ```json
165 | {
166 |   "positive_paper_ids": [
167 |     "649def34f8be52c8b66281af98ae884c09aef38b",
168 |     "ARXIV:2106.15928"
169 |   ],
170 |   "negative_paper_ids": ["ARXIV:1805.02262"],
171 |   "fields": "title,authors,year",
172 |   "limit": 20
173 | }
174 | ```
175 | 
176 | ## Note
177 | 
178 | - The tool name in the error message (`read_paper`) does not exist in this server
179 | - Use one of the tools listed above instead
180 | - Always include the required parameters for each tool
181 | 
```

--------------------------------------------------------------------------------
/REFACTORING.md:
--------------------------------------------------------------------------------

```markdown
 1 | # Semantic Scholar Server Refactoring
 2 | 
 3 | This document describes the refactoring of the Semantic Scholar server from a single monolithic file to a modular package structure.
 4 | 
 5 | ## Motivation
 6 | 
 7 | The original implementation consisted of a single 2,200+ line Python file (`semantic_scholar_server.py`), which made it difficult to:
 8 | 
 9 | - Understand the overall structure
10 | - Locate specific functionality
11 | - Debug issues
12 | - Make focused changes
13 | - Test individual components
14 | 
15 | ## Refactoring Approach
16 | 
17 | We used a modular package approach, separating concerns into logical components:
18 | 
19 | ```
20 | semantic-scholar-server/
21 | ├── semantic_scholar/            # Main package
22 | │   ├── __init__.py             # Package initialization
23 | │   ├── server.py               # Server setup and main functionality
24 | │   ├── mcp.py                  # Centralized FastMCP instance definition
25 | │   ├── config.py               # Configuration classes
26 | │   ├── utils/                  # Utility modules
27 | │   │   ├── __init__.py
28 | │   │   ├── errors.py           # Error handling
29 | │   │   └── http.py             # HTTP client and rate limiting
30 | │   ├── api/                    # API endpoints
31 | │       ├── __init__.py
32 | │       ├── papers.py           # Paper-related endpoints
33 | │       ├── authors.py          # Author-related endpoints
34 | │       └── recommendations.py  # Recommendation endpoints
35 | ├── run.py                      # Entry point script
36 | ```
37 | 
38 | ## Key Improvements
39 | 
40 | 1. **Separation of Concerns**
41 | 
42 |    - Config classes in their own module
43 |    - Utilities separated from business logic
44 |    - API endpoints grouped by domain (papers, authors, recommendations)
45 |    - Server infrastructure code isolated
46 |    - FastMCP instance centralized in its own module
47 | 
48 | 2. **Improved Maintainability**
49 | 
50 |    - Each file has a single responsibility
51 |    - Files are much smaller and easier to understand
52 |    - Clear imports show dependencies between modules
53 |    - Better docstrings and code organization
54 |    - No circular dependencies between modules
55 | 
56 | 3. **Enhanced Extensibility**
57 | 
58 |    - Adding new endpoints only requires changes to the relevant module
59 |    - Utilities can be reused across the codebase
60 |    - Configuration is centralized
61 |    - Testing individual components is much easier
62 |    - Each module imports the FastMCP instance from a central location
63 | 
64 | 4. **Clearer Entry Point**
65 |    - `run.py` provides a simple way to start the server
66 |    - Server initialization is separated from the API logic
67 |    - All modules consistently import the FastMCP instance from mcp.py
68 | 
69 | ## Migration Guide
70 | 
71 | The refactored code maintains the same functionality and API as the original implementation. To migrate:
72 | 
73 | 1. Replace the original `semantic_scholar_server.py` with the new package structure
74 | 2. Update any import statements that referenced the original file
75 | 3. Use `run.py` as the new entry point
76 | 
77 | No changes to API usage are required - all tool functions maintain the same signatures and behavior.
78 | 
79 | ## Future Improvements
80 | 
81 | The modular structure enables several future improvements:
82 | 
83 | 1. **Testing**: Add unit tests for individual components
84 | 2. **Caching**: Implement caching layer for improved performance
85 | 3. **Logging**: Enhanced logging throughout the application
86 | 4. **Metrics**: Add performance monitoring
87 | 5. **Documentation**: Generate API documentation from docstrings
88 | 
```

--------------------------------------------------------------------------------
/test/test_utils.py:
--------------------------------------------------------------------------------

```python
  1 | """Test utilities and core functionality without MCP dependencies"""
  2 | 
  3 | import httpx
  4 | import logging
  5 | import os
  6 | from typing import Dict, Optional
  7 | import asyncio
  8 | from enum import Enum
  9 | from dotenv import load_dotenv
 10 | 
 11 | # Load environment variables from .env file
 12 | load_dotenv()
 13 | 
 14 | # Basic setup
 15 | logging.basicConfig(level=logging.INFO)
 16 | logger = logging.getLogger(__name__)
 17 | 
 18 | class ErrorType(Enum):
 19 |     RATE_LIMIT = "rate_limit"
 20 |     API_ERROR = "api_error"
 21 |     VALIDATION = "validation"
 22 |     TIMEOUT = "timeout"
 23 | 
 24 | class Config:
 25 |     API_VERSION = "v1"
 26 |     GRAPH_BASE_URL = f"https://api.semanticscholar.org/graph/{API_VERSION}"
 27 |     RECOMMENDATIONS_BASE_URL = "https://api.semanticscholar.org/recommendations/v1"
 28 |     TIMEOUT = 30  # seconds
 29 | 
 30 | def create_error_response(
 31 |     error_type: ErrorType,
 32 |     message: str,
 33 |     details: Optional[Dict] = None
 34 | ) -> Dict:
 35 |     return {
 36 |         "error": {
 37 |             "type": error_type.value,
 38 |             "message": message,
 39 |             "details": details or {}
 40 |         }
 41 |     }
 42 | 
 43 | def get_api_key() -> Optional[str]:
 44 |     """Get the Semantic Scholar API key from environment variables."""
 45 |     api_key = os.getenv("SEMANTIC_SCHOLAR_API_KEY")
 46 |     logger.info(f"API Key found: {'Yes' if api_key else 'No'}")
 47 |     return api_key
 48 | 
 49 | async def make_request(endpoint: str, params: Dict = None, method: str = "GET", json: Dict = None) -> Dict:
 50 |     """Make a request to the Semantic Scholar API."""
 51 |     try:
 52 |         api_key = get_api_key()
 53 |         headers = {"x-api-key": api_key} if api_key else {}
 54 |         params = params or {}
 55 |         
 56 |         # Choose base URL based on endpoint
 57 |         is_recommendations = endpoint.startswith("recommendations") or endpoint.startswith("papers/forpaper")
 58 |         base_url = Config.RECOMMENDATIONS_BASE_URL if is_recommendations else Config.GRAPH_BASE_URL
 59 |         
 60 |         # Clean up endpoint
 61 |         if endpoint.startswith("/"):
 62 |             endpoint = endpoint[1:]
 63 |         if is_recommendations and endpoint.startswith("recommendations/"):
 64 |             endpoint = endpoint[15:]  # Remove "recommendations/" prefix
 65 |             
 66 |         url = f"{base_url}/{endpoint}"
 67 |         logger.info(f"Making {method} request to {url}")
 68 |         logger.info(f"Headers: {headers}")
 69 |         logger.info(f"Params: {params}")
 70 |         if json:
 71 |             logger.info(f"JSON body: {json}")
 72 | 
 73 |         async with httpx.AsyncClient(timeout=Config.TIMEOUT, follow_redirects=True) as client:
 74 |             if method == "GET":
 75 |                 response = await client.get(url, params=params, headers=headers)
 76 |             else:  # POST
 77 |                 response = await client.post(url, params=params, json=json, headers=headers)
 78 |             
 79 |             logger.info(f"Response status: {response.status_code}")
 80 |             logger.info(f"Response body: {response.text}")
 81 |             
 82 |             response.raise_for_status()
 83 |             return response.json()
 84 | 
 85 |     except httpx.HTTPStatusError as e:
 86 |         if e.response.status_code == 429:
 87 |             return create_error_response(
 88 |                 ErrorType.RATE_LIMIT,
 89 |                 "Rate limit exceeded",
 90 |                 {"retry_after": e.response.headers.get("retry-after")}
 91 |             )
 92 |         return create_error_response(
 93 |             ErrorType.API_ERROR,
 94 |             f"HTTP error: {e.response.status_code}",
 95 |             {"response": e.response.text}
 96 |         )
 97 |     except httpx.TimeoutException:
 98 |         return create_error_response(
 99 |             ErrorType.TIMEOUT,
100 |             f"Request timed out after {Config.TIMEOUT} seconds"
101 |         )
102 |     except Exception as e:
103 |         return create_error_response(
104 |             ErrorType.API_ERROR,
105 |             str(e)
106 |         ) 
```

--------------------------------------------------------------------------------
/test/test_paper.py:
--------------------------------------------------------------------------------

```python
  1 | import unittest
  2 | import asyncio
  3 | import os
  4 | from typing import Optional, List, Dict
  5 | import random
  6 | 
  7 | from .test_utils import make_request, create_error_response, ErrorType, Config
  8 | 
  9 | class TestPaperTools(unittest.TestCase):
 10 |     def setUp(self):
 11 |         """Set up test environment"""
 12 |         # You can set your API key here for testing
 13 |         os.environ["SEMANTIC_SCHOLAR_API_KEY"] = ""  # Optional
 14 |         
 15 |         # Create event loop for async tests
 16 |         self.loop = asyncio.new_event_loop()
 17 |         asyncio.set_event_loop(self.loop)
 18 |         
 19 |         # Sample paper IDs for testing
 20 |         self.sample_paper_id = "649def34f8be52c8b66281af98ae884c09aef38b"
 21 |         self.sample_paper_ids = [
 22 |             self.sample_paper_id,
 23 |             "ARXIV:2106.15928"
 24 |         ]
 25 | 
 26 |     def tearDown(self):
 27 |         """Clean up after tests"""
 28 |         self.loop.close()
 29 | 
 30 |     def run_async(self, coro):
 31 |         """Helper to run async functions in tests"""
 32 |         return self.loop.run_until_complete(coro)
 33 | 
 34 |     async def async_test_with_delay(self, endpoint: str, **kwargs):
 35 |         """Helper to run async tests with delay to handle rate limiting"""
 36 |         await asyncio.sleep(random.uniform(5, 8))  # Random initial delay
 37 |         
 38 |         max_retries = 5
 39 |         base_delay = 8
 40 |         
 41 |         for attempt in range(max_retries):
 42 |             result = await make_request(endpoint, **kwargs)
 43 |             if not isinstance(result, dict) or "error" not in result:
 44 |                 return result
 45 |                 
 46 |             if result["error"]["type"] == "rate_limit":
 47 |                 delay = base_delay * (2 ** attempt) + random.uniform(0, 2)  # Add jitter
 48 |                 await asyncio.sleep(delay)
 49 |                 continue
 50 |             else:
 51 |                 return result
 52 |                 
 53 |         return result  # Return last result if all retries failed
 54 | 
 55 |     @classmethod
 56 |     def setUpClass(cls):
 57 |         """Set up class-level test environment"""
 58 |         # Add initial delay before any tests run
 59 |         asyncio.get_event_loop().run_until_complete(asyncio.sleep(10))
 60 | 
 61 |     def test_paper_relevance_search(self):
 62 |         """Test paper relevance search functionality"""
 63 |         # Test basic search
 64 |         result = self.run_async(self.async_test_with_delay(
 65 |             "paper/search",  # Remove leading slash
 66 |             params={
 67 |                 "query": "quantum computing",
 68 |                 "fields": "title,abstract,year"
 69 |             }
 70 |         ))
 71 |         self.assertNotIn("error", result)
 72 |         self.assertIn("data", result)
 73 |         self.assertIn("total", result)
 74 |         
 75 |         # Test with filters
 76 |         result = self.run_async(self.async_test_with_delay(
 77 |             "paper/search",
 78 |             params={
 79 |                 "query": "machine learning",
 80 |                 "fields": "title,year",
 81 |                 "minCitationCount": 100,
 82 |                 "year": "2020-2023"
 83 |             }
 84 |         ))
 85 |         self.assertNotIn("error", result)
 86 |         self.assertIn("data", result)
 87 | 
 88 |     def test_paper_bulk_search(self):
 89 |         """Test paper bulk search functionality"""
 90 |         result = self.run_async(self.async_test_with_delay(
 91 |             "paper/search/bulk",  # Remove leading slash
 92 |             params={
 93 |                 "query": "neural networks",
 94 |                 "fields": "title,year,authors",
 95 |                 "sort": "citationCount:desc"
 96 |             }
 97 |         ))
 98 |         self.assertNotIn("error", result)
 99 |         self.assertIn("data", result)
100 | 
101 |     def test_paper_details(self):
102 |         """Test paper details functionality"""
103 |         result = self.run_async(self.async_test_with_delay(
104 |             f"paper/{self.sample_paper_id}",  # Remove leading slash
105 |             params={
106 |                 "fields": "title,abstract,year,authors"
107 |             }
108 |         ))
109 |         self.assertNotIn("error", result)
110 |         self.assertIn("paperId", result)
111 |         self.assertIn("title", result)
112 | 
113 |     def test_paper_batch_details(self):
114 |         """Test batch paper details functionality"""
115 |         result = self.run_async(self.async_test_with_delay(
116 |             "paper/batch",  # Remove leading slash
117 |             method="POST",
118 |             params={"fields": "title,year,authors"},
119 |             json={"ids": self.sample_paper_ids}
120 |         ))
121 |         self.assertNotIn("error", result)
122 |         self.assertTrue(isinstance(result, list))
123 |         self.assertEqual(len(result), len(self.sample_paper_ids))
124 | 
125 | if __name__ == '__main__':
126 |     unittest.main()
127 | 
```

--------------------------------------------------------------------------------
/semantic_scholar/utils/http.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | HTTP client utilities for the Semantic Scholar API Server.
  3 | """
  4 | 
  5 | import os
  6 | import logging
  7 | import httpx
  8 | import asyncio
  9 | import time
 10 | from typing import Dict, Optional, Tuple, Any
 11 | 
 12 | from ..config import Config, ErrorType, RateLimitConfig
 13 | from .errors import create_error_response
 14 | 
 15 | logger = logging.getLogger(__name__)
 16 | 
 17 | # Global HTTP client for connection pooling
 18 | http_client = None
 19 | 
 20 | class RateLimiter:
 21 |     """
 22 |     Rate limiter for API requests to prevent exceeding API limits.
 23 |     """
 24 |     def __init__(self):
 25 |         self._last_call_time = {}
 26 |         self._locks = {}
 27 | 
 28 |     def _get_rate_limit(self, endpoint: str) -> Tuple[int, int]:
 29 |         """Get the appropriate rate limit for an endpoint."""
 30 |         if any(restricted in endpoint for restricted in RateLimitConfig.RESTRICTED_ENDPOINTS):
 31 |             if "batch" in endpoint:
 32 |                 return RateLimitConfig.BATCH_LIMIT
 33 |             if "search" in endpoint:
 34 |                 return RateLimitConfig.SEARCH_LIMIT
 35 |             return RateLimitConfig.DEFAULT_LIMIT
 36 |         return RateLimitConfig.DEFAULT_LIMIT
 37 | 
 38 |     async def acquire(self, endpoint: str):
 39 |         """
 40 |         Acquire permission to make a request, waiting if necessary to respect rate limits.
 41 |         
 42 |         Args:
 43 |             endpoint: The API endpoint being accessed.
 44 |         """
 45 |         if endpoint not in self._locks:
 46 |             self._locks[endpoint] = asyncio.Lock()
 47 |             self._last_call_time[endpoint] = 0
 48 | 
 49 |         async with self._locks[endpoint]:
 50 |             rate_limit = self._get_rate_limit(endpoint)
 51 |             current_time = time.time()
 52 |             time_since_last_call = current_time - self._last_call_time[endpoint]
 53 |             
 54 |             if time_since_last_call < rate_limit[1]:
 55 |                 delay = rate_limit[1] - time_since_last_call
 56 |                 await asyncio.sleep(delay)
 57 |             
 58 |             self._last_call_time[endpoint] = time.time()
 59 | 
 60 | # Create global rate limiter instance
 61 | rate_limiter = RateLimiter()
 62 | 
 63 | def get_api_key() -> Optional[str]:
 64 |     """
 65 |     Get the Semantic Scholar API key from environment variables.
 66 |     Returns None if no API key is set, enabling unauthenticated access.
 67 |     """
 68 |     api_key = os.getenv("SEMANTIC_SCHOLAR_API_KEY")
 69 |     if not api_key:
 70 |         logger.warning("No SEMANTIC_SCHOLAR_API_KEY set. Using unauthenticated access with lower rate limits.")
 71 |     return api_key
 72 | 
 73 | async def initialize_client():
 74 |     """Initialize the global HTTP client."""
 75 |     global http_client
 76 |     if http_client is None:
 77 |         http_client = httpx.AsyncClient(
 78 |             timeout=Config.TIMEOUT,
 79 |             limits=httpx.Limits(max_keepalive_connections=10)
 80 |         )
 81 |     return http_client
 82 | 
 83 | async def cleanup_client():
 84 |     """Clean up the global HTTP client."""
 85 |     global http_client
 86 |     if http_client is not None:
 87 |         await http_client.aclose()
 88 |         http_client = None
 89 | 
 90 | async def make_request(endpoint: str, params: Dict = None) -> Dict:
 91 |     """
 92 |     Make a rate-limited request to the Semantic Scholar API.
 93 |     
 94 |     Args:
 95 |         endpoint: The API endpoint to call.
 96 |         params: Optional query parameters.
 97 |         
 98 |     Returns:
 99 |         The JSON response or an error response dictionary.
100 |     """
101 |     try:
102 |         # Apply rate limiting
103 |         await rate_limiter.acquire(endpoint)
104 | 
105 |         # Get API key if available
106 |         api_key = get_api_key()
107 |         headers = {"x-api-key": api_key} if api_key else {}
108 |         url = f"{Config.BASE_URL}{endpoint}"
109 | 
110 |         # Use global client
111 |         client = await initialize_client()
112 |         response = await client.get(url, params=params, headers=headers)
113 |         response.raise_for_status()
114 |         return response.json()
115 |     except httpx.HTTPStatusError as e:
116 |         logger.error(f"HTTP error {e.response.status_code} for {endpoint}: {e.response.text}")
117 |         if e.response.status_code == 429:
118 |             return create_error_response(
119 |                 ErrorType.RATE_LIMIT,
120 |                 "Rate limit exceeded. Consider using an API key for higher limits.",
121 |                 {
122 |                     "retry_after": e.response.headers.get("retry-after"),
123 |                     "authenticated": bool(get_api_key())
124 |                 }
125 |             )
126 |         return create_error_response(
127 |             ErrorType.API_ERROR,
128 |             f"HTTP error: {e.response.status_code}",
129 |             {"response": e.response.text}
130 |         )
131 |     except httpx.TimeoutException as e:
132 |         logger.error(f"Request timeout for {endpoint}: {str(e)}")
133 |         return create_error_response(
134 |             ErrorType.TIMEOUT,
135 |             f"Request timed out after {Config.TIMEOUT} seconds"
136 |         )
137 |     except Exception as e:
138 |         logger.error(f"Unexpected error for {endpoint}: {str(e)}")
139 |         return create_error_response(
140 |             ErrorType.API_ERROR,
141 |             str(e)
142 |         ) 
```

--------------------------------------------------------------------------------
/semantic_scholar/config.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Configuration for the Semantic Scholar API Server.
  3 | """
  4 | 
  5 | from dataclasses import dataclass
  6 | from enum import Enum
  7 | from typing import Dict, List, Tuple, Any
  8 | 
  9 | # Rate Limiting Configuration
 10 | @dataclass
 11 | class RateLimitConfig:
 12 |     # Define rate limits (requests, seconds)
 13 |     SEARCH_LIMIT = (1, 1)  # 1 request per 1 second
 14 |     BATCH_LIMIT = (1, 1)   # 1 request per 1 second
 15 |     DEFAULT_LIMIT = (10, 1)  # 10 requests per 1 second
 16 |     
 17 |     # Endpoints categorization
 18 |     # These endpoints have stricter rate limits due to their computational intensity
 19 |     # and to prevent abuse of the recommendation system
 20 |     RESTRICTED_ENDPOINTS = [
 21 |         "/paper/batch",     # Batch operations are expensive
 22 |         "/paper/search",    # Search operations are computationally intensive
 23 |         "/recommendations"  # Recommendation generation is resource-intensive
 24 |     ]
 25 | 
 26 | # Error Types
 27 | class ErrorType(Enum):
 28 |     RATE_LIMIT = "rate_limit"
 29 |     API_ERROR = "api_error"
 30 |     VALIDATION = "validation"
 31 |     TIMEOUT = "timeout"
 32 | 
 33 | # Field Constants
 34 | class PaperFields:
 35 |     DEFAULT = ["title", "abstract", "year", "citationCount", "authors", "url"]
 36 |     DETAILED = DEFAULT + ["references", "citations", "venue", "influentialCitationCount"]
 37 |     MINIMAL = ["title", "year", "authors"]
 38 |     SEARCH = ["paperId", "title", "year", "citationCount"]
 39 |     
 40 |     # Valid fields from API documentation
 41 |     VALID_FIELDS = {
 42 |         "abstract",
 43 |         "authors",
 44 |         "citationCount",
 45 |         "citations",
 46 |         "corpusId",
 47 |         "embedding",
 48 |         "externalIds",
 49 |         "fieldsOfStudy",
 50 |         "influentialCitationCount",
 51 |         "isOpenAccess",
 52 |         "openAccessPdf",
 53 |         "paperId",
 54 |         "publicationDate",
 55 |         "publicationTypes",
 56 |         "publicationVenue",
 57 |         "references",
 58 |         "s2FieldsOfStudy",
 59 |         "title",
 60 |         "tldr",
 61 |         "url",
 62 |         "venue",
 63 |         "year"
 64 |     }
 65 | 
 66 | class AuthorDetailFields:
 67 |     """Common field combinations for author details"""
 68 |     
 69 |     # Basic author information
 70 |     BASIC = ["name", "url", "affiliations"]
 71 |     
 72 |     # Author's papers information
 73 |     PAPERS_BASIC = ["papers"]  # Returns paperId and title
 74 |     PAPERS_DETAILED = [
 75 |         "papers.year",
 76 |         "papers.authors",
 77 |         "papers.abstract",
 78 |         "papers.venue",
 79 |         "papers.url"
 80 |     ]
 81 |     
 82 |     # Complete author profile
 83 |     COMPLETE = BASIC + ["papers", "papers.year", "papers.authors", "papers.venue"]
 84 |     
 85 |     # Citation metrics
 86 |     METRICS = ["citationCount", "hIndex", "paperCount"]
 87 | 
 88 |     # Valid fields for author details
 89 |     VALID_FIELDS = {
 90 |         "authorId",
 91 |         "name",
 92 |         "url",
 93 |         "affiliations",
 94 |         "papers",
 95 |         "papers.year",
 96 |         "papers.authors",
 97 |         "papers.abstract",
 98 |         "papers.venue",
 99 |         "papers.url",
100 |         "citationCount",
101 |         "hIndex",
102 |         "paperCount"
103 |     }
104 | 
105 | class PaperDetailFields:
106 |     """Common field combinations for paper details"""
107 |     
108 |     # Basic paper information
109 |     BASIC = ["title", "abstract", "year", "venue"]
110 |     
111 |     # Author information
112 |     AUTHOR_BASIC = ["authors"]
113 |     AUTHOR_DETAILED = ["authors.url", "authors.paperCount", "authors.citationCount"]
114 |     
115 |     # Citation information
116 |     CITATION_BASIC = ["citations", "references"]
117 |     CITATION_DETAILED = ["citations.title", "citations.abstract", "citations.year",
118 |                         "references.title", "references.abstract", "references.year"]
119 |     
120 |     # Full paper details
121 |     COMPLETE = BASIC + AUTHOR_BASIC + CITATION_BASIC + ["url", "fieldsOfStudy", 
122 |                                                        "publicationVenue", "publicationTypes"]
123 | 
124 | class CitationReferenceFields:
125 |     """Common field combinations for citation and reference queries"""
126 |     
127 |     # Basic information
128 |     BASIC = ["title"]
129 |     
130 |     # Citation/Reference context
131 |     CONTEXT = ["contexts", "intents", "isInfluential"]
132 |     
133 |     # Paper details
134 |     DETAILED = ["title", "abstract", "authors", "year", "venue"]
135 |     
136 |     # Full information
137 |     COMPLETE = CONTEXT + DETAILED
138 | 
139 |     # Valid fields for citation/reference queries
140 |     VALID_FIELDS = {
141 |         "contexts",
142 |         "intents",
143 |         "isInfluential",
144 |         "title",
145 |         "abstract",
146 |         "authors",
147 |         "year",
148 |         "venue",
149 |         "paperId",
150 |         "url",
151 |         "citationCount",
152 |         "influentialCitationCount"
153 |     }
154 | 
155 | # Configuration
156 | class Config:
157 |     # API Configuration
158 |     API_VERSION = "v1"
159 |     BASE_URL = f"https://api.semanticscholar.org/graph/{API_VERSION}"
160 |     TIMEOUT = 30  # seconds
161 |     
162 |     # Request Limits
163 |     MAX_BATCH_SIZE = 100
164 |     MAX_RESULTS_PER_PAGE = 100
165 |     DEFAULT_PAGE_SIZE = 10
166 |     MAX_BATCHES = 5
167 |     
168 |     # Fields Configuration
169 |     DEFAULT_FIELDS = PaperFields.DEFAULT
170 |     
171 |     # Feature Flags
172 |     ENABLE_CACHING = False
173 |     DEBUG_MODE = False
174 |     
175 |     # Search Configuration
176 |     SEARCH_TYPES = {
177 |         "comprehensive": {
178 |             "description": "Balanced search considering relevance and impact",
179 |             "min_citations": None,
180 |             "ranking_strategy": "balanced"
181 |         },
182 |         "influential": {
183 |             "description": "Focus on highly-cited and influential papers",
184 |             "min_citations": 50,
185 |             "ranking_strategy": "citations"
186 |         },
187 |         "latest": {
188 |             "description": "Focus on recent papers with impact",
189 |             "min_citations": None,
190 |             "ranking_strategy": "recency"
191 |         }
192 |     } 
```

--------------------------------------------------------------------------------
/semantic_scholar/api/recommendations.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Recommendation-related API endpoints for the Semantic Scholar API.
  3 | """
  4 | 
  5 | from typing import Dict, List, Optional
  6 | from fastmcp import Context
  7 | import httpx
  8 | 
  9 | # Import mcp from centralized location instead of server
 10 | from ..mcp import mcp
 11 | from ..config import Config, ErrorType
 12 | from ..utils.http import rate_limiter, get_api_key
 13 | from ..utils.errors import create_error_response
 14 | 
 15 | @mcp.tool()
 16 | async def get_paper_recommendations_single(
 17 |     context: Context,
 18 |     paper_id: str,
 19 |     fields: Optional[str] = None,
 20 |     limit: int = 100,
 21 |     from_pool: str = "recent"
 22 | ) -> Dict:
 23 |     """
 24 |     Get paper recommendations based on a single seed paper.
 25 |     This endpoint is optimized for finding papers similar to a specific paper.
 26 | 
 27 |     Args:
 28 |         paper_id (str): Paper identifier in one of the following formats:
 29 |             - Semantic Scholar ID (e.g., "649def34f8be52c8b66281af98ae884c09aef38b")
 30 |             - CorpusId:<id> (e.g., "CorpusId:215416146")
 31 |             - DOI:<doi> (e.g., "DOI:10.18653/v1/N18-3011")
 32 |             - ARXIV:<id> (e.g., "ARXIV:2106.15928")
 33 |             - MAG:<id> (e.g., "MAG:112218234")
 34 |             - ACL:<id> (e.g., "ACL:W12-3903")
 35 |             - PMID:<id> (e.g., "PMID:19872477")
 36 |             - PMCID:<id> (e.g., "PMCID:2323736")
 37 |             - URL:<url> (e.g., "URL:https://arxiv.org/abs/2106.15928v1")
 38 | 
 39 |         fields (Optional[str]): Comma-separated list of fields to return for each paper.
 40 |             paperId is always returned.
 41 | 
 42 |         limit (int): Maximum number of recommendations to return.
 43 |             Default: 100
 44 |             Maximum: 500
 45 | 
 46 |         from_pool (str): Which pool of papers to recommend from.
 47 |             Options:
 48 |             - "recent": Recent papers (default)
 49 |             - "all-cs": All computer science papers
 50 |             Default: "recent"
 51 | 
 52 |     Returns:
 53 |         Dict: {
 54 |             "recommendedPapers": List[Dict] # List of recommended papers with requested fields
 55 |         }
 56 |     """
 57 |     try:
 58 |         # Apply rate limiting
 59 |         endpoint = "/recommendations"
 60 |         await rate_limiter.acquire(endpoint)
 61 | 
 62 |         # Validate limit
 63 |         if limit > 500:
 64 |             return create_error_response(
 65 |                 ErrorType.VALIDATION,
 66 |                 "Cannot request more than 500 recommendations",
 67 |                 {"max_limit": 500, "requested": limit}
 68 |             )
 69 | 
 70 |         # Validate pool
 71 |         if from_pool not in ["recent", "all-cs"]:
 72 |             return create_error_response(
 73 |                 ErrorType.VALIDATION,
 74 |                 "Invalid paper pool specified",
 75 |                 {"valid_pools": ["recent", "all-cs"]}
 76 |             )
 77 | 
 78 |         # Build request parameters
 79 |         params = {
 80 |             "limit": limit,
 81 |             "from": from_pool
 82 |         }
 83 |         if fields:
 84 |             params["fields"] = fields
 85 | 
 86 |         # Make the API request
 87 |         async with httpx.AsyncClient(timeout=Config.TIMEOUT) as client:
 88 |             api_key = get_api_key()
 89 |             headers = {"x-api-key": api_key} if api_key else {}
 90 |             
 91 |             url = f"https://api.semanticscholar.org/recommendations/v1/papers/forpaper/{paper_id}"
 92 |             response = await client.get(url, params=params, headers=headers)
 93 |             
 94 |             # Handle specific error cases
 95 |             if response.status_code == 404:
 96 |                 return create_error_response(
 97 |                     ErrorType.VALIDATION,
 98 |                     "Paper not found",
 99 |                     {"paper_id": paper_id}
100 |                 )
101 |             
102 |             response.raise_for_status()
103 |             return response.json()
104 | 
105 |     except httpx.HTTPStatusError as e:
106 |         if e.response.status_code == 429:
107 |             return create_error_response(
108 |                 ErrorType.RATE_LIMIT,
109 |                 "Rate limit exceeded. Consider using an API key for higher limits.",
110 |                 {
111 |                     "retry_after": e.response.headers.get("retry-after"),
112 |                     "authenticated": bool(get_api_key())
113 |                 }
114 |             )
115 |         return create_error_response(
116 |             ErrorType.API_ERROR,
117 |             f"HTTP error {e.response.status_code}",
118 |             {"response": e.response.text}
119 |         )
120 |     except httpx.TimeoutException:
121 |         return create_error_response(
122 |             ErrorType.TIMEOUT,
123 |             f"Request timed out after {Config.TIMEOUT} seconds"
124 |         )
125 |     except Exception as e:
126 |         import logging
127 |         logger = logging.getLogger(__name__)
128 |         logger.error(f"Unexpected error in recommendations: {str(e)}")
129 |         return create_error_response(
130 |             ErrorType.API_ERROR,
131 |             "Failed to get recommendations",
132 |             {"error": str(e)}
133 |         )
134 | 
135 | @mcp.tool()
136 | async def get_paper_recommendations_multi(
137 |     context: Context,
138 |     positive_paper_ids: List[str],
139 |     negative_paper_ids: Optional[List[str]] = None,
140 |     fields: Optional[str] = None,
141 |     limit: int = 100
142 | ) -> Dict:
143 |     """
144 |     Get paper recommendations based on multiple positive and optional negative examples.
145 |     This endpoint is optimized for finding papers similar to a set of papers while
146 |     avoiding papers similar to the negative examples.
147 | 
148 |     Args:
149 |         positive_paper_ids (List[str]): List of paper IDs to use as positive examples.
150 |             Papers similar to these will be recommended.
151 |             Each ID can be in any of these formats:
152 |             - Semantic Scholar ID (e.g., "649def34f8be52c8b66281af98ae884c09aef38b")
153 |             - CorpusId:<id> (e.g., "CorpusId:215416146")
154 |             - DOI:<doi> (e.g., "DOI:10.18653/v1/N18-3011")
155 |             - ARXIV:<id> (e.g., "ARXIV:2106.15928")
156 |             - MAG:<id> (e.g., "MAG:112218234")
157 |             - ACL:<id> (e.g., "ACL:W12-3903")
158 |             - PMID:<id> (e.g., "PMID:19872477")
159 |             - PMCID:<id> (e.g., "PMCID:2323736")
160 |             - URL:<url> (e.g., "URL:https://arxiv.org/abs/2106.15928v1")
161 | 
162 |         negative_paper_ids (Optional[List[str]]): List of paper IDs to use as negative examples.
163 |             Papers similar to these will be avoided in recommendations.
164 |             Uses same ID formats as positive_paper_ids.
165 | 
166 |         fields (Optional[str]): Comma-separated list of fields to return for each paper.
167 |             paperId is always returned.
168 | 
169 |         limit (int): Maximum number of recommendations to return.
170 |             Default: 100
171 |             Maximum: 500
172 | 
173 |     Returns:
174 |         Dict: {
175 |             "recommendedPapers": List[Dict] # List of recommended papers with requested fields
176 |         }
177 |     """
178 |     try:
179 |         # Apply rate limiting
180 |         endpoint = "/recommendations"
181 |         await rate_limiter.acquire(endpoint)
182 | 
183 |         # Validate inputs
184 |         if not positive_paper_ids:
185 |             return create_error_response(
186 |                 ErrorType.VALIDATION,
187 |                 "Must provide at least one positive paper ID"
188 |             )
189 | 
190 |         if limit > 500:
191 |             return create_error_response(
192 |                 ErrorType.VALIDATION,
193 |                 "Cannot request more than 500 recommendations",
194 |                 {"max_limit": 500, "requested": limit}
195 |             )
196 | 
197 |         # Build request parameters
198 |         params = {"limit": limit}
199 |         if fields:
200 |             params["fields"] = fields
201 | 
202 |         request_body = {
203 |             "positivePaperIds": positive_paper_ids,
204 |             "negativePaperIds": negative_paper_ids or []
205 |         }
206 | 
207 |         # Make the API request
208 |         async with httpx.AsyncClient(timeout=Config.TIMEOUT) as client:
209 |             api_key = get_api_key()
210 |             headers = {"x-api-key": api_key} if api_key else {}
211 |             
212 |             url = "https://api.semanticscholar.org/recommendations/v1/papers"
213 |             response = await client.post(url, params=params, json=request_body, headers=headers)
214 |             
215 |             # Handle specific error cases
216 |             if response.status_code == 404:
217 |                 return create_error_response(
218 |                     ErrorType.VALIDATION,
219 |                     "One or more input papers not found",
220 |                     {
221 |                         "positive_ids": positive_paper_ids,
222 |                         "negative_ids": negative_paper_ids
223 |                     }
224 |                 )
225 |             
226 |             response.raise_for_status()
227 |             return response.json()
228 | 
229 |     except httpx.HTTPStatusError as e:
230 |         if e.response.status_code == 429:
231 |             return create_error_response(
232 |                 ErrorType.RATE_LIMIT,
233 |                 "Rate limit exceeded. Consider using an API key for higher limits.",
234 |                 {
235 |                     "retry_after": e.response.headers.get("retry-after"),
236 |                     "authenticated": bool(get_api_key())
237 |                 }
238 |             )
239 |         return create_error_response(
240 |             ErrorType.API_ERROR,
241 |             f"HTTP error {e.response.status_code}",
242 |             {"response": e.response.text}
243 |         )
244 |     except httpx.TimeoutException:
245 |         return create_error_response(
246 |             ErrorType.TIMEOUT,
247 |             f"Request timed out after {Config.TIMEOUT} seconds"
248 |         )
249 |     except Exception as e:
250 |         import logging
251 |         logger = logging.getLogger(__name__)
252 |         logger.error(f"Unexpected error in recommendations: {str(e)}")
253 |         return create_error_response(
254 |             ErrorType.API_ERROR,
255 |             "Failed to get recommendations",
256 |             {"error": str(e)}
257 |         ) 
```

--------------------------------------------------------------------------------
/semantic_scholar/api/authors.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Author-related API endpoints for the Semantic Scholar API.
  3 | """
  4 | 
  5 | from typing import Dict, List, Optional
  6 | from fastmcp import Context
  7 | 
  8 | # Import mcp from centralized location instead of server
  9 | from ..mcp import mcp
 10 | from ..config import AuthorDetailFields, ErrorType
 11 | from ..utils.http import make_request
 12 | from ..utils.errors import create_error_response
 13 | 
 14 | @mcp.tool()
 15 | async def author_search(
 16 |     context: Context,
 17 |     query: str,
 18 |     fields: Optional[List[str]] = None,
 19 |     offset: int = 0,
 20 |     limit: int = 100
 21 | ) -> Dict:
 22 |     """
 23 |     Search for authors by name on Semantic Scholar.
 24 |     This endpoint is optimized for finding authors based on their name.
 25 |     Results are sorted by relevance to the query.
 26 |     
 27 |     Args:
 28 |         query (str): The name text to search for. The query will be matched against author names
 29 |             and their known aliases.
 30 | 
 31 |         fields (Optional[List[str]]): List of fields to return for each author.
 32 |             authorId is always returned.
 33 | 
 34 |         offset (int): Number of authors to skip for pagination.
 35 |             Default: 0
 36 | 
 37 |         limit (int): Maximum number of authors to return.
 38 |             Default: 100
 39 |             Maximum: 1000
 40 | 
 41 |     Returns:
 42 |         Dict: {
 43 |             "total": int,      # Total number of authors matching the query
 44 |             "offset": int,     # Current offset in the results
 45 |             "next": int,       # Next offset (if more results available)
 46 |             "data": List[Dict] # List of authors with requested fields
 47 |         }
 48 |     """
 49 |     if not query.strip():
 50 |         return create_error_response(
 51 |             ErrorType.VALIDATION,
 52 |             "Query string cannot be empty"
 53 |         )
 54 | 
 55 |     # Validate limit
 56 |     if limit > 1000:
 57 |         return create_error_response(
 58 |             ErrorType.VALIDATION,
 59 |             "Limit cannot exceed 1000",
 60 |             {"max_limit": 1000}
 61 |         )
 62 | 
 63 |     # Validate fields
 64 |     if fields:
 65 |         invalid_fields = set(fields) - AuthorDetailFields.VALID_FIELDS
 66 |         if invalid_fields:
 67 |             return create_error_response(
 68 |                 ErrorType.VALIDATION,
 69 |                 f"Invalid fields: {', '.join(invalid_fields)}",
 70 |                 {"valid_fields": list(AuthorDetailFields.VALID_FIELDS)}
 71 |             )
 72 | 
 73 |     # Build request parameters
 74 |     params = {
 75 |         "query": query,
 76 |         "offset": offset,
 77 |         "limit": limit
 78 |     }
 79 |     if fields:
 80 |         params["fields"] = ",".join(fields)
 81 | 
 82 |     # Make the API request
 83 |     return await make_request("/author/search", params)
 84 | 
 85 | @mcp.tool()
 86 | async def author_details(
 87 |     context: Context,
 88 |     author_id: str,
 89 |     fields: Optional[List[str]] = None
 90 | ) -> Dict:
 91 |     """
 92 |     Get detailed information about an author by their ID.
 93 |     This endpoint provides comprehensive metadata about an author.
 94 | 
 95 |     Args:
 96 |         author_id (str): Semantic Scholar author ID.
 97 |             This is a unique identifier assigned by Semantic Scholar.
 98 |             Example: "1741101" (Albert Einstein)
 99 | 
100 |         fields (Optional[List[str]]): List of fields to return.
101 |             authorId is always returned.
102 |             Available fields include name, papers, citationCount, etc.
103 | 
104 |     Returns:
105 |         Dict: Author details with requested fields.
106 |             Always includes authorId.
107 |             Returns error response if author not found.
108 |     """
109 |     if not author_id.strip():
110 |         return create_error_response(
111 |             ErrorType.VALIDATION,
112 |             "Author ID cannot be empty"
113 |         )
114 | 
115 |     # Validate fields
116 |     if fields:
117 |         invalid_fields = set(fields) - AuthorDetailFields.VALID_FIELDS
118 |         if invalid_fields:
119 |             return create_error_response(
120 |                 ErrorType.VALIDATION,
121 |                 f"Invalid fields: {', '.join(invalid_fields)}",
122 |                 {"valid_fields": list(AuthorDetailFields.VALID_FIELDS)}
123 |             )
124 | 
125 |     # Build request parameters
126 |     params = {}
127 |     if fields:
128 |         params["fields"] = ",".join(fields)
129 | 
130 |     # Make the API request
131 |     result = await make_request(f"/author/{author_id}", params)
132 |     
133 |     if isinstance(result, Dict) and "error" in result:
134 |         error_msg = result["error"].get("message", "")
135 |         if "404" in error_msg:
136 |             return create_error_response(
137 |                 ErrorType.VALIDATION,
138 |                 "Author not found",
139 |                 {"author_id": author_id}
140 |             )
141 |         return result
142 | 
143 |     return result
144 | 
145 | @mcp.tool()
146 | async def author_papers(
147 |     context: Context,
148 |     author_id: str,
149 |     fields: Optional[List[str]] = None,
150 |     offset: int = 0,
151 |     limit: int = 100
152 | ) -> Dict:
153 |     """
154 |     Get papers written by an author with pagination support.
155 |     This endpoint provides detailed information about an author's publications.
156 | 
157 |     Args:
158 |         author_id (str): Semantic Scholar author ID.
159 |             This is a unique identifier assigned by Semantic Scholar.
160 |             Example: "1741101" (Albert Einstein)
161 | 
162 |         fields (Optional[List[str]]): List of fields to return for each paper.
163 |             paperId is always returned.
164 | 
165 |         offset (int): Number of papers to skip for pagination.
166 |             Default: 0
167 | 
168 |         limit (int): Maximum number of papers to return.
169 |             Default: 100
170 |             Maximum: 1000
171 | 
172 |     Returns:
173 |         Dict: {
174 |             "offset": int,     # Current offset in the results
175 |             "next": int,       # Next offset (if more results available)
176 |             "data": List[Dict] # List of papers with requested fields
177 |         }
178 |     """
179 |     if not author_id.strip():
180 |         return create_error_response(
181 |             ErrorType.VALIDATION,
182 |             "Author ID cannot be empty"
183 |         )
184 | 
185 |     # Validate limit
186 |     if limit > 1000:
187 |         return create_error_response(
188 |             ErrorType.VALIDATION,
189 |             "Limit cannot exceed 1000",
190 |             {"max_limit": 1000}
191 |         )
192 | 
193 |     # Build request parameters
194 |     params = {
195 |         "offset": offset,
196 |         "limit": limit
197 |     }
198 |     if fields:
199 |         params["fields"] = ",".join(fields)
200 | 
201 |     # Make the API request
202 |     result = await make_request(f"/author/{author_id}/papers", params)
203 |     
204 |     if isinstance(result, Dict) and "error" in result:
205 |         error_msg = result["error"].get("message", "")
206 |         if "404" in error_msg:
207 |             return create_error_response(
208 |                 ErrorType.VALIDATION,
209 |                 "Author not found",
210 |                 {"author_id": author_id}
211 |             )
212 |         return result
213 | 
214 |     return result
215 | 
216 | @mcp.tool()
217 | async def author_batch_details(
218 |     context: Context,
219 |     author_ids: List[str],
220 |     fields: Optional[str] = None
221 | ) -> Dict:
222 |     """
223 |     Get details for multiple authors in a single batch request.
224 |     This endpoint is optimized for efficiently retrieving details about known authors.
225 | 
226 |     Args:
227 |         author_ids (List[str]): List of Semantic Scholar author IDs.
228 |             These are unique identifiers assigned by Semantic Scholar.
229 |             Example: ["1741101", "1741102"]
230 |             Maximum: 1000 IDs per request
231 | 
232 |         fields (Optional[str]): Comma-separated list of fields to return for each author.
233 |             authorId is always returned.
234 | 
235 |     Returns:
236 |         List[Dict]: List of author details with requested fields.
237 |             - Results maintain the same order as input author_ids
238 |             - Invalid or not found author IDs return null in the results
239 |             - Each author object contains the requested fields
240 |             - authorId is always included in each author object
241 |     """
242 |     # Validate inputs
243 |     if not author_ids:
244 |         return create_error_response(
245 |             ErrorType.VALIDATION,
246 |             "Author IDs list cannot be empty"
247 |         )
248 |         
249 |     if len(author_ids) > 1000:
250 |         return create_error_response(
251 |             ErrorType.VALIDATION,
252 |             "Cannot process more than 1000 author IDs at once",
253 |             {"max_authors": 1000, "received": len(author_ids)}
254 |         )
255 | 
256 |     # Validate fields if provided
257 |     if fields:
258 |         field_list = fields.split(",")
259 |         invalid_fields = set(field_list) - AuthorDetailFields.VALID_FIELDS
260 |         if invalid_fields:
261 |             return create_error_response(
262 |                 ErrorType.VALIDATION,
263 |                 f"Invalid fields: {', '.join(invalid_fields)}",
264 |                 {"valid_fields": list(AuthorDetailFields.VALID_FIELDS)}
265 |             )
266 | 
267 |     # Build request parameters
268 |     params = {}
269 |     if fields:
270 |         params["fields"] = fields
271 | 
272 |     # Make POST request with proper structure
273 |     try:
274 |         import httpx
275 |         from ..config import Config
276 | 
277 |         async with httpx.AsyncClient(timeout=Config.TIMEOUT) as client:
278 |             from ..utils.http import get_api_key
279 |             api_key = get_api_key()
280 |             headers = {"x-api-key": api_key} if api_key else {}
281 |             
282 |             response = await client.post(
283 |                 f"{Config.BASE_URL}/author/batch",
284 |                 params=params,
285 |                 json={"ids": author_ids},
286 |                 headers=headers
287 |             )
288 |             response.raise_for_status()
289 |             return response.json()
290 |             
291 |     except httpx.HTTPStatusError as e:
292 |         if e.response.status_code == 429:
293 |             return create_error_response(
294 |                 ErrorType.RATE_LIMIT,
295 |                 "Rate limit exceeded",
296 |                 {"retry_after": e.response.headers.get("retry-after")}
297 |             )
298 |         return create_error_response(
299 |             ErrorType.API_ERROR,
300 |             f"HTTP error: {e.response.status_code}",
301 |             {"response": e.response.text}
302 |         )
303 |     except httpx.TimeoutException:
304 |         return create_error_response(
305 |             ErrorType.TIMEOUT,
306 |             f"Request timed out after {Config.TIMEOUT} seconds"
307 |         )
308 |     except Exception as e:
309 |         return create_error_response(
310 |             ErrorType.API_ERROR,
311 |             str(e)
312 |         ) 
```

--------------------------------------------------------------------------------
/semantic_scholar/api/papers.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Paper-related API endpoints for the Semantic Scholar API.
  3 | """
  4 | 
  5 | from typing import Dict, List, Optional
  6 | from fastmcp import Context
  7 | import httpx
  8 | 
  9 | # Import mcp from centralized location instead of server
 10 | from ..mcp import mcp
 11 | from ..config import PaperFields, CitationReferenceFields, AuthorDetailFields, Config, ErrorType
 12 | from ..utils.http import make_request, get_api_key
 13 | from ..utils.errors import create_error_response
 14 | 
 15 | @mcp.tool()
 16 | async def paper_relevance_search(
 17 |     context: Context,
 18 |     query: str,
 19 |     fields: Optional[List[str]] = None,
 20 |     publication_types: Optional[List[str]] = None,
 21 |     open_access_pdf: bool = False,
 22 |     min_citation_count: Optional[int] = None,
 23 |     year: Optional[str] = None,
 24 |     venue: Optional[List[str]] = None,
 25 |     fields_of_study: Optional[List[str]] = None,
 26 |     offset: int = 0,
 27 |     limit: int = 10
 28 | ) -> Dict:
 29 |     """
 30 |     Search for papers on Semantic Scholar using relevance-based ranking.
 31 |     This endpoint is optimized for finding the most relevant papers matching a text query.
 32 |     Results are sorted by relevance score.
 33 | 
 34 |     Args:
 35 |         query (str): A text query to search for. The query will be matched against paper titles,
 36 |             abstracts, venue names, and author names.
 37 | 
 38 |         fields (Optional[List[str]]): List of fields to return for each paper.
 39 |             paperId and title are always returned.
 40 | 
 41 |         publication_types (Optional[List[str]]): Filter by publication types.
 42 | 
 43 |         open_access_pdf (bool): If True, only include papers with a public PDF.
 44 |             Default: False
 45 | 
 46 |         min_citation_count (Optional[int]): Minimum number of citations required.
 47 | 
 48 |         year (Optional[str]): Filter by publication year. Supports several formats:
 49 |             - Single year: "2019"
 50 |             - Year range: "2016-2020"
 51 |             - Since year: "2010-"
 52 |             - Until year: "-2015"
 53 | 
 54 |         venue (Optional[List[str]]): Filter by publication venues.
 55 |             Accepts full venue names or ISO4 abbreviations.
 56 | 
 57 |         fields_of_study (Optional[List[str]]): Filter by fields of study.
 58 | 
 59 |         offset (int): Number of results to skip for pagination.
 60 |             Default: 0
 61 | 
 62 |         limit (int): Maximum number of results to return.
 63 |             Default: 10
 64 |             Maximum: 100
 65 | 
 66 |     Returns:
 67 |         Dict: {
 68 |             "total": int,      # Total number of papers matching the query
 69 |             "offset": int,     # Current offset in the results
 70 |             "next": int,       # Offset for the next page of results (if available)
 71 |             "data": List[Dict] # List of papers with requested fields
 72 |         }
 73 |     """
 74 |     if not query.strip():
 75 |         return create_error_response(
 76 |             ErrorType.VALIDATION,
 77 |             "Query string cannot be empty"
 78 |         )
 79 | 
 80 |     # Validate and prepare fields
 81 |     if fields is None:
 82 |         fields = PaperFields.DEFAULT
 83 |     else:
 84 |         invalid_fields = set(fields) - PaperFields.VALID_FIELDS
 85 |         if invalid_fields:
 86 |             return create_error_response(
 87 |                 ErrorType.VALIDATION,
 88 |                 f"Invalid fields: {', '.join(invalid_fields)}",
 89 |                 {"valid_fields": list(PaperFields.VALID_FIELDS)}
 90 |             )
 91 | 
 92 |     # Validate and prepare parameters
 93 |     limit = min(limit, 100)
 94 |     params = {
 95 |         "query": query,
 96 |         "offset": offset,
 97 |         "limit": limit,
 98 |         "fields": ",".join(fields)
 99 |     }
100 | 
101 |     # Add optional filters
102 |     if publication_types:
103 |         params["publicationTypes"] = ",".join(publication_types)
104 |     if open_access_pdf:
105 |         params["openAccessPdf"] = "true"
106 |     if min_citation_count is not None:
107 |         params["minCitationCount"] = min_citation_count
108 |     if year:
109 |         params["year"] = year
110 |     if venue:
111 |         params["venue"] = ",".join(venue)
112 |     if fields_of_study:
113 |         params["fieldsOfStudy"] = ",".join(fields_of_study)
114 | 
115 |     return await make_request("/paper/search", params)
116 | 
117 | @mcp.tool()
118 | async def paper_bulk_search(
119 |     context: Context,
120 |     query: Optional[str] = None,
121 |     token: Optional[str] = None,
122 |     fields: Optional[List[str]] = None,
123 |     sort: Optional[str] = None,
124 |     publication_types: Optional[List[str]] = None,
125 |     open_access_pdf: bool = False,
126 |     min_citation_count: Optional[int] = None,
127 |     publication_date_or_year: Optional[str] = None,
128 |     year: Optional[str] = None,
129 |     venue: Optional[List[str]] = None,
130 |     fields_of_study: Optional[List[str]] = None
131 | ) -> Dict:
132 |     """
133 |     Bulk search for papers with advanced filtering and sorting options.
134 |     Intended for retrieving large sets of papers efficiently.
135 |     
136 |     Args:
137 |         query (Optional[str]): Text query to match against paper title and abstract.
138 |             Supports boolean logic with +, |, -, ", *, (), and ~N.
139 |             
140 |         token (Optional[str]): Continuation token for pagination
141 |         
142 |         fields (Optional[List[str]]): Fields to return for each paper
143 |             paperId is always returned
144 |             Default: paperId and title only
145 |             
146 |         sort (Optional[str]): Sort order in format 'field:order'
147 |             Fields: paperId, publicationDate, citationCount
148 |             Order: asc (default), desc
149 |             Default: 'paperId:asc'
150 |             
151 |         publication_types (Optional[List[str]]): Filter by publication types
152 |             
153 |         open_access_pdf (bool): Only include papers with public PDF
154 |         
155 |         min_citation_count (Optional[int]): Minimum citation threshold
156 |         
157 |         publication_date_or_year (Optional[str]): Date/year range filter
158 |             Format: <startDate>:<endDate> in YYYY-MM-DD
159 |             
160 |         year (Optional[str]): Publication year filter
161 |             Examples: '2019', '2016-2020', '2010-', '-2015'
162 |             
163 |         venue (Optional[List[str]]): Filter by publication venues
164 |             
165 |         fields_of_study (Optional[List[str]]): Filter by fields of study
166 |     
167 |     Returns:
168 |         Dict: {
169 |             'total': int,      # Total matching papers
170 |             'token': str,      # Continuation token for next batch
171 |             'data': List[Dict] # Papers with requested fields
172 |         }
173 |     """
174 |     # Build request parameters
175 |     params = {}
176 |     
177 |     # Add query if provided
178 |     if query:
179 |         params["query"] = query.strip()
180 |         
181 |     # Add continuation token if provided
182 |     if token:
183 |         params["token"] = token
184 |         
185 |     # Add fields if provided
186 |     if fields:
187 |         # Validate fields
188 |         invalid_fields = set(fields) - PaperFields.VALID_FIELDS
189 |         if invalid_fields:
190 |             return create_error_response(
191 |                 ErrorType.VALIDATION,
192 |                 f"Invalid fields: {', '.join(invalid_fields)}",
193 |                 {"valid_fields": list(PaperFields.VALID_FIELDS)}
194 |             )
195 |         params["fields"] = ",".join(fields)
196 |         
197 |     # Add sort if provided
198 |     if sort:
199 |         # Validate sort format
200 |         valid_sort_fields = ["paperId", "publicationDate", "citationCount"]
201 |         valid_sort_orders = ["asc", "desc"]
202 |         
203 |         try:
204 |             field, order = sort.split(":")
205 |             if field not in valid_sort_fields:
206 |                 return create_error_response(
207 |                     ErrorType.VALIDATION,
208 |                     f"Invalid sort field. Must be one of: {', '.join(valid_sort_fields)}"
209 |                 )
210 |             if order not in valid_sort_orders:
211 |                 return create_error_response(
212 |                     ErrorType.VALIDATION,
213 |                     f"Invalid sort order. Must be one of: {', '.join(valid_sort_orders)}"
214 |                 )
215 |             params["sort"] = sort
216 |         except ValueError:
217 |             return create_error_response(
218 |                 ErrorType.VALIDATION,
219 |                 "Sort must be in format 'field:order'"
220 |             )
221 |             
222 |     # Add publication types if provided
223 |     if publication_types:
224 |         valid_types = {
225 |             "Review", "JournalArticle", "CaseReport", "ClinicalTrial",
226 |             "Conference", "Dataset", "Editorial", "LettersAndComments",
227 |             "MetaAnalysis", "News", "Study", "Book", "BookSection"
228 |         }
229 |         invalid_types = set(publication_types) - valid_types
230 |         if invalid_types:
231 |             return create_error_response(
232 |                 ErrorType.VALIDATION,
233 |                 f"Invalid publication types: {', '.join(invalid_types)}",
234 |                 {"valid_types": list(valid_types)}
235 |             )
236 |         params["publicationTypes"] = ",".join(publication_types)
237 |         
238 |     # Add open access PDF filter
239 |     if open_access_pdf:
240 |         params["openAccessPdf"] = "true"
241 |         
242 |     # Add minimum citation count if provided
243 |     if min_citation_count is not None:
244 |         if min_citation_count < 0:
245 |             return create_error_response(
246 |                 ErrorType.VALIDATION,
247 |                 "Minimum citation count cannot be negative"
248 |             )
249 |         params["minCitationCount"] = str(min_citation_count)
250 |         
251 |     # Add publication date/year if provided
252 |     if publication_date_or_year:
253 |         params["publicationDateOrYear"] = publication_date_or_year
254 |     elif year:
255 |         params["year"] = year
256 |         
257 |     # Add venue filter if provided
258 |     if venue:
259 |         params["venue"] = ",".join(venue)
260 |         
261 |     # Add fields of study filter if provided
262 |     if fields_of_study:
263 |         valid_fields = {
264 |             "Computer Science", "Medicine", "Chemistry", "Biology",
265 |             "Materials Science", "Physics", "Geology", "Psychology",
266 |             "Art", "History", "Geography", "Sociology", "Business",
267 |             "Political Science", "Economics", "Philosophy", "Mathematics",
268 |             "Engineering", "Environmental Science", "Agricultural and Food Sciences",
269 |             "Education", "Law", "Linguistics"
270 |         }
271 |         invalid_fields = set(fields_of_study) - valid_fields
272 |         if invalid_fields:
273 |             return create_error_response(
274 |                 ErrorType.VALIDATION,
275 |                 f"Invalid fields of study: {', '.join(invalid_fields)}",
276 |                 {"valid_fields": list(valid_fields)}
277 |             )
278 |         params["fieldsOfStudy"] = ",".join(fields_of_study)
279 |     
280 |     # Make the API request
281 |     result = await make_request("/paper/search/bulk", params)
282 |     
283 |     # Handle potential errors
284 |     if isinstance(result, Dict) and "error" in result:
285 |         return result
286 |         
287 |     return result
288 | 
289 | @mcp.tool()
290 | async def paper_title_search(
291 |     context: Context,
292 |     query: str,
293 |     fields: Optional[List[str]] = None,
294 |     publication_types: Optional[List[str]] = None,
295 |     open_access_pdf: bool = False,
296 |     min_citation_count: Optional[int] = None,
297 |     year: Optional[str] = None,
298 |     venue: Optional[List[str]] = None,
299 |     fields_of_study: Optional[List[str]] = None
300 | ) -> Dict:
301 |     """
302 |     Find a single paper by title match. This endpoint is optimized for finding a specific paper
303 |     by its title and returns the best matching paper based on title similarity.
304 | 
305 |     Args:
306 |         query (str): The title text to search for. The query will be matched against paper titles
307 |             to find the closest match.
308 | 
309 |         fields (Optional[List[str]]): List of fields to return for the paper.
310 |             paperId and title are always returned.
311 | 
312 |         publication_types (Optional[List[str]]): Filter by publication types.
313 | 
314 |         open_access_pdf (bool): If True, only include papers with a public PDF.
315 |             Default: False
316 | 
317 |         min_citation_count (Optional[int]): Minimum number of citations required.
318 | 
319 |         year (Optional[str]): Filter by publication year. Supports several formats:
320 |             - Single year: "2019"
321 |             - Year range: "2016-2020"
322 |             - Since year: "2010-"
323 |             - Until year: "-2015"
324 | 
325 |         venue (Optional[List[str]]): Filter by publication venues.
326 |             Accepts full venue names or ISO4 abbreviations.
327 | 
328 |         fields_of_study (Optional[List[str]]): Filter by fields of study.
329 | 
330 |     Returns:
331 |         Dict: {
332 |             "paperId": str,      # Semantic Scholar Paper ID
333 |             "title": str,        # Paper title
334 |             "matchScore": float, # Similarity score between query and matched title
335 |             ...                  # Additional requested fields
336 |         }
337 |         
338 |         Returns error response if no matching paper is found.
339 |     """
340 |     if not query.strip():
341 |         return create_error_response(
342 |             ErrorType.VALIDATION,
343 |             "Query string cannot be empty"
344 |         )
345 | 
346 |     # Validate and prepare fields
347 |     if fields is None:
348 |         fields = PaperFields.DEFAULT
349 |     else:
350 |         invalid_fields = set(fields) - PaperFields.VALID_FIELDS
351 |         if invalid_fields:
352 |             return create_error_response(
353 |                 ErrorType.VALIDATION,
354 |                 f"Invalid fields: {', '.join(invalid_fields)}",
355 |                 {"valid_fields": list(PaperFields.VALID_FIELDS)}
356 |             )
357 | 
358 |     # Build base parameters
359 |     params = {"query": query}
360 | 
361 |     # Add optional parameters
362 |     if fields:
363 |         params["fields"] = ",".join(fields)
364 |     if publication_types:
365 |         params["publicationTypes"] = ",".join(publication_types)
366 |     if open_access_pdf:
367 |         params["openAccessPdf"] = "true"
368 |     if min_citation_count is not None:
369 |         params["minCitationCount"] = str(min_citation_count)
370 |     if year:
371 |         params["year"] = year
372 |     if venue:
373 |         params["venue"] = ",".join(venue)
374 |     if fields_of_study:
375 |         params["fieldsOfStudy"] = ",".join(fields_of_study)
376 | 
377 |     result = await make_request("/paper/search/match", params)
378 |     
379 |     # Handle specific error cases
380 |     if isinstance(result, Dict):
381 |         if "error" in result:
382 |             error_msg = result["error"].get("message", "")
383 |             if "404" in error_msg:
384 |                 return create_error_response(
385 |                     ErrorType.VALIDATION,
386 |                     "No matching paper found",
387 |                     {"original_query": query}
388 |                 )
389 |             return result
390 |     
391 |     return result
392 | 
393 | @mcp.tool()
394 | async def paper_details(
395 |     context: Context,
396 |     paper_id: str,
397 |     fields: Optional[List[str]] = None
398 | ) -> Dict:
399 |     """
400 |     Get details about a paper using various types of identifiers.
401 |     This endpoint provides comprehensive metadata about a paper.
402 | 
403 |     Args:
404 |         paper_id (str): Paper identifier in one of the following formats:
405 |             - Semantic Scholar ID (e.g., "649def34f8be52c8b66281af98ae884c09aef38b")
406 |             - DOI:<doi> (e.g., "DOI:10.18653/v1/N18-3011")
407 |             - ARXIV:<id> (e.g., "ARXIV:2106.15928")
408 |             - MAG:<id> (e.g., "MAG:112218234")
409 |             - ACL:<id> (e.g., "ACL:W12-3903")
410 |             - PMID:<id> (e.g., "PMID:19872477")
411 |             - PMCID:<id> (e.g., "PMCID:2323736")
412 |             - URL:<url> (e.g., "URL:https://arxiv.org/abs/2106.15928v1")
413 |         
414 |         fields (Optional[List[str]]): List of fields to return.
415 |             paperId is always returned.
416 | 
417 |     Returns:
418 |         Dict: Paper details with requested fields.
419 |             Always includes paperId.
420 |             Returns error response if paper not found.
421 |     """
422 |     if not paper_id.strip():
423 |         return create_error_response(
424 |             ErrorType.VALIDATION,
425 |             "Paper ID cannot be empty"
426 |         )
427 | 
428 |     # Build request parameters
429 |     params = {}
430 |     if fields:
431 |         params["fields"] = ",".join(fields)
432 | 
433 |     # Make the API request
434 |     result = await make_request(f"/paper/{paper_id}", params)
435 |     
436 |     # Handle potential errors
437 |     if isinstance(result, Dict) and "error" in result:
438 |         error_msg = result["error"].get("message", "")
439 |         if "404" in error_msg:
440 |             return create_error_response(
441 |                 ErrorType.VALIDATION,
442 |                 "Paper not found",
443 |                 {"paper_id": paper_id}
444 |             )
445 |         return result
446 | 
447 |     return result
448 | 
449 | @mcp.tool()
450 | async def paper_batch_details(
451 |     context: Context,
452 |     paper_ids: List[str],
453 |     fields: Optional[str] = None
454 | ) -> Dict:
455 |     """
456 |     Get details for multiple papers in a single batch request.
457 |     This endpoint is optimized for efficiently retrieving details about known papers.
458 |     
459 |     Args:
460 |         paper_ids (List[str]): List of paper identifiers. Each ID can be in any of these formats:
461 |             - Semantic Scholar ID (e.g., "649def34f8be52c8b66281af98ae884c09aef38b")
462 |             - DOI:<doi> (e.g., "DOI:10.18653/v1/N18-3011")
463 |             - ARXIV:<id> (e.g., "ARXIV:2106.15928")
464 |             - MAG:<id> (e.g., "MAG:112218234")
465 |             - ACL:<id> (e.g., "ACL:W12-3903")
466 |             - PMID:<id> (e.g., "PMID:19872477")
467 |             - PMCID:<id> (e.g., "PMCID:2323736")
468 |             - URL:<url> (e.g., "URL:https://arxiv.org/abs/2106.15928v1")
469 |             Maximum: 500 IDs per request
470 | 
471 |         fields (Optional[str]): Comma-separated list of fields to return for each paper.
472 |             paperId is always returned.
473 |     
474 |     Returns:
475 |         List[Dict]: List of paper details with requested fields.
476 |             - Results maintain the same order as input paper_ids
477 |             - Invalid or not found paper IDs return null in the results
478 |             - Each paper object contains the requested fields
479 |             - paperId is always included in each paper object
480 |     """
481 |     # Validate inputs
482 |     if not paper_ids:
483 |         return create_error_response(
484 |             ErrorType.VALIDATION,
485 |             "Paper IDs list cannot be empty"
486 |         )
487 |         
488 |     if len(paper_ids) > 500:
489 |         return create_error_response(
490 |             ErrorType.VALIDATION,
491 |             "Cannot process more than 500 paper IDs at once",
492 |             {"max_papers": 500, "received": len(paper_ids)}
493 |         )
494 | 
495 |     # Validate fields if provided
496 |     if fields:
497 |         field_list = fields.split(",")
498 |         invalid_fields = set(field_list) - PaperFields.VALID_FIELDS
499 |         if invalid_fields:
500 |             return create_error_response(
501 |                 ErrorType.VALIDATION,
502 |                 f"Invalid fields: {', '.join(invalid_fields)}",
503 |                 {"valid_fields": list(PaperFields.VALID_FIELDS)}
504 |             )
505 | 
506 |     # Build request parameters
507 |     params = {}
508 |     if fields:
509 |         params["fields"] = fields
510 | 
511 |     # Make POST request with proper structure
512 |     try:
513 |         async with httpx.AsyncClient(timeout=Config.TIMEOUT) as client:
514 |             api_key = get_api_key()
515 |             headers = {"x-api-key": api_key} if api_key else {}
516 |             
517 |             response = await client.post(
518 |                 f"{Config.BASE_URL}/paper/batch",
519 |                 params=params,
520 |                 json={"ids": paper_ids},
521 |                 headers=headers
522 |             )
523 |             response.raise_for_status()
524 |             return response.json()
525 |             
526 |     except httpx.HTTPStatusError as e:
527 |         if e.response.status_code == 429:
528 |             return create_error_response(
529 |                 ErrorType.RATE_LIMIT,
530 |                 "Rate limit exceeded",
531 |                 {"retry_after": e.response.headers.get("retry-after")}
532 |             )
533 |         return create_error_response(
534 |             ErrorType.API_ERROR,
535 |             f"HTTP error: {e.response.status_code}",
536 |             {"response": e.response.text}
537 |         )
538 |     except httpx.TimeoutException:
539 |         return create_error_response(
540 |             ErrorType.TIMEOUT,
541 |             f"Request timed out after {Config.TIMEOUT} seconds"
542 |         )
543 |     except Exception as e:
544 |         return create_error_response(
545 |             ErrorType.API_ERROR,
546 |             str(e)
547 |         )
548 | 
549 | @mcp.tool()
550 | async def paper_authors(
551 |     context: Context,
552 |     paper_id: str,
553 |     fields: Optional[List[str]] = None,
554 |     offset: int = 0,
555 |     limit: int = 100
556 | ) -> Dict:
557 |     """
558 |     Get details about the authors of a paper with pagination support.
559 |     This endpoint provides author information and their contributions.
560 | 
561 |     Args:
562 |         paper_id (str): Paper identifier in one of the following formats:
563 |             - Semantic Scholar ID (e.g., "649def34f8be52c8b66281af98ae884c09aef38b")
564 |             - DOI:<doi> (e.g., "DOI:10.18653/v1/N18-3011")
565 |             - ARXIV:<id> (e.g., "ARXIV:2106.15928")
566 |             - MAG:<id> (e.g., "MAG:112218234")
567 |             - ACL:<id> (e.g., "ACL:W12-3903")
568 |             - PMID:<id> (e.g., "PMID:19872477")
569 |             - PMCID:<id> (e.g., "PMCID:2323736")
570 |             - URL:<url> (e.g., "URL:https://arxiv.org/abs/2106.15928v1")
571 | 
572 |         fields (Optional[List[str]]): List of fields to return for each author.
573 |             authorId is always returned.
574 | 
575 |         offset (int): Number of authors to skip for pagination.
576 |             Default: 0
577 | 
578 |         limit (int): Maximum number of authors to return.
579 |             Default: 100
580 |             Maximum: 1000
581 | 
582 |     Returns:
583 |         Dict: {
584 |             "offset": int,     # Current offset in the results
585 |             "next": int,       # Next offset (if more results available)
586 |             "data": List[Dict] # List of authors with requested fields
587 |         }
588 |     """
589 |     if not paper_id.strip():
590 |         return create_error_response(
591 |             ErrorType.VALIDATION,
592 |             "Paper ID cannot be empty"
593 |         )
594 | 
595 |     # Validate limit
596 |     if limit > 1000:
597 |         return create_error_response(
598 |             ErrorType.VALIDATION,
599 |             "Limit cannot exceed 1000",
600 |             {"max_limit": 1000}
601 |         )
602 |     
603 |     # Validate fields
604 |     if fields:
605 |         invalid_fields = set(fields) - AuthorDetailFields.VALID_FIELDS
606 |         if invalid_fields:
607 |             return create_error_response(
608 |                 ErrorType.VALIDATION,
609 |                 f"Invalid fields: {', '.join(invalid_fields)}",
610 |                 {"valid_fields": list(AuthorDetailFields.VALID_FIELDS)}
611 |             )
612 | 
613 |     # Build request parameters
614 |     params = {
615 |         "offset": offset,
616 |         "limit": limit
617 |     }
618 |     if fields:
619 |         params["fields"] = ",".join(fields)
620 | 
621 |     # Make the API request
622 |     result = await make_request(f"/paper/{paper_id}/authors", params)
623 |     
624 |     # Handle potential errors
625 |     if isinstance(result, Dict) and "error" in result:
626 |         error_msg = result["error"].get("message", "")
627 |         if "404" in error_msg:
628 |             return create_error_response(
629 |                 ErrorType.VALIDATION,
630 |                 "Paper not found",
631 |                 {"paper_id": paper_id}
632 |             )
633 |         return result
634 | 
635 |     return result
636 | 
637 | @mcp.tool()
638 | async def paper_citations(
639 |     context: Context,
640 |     paper_id: str,
641 |     fields: Optional[List[str]] = None,
642 |     offset: int = 0,
643 |     limit: int = 100
644 | ) -> Dict:
645 |     """
646 |     Get papers that cite the specified paper (papers where this paper appears in their bibliography).
647 |     This endpoint provides detailed citation information including citation contexts.
648 | 
649 |     Args:
650 |         paper_id (str): Paper identifier in one of the following formats:
651 |             - Semantic Scholar ID (e.g., "649def34f8be52c8b66281af98ae884c09aef38b")
652 |             - DOI:<doi> (e.g., "DOI:10.18653/v1/N18-3011")
653 |             - ARXIV:<id> (e.g., "ARXIV:2106.15928")
654 |             - MAG:<id> (e.g., "MAG:112218234")
655 |             - ACL:<id> (e.g., "ACL:W12-3903")
656 |             - PMID:<id> (e.g., "PMID:19872477")
657 |             - PMCID:<id> (e.g., "PMCID:2323736")
658 |             - URL:<url> (e.g., "URL:https://arxiv.org/abs/2106.15928v1")
659 | 
660 |         fields (Optional[List[str]]): List of fields to return for each citing paper.
661 |             paperId is always returned.
662 | 
663 |         offset (int): Number of citations to skip for pagination.
664 |             Default: 0
665 | 
666 |         limit (int): Maximum number of citations to return.
667 |             Default: 100
668 |             Maximum: 1000
669 | 
670 |     Returns:
671 |         Dict: {
672 |             "offset": int,     # Current offset in the results
673 |             "next": int,       # Next offset (if more results available)
674 |             "data": List[Dict] # List of citing papers with requested fields
675 |         }
676 |     """
677 |     if not paper_id.strip():
678 |         return create_error_response(
679 |             ErrorType.VALIDATION,
680 |             "Paper ID cannot be empty"
681 |         )
682 | 
683 |     # Validate limit
684 |     if limit > 1000:
685 |         return create_error_response(
686 |             ErrorType.VALIDATION,
687 |             "Limit cannot exceed 1000",
688 |             {"max_limit": 1000}
689 |         )
690 | 
691 |     # Validate fields
692 |     if fields:
693 |         invalid_fields = set(fields) - CitationReferenceFields.VALID_FIELDS
694 |         if invalid_fields:
695 |             return create_error_response(
696 |                 ErrorType.VALIDATION,
697 |                 f"Invalid fields: {', '.join(invalid_fields)}",
698 |                 {"valid_fields": list(CitationReferenceFields.VALID_FIELDS)}
699 |             )
700 | 
701 |     # Build request parameters
702 |     params = {
703 |         "offset": offset,
704 |         "limit": limit
705 |     }
706 |     if fields:
707 |         params["fields"] = ",".join(fields)
708 | 
709 |     # Make the API request
710 |     result = await make_request(f"/paper/{paper_id}/citations", params)
711 |     
712 |     # Handle potential errors
713 |     if isinstance(result, Dict) and "error" in result:
714 |         error_msg = result["error"].get("message", "")
715 |         if "404" in error_msg:
716 |             return create_error_response(
717 |                 ErrorType.VALIDATION,
718 |                 "Paper not found",
719 |                 {"paper_id": paper_id}
720 |             )
721 |         return result
722 | 
723 |     return result
724 | 
725 | @mcp.tool()
726 | async def paper_references(
727 |     context: Context,
728 |     paper_id: str,
729 |     fields: Optional[List[str]] = None,
730 |     offset: int = 0,
731 |     limit: int = 100
732 | ) -> Dict:
733 |     """
734 |     Get papers cited by the specified paper (papers appearing in this paper's bibliography).
735 |     This endpoint provides detailed reference information including citation contexts.
736 | 
737 |     Args:
738 |         paper_id (str): Paper identifier in one of the following formats:
739 |             - Semantic Scholar ID (e.g., "649def34f8be52c8b66281af98ae884c09aef38b")
740 |             - DOI:<doi> (e.g., "DOI:10.18653/v1/N18-3011")
741 |             - ARXIV:<id> (e.g., "ARXIV:2106.15928")
742 |             - MAG:<id> (e.g., "MAG:112218234")
743 |             - ACL:<id> (e.g., "ACL:W12-3903")
744 |             - PMID:<id> (e.g., "PMID:19872477")
745 |             - PMCID:<id> (e.g., "PMCID:2323736")
746 |             - URL:<url> (e.g., "URL:https://arxiv.org/abs/2106.15928v1")
747 | 
748 |         fields (Optional[List[str]]): List of fields to return for each referenced paper.
749 |             paperId is always returned.
750 | 
751 |         offset (int): Number of references to skip for pagination.
752 |             Default: 0
753 | 
754 |         limit (int): Maximum number of references to return.
755 |             Default: 100
756 |             Maximum: 1000
757 | 
758 |     Returns:
759 |         Dict: {
760 |             "offset": int,     # Current offset in the results
761 |             "next": int,       # Next offset (if more results available)
762 |             "data": List[Dict] # List of referenced papers with requested fields
763 |         }
764 |     """
765 |     if not paper_id.strip():
766 |         return create_error_response(
767 |             ErrorType.VALIDATION,
768 |             "Paper ID cannot be empty"
769 |         )
770 | 
771 |     # Validate limit
772 |     if limit > 1000:
773 |         return create_error_response(
774 |             ErrorType.VALIDATION,
775 |             "Limit cannot exceed 1000",
776 |             {"max_limit": 1000}
777 |         )
778 | 
779 |     # Validate fields
780 |     if fields:
781 |         invalid_fields = set(fields) - CitationReferenceFields.VALID_FIELDS
782 |         if invalid_fields:
783 |             return create_error_response(
784 |                 ErrorType.VALIDATION,
785 |                 f"Invalid fields: {', '.join(invalid_fields)}",
786 |                 {"valid_fields": list(CitationReferenceFields.VALID_FIELDS)}
787 |             )
788 | 
789 |     # Build request parameters
790 |     params = {
791 |         "offset": offset,
792 |         "limit": limit
793 |     }
794 |     if fields:
795 |         params["fields"] = ",".join(fields)
796 | 
797 |     # Make the API request
798 |     result = await make_request(f"/paper/{paper_id}/references", params)
799 |     
800 |     # Handle potential errors
801 |     if isinstance(result, Dict) and "error" in result:
802 |         error_msg = result["error"].get("message", "")
803 |         if "404" in error_msg:
804 |             return create_error_response(
805 |                 ErrorType.VALIDATION,
806 |                 "Paper not found",
807 |                 {"paper_id": paper_id}
808 |             )
809 |         return result
810 | 
811 |     return result 
```