# Directory Structure
```
├── .gitignore
├── .python-version
├── Dockerfile
├── LICENSE
├── pyproject.toml
├── README.md
├── smithery.yaml
├── src
│ └── huggingface
│ ├── __init__.py
│ └── server.py
└── uv.lock
```
# Files
--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------
```
1 | 3.13
2 |
```
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
```
1 | # Python-generated files
2 | __pycache__/
3 | *.py[oc]
4 | build/
5 | dist/
6 | wheels/
7 | *.egg-info
8 |
9 | # Virtual environments
10 | .venv
11 |
```
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
```markdown
1 | # 🤗 Hugging Face MCP Server 🤗
2 |
3 | [](https://smithery.ai/server/@shreyaskarnik/huggingface-mcp-server)
4 |
5 | A Model Context Protocol (MCP) server that provides read-only access to the Hugging Face Hub APIs. This server allows LLMs like Claude to interact with Hugging Face's models, datasets, spaces, papers, and collections.
6 |
7 | ## Components
8 |
9 | ### Resources
10 |
11 | The server exposes popular Hugging Face resources:
12 |
13 | - Custom `hf://` URI scheme for accessing resources
14 | - Models with `hf://model/{model_id}` URIs
15 | - Datasets with `hf://dataset/{dataset_id}` URIs
16 | - Spaces with `hf://space/{space_id}` URIs
17 | - All resources have descriptive names and JSON content type
18 |
19 | ### Prompts
20 |
21 | The server provides two prompt templates:
22 |
23 | - `compare-models`: Generates a comparison between multiple Hugging Face models
24 | - Required `model_ids` argument (comma-separated model IDs)
25 | - Retrieves model details and formats them for comparison
26 |
27 | - `summarize-paper`: Summarizes a research paper from Hugging Face
28 | - Required `arxiv_id` argument for paper identification
29 | - Optional `detail_level` argument (brief/detailed) to control summary depth
30 | - Combines paper metadata with implementation details
31 |
32 | ### Tools
33 |
34 | The server implements several tool categories:
35 |
36 | - **Model Tools**
37 | - `search-models`: Search models with filters for query, author, tags, and limit
38 | - `get-model-info`: Get detailed information about a specific model
39 |
40 | - **Dataset Tools**
41 | - `search-datasets`: Search datasets with filters
42 | - `get-dataset-info`: Get detailed information about a specific dataset
43 |
44 | - **Space Tools**
45 | - `search-spaces`: Search Spaces with filters including SDK type
46 | - `get-space-info`: Get detailed information about a specific Space
47 |
48 | - **Paper Tools**
49 | - `get-paper-info`: Get information about a paper and its implementations
50 | - `get-daily-papers`: Get the list of curated daily papers
51 |
52 | - **Collection Tools**
53 | - `search-collections`: Search collections with various filters
54 | - `get-collection-info`: Get detailed information about a specific collection
55 |
56 | ## Configuration
57 |
58 | The server does not require configuration, but supports optional Hugging Face authentication:
59 |
60 | - Set `HF_TOKEN` environment variable with your Hugging Face API token for:
61 | - Higher API rate limits
62 | - Access to private repositories (if authorized)
63 | - Improved reliability for high-volume requests
64 |
65 | ## Quickstart
66 |
67 | ### Install
68 |
69 | #### Installing via Smithery
70 |
71 | To install huggingface-mcp-server for Claude Desktop automatically via [Smithery](https://smithery.ai/server/@shreyaskarnik/huggingface-mcp-server):
72 |
73 | ```bash
74 | npx -y @smithery/cli install @shreyaskarnik/huggingface-mcp-server --client claude
75 | ```
76 |
77 | #### Claude Desktop
78 |
79 | On MacOS: `~/Library/Application\ Support/Claude/claude_desktop_config.json`
80 | On Windows: `%APPDATA%/Claude/claude_desktop_config.json`
81 |
82 | <details>
83 | <summary>Development/Unpublished Servers Configuration</summary>
84 |
85 | ```json
86 | "mcpServers": {
87 | "huggingface": {
88 | "command": "uv",
89 | "args": [
90 | "--directory",
91 | "/absolute/path/to/huggingface-mcp-server",
92 | "run",
93 | "huggingface_mcp_server.py"
94 | ],
95 | "env": {
96 | "HF_TOKEN": "your_token_here" // Optional
97 | }
98 | }
99 | }
100 | ```
101 |
102 | </details>
103 |
104 | ## Development
105 |
106 | ### Building and Publishing
107 |
108 | To prepare the package for distribution:
109 |
110 | 1. Sync dependencies and update lockfile:
111 |
112 | ```bash
113 | uv sync
114 | ```
115 |
116 | 1. Build package distributions:
117 |
118 | ```bash
119 | uv build
120 | ```
121 |
122 | This will create source and wheel distributions in the `dist/` directory.
123 |
124 | 1. Publish to PyPI:
125 |
126 | ```bash
127 | uv publish
128 | ```
129 |
130 | Note: You'll need to set PyPI credentials via environment variables or command flags:
131 |
132 | - Token: `--token` or `UV_PUBLISH_TOKEN`
133 | - Or username/password: `--username`/`UV_PUBLISH_USERNAME` and `--password`/`UV_PUBLISH_PASSWORD`
134 |
135 | ### Debugging
136 |
137 | Since MCP servers run over stdio, debugging can be challenging. For the best debugging
138 | experience, we strongly recommend using the [MCP Inspector](https://github.com/modelcontextprotocol/inspector).
139 |
140 | You can launch the MCP Inspector via [`npm`](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm) with this command:
141 |
142 | ```bash
143 | npx @modelcontextprotocol/inspector uv --directory /path/to/huggingface-mcp-server run huggingface_mcp_server.py
144 | ```
145 |
146 | Upon launching, the Inspector will display a URL that you can access in your browser to begin debugging.
147 |
148 | ## Example Prompts for Claude
149 |
150 | When using this server with Claude, try these example prompts:
151 |
152 | - "Search for BERT models on Hugging Face with less than 100 million parameters"
153 | - "Find the most popular datasets for text classification on Hugging Face"
154 | - "What are today's featured AI research papers on Hugging Face?"
155 | - "Summarize the paper with arXiv ID 2307.09288 using the Hugging Face MCP server"
156 | - "Compare the Llama-3-8B and Mistral-7B models from Hugging Face"
157 | - "Show me the most popular Gradio spaces for image generation"
158 | - "Find collections created by TheBloke that include Mixtral models"
159 |
160 | ## Troubleshooting
161 |
162 | If you encounter issues with the server:
163 |
164 | 1. Check server logs in Claude Desktop:
165 | - macOS: `~/Library/Logs/Claude/mcp-server-huggingface.log`
166 | - Windows: `%APPDATA%\Claude\logs\mcp-server-huggingface.log`
167 |
168 | 2. For API rate limiting errors, consider adding a Hugging Face API token
169 |
170 | 3. Make sure your machine has internet connectivity to reach the Hugging Face API
171 |
172 | 4. If a particular tool is failing, try accessing the same data through the Hugging Face website to verify it exists
173 |
```
--------------------------------------------------------------------------------
/src/huggingface/__init__.py:
--------------------------------------------------------------------------------
```python
1 | from . import server
2 | import asyncio
3 |
4 | def main():
5 | """Main entry point for the package."""
6 | asyncio.run(server.main())
7 |
8 | # Optionally expose other important items at package level
9 | __all__ = ['main', 'server']
```
--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
```toml
1 | [project]
2 | name = "huggingface"
3 | version = "0.1.0"
4 | description = "Hugging Face MCP Server"
5 | readme = "README.md"
6 | requires-python = ">=3.13"
7 | dependencies = ["huggingface-hub>=0.29.3", "mcp>=1.4.1"]
8 | [[project.authors]]
9 | name = "Shreyas Karnik"
10 | email = "[email protected]"
11 |
12 | [build-system]
13 | requires = ["hatchling"]
14 | build-backend = "hatchling.build"
15 |
16 | [project.scripts]
17 | huggingface = "huggingface:main"
18 |
```
--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------
```dockerfile
1 | # Generated by https://smithery.ai. See: https://smithery.ai/docs/config#dockerfile
2 | FROM python:3.11-slim
3 |
4 | # Set working directory
5 | WORKDIR /app
6 |
7 | # Copy necessary files
8 | COPY pyproject.toml ./
9 | COPY README.md ./
10 | COPY src ./src
11 | COPY uv.lock ./
12 |
13 | # Upgrade pip and install build tools and the package
14 | RUN pip install --upgrade pip \
15 | && pip install hatchling \
16 | && pip install . --ignore-requires-python --no-build-isolation
17 |
18 | CMD ["huggingface"]
19 |
```
--------------------------------------------------------------------------------
/smithery.yaml:
--------------------------------------------------------------------------------
```yaml
1 | # Smithery configuration file: https://smithery.ai/docs/config#smitheryyaml
2 |
3 | startCommand:
4 | type: stdio
5 | configSchema:
6 | # JSON Schema defining the configuration options for the MCP.
7 | type: object
8 | properties:
9 | hfToken:
10 | type: string
11 | default: ""
12 | description: Optional Hugging Face API Token. Leave empty if not provided.
13 | commandFunction:
14 | # A JS function that produces the CLI command based on the given config to start the MCP on stdio.
15 | |-
16 | (config) => ({
17 | command: 'huggingface',
18 | args: [],
19 | env: {
20 | HF_TOKEN: config.hfToken || ""
21 | }
22 | })
23 | exampleConfig:
24 | hfToken: your_hf_token_here
25 |
```
--------------------------------------------------------------------------------
/src/huggingface/server.py:
--------------------------------------------------------------------------------
```python
1 | """
2 | 🤗 Hugging Face MCP Server 🤗
3 |
4 | This server provides Model Context Protocol (MCP) access to the Hugging Face API,
5 | allowing models like Claude to interact with models, datasets, spaces, and other
6 | Hugging Face resources in a read-only manner.
7 | """
8 |
9 | import asyncio
10 | import json
11 | from typing import Any, Dict, Optional
12 | from urllib.parse import quote_plus
13 |
14 | import httpx
15 | import mcp.server.stdio
16 | import mcp.types as types
17 | from huggingface_hub import HfApi
18 | from mcp.server import NotificationOptions, Server
19 | from mcp.server.models import InitializationOptions
20 | from pydantic import AnyUrl
21 |
22 | # Initialize server
23 | server = Server("huggingface")
24 |
25 | # Initialize Hugging Face API client
26 | hf_api = HfApi()
27 |
28 | # Base URL for the Hugging Face API
29 | HF_API_BASE = "https://huggingface.co/api"
30 |
31 | # Initialize HTTP client for making requests
32 | http_client = httpx.AsyncClient(timeout=30.0)
33 |
34 |
35 | # Helper Functions
36 | async def make_hf_request(
37 | endpoint: str, params: Optional[Dict[str, Any]] = None
38 | ) -> Dict:
39 | """Make a request to the Hugging Face API with proper error handling."""
40 | url = f"{HF_API_BASE}/{endpoint}"
41 | try:
42 | response = await http_client.get(url, params=params)
43 | response.raise_for_status()
44 | return response.json()
45 | except Exception as e:
46 | return {"error": str(e)}
47 |
48 |
49 | # Tool Handlers
50 | @server.list_tools()
51 | async def handle_list_tools() -> list[types.Tool]:
52 | """
53 | List available tools for interacting with the Hugging Face Hub.
54 | Each tool specifies its arguments using JSON Schema validation.
55 | """
56 | return [
57 | # Model Tools
58 | types.Tool(
59 | name="search-models",
60 | description="Search for models on Hugging Face Hub",
61 | inputSchema={
62 | "type": "object",
63 | "properties": {
64 | "query": {
65 | "type": "string",
66 | "description": "Search term (e.g., 'bert', 'gpt')",
67 | },
68 | "author": {
69 | "type": "string",
70 | "description": "Filter by author/organization (e.g., 'huggingface', 'google')",
71 | },
72 | "tags": {
73 | "type": "string",
74 | "description": "Filter by tags (e.g., 'text-classification', 'translation')",
75 | },
76 | "limit": {
77 | "type": "integer",
78 | "description": "Maximum number of results to return",
79 | },
80 | },
81 | },
82 | ),
83 | types.Tool(
84 | name="get-model-info",
85 | description="Get detailed information about a specific model",
86 | inputSchema={
87 | "type": "object",
88 | "properties": {
89 | "model_id": {
90 | "type": "string",
91 | "description": "The ID of the model (e.g., 'google/bert-base-uncased')",
92 | },
93 | },
94 | "required": ["model_id"],
95 | },
96 | ),
97 | # Dataset Tools
98 | types.Tool(
99 | name="search-datasets",
100 | description="Search for datasets on Hugging Face Hub",
101 | inputSchema={
102 | "type": "object",
103 | "properties": {
104 | "query": {"type": "string", "description": "Search term"},
105 | "author": {
106 | "type": "string",
107 | "description": "Filter by author/organization",
108 | },
109 | "tags": {"type": "string", "description": "Filter by tags"},
110 | "limit": {
111 | "type": "integer",
112 | "description": "Maximum number of results to return",
113 | },
114 | },
115 | },
116 | ),
117 | types.Tool(
118 | name="get-dataset-info",
119 | description="Get detailed information about a specific dataset",
120 | inputSchema={
121 | "type": "object",
122 | "properties": {
123 | "dataset_id": {
124 | "type": "string",
125 | "description": "The ID of the dataset (e.g., 'squad')",
126 | },
127 | },
128 | "required": ["dataset_id"],
129 | },
130 | ),
131 | # Space Tools
132 | types.Tool(
133 | name="search-spaces",
134 | description="Search for Spaces on Hugging Face Hub",
135 | inputSchema={
136 | "type": "object",
137 | "properties": {
138 | "query": {"type": "string", "description": "Search term"},
139 | "author": {
140 | "type": "string",
141 | "description": "Filter by author/organization",
142 | },
143 | "tags": {"type": "string", "description": "Filter by tags"},
144 | "sdk": {
145 | "type": "string",
146 | "description": "Filter by SDK (e.g., 'streamlit', 'gradio', 'docker')",
147 | },
148 | "limit": {
149 | "type": "integer",
150 | "description": "Maximum number of results to return",
151 | },
152 | },
153 | },
154 | ),
155 | types.Tool(
156 | name="get-space-info",
157 | description="Get detailed information about a specific Space",
158 | inputSchema={
159 | "type": "object",
160 | "properties": {
161 | "space_id": {
162 | "type": "string",
163 | "description": "The ID of the Space (e.g., 'huggingface/diffusers-demo')",
164 | },
165 | },
166 | "required": ["space_id"],
167 | },
168 | ),
169 | # Papers Tools
170 | types.Tool(
171 | name="get-paper-info",
172 | description="Get information about a specific paper on Hugging Face",
173 | inputSchema={
174 | "type": "object",
175 | "properties": {
176 | "arxiv_id": {
177 | "type": "string",
178 | "description": "The arXiv ID of the paper (e.g., '1810.04805')",
179 | },
180 | },
181 | "required": ["arxiv_id"],
182 | },
183 | ),
184 | types.Tool(
185 | name="get-daily-papers",
186 | description="Get the list of daily papers curated by Hugging Face",
187 | inputSchema={
188 | "type": "object",
189 | "properties": {},
190 | },
191 | ),
192 | # Collections Tools
193 | types.Tool(
194 | name="search-collections",
195 | description="Search for collections on Hugging Face Hub",
196 | inputSchema={
197 | "type": "object",
198 | "properties": {
199 | "owner": {"type": "string", "description": "Filter by owner"},
200 | "item": {
201 | "type": "string",
202 | "description": "Filter by item (e.g., 'models/teknium/OpenHermes-2.5-Mistral-7B')",
203 | },
204 | "query": {
205 | "type": "string",
206 | "description": "Search term for titles and descriptions",
207 | },
208 | "limit": {
209 | "type": "integer",
210 | "description": "Maximum number of results to return",
211 | },
212 | },
213 | },
214 | ),
215 | types.Tool(
216 | name="get-collection-info",
217 | description="Get detailed information about a specific collection",
218 | inputSchema={
219 | "type": "object",
220 | "properties": {
221 | "namespace": {
222 | "type": "string",
223 | "description": "The namespace of the collection (user or organization)",
224 | },
225 | "collection_id": {
226 | "type": "string",
227 | "description": "The ID part of the collection",
228 | },
229 | },
230 | "required": ["namespace", "collection_id"],
231 | },
232 | ),
233 | ]
234 |
235 |
236 | @server.call_tool()
237 | async def handle_call_tool(
238 | name: str, arguments: dict | None
239 | ) -> list[types.TextContent | types.ImageContent | types.EmbeddedResource]:
240 | """
241 | Handle tool execution requests for Hugging Face API.
242 | """
243 | if not arguments:
244 | arguments = {}
245 |
246 | if name == "search-models":
247 | query = arguments.get("query")
248 | author = arguments.get("author")
249 | tags = arguments.get("tags")
250 | limit = arguments.get("limit", 10)
251 |
252 | params = {"limit": limit}
253 | if query:
254 | params["search"] = query
255 | if author:
256 | params["author"] = author
257 | if tags:
258 | params["filter"] = tags
259 |
260 | data = await make_hf_request("models", params)
261 |
262 | if "error" in data:
263 | return [
264 | types.TextContent(
265 | type="text", text=f"Error searching models: {data['error']}"
266 | )
267 | ]
268 |
269 | # Format the results
270 | results = []
271 | for model in data:
272 | model_info = {
273 | "id": model.get("id", ""),
274 | "name": model.get("modelId", ""),
275 | "author": model.get("author", ""),
276 | "tags": model.get("tags", []),
277 | "downloads": model.get("downloads", 0),
278 | "likes": model.get("likes", 0),
279 | "lastModified": model.get("lastModified", ""),
280 | }
281 | results.append(model_info)
282 |
283 | return [types.TextContent(type="text", text=json.dumps(results, indent=2))]
284 |
285 | elif name == "get-model-info":
286 | model_id = arguments.get("model_id")
287 | if not model_id:
288 | return [types.TextContent(type="text", text="Error: model_id is required")]
289 |
290 | data = await make_hf_request(f"models/{quote_plus(model_id)}")
291 |
292 | if "error" in data:
293 | return [
294 | types.TextContent(
295 | type="text",
296 | text=f"Error retrieving model information: {data['error']}",
297 | )
298 | ]
299 |
300 | # Format the result
301 | model_info = {
302 | "id": data.get("id", ""),
303 | "name": data.get("modelId", ""),
304 | "author": data.get("author", ""),
305 | "tags": data.get("tags", []),
306 | "pipeline_tag": data.get("pipeline_tag", ""),
307 | "downloads": data.get("downloads", 0),
308 | "likes": data.get("likes", 0),
309 | "lastModified": data.get("lastModified", ""),
310 | "description": data.get("description", "No description available"),
311 | }
312 |
313 | # Add model card if available
314 | if "card" in data and data["card"]:
315 | model_info["model_card"] = (
316 | data["card"].get("data", {}).get("text", "No model card available")
317 | )
318 |
319 | return [types.TextContent(type="text", text=json.dumps(model_info, indent=2))]
320 |
321 | elif name == "search-datasets":
322 | query = arguments.get("query")
323 | author = arguments.get("author")
324 | tags = arguments.get("tags")
325 | limit = arguments.get("limit", 10)
326 |
327 | params = {"limit": limit}
328 | if query:
329 | params["search"] = query
330 | if author:
331 | params["author"] = author
332 | if tags:
333 | params["filter"] = tags
334 |
335 | data = await make_hf_request("datasets", params)
336 |
337 | if "error" in data:
338 | return [
339 | types.TextContent(
340 | type="text", text=f"Error searching datasets: {data['error']}"
341 | )
342 | ]
343 |
344 | # Format the results
345 | results = []
346 | for dataset in data:
347 | dataset_info = {
348 | "id": dataset.get("id", ""),
349 | "name": dataset.get("datasetId", ""),
350 | "author": dataset.get("author", ""),
351 | "tags": dataset.get("tags", []),
352 | "downloads": dataset.get("downloads", 0),
353 | "likes": dataset.get("likes", 0),
354 | "lastModified": dataset.get("lastModified", ""),
355 | }
356 | results.append(dataset_info)
357 |
358 | return [types.TextContent(type="text", text=json.dumps(results, indent=2))]
359 |
360 | elif name == "get-dataset-info":
361 | dataset_id = arguments.get("dataset_id")
362 | if not dataset_id:
363 | return [
364 | types.TextContent(type="text", text="Error: dataset_id is required")
365 | ]
366 |
367 | data = await make_hf_request(f"datasets/{quote_plus(dataset_id)}")
368 |
369 | if "error" in data:
370 | return [
371 | types.TextContent(
372 | type="text",
373 | text=f"Error retrieving dataset information: {data['error']}",
374 | )
375 | ]
376 |
377 | # Format the result
378 | dataset_info = {
379 | "id": data.get("id", ""),
380 | "name": data.get("datasetId", ""),
381 | "author": data.get("author", ""),
382 | "tags": data.get("tags", []),
383 | "downloads": data.get("downloads", 0),
384 | "likes": data.get("likes", 0),
385 | "lastModified": data.get("lastModified", ""),
386 | "description": data.get("description", "No description available"),
387 | }
388 |
389 | # Add dataset card if available
390 | if "card" in data and data["card"]:
391 | dataset_info["dataset_card"] = (
392 | data["card"].get("data", {}).get("text", "No dataset card available")
393 | )
394 |
395 | return [types.TextContent(type="text", text=json.dumps(dataset_info, indent=2))]
396 |
397 | elif name == "search-spaces":
398 | query = arguments.get("query")
399 | author = arguments.get("author")
400 | tags = arguments.get("tags")
401 | sdk = arguments.get("sdk")
402 | limit = arguments.get("limit", 10)
403 |
404 | params = {"limit": limit}
405 | if query:
406 | params["search"] = query
407 | if author:
408 | params["author"] = author
409 | if tags:
410 | params["filter"] = tags
411 | if sdk:
412 | params["filter"] = params.get("filter", "") + f" sdk:{sdk}"
413 |
414 | data = await make_hf_request("spaces", params)
415 |
416 | if "error" in data:
417 | return [
418 | types.TextContent(
419 | type="text", text=f"Error searching spaces: {data['error']}"
420 | )
421 | ]
422 |
423 | # Format the results
424 | results = []
425 | for space in data:
426 | space_info = {
427 | "id": space.get("id", ""),
428 | "name": space.get("spaceId", ""),
429 | "author": space.get("author", ""),
430 | "sdk": space.get("sdk", ""),
431 | "tags": space.get("tags", []),
432 | "likes": space.get("likes", 0),
433 | "lastModified": space.get("lastModified", ""),
434 | }
435 | results.append(space_info)
436 |
437 | return [types.TextContent(type="text", text=json.dumps(results, indent=2))]
438 |
439 | elif name == "get-space-info":
440 | space_id = arguments.get("space_id")
441 | if not space_id:
442 | return [types.TextContent(type="text", text="Error: space_id is required")]
443 |
444 | data = await make_hf_request(f"spaces/{quote_plus(space_id)}")
445 |
446 | if "error" in data:
447 | return [
448 | types.TextContent(
449 | type="text",
450 | text=f"Error retrieving space information: {data['error']}",
451 | )
452 | ]
453 |
454 | # Format the result
455 | space_info = {
456 | "id": data.get("id", ""),
457 | "name": data.get("spaceId", ""),
458 | "author": data.get("author", ""),
459 | "sdk": data.get("sdk", ""),
460 | "tags": data.get("tags", []),
461 | "likes": data.get("likes", 0),
462 | "lastModified": data.get("lastModified", ""),
463 | "description": data.get("description", "No description available"),
464 | "url": f"https://huggingface.co/spaces/{space_id}",
465 | }
466 |
467 | return [types.TextContent(type="text", text=json.dumps(space_info, indent=2))]
468 |
469 | elif name == "get-paper-info":
470 | arxiv_id = arguments.get("arxiv_id")
471 | if not arxiv_id:
472 | return [types.TextContent(type="text", text="Error: arxiv_id is required")]
473 |
474 | data = await make_hf_request(f"papers/{arxiv_id}")
475 |
476 | if "error" in data:
477 | return [
478 | types.TextContent(
479 | type="text",
480 | text=f"Error retrieving paper information: {data['error']}",
481 | )
482 | ]
483 |
484 | # Format the result
485 | paper_info = {
486 | "arxiv_id": data.get("arxivId", ""),
487 | "title": data.get("title", ""),
488 | "authors": data.get("authors", []),
489 | "summary": data.get("summary", "No summary available"),
490 | "url": f"https://huggingface.co/papers/{arxiv_id}",
491 | }
492 |
493 | # Get implementations
494 | implementations = await make_hf_request(f"arxiv/{arxiv_id}/repos")
495 | if "error" not in implementations:
496 | paper_info["implementations"] = implementations
497 |
498 | return [types.TextContent(type="text", text=json.dumps(paper_info, indent=2))]
499 |
500 | elif name == "get-daily-papers":
501 | data = await make_hf_request("daily_papers")
502 |
503 | if "error" in data:
504 | return [
505 | types.TextContent(
506 | type="text", text=f"Error retrieving daily papers: {data['error']}"
507 | )
508 | ]
509 |
510 | # Format the results
511 | results = []
512 | for paper in data:
513 | paper_info = {
514 | "arxiv_id": paper.get("paper", {}).get("arxivId", ""),
515 | "title": paper.get("paper", {}).get("title", ""),
516 | "authors": paper.get("paper", {}).get("authors", []),
517 | "summary": paper.get("paper", {}).get("summary", "")[:200] + "..."
518 | if len(paper.get("paper", {}).get("summary", "")) > 200
519 | else paper.get("paper", {}).get("summary", ""),
520 | }
521 | results.append(paper_info)
522 |
523 | return [types.TextContent(type="text", text=json.dumps(results, indent=2))]
524 |
525 | elif name == "search-collections":
526 | owner = arguments.get("owner")
527 | item = arguments.get("item")
528 | query = arguments.get("query")
529 | limit = arguments.get("limit", 10)
530 |
531 | params = {"limit": limit}
532 | if owner:
533 | params["owner"] = owner
534 | if item:
535 | params["item"] = item
536 | if query:
537 | params["q"] = query
538 |
539 | data = await make_hf_request("collections", params)
540 |
541 | if "error" in data:
542 | return [
543 | types.TextContent(
544 | type="text", text=f"Error searching collections: {data['error']}"
545 | )
546 | ]
547 |
548 | # Format the results
549 | results = []
550 | for collection in data:
551 | collection_info = {
552 | "id": collection.get("id", ""),
553 | "title": collection.get("title", ""),
554 | "owner": collection.get("owner", {}).get("name", ""),
555 | "description": collection.get(
556 | "description", "No description available"
557 | ),
558 | "items_count": collection.get("itemsCount", 0),
559 | "upvotes": collection.get("upvotes", 0),
560 | "last_modified": collection.get("lastModified", ""),
561 | }
562 | results.append(collection_info)
563 |
564 | return [types.TextContent(type="text", text=json.dumps(results, indent=2))]
565 |
566 | elif name == "get-collection-info":
567 | namespace = arguments.get("namespace")
568 | collection_id = arguments.get("collection_id")
569 |
570 | if not namespace or not collection_id:
571 | return [
572 | types.TextContent(
573 | type="text", text="Error: namespace and collection_id are required"
574 | )
575 | ]
576 |
577 | # Extract the slug from the collection_id if it contains a dash
578 | slug = collection_id.split("-")[0] if "-" in collection_id else collection_id
579 | endpoint = f"collections/{namespace}/{slug}-{collection_id}"
580 |
581 | data = await make_hf_request(endpoint)
582 |
583 | if "error" in data:
584 | return [
585 | types.TextContent(
586 | type="text",
587 | text=f"Error retrieving collection information: {data['error']}",
588 | )
589 | ]
590 |
591 | # Format the result
592 | collection_info = {
593 | "id": data.get("id", ""),
594 | "title": data.get("title", ""),
595 | "owner": data.get("owner", {}).get("name", ""),
596 | "description": data.get("description", "No description available"),
597 | "upvotes": data.get("upvotes", 0),
598 | "last_modified": data.get("lastModified", ""),
599 | "items": [],
600 | }
601 |
602 | # Add items
603 | for item in data.get("items", []):
604 | item_info = {
605 | "type": item.get("item", {}).get("type", ""),
606 | "id": item.get("item", {}).get("id", ""),
607 | "note": item.get("note", ""),
608 | }
609 | collection_info["items"].append(item_info)
610 |
611 | return [
612 | types.TextContent(type="text", text=json.dumps(collection_info, indent=2))
613 | ]
614 |
615 | else:
616 | return [types.TextContent(type="text", text=f"Unknown tool: {name}")]
617 |
618 |
619 | # Resource Handlers - Define popular models, datasets, and spaces as resources
620 | @server.list_resources()
621 | async def handle_list_resources() -> list[types.Resource]:
622 | """
623 | List available Hugging Face resources.
624 | This provides direct access to popular models, datasets, and spaces.
625 | """
626 | resources = []
627 |
628 | # Popular models
629 | popular_models = [
630 | (
631 | "meta-llama/Llama-3-8B-Instruct",
632 | "Llama 3 8B Instruct",
633 | "Meta's Llama 3 8B Instruct model",
634 | ),
635 | (
636 | "mistralai/Mistral-7B-Instruct-v0.2",
637 | "Mistral 7B Instruct v0.2",
638 | "Mistral AI's 7B instruction-following model",
639 | ),
640 | (
641 | "openchat/openchat-3.5-0106",
642 | "OpenChat 3.5",
643 | "Open-source chatbot based on Mistral 7B",
644 | ),
645 | (
646 | "stabilityai/stable-diffusion-xl-base-1.0",
647 | "Stable Diffusion XL 1.0",
648 | "SDXL text-to-image model",
649 | ),
650 | ]
651 |
652 | for model_id, name, description in popular_models:
653 | resources.append(
654 | types.Resource(
655 | uri=AnyUrl(f"hf://model/{model_id}"),
656 | name=name,
657 | description=description,
658 | mimeType="application/json",
659 | )
660 | )
661 |
662 | # Popular datasets
663 | popular_datasets = [
664 | (
665 | "databricks/databricks-dolly-15k",
666 | "Databricks Dolly 15k",
667 | "15k instruction-following examples",
668 | ),
669 | ("squad", "SQuAD", "Stanford Question Answering Dataset"),
670 | ("glue", "GLUE", "General Language Understanding Evaluation benchmark"),
671 | (
672 | "openai/summarize_from_feedback",
673 | "Summarize From Feedback",
674 | "OpenAI summarization dataset",
675 | ),
676 | ]
677 |
678 | for dataset_id, name, description in popular_datasets:
679 | resources.append(
680 | types.Resource(
681 | uri=AnyUrl(f"hf://dataset/{dataset_id}"),
682 | name=name,
683 | description=description,
684 | mimeType="application/json",
685 | )
686 | )
687 |
688 | # Popular spaces
689 | popular_spaces = [
690 | (
691 | "huggingface/diffusers-demo",
692 | "Diffusers Demo",
693 | "Demo of Stable Diffusion models",
694 | ),
695 | ("gradio/chatbot-demo", "Chatbot Demo", "Demo of a Gradio chatbot interface"),
696 | (
697 | "prompthero/midjourney-v4-diffusion",
698 | "Midjourney v4 Diffusion",
699 | "Replica of Midjourney v4",
700 | ),
701 | ("stabilityai/stablevicuna", "StableVicuna", "Fine-tuned Vicuna with RLHF"),
702 | ]
703 |
704 | for space_id, name, description in popular_spaces:
705 | resources.append(
706 | types.Resource(
707 | uri=AnyUrl(f"hf://space/{space_id}"),
708 | name=name,
709 | description=description,
710 | mimeType="application/json",
711 | )
712 | )
713 |
714 | return resources
715 |
716 |
717 | @server.read_resource()
718 | async def handle_read_resource(uri: AnyUrl) -> str:
719 | """
720 | Read a specific Hugging Face resource by its URI.
721 | """
722 | if uri.scheme != "hf":
723 | raise ValueError(f"Unsupported URI scheme: {uri.scheme}")
724 |
725 | if not uri.path:
726 | raise ValueError("Invalid Hugging Face resource URI")
727 |
728 | parts = uri.path.lstrip("/").split("/", 1)
729 | if len(parts) != 2:
730 | raise ValueError("Invalid Hugging Face resource URI format")
731 |
732 | resource_type, resource_id = parts
733 |
734 | if resource_type == "model":
735 | data = await make_hf_request(f"models/{quote_plus(resource_id)}")
736 | elif resource_type == "dataset":
737 | data = await make_hf_request(f"datasets/{quote_plus(resource_id)}")
738 | elif resource_type == "space":
739 | data = await make_hf_request(f"spaces/{quote_plus(resource_id)}")
740 | else:
741 | raise ValueError(f"Unsupported resource type: {resource_type}")
742 |
743 | if "error" in data:
744 | raise ValueError(f"Error retrieving resource: {data['error']}")
745 |
746 | return json.dumps(data, indent=2)
747 |
748 |
749 | # Prompt Handlers
750 | @server.list_prompts()
751 | async def handle_list_prompts() -> list[types.Prompt]:
752 | """
753 | List available prompts for Hugging Face integration.
754 | """
755 | return [
756 | types.Prompt(
757 | name="compare-models",
758 | description="Compare multiple Hugging Face models",
759 | arguments=[
760 | types.PromptArgument(
761 | name="model_ids",
762 | description="Comma-separated list of model IDs to compare",
763 | required=True,
764 | )
765 | ],
766 | ),
767 | types.Prompt(
768 | name="summarize-paper",
769 | description="Summarize an AI research paper from arXiv",
770 | arguments=[
771 | types.PromptArgument(
772 | name="arxiv_id",
773 | description="arXiv ID of the paper to summarize",
774 | required=True,
775 | ),
776 | types.PromptArgument(
777 | name="detail_level",
778 | description="Level of detail in the summary (brief/detailed/eli5)",
779 | required=False,
780 | ),
781 | ],
782 | ),
783 | ]
784 |
785 |
786 | @server.get_prompt()
787 | async def handle_get_prompt(
788 | name: str, arguments: dict[str, str] | None
789 | ) -> types.GetPromptResult:
790 | """
791 | Generate a prompt related to Hugging Face resources.
792 | """
793 | if not arguments:
794 | arguments = {}
795 |
796 | if name == "compare-models":
797 | model_ids = arguments.get("model_ids", "")
798 | if not model_ids:
799 | raise ValueError("model_ids argument is required")
800 |
801 | model_list = [model_id.strip() for model_id in model_ids.split(",")]
802 | models_data = []
803 |
804 | for model_id in model_list:
805 | data = await make_hf_request(f"models/{quote_plus(model_id)}")
806 | if "error" not in data:
807 | models_data.append(data)
808 |
809 | model_details = []
810 | for data in models_data:
811 | details = {
812 | "id": data.get("id", ""),
813 | "author": data.get("author", ""),
814 | "downloads": data.get("downloads", 0),
815 | "tags": data.get("tags", []),
816 | "description": data.get("description", "No description available"),
817 | }
818 | model_details.append(details)
819 |
820 | return types.GetPromptResult(
821 | description=f"Comparing models: {model_ids}",
822 | messages=[
823 | types.PromptMessage(
824 | role="user",
825 | content=types.TextContent(
826 | type="text",
827 | text="I'd like you to compare these Hugging Face models and help me understand their differences, strengths, and suitable use cases:\n\n"
828 | + json.dumps(model_details, indent=2)
829 | + "\n\nPlease structure your comparison with sections on architecture, performance, use cases, and limitations.",
830 | ),
831 | )
832 | ],
833 | )
834 |
835 | elif name == "summarize-paper":
836 | arxiv_id = arguments.get("arxiv_id", "")
837 | if not arxiv_id:
838 | raise ValueError("arxiv_id argument is required")
839 |
840 | detail_level = arguments.get("detail_level", "detailed")
841 |
842 | paper_data = await make_hf_request(f"papers/{arxiv_id}")
843 | if "error" in paper_data:
844 | raise ValueError(f"Error retrieving paper: {paper_data['error']}")
845 |
846 | # Get implementations
847 | implementations = await make_hf_request(f"arxiv/{arxiv_id}/repos")
848 |
849 | return types.GetPromptResult(
850 | description=f"Summarizing paper: {paper_data.get('title', arxiv_id)}",
851 | messages=[
852 | types.PromptMessage(
853 | role="user",
854 | content=types.TextContent(
855 | type="text",
856 | text=f"Please provide a {'detailed' if detail_level == 'detailed' else 'brief'} summary of this AI research paper:\n\n"
857 | + f"Title: {paper_data.get('title', 'Unknown')}\n"
858 | + f"Authors: {', '.join(paper_data.get('authors', []))}\n"
859 | + f"Abstract: {paper_data.get('summary', 'No abstract available')}\n\n"
860 | + (
861 | f"Implementations on Hugging Face: {json.dumps(implementations, indent=2)}\n\n"
862 | if "error" not in implementations
863 | else ""
864 | )
865 | + f"Please {'cover all key aspects including methodology, results, and implications' if detail_level == 'detailed' else 'provide a concise overview of the main contributions'}.",
866 | ),
867 | )
868 | ],
869 | )
870 |
871 | else:
872 | raise ValueError(f"Unknown prompt: {name}")
873 |
874 |
875 | async def main():
876 | # Run the server using stdin/stdout streams
877 | async with mcp.server.stdio.stdio_server() as (read_stream, write_stream):
878 | await server.run(
879 | read_stream,
880 | write_stream,
881 | InitializationOptions(
882 | server_name="huggingface",
883 | server_version="0.1.0",
884 | capabilities=server.get_capabilities(
885 | notification_options=NotificationOptions(),
886 | experimental_capabilities={},
887 | ),
888 | ),
889 | )
890 |
891 |
892 | if __name__ == "__main__":
893 | asyncio.run(main())
894 |
```