zzaebok/mcp-wikidata # codebase.md

# Directory Structure

```
├── .gitignore
├── .python-version
├── Dockerfile
├── LICENSE
├── pyproject.toml
├── README.md
├── smithery.yaml
├── src
│   ├── client.py
│   └── server.py
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------

```
1 | 3.11
2 | 
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
 1 | # Python-generated files
 2 | __pycache__/
 3 | *.py[oc]
 4 | build/
 5 | dist/
 6 | wheels/
 7 | *.egg-info
 8 | 
 9 | # Virtual environments
10 | .venv
11 | 
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
  1 | ## Wikidata MCP Server
  2 | 
  3 | [![smithery badge](https://smithery.ai/badge/@zzaebok/mcp-wikidata)](https://smithery.ai/server/@zzaebok/mcp-wikidata)
  4 | 
  5 | A server implementation for Wikidata API using the Model Context Protocol (MCP).
  6 | This project provides tools to interact with Wikidata, such as **searching identifiers** (entity and property), **extracting metadata** (label and description) and **executing sparql query**.
  7 | 
  8 | ---
  9 | 
 10 | ### Installation
 11 | 
 12 | #### Installing via Smithery
 13 | 
 14 | To install Wikidata MCP Server for Claude Desktop automatically via [Smithery](https://smithery.ai/server/@zzaebok/mcp-wikidata):
 15 | 
 16 | ```bash
 17 | npx -y @smithery/cli install @zzaebok/mcp-wikidata --client claude
 18 | ```
 19 | 
 20 | #### Installing Manually
 21 | Install `uv` if it is not installed yet.
 22 | 
 23 | ```bash
 24 | $ curl -LsSf https://astral.sh/uv/install.sh | sh
 25 | ```
 26 | 
 27 | Then, install dependencies.
 28 | 
 29 | ```bash
 30 | $ git clone https://github.com/zzaebok/mcp-wikidata.git
 31 | $ cd mcp-wikidata
 32 | $ uv sync
 33 | # if you want to run client example together
 34 | $ uv sync --extra example
 35 | ```
 36 | 
 37 | ---
 38 | 
 39 | ### Run
 40 | 
 41 | Run the server with:
 42 | 
 43 | ```bash
 44 | $ uv run src/server.py
 45 | ```
 46 | 
 47 | If you want to test it with a simple client code (with `langchain-mcp-adapters`), run the client with:
 48 | 
 49 | ```bash
 50 | # in another shell
 51 | $ uv run src/client.py
 52 | ```
 53 | 
 54 | The LLM extracts valid entity and property identifiers, executes a sparql query, and finally recommend a movie directed by Bong Joon-ho.
 55 | 
 56 | <details>
 57 |   <summary><i>See the execution output</i></summary>
 58 | 
 59 | ```python
 60 | {
 61 |   "messages": [
 62 |       HumanMessage(
 63 |           content="Can you recommend me a movie directed by Bong Joonho?",
 64 |       ),
 65 |       AIMessage(
 66 |           tool_calls=[
 67 |               {
 68 |                   "name": "search_entity",
 69 |                   "args": {"query": "Bong Joon-ho"},
 70 |               }
 71 |           ],
 72 |       ),
 73 |       ToolMessage(
 74 |           content="Q495980",
 75 |           name="search_entity",
 76 |       ),
 77 |       AIMessage(
 78 |           tool_calls=[
 79 |               {
 80 |                   "name": "get_properties",
 81 |                   "args": {"entity_id": "Q495980"},
 82 |               }
 83 |           ],
 84 |       ),
 85 |       ToolMessage(
 86 |           content='["P345", "P244", "P214", "P227", ...]',
 87 |           name="get_properties",
 88 |       ),
 89 |       AIMessage(
 90 |           tool_calls=[
 91 |               {
 92 |                   "name": "search_property",
 93 |                   "args": {"query": "director"},
 94 |               }
 95 |           ],
 96 |       ),
 97 |       ToolMessage(
 98 |           content="P57",
 99 |           name="search_property",
100 |       ),
101 |       AIMessage(
102 |           tool_calls=[
103 |               {
104 |                   "name": "execute_sparql",
105 |                   "args": {
106 |                       "sparql_query": 'SELECT ?film ?filmLabel WHERE {\n  ?film wdt:P57 wd:Q495980.\n  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }\n} LIMIT 1'
107 |                   },
108 |               }
109 |           ],
110 |       ),
111 |       ToolMessage(
112 |           content='[{"film": {"type": "uri", "value": "http://www.wikidata.org/entity/Q483761"}, "filmLabel": {"xml:lang": "en", "type": "literal", "value": "Mother"}}]',
113 |           name="execute_sparql",
114 |       ),
115 |       AIMessage(
116 |           content='I recommend the movie "Mother," which was directed by Bong Joon-ho.',
117 |       ),
118 |   ]
119 | }
120 | ```
121 | 
122 | </details>
123 | 
124 | ---
125 | 
126 | ### Wikidata MCP Tools
127 | 
128 | The following tools are implemented in the server:
129 | 
130 | | Tool                                                 | Description                                                                |
131 | | ---------------------------------------------------- | -------------------------------------------------------------------------- |
132 | | `search_entity(query: str)`                          | Search for a Wikidata entity ID by its query.                              |
133 | | `search_property(query: str)`                        | Search for a Wikidata property ID by its query.                            |
134 | | `get_properties(entity_id: str)`                     | Get the properties associated with a given Wikidata entity ID.             |
135 | | `execute_sparql(sparql_query: str)`                  | Execute a SPARQL query on Wikidata.                                        |
136 | | `get_metadata(entity_id: str, language: str = "en")` | Retrieve the English label and description for a given Wikidata entity ID. |
137 | 
138 | ---
139 | 
140 | #### License
141 | 
142 | MIT License
143 | 
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
 1 | [project]
 2 | name = "mcp-wikidata"
 3 | version = "0.1.0"
 4 | description = "MCP Wikidata Server"
 5 | readme = "README.md"
 6 | requires-python = ">=3.11"
 7 | dependencies = [
 8 |     "mcp[cli]>=1.4.1",
 9 |     "httpx>=0.28.1",
10 | ]
11 | authors = [{ name = "Jaebok Lee" }]
12 | 
13 | [project.optional-dependencies]
14 | example = [
15 |     "black>=25.1.0",
16 |     "langchain-openai>=0.3.11",
17 |     "langgraph>=0.3.21",
18 |     "langchain-core>=0.3.49",
19 |     "langchain-mcp-adapters>=0.0.6",
20 | ]
21 | 
```

--------------------------------------------------------------------------------
/smithery.yaml:
--------------------------------------------------------------------------------

```yaml
 1 | # Smithery configuration file: https://smithery.ai/docs/config#smitheryyaml
 2 | 
 3 | startCommand:
 4 |   type: stdio
 5 |   configSchema:
 6 |     # JSON Schema defining the configuration options for the MCP.
 7 |     type: object
 8 |     properties: {}
 9 |     default: {}
10 |   commandFunction:
11 |     # A JS function that produces the CLI command based on the given config to start the MCP on stdio.
12 |     |-
13 |     (config) => ({
14 |       command: 'uv',
15 |       args: ['run', 'src/server.py'],
16 |       env: {}
17 |     })
18 |   exampleConfig: {}
19 | 
```

--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
 1 | # Generated by https://smithery.ai. See: https://smithery.ai/docs/config#dockerfile
 2 | FROM python:3.11-slim
 3 | 
 4 | # Install curl for installation of uv
 5 | RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*
 6 | 
 7 | # Install uv - see https://astral.sh/uv
 8 | RUN curl -LsSf https://astral.sh/uv/install.sh | sh
 9 | 
10 | # Add local binary directory to PATH
11 | ENV PATH="/root/.local/bin:$PATH"
12 | 
13 | # Set working directory
14 | WORKDIR /app
15 | 
16 | # Copy requirements and project files
17 | COPY pyproject.toml ./
18 | COPY README.md ./
19 | COPY src/ ./src/
20 | COPY uv.lock ./
21 | 
22 | # Install project dependencies using pip
23 | RUN pip install --upgrade pip && \
24 |     pip install . --no-cache-dir
25 | 
26 | # Command to run the MCP server
27 | CMD ["uv", "run", "src/server.py"]
28 | 
```

--------------------------------------------------------------------------------
/src/client.py:
--------------------------------------------------------------------------------

```python
 1 | import os
 2 | 
 3 | from mcp import ClientSession, StdioServerParameters
 4 | from mcp.client.stdio import stdio_client
 5 | from langchain_mcp_adapters.tools import load_mcp_tools
 6 | from langgraph.prebuilt import create_react_agent
 7 | from langchain_openai import ChatOpenAI
 8 | 
 9 | os.environ["OPENAI_API_KEY"] = "your-api-key"
10 | 
11 | model = ChatOpenAI(model="gpt-4o")
12 | 
13 | server_py = os.path.join(os.path.dirname(os.path.abspath(__file__)), "server.py")
14 | server_params = StdioServerParameters(
15 |     command="python",
16 |     args=[server_py],
17 | )
18 | 
19 | 
20 | async def main():
21 |     async with stdio_client(server_params) as (read, write):
22 |         async with ClientSession(read, write) as session:
23 |             # Initialize the connection
24 |             await session.initialize()
25 | 
26 |             # Get tools
27 |             tools = await load_mcp_tools(session)
28 | 
29 |             # Create and run the agent
30 |             agent = create_react_agent(
31 |                 model,
32 |                 tools,
33 |                 prompt="You are a helpful assistant. Answer the user's questions based on Wikidata.",
34 |             )
35 |             agent_response = await agent.ainvoke(
36 |                 {
37 |                     "messages": "Can you recommend me a movie directed by Bong Joonho?",
38 |                 }
39 |             )
40 |             print(agent_response)
41 | 
42 | 
43 | if __name__ == "__main__":
44 |     import asyncio
45 | 
46 |     asyncio.run(main())
47 | 
```

--------------------------------------------------------------------------------
/src/server.py:
--------------------------------------------------------------------------------

```python
  1 | # reference: https://github.com/langchain-ai/langchain/blob/master/cookbook/wikibase_agent.ipynb
  2 | import httpx
  3 | import json
  4 | from mcp.server.fastmcp import FastMCP
  5 | from typing import List, Dict
  6 | 
  7 | server = FastMCP("Wikidata MCP Server")
  8 | 
  9 | WIKIDATA_URL = "https://www.wikidata.org/w/api.php"
 10 | HEADER = {"Accept": "application/json", "User-Agent": "foobar"}
 11 | 
 12 | 
 13 | async def search_wikidata(query: str, is_entity: bool = True) -> str:
 14 |     """
 15 |     Search for a Wikidata item or property ID by its query.
 16 |     """
 17 |     params = {
 18 |         "action": "query",
 19 |         "list": "search",
 20 |         "srsearch": query,
 21 |         "srnamespace": 0 if is_entity else 120,
 22 |         "srlimit": 1,  # TODO: add a parameter to limit the number of results?
 23 |         "srqiprofile": "classic_noboostlinks" if is_entity else "classic",
 24 |         "srwhat": "text",
 25 |         "format": "json",
 26 |     }
 27 |     async with httpx.AsyncClient() as client:
 28 |         response = await client.get(WIKIDATA_URL, headers=HEADER, params=params)
 29 |     response.raise_for_status()
 30 |     try:
 31 |         title = response.json()["query"]["search"][0]["title"]
 32 |         title = title.split(":")[-1]
 33 |         return title
 34 |     except KeyError:
 35 |         return "No results found. Consider changing the search term."
 36 | 
 37 | 
 38 | @server.tool()
 39 | async def search_entity(query: str) -> str:
 40 |     """
 41 |     Search for a Wikidata entity ID by its query.
 42 | 
 43 |     Args:
 44 |         query (str): The query to search for. The query should be unambiguous enough to uniquely identify the entity.
 45 | 
 46 |     Returns:
 47 |         str: The Wikidata entity ID corresponding to the given query."
 48 |     """
 49 |     return await search_wikidata(query, is_entity=True)
 50 | 
 51 | 
 52 | @server.tool()
 53 | async def search_property(query: str) -> str:
 54 |     """
 55 |     Search for a Wikidata property ID by its query.
 56 | 
 57 |     Args:
 58 |         query (str): The query to search for. The query should be unambiguous enough to uniquely identify the property.
 59 | 
 60 |     Returns:
 61 |         str: The Wikidata property ID corresponding to the given query."
 62 |     """
 63 |     return await search_wikidata(query, is_entity=False)
 64 | 
 65 | 
 66 | @server.tool()
 67 | async def get_properties(entity_id: str) -> List[str]:
 68 |     """
 69 |     Get the properties associated with a given Wikidata entity ID.
 70 | 
 71 |     Args:
 72 |         entity_id (str): The entity ID to retrieve properties for. This should be a valid Wikidata entity ID.
 73 | 
 74 |     Returns:
 75 |         list: A list of property IDs associated with the given entity ID. If no properties are found, an empty list is returned.
 76 |     """
 77 |     params = {
 78 |         "action": "wbgetentities",
 79 |         "ids": entity_id,
 80 |         "props": "claims",
 81 |         "format": "json",
 82 |     }
 83 |     async with httpx.AsyncClient() as client:
 84 |         response = await client.get(WIKIDATA_URL, headers=HEADER, params=params)
 85 |     response.raise_for_status()
 86 |     data = response.json()
 87 |     return list(data.get("entities", {}).get(entity_id, {}).get("claims", {}).keys())
 88 | 
 89 | 
 90 | @server.tool()
 91 | async def execute_sparql(sparql_query: str) -> str:
 92 |     """
 93 |     Execute a SPARQL query on Wikidata.
 94 | 
 95 |     You may assume the following prefixes:
 96 |     PREFIX wd: <http://www.wikidata.org/entity/>
 97 |     PREFIX wdt: <http://www.wikidata.org/prop/direct/>
 98 |     PREFIX p: <http://www.wikidata.org/prop/>
 99 |     PREFIX ps: <http://www.wikidata.org/prop/statement/>
100 | 
101 |     Args:
102 |         sparql_query (str): The SPARQL query to execute.
103 | 
104 |     Returns:
105 |         str: The JSON-formatted result of the SPARQL query execution. If there are no results, an empty JSON object will be returned.
106 |     """
107 |     url = "https://query.wikidata.org/sparql"
108 |     async with httpx.AsyncClient() as client:
109 |         response = await client.get(
110 |             url, params={"query": sparql_query, "format": "json"}
111 |         )
112 |     response.raise_for_status()
113 |     result = response.json()["results"]["bindings"]
114 |     return json.dumps(result)
115 | 
116 | 
117 | @server.tool()
118 | async def get_metadata(entity_id: str, language: str = "en") -> Dict[str, str]:
119 |     """
120 |     Retrieve the English label and description for a given Wikidata entity ID.
121 | 
122 |     Args:
123 |         entity_id (str): The entity ID to retrieve metadata for.
124 |         language (str): The language code for the label and description (default is "en"). Use ISO 639-1 codes.
125 | 
126 |     Returns:
127 |         dict: A dictionary containing the label and description of the entity, if available.
128 |     """
129 |     params = {
130 |         "action": "wbgetentities",
131 |         "ids": entity_id,
132 |         "props": "labels|descriptions",
133 |         "languages": language,  # specify the desired language
134 |         "format": "json",
135 |     }
136 |     async with httpx.AsyncClient() as client:
137 |         response = await client.get(WIKIDATA_URL, params=params)
138 |     response.raise_for_status()
139 |     data = response.json()
140 |     entity_data = data.get("entities", {}).get(entity_id, {})
141 |     label = (
142 |         entity_data.get("labels", {}).get(language, {}).get("value", "No label found")
143 |     )
144 |     descriptions = (
145 |         entity_data.get("descriptions", {})
146 |         .get(language, {})
147 |         .get("value", "No label found")
148 |     )
149 |     return {"Label": label, "Descriptions": descriptions}
150 | 
151 | 
152 | if __name__ == "__main__":
153 |     server.run()
154 | 
```