# Directory Structure
```
├── .github
│ └── workflows
│ └── python-package.yml
├── .gitignore
├── .python-version
├── Dockerfile
├── LICENSE
├── Makefile
├── pyproject.toml
├── README.md
├── src
│ └── zotero_mcp
│ ├── __init__.py
│ ├── cli.py
│ └── client.py
├── tests
│ ├── __init__.py
│ ├── conftest.py
│ ├── test_client.py
│ ├── test_item_operations.py
│ └── test_search.py
└── uv.lock
```
# Files
--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------
```
3.13
```
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
```
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# UV
# Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
#uv.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
# Ruff stuff:
.ruff_cache/
# PyPI configuration file
.pypirc
```
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
```markdown
# Model Context Protocol server for Zotero
[](https://github.com/kujenga/zotero-mcp/actions)
[](https://pypi.org/project/zotero-mcp/)
This project is a python server that implements the [Model Context Protocol (MCP)](https://modelcontextprotocol.io/introduction) for [Zotero](https://www.zotero.org/), giving you access to your Zotero library within AI assistants. It is intended to implement a small but maximally useful set of interactions with Zotero for use with [MCP clients](https://modelcontextprotocol.io/clients).
<a href="https://glama.ai/mcp/servers/jknz38ntu4">
<img width="380" height="200" src="https://glama.ai/mcp/servers/jknz38ntu4/badge" alt="Zotero Server MCP server" />
</a>
## Features
This MCP server provides the following tools:
- `zotero_search_items`: Search for items in your Zotero library using a text query
- `zotero_item_metadata`: Get detailed metadata information about a specific Zotero item
- `zotero_item_fulltext`: Get the full text of a specific Zotero item (i.e. PDF contents)
These can be discovered and accessed through any MCP client or through the [MCP Inspector](https://modelcontextprotocol.io/docs/tools/inspector).
Each tool returns formatted text containing relevant information from your Zotero items, and AI assistants such as Claude can use them sequentially, searching for items then retrieving their metadata or text content.
## Installation
This server can either run against either a [local API offered by the Zotero desktop application](https://groups.google.com/g/zotero-dev/c/ElvHhIFAXrY/m/fA7SKKwsAgAJ)) or through the [Zotero Web API](https://www.zotero.org/support/dev/web_api/v3/start). The local API can be a bit more responsive, but requires that the Zotero app be running on the same computer with the API enabled. To enable the local API, do the following steps:
1. Open Zotero and open "Zotero Settings"
1. Under the "Advanced" tab, check the box that says "Allow other applications on this computer to communicate with Zotero".
> [!IMPORTANT]
> For access to the `/fulltext` endpoint on the local API which allows retrieving the full content of items in your library, you'll need to install a [Zotero Beta Build](https://www.zotero.org/support/beta_builds) (as of 2025-03-30). Once 7.1 is released this will no longer be the case. See https://github.com/zotero/zotero/pull/5004 for more information. If you do not want to do this, use the Web API instead.
To use the Zotero Web API, you'll need to create an API key and find your Library ID (usually your User ID) in your Zotero account settings here: <https://www.zotero.org/settings/keys>
These are the available configuration options:
- `ZOTERO_LOCAL=true`: Use the local Zotero API (default: false, see note below)
- `ZOTERO_API_KEY`: Your Zotero API key (not required for the local API)
- `ZOTERO_LIBRARY_ID`: Your Zotero library ID (your user ID for user libraries, not required for the local API)
- `ZOTERO_LIBRARY_TYPE`: The type of library (user or group, default: user)
### [`uvx`](https://docs.astral.sh/uv/getting-started/installation/) with Local Zotero API
To use this with Claude Desktop and a direct python install with [`uvx`](https://docs.astral.sh/uv/getting-started/installation/), add the following to the `mcpServers` configuration:
```json
{
"mcpServers": {
"zotero": {
"command": "uvx",
"args": ["--upgrade", "zotero-mcp"],
"env": {
"ZOTERO_LOCAL": "true",
"ZOTERO_API_KEY": "",
"ZOTERO_LIBRARY_ID": ""
}
}
}
}
```
The `--upgrade` flag is optional and will pull the latest version when new ones are available. If you don't have `uvx` installed you can use `pipx run` instead, or clone this repository locally and use the instructions in [Development](#development) below.
### Docker with Zotero Web API
If you want to run this MCP server in a Docker container, you can use the following configuration, inserting your API key and library ID:
```json
{
"mcpServers": {
"zotero": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"-e", "ZOTERO_API_KEY=PLACEHOLDER",
"-e", "ZOTERO_LIBRARY_ID=PLACEHOLDER",
"ghcr.io/kujenga/zotero-mcp:main"
],
}
}
}
```
To update to a newer version, run `docker pull ghcr.io/kujenga/zotero-mcp:main`. It is also possible to use the docker-based installation to talk to the local Zotero API, but you'll need to modify the above command to ensure that there is network connectivity to the Zotero application's local API interface.
## Development
Information on making changes and contributing to the project.
1. Clone this repository
1. Install dependencies with [uv](https://docs.astral.sh/uv/) by running: `uv sync`
1. Create a `.env` file in the project root with the environment variables above
Start the [MCP Inspector](https://modelcontextprotocol.io/docs/tools/inspector) for local development:
```bash
npx @modelcontextprotocol/inspector uv run zotero-mcp
```
To test the local repository against Claude Desktop, run `echo $PWD/.venv/bin/zotero-mcp` in your shell within this directory, then set the following within your Claude Desktop configuration
```json
{
"mcpServers": {
"zotero": {
"command": "/path/to/zotero-mcp/.venv/bin/zotero-mcp"
"env": {
// Whatever configuration is desired.
}
}
}
}
```
### Running Tests
To run the test suite:
```bash
uv run pytest
```
### Docker Development
Build the container image with this command:
```sh
docker build . -t zotero-mcp:local
```
To test the container with the MCP inspector, run the following command:
```sh
npx @modelcontextprotocol/inspector \
-e ZOTERO_API_KEY=$ZOTERO_API_KEY \
-e ZOTERO_LIBRARY_ID=$ZOTERO_LIBRARY_ID \
docker run --rm -i \
--env ZOTERO_API_KEY \
--env ZOTERO_LIBRARY_ID \
zotero-mcp:local
```
## Relevant Documentation
- https://modelcontextprotocol.io/tutorials/building-mcp-with-llms
- https://github.com/modelcontextprotocol/python-sdk
- https://pyzotero.readthedocs.io/en/latest/
- https://www.zotero.org/support/dev/web_api/v3/start
- https://modelcontextprotocol.io/llms-full.txt can be utilized by LLMs
```
--------------------------------------------------------------------------------
/tests/__init__.py:
--------------------------------------------------------------------------------
```python
"""Test suite for zotero-mcp"""
```
--------------------------------------------------------------------------------
/src/zotero_mcp/cli.py:
--------------------------------------------------------------------------------
```python
import argparse
from zotero_mcp import mcp
def main():
parser = argparse.ArgumentParser(description="Zotero Model Contect Server")
parser.add_argument(
"--transport",
choices=["stdio", "sse"],
default="stdio",
help="Transport to use",
)
args = parser.parse_args()
mcp.run(args.transport)
if __name__ == "__main__":
main()
```
--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------
```dockerfile
FROM python:3.13-slim-bookworm
# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/
# Install application
ADD README.md LICENSE pyproject.toml uv.lock src /app/
WORKDIR /app
ENV UV_FROZEN=true
RUN uv sync
# Check basic functionality
RUN uv run zotero-mcp --help
LABEL org.opencontainers.image.title="zotero-mcp"
LABEL org.opencontainers.image.description="Model Context Protocol Server for Zotero"
LABEL org.opencontainers.image.url="https://github.com/zotero/zotero-mcp"
LABEL org.opencontainers.image.source="https://github.com/zotero/zotero-mcp"
LABEL org.opencontainers.image.license="MIT"
# Command to run the server
ENTRYPOINT ["uv", "run", "--quiet", "zotero-mcp"]
```
--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
```toml
[project]
name = "zotero-mcp"
version = "0.1.6"
description = "Model Context Protocol server for Zotero"
authors = [{ name = "Aaron Taylor", email = "[email protected]" }]
readme = "README.md"
license = { file = "LICENSE" }
requires-python = ">=3.11"
keywords = ["mcp", "zotero"]
classifiers = [
"Intended Audience :: Developers",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Programming Language :: Python :: 3 :: Only",
"Operating System :: OS Independent",
]
dependencies = [
"mcp[cli]>=1.2.1",
"pydantic>=2.10.6",
"python-dotenv>=1.0.1",
"pyzotero>=1.6.8",
]
[project.scripts]
zotero-mcp = "zotero_mcp.cli:main"
[project.urls]
Repository = "https://github.com/kujenga/zotero-mcp"
Issues = "https://github.com/kujenga/zotero-mcp/issues"
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[dependency-groups]
dev = ["pytest>=8.3.4", "ruff>=0.9.4"]
```
--------------------------------------------------------------------------------
/tests/test_search.py:
--------------------------------------------------------------------------------
```python
"""Tests for search functionality"""
from typing import Any
from zotero_mcp import search_items
def test_search_items_basic(mock_zotero: Any, sample_item: dict[str, Any]) -> None:
"""Test basic search functionality"""
mock_zotero.items.return_value = [sample_item]
result = search_items("test")
assert "Test Article" in result
assert "**Key**: `ABCD1234`" in result
assert "**Authors**: Doe, John; Smith, Jane" in result
assert "This is a test abstract" in result
# Verify search parameters
mock_zotero.add_parameters.assert_called_once_with(
q="test", qmode="titleCreatorYear", limit=10
)
def test_search_items_no_results(mock_zotero: Any) -> None:
"""Test search with no results"""
mock_zotero.items.return_value = []
result = search_items("nonexistent")
assert "No items found" in result
def test_search_items_custom_params(
mock_zotero: Any, sample_item: dict[str, Any]
) -> None:
"""Test search with custom parameters"""
mock_zotero.items.return_value = [sample_item]
search_items("test", qmode="everything", limit=5)
mock_zotero.add_parameters.assert_called_once_with(
q="test", qmode="everything", limit=5
)
```
--------------------------------------------------------------------------------
/tests/conftest.py:
--------------------------------------------------------------------------------
```python
"""Pytest fixtures for zotero-mcp tests"""
from typing import Any
from unittest.mock import MagicMock
import pytest
from pyzotero import zotero
@pytest.fixture
def mock_zotero(monkeypatch) -> MagicMock:
"""Fixture that returns a mocked Zotero client"""
mock = MagicMock(spec=zotero.Zotero)
def mock_get_zotero_client():
return mock
monkeypatch.setattr("zotero_mcp.get_zotero_client", mock_get_zotero_client)
return mock
@pytest.fixture
def sample_item() -> dict[str, Any]:
"""Fixture that returns a sample Zotero item"""
return {
"key": "ABCD1234",
"data": {
"key": "ABCD1234",
"itemType": "journalArticle",
"title": "Test Article",
"date": "2024",
"creators": [
{"firstName": "John", "lastName": "Doe"},
{"firstName": "Jane", "lastName": "Smith"},
],
"abstractNote": "This is a test abstract",
"tags": [{"tag": "test"}, {"tag": "article"}],
"url": "https://example.com",
"DOI": "10.1234/test",
},
"meta": {"numChildren": 2},
}
@pytest.fixture
def sample_attachment() -> dict[str, Any]:
"""Fixture that returns a sample Zotero attachment item"""
return {
"key": "XYZ789",
"data": {
"key": "XYZ789",
"itemType": "attachment",
"contentType": "application/pdf",
"md5": "123456789",
},
}
```
--------------------------------------------------------------------------------
/tests/test_item_operations.py:
--------------------------------------------------------------------------------
```python
"""Tests for item metadata and fulltext operations"""
from typing import Any
from zotero_mcp import get_item_metadata, get_item_fulltext
def test_get_item_metadata(mock_zotero: Any, sample_item: dict[str, Any]) -> None:
"""Test retrieving item metadata"""
mock_zotero.item.return_value = sample_item
result = get_item_metadata("ABCD1234")
assert "## Test Article" in result
assert "Item Key: `ABCD1234`" in result
assert "Type: journalArticle" in result
assert "Date: 2024" in result
assert "Doe, John; Smith, Jane" in result
assert "### Abstract" in result
assert "This is a test abstract" in result
assert "### Tags" in result
assert "`test`" in result and "`article`" in result
assert "URL: https://example.com" in result
assert "DOI: 10.1234/test" in result
assert "Number of notes/attachments: 2" in result
def test_get_item_metadata_not_found(mock_zotero: Any) -> None:
"""Test retrieving metadata for nonexistent item"""
mock_zotero.item.return_value = None
result = get_item_metadata("NONEXISTENT")
assert "No item found" in result
def test_get_item_fulltext(
mock_zotero: Any, sample_item: dict[str, Any], sample_attachment: dict[str, Any]
) -> None:
"""Test retrieving item fulltext"""
mock_zotero.item.return_value = sample_item
mock_zotero.children.return_value = [sample_attachment]
mock_zotero.fulltext_item.return_value = {"content": "Sample full text content"}
result = get_item_fulltext("ABCD1234")
assert "Test Article" in result
assert "Sample full text content" in result
assert "XYZ789" in result # Attachment key
def test_get_item_fulltext_no_attachment(
mock_zotero: Any, sample_item: dict[str, Any]
) -> None:
"""Test retrieving fulltext when no attachment is available"""
mock_zotero.item.return_value = sample_item
mock_zotero.children.return_value = []
result = get_item_fulltext("ABCD1234")
assert "No suitable attachment found" in result
```
--------------------------------------------------------------------------------
/.github/workflows/python-package.yml:
--------------------------------------------------------------------------------
```yaml
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python
name: Python package
env:
IMAGE_REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
on:
push:
branches: ['main']
pull_request:
branches: ['main']
jobs:
build:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ['3.11', '3.12', '3.13']
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v5
- name: Install dependencies
run: uv sync --all-groups
- name: Check format with ruff
run: uv run ruff format --check
- name: Check lints with ruff
run: uv run ruff check
- name: Test with pytest
run: uv run pytest
docker:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
attestations: write
id-token: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Log in to the Container registry
uses: docker/login-action@v3
with:
registry: ${{ env.IMAGE_REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.IMAGE_REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch,suffix=-{{sha}}
type=ref,event=pr,suffix=-{{sha}}
type=ref,event=branch
type=ref,event=pr
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=semver,pattern={{major}}
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build and push Docker image
id: push
uses: docker/build-push-action@v6
with:
platforms: linux/amd64,linux/arm64
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
- name: Generate artifact attestation
uses: actions/attest-build-provenance@v2
with:
subject-name: ${{ env.IMAGE_REGISTRY }}/${{ env.IMAGE_NAME}}
subject-digest: ${{ steps.push.outputs.digest }}
push-to-registry: true
```
--------------------------------------------------------------------------------
/tests/test_client.py:
--------------------------------------------------------------------------------
```python
"""Tests for Zotero client module"""
import os
from unittest.mock import patch
import pytest
from zotero_mcp.client import get_zotero_client
@pytest.fixture
def mock_env_vars():
"""Mock environment variables for testing"""
with patch.dict(
os.environ,
{
"ZOTERO_LIBRARY_ID": "1234567",
"ZOTERO_LIBRARY_TYPE": "user",
"ZOTERO_API_KEY": "abcdef123456",
"ZOTERO_LOCAL": "",
},
clear=True,
):
yield
@pytest.fixture
def mock_env_vars_local():
"""Mock environment variables for local mode"""
with patch.dict(
os.environ,
{
"ZOTERO_LIBRARY_ID": "",
"ZOTERO_LIBRARY_TYPE": "user",
"ZOTERO_API_KEY": "",
"ZOTERO_LOCAL": "true",
},
clear=True,
):
yield
def test_get_zotero_client_with_api_key(mock_env_vars):
"""Test client initialization with API key"""
with patch("zotero_mcp.client.zotero.Zotero") as mock_zotero:
get_zotero_client()
mock_zotero.assert_called_once_with(
library_id="1234567",
library_type="user",
api_key="abcdef123456",
local=False,
)
def test_get_zotero_client_missing_api_key():
"""Test client initialization with missing API key"""
with patch.dict(
os.environ,
{
"ZOTERO_LIBRARY_ID": "1234567",
"ZOTERO_LIBRARY_TYPE": "user",
"ZOTERO_API_KEY": "",
"ZOTERO_LOCAL": "",
},
clear=True,
):
with pytest.raises(ValueError) as excinfo:
get_zotero_client()
assert "Missing required environment variables" in str(excinfo.value)
def test_get_zotero_client_local_mode(mock_env_vars_local):
"""Test client initialization in local mode"""
with patch("zotero_mcp.client.zotero.Zotero") as mock_zotero:
get_zotero_client()
mock_zotero.assert_called_once_with(
library_id="0",
library_type="user",
api_key=None,
local=True,
)
def test_get_zotero_client_local_mode_with_library_id():
"""Test client initialization in local mode with custom library ID"""
with patch.dict(
os.environ,
{
"ZOTERO_LIBRARY_ID": "custom_id",
"ZOTERO_LIBRARY_TYPE": "user",
"ZOTERO_API_KEY": "",
"ZOTERO_LOCAL": "true",
},
clear=True,
):
with patch("zotero_mcp.client.zotero.Zotero") as mock_zotero:
get_zotero_client()
mock_zotero.assert_called_once_with(
library_id="custom_id",
library_type="user",
api_key=None,
local=True,
)
```
--------------------------------------------------------------------------------
/src/zotero_mcp/client.py:
--------------------------------------------------------------------------------
```python
import os
from typing import Any
from dotenv import load_dotenv
from pydantic import BaseModel
from pyzotero import zotero
# Load environment variables
load_dotenv()
# Initialize Zotero client
def get_zotero_client() -> zotero.Zotero:
"""Get authenticated Zotero client using environment variables"""
library_id = os.getenv("ZOTERO_LIBRARY_ID")
library_type = os.getenv("ZOTERO_LIBRARY_TYPE", "user")
api_key = os.getenv("ZOTERO_API_KEY") or None
local = os.getenv("ZOTERO_LOCAL", "").lower() in ["true", "yes", "1"]
if local:
if not library_id:
# Indicates "current user" for the local API
library_id = "0"
elif not all([library_id, api_key]):
raise ValueError(
"Missing required environment variables. Please set ZOTERO_LIBRARY_ID and ZOTERO_API_KEY"
)
return zotero.Zotero(
library_id=library_id,
library_type=library_type,
api_key=api_key,
local=local,
)
class AttachmentDetails(BaseModel):
key: str
content_type: str
def get_attachment_details(
zot: zotero.Zotero,
item: dict[str, Any],
) -> AttachmentDetails | None:
"""Get attachment ID and content type for a Zotero item"""
data = item.get("data", {})
item_type = data.get("itemType")
# Direct attachment - check if it's a PDF or other supported type
if item_type == "attachment":
content_type = data.get("contentType")
return AttachmentDetails(
key=data.get("key"),
content_type=content_type,
)
# For regular items, look for child attachments
try:
children: Any = zot.children(data.get("key", ""))
# Group attachments by content type and size
pdfs = []
htmls = []
others = []
for child in children:
child_data = child.get("data", {})
if child_data.get("itemType") == "attachment":
content_type = child_data.get("contentType")
file_size = child_data.get("md5", "") # Use md5 as proxy for size
if content_type == "application/pdf":
pdfs.append((child_data.get("key"), content_type, file_size))
elif content_type == "text/html":
htmls.append((child_data.get("key"), content_type, file_size))
else:
others.append((child_data.get("key"), content_type, file_size))
# Return first match in priority order
if pdfs:
pdfs.sort(key=lambda x: x[2], reverse=True)
return AttachmentDetails(
key=pdfs[0][0],
content_type=pdfs[0][1],
)
if htmls:
htmls.sort(key=lambda x: x[2], reverse=True)
return AttachmentDetails(
key=htmls[0][0],
content_type=htmls[0][1],
)
if others:
others.sort(key=lambda x: x[2], reverse=True)
return AttachmentDetails(
key=others[0][0],
content_type=others[0][1],
)
except Exception:
pass
return None
```
--------------------------------------------------------------------------------
/src/zotero_mcp/__init__.py:
--------------------------------------------------------------------------------
```python
from typing import Any, Literal
from mcp.server.fastmcp import FastMCP
from zotero_mcp.client import get_attachment_details, get_zotero_client
# Create an MCP server
mcp = FastMCP("Zotero")
def format_item(item: dict[str, Any]) -> str:
"""Format a Zotero item's metadata as a readable string optimized for LLM consumption"""
data = item["data"]
item_key = item["key"]
item_type = data.get("itemType", "unknown")
# Special handling for notes
if item_type == "note":
# Get note content
note_content = data.get("note", "")
# Strip HTML tags for cleaner text (simple approach)
note_content = (
note_content.replace("<p>", "").replace("</p>", "\n").replace("<br>", "\n")
)
note_content = note_content.replace("<strong>", "**").replace("</strong>", "**")
note_content = note_content.replace("<em>", "*").replace("</em>", "*")
# Format note with clear sections
formatted = [
"## 📝 Note",
f"Item Key: `{item_key}`",
]
# Add parent item reference if available
if parent_item := data.get("parentItem"):
formatted.append(f"Parent Item: `{parent_item}`")
# Add date if available
if date := data.get("dateModified"):
formatted.append(f"Last Modified: {date}")
# Add tags with formatting for better visibility
if tags := data.get("tags"):
tag_list = [f"`{tag['tag']}`" for tag in tags]
formatted.append(f"\n### Tags\n{', '.join(tag_list)}")
# Add note content
formatted.append(f"\n### Note Content\n{note_content}")
return "\n".join(formatted)
# Regular item handling (non-notes)
# Basic metadata with key for easy reference
formatted = [
f"## {data.get('title', 'Untitled')}",
f"Item Key: `{item_key}`",
f"Type: {item_type}",
f"Date: {data.get('date', 'No date')}",
]
# Creators with role differentiation
creators_by_role = {}
for creator in data.get("creators", []):
role = creator.get("creatorType", "contributor")
name = ""
if "firstName" in creator and "lastName" in creator:
name = f"{creator['lastName']}, {creator['firstName']}"
elif "name" in creator:
name = creator["name"]
if name:
if role not in creators_by_role:
creators_by_role[role] = []
creators_by_role[role].append(name)
for role, names in creators_by_role.items():
role_display = role.capitalize() + ("s" if len(names) > 1 else "")
formatted.append(f"{role_display}: {'; '.join(names)}")
# Publication details
if publication := data.get("publicationTitle"):
formatted.append(f"Publication: {publication}")
if volume := data.get("volume"):
volume_info = f"Volume: {volume}"
if issue := data.get("issue"):
volume_info += f", Issue: {issue}"
if pages := data.get("pages"):
volume_info += f", Pages: {pages}"
formatted.append(volume_info)
# Abstract with clear section header
if abstract := data.get("abstractNote"):
formatted.append(f"\n### Abstract\n{abstract}")
# Tags with formatting for better visibility
if tags := data.get("tags"):
tag_list = [f"`{tag['tag']}`" for tag in tags]
formatted.append(f"\n### Tags\n{', '.join(tag_list)}")
# URLs, DOIs, and identifiers grouped together
identifiers = []
if url := data.get("url"):
identifiers.append(f"URL: {url}")
if doi := data.get("DOI"):
identifiers.append(f"DOI: {doi}")
if isbn := data.get("ISBN"):
identifiers.append(f"ISBN: {isbn}")
if issn := data.get("ISSN"):
identifiers.append(f"ISSN: {issn}")
if identifiers:
formatted.append("\n### Identifiers\n" + "\n".join(identifiers))
# Notes and attachments
if notes := item.get("meta", {}).get("numChildren", 0):
formatted.append(
f"\n### Additional Information\nNumber of notes/attachments: {notes}"
)
return "\n".join(formatted)
@mcp.tool(
name="zotero_item_metadata",
description="Get metadata information about a specific Zotero item, given the item key.",
)
def get_item_metadata(item_key: str) -> str:
"""Get metadata information about a specific Zotero item"""
zot = get_zotero_client()
try:
item: Any = zot.item(item_key)
if not item:
return f"No item found with key: {item_key}"
return format_item(item)
except Exception as e:
return f"Error retrieving item metadata: {str(e)}"
@mcp.tool(
name="zotero_item_fulltext",
description="Get the full text content of a Zotero item, given the item key of a parent item or specific attachment.",
)
def get_item_fulltext(item_key: str) -> str:
"""Get the full text content of a specific Zotero item"""
zot = get_zotero_client()
try:
item: Any = zot.item(item_key)
if not item:
return f"No item found with key: {item_key}"
# Fetch full-text content
attachment = get_attachment_details(zot, item)
# Prepare header with metadata
header = format_item(item)
# Add attachment information
if attachment is not None:
attachment_info = f"\n## Attachment Information\n- **Key**: `{attachment.key}`\n- **Type**: {attachment.content_type}"
# Get the full text
full_text_data: Any = zot.fulltext_item(attachment.key)
if full_text_data and "content" in full_text_data:
item_text = full_text_data["content"]
# Calculate approximate word count
word_count = len(item_text.split())
attachment_info += f"\n- **Word Count**: ~{word_count}"
# Format the content with markdown for structure
full_text = f"\n\n## Document Content\n\n{item_text}"
else:
# Clear error message when text extraction isn't possible
full_text = "\n\n## Document Content\n\n[⚠️ Attachment is available but text extraction is not possible. The document may be scanned as images or have other restrictions that prevent text extraction.]"
else:
attachment_info = "\n\n## Attachment Information\n[❌ No suitable attachment found for full text extraction. This item may not have any attached files or they may not be in a supported format.]"
full_text = ""
# Combine all sections
return f"{header}{attachment_info}{full_text}"
except Exception as e:
return f"Error retrieving item full text: {str(e)}"
@mcp.tool(
name="zotero_search_items",
# More detail can be added if useful: https://www.zotero.org/support/dev/web_api/v3/basics#searching
description="Search for items in your Zotero library, given a query string, query mode (titleCreatorYear or everything), and optional tag search (supports boolean searches). Returned results can be looked up with zotero_item_fulltext or zotero_item_metadata.",
)
def search_items(
query: str,
qmode: Literal["titleCreatorYear", "everything"] | None = "titleCreatorYear",
tag: str | None = None,
limit: int | None = 10,
) -> str:
"""Search for items in your Zotero library"""
zot = get_zotero_client()
# Search using the q parameter
params = {"q": query, "qmode": qmode, "limit": limit}
if tag:
params["tag"] = tag
zot.add_parameters(**params)
# n.b. types for this return do not work, it's a parsed JSON object
results: Any = zot.items()
if not results:
return "No items found matching your query."
# Header with search info
header = [
f"# Search Results for: '{query}'",
f"Found {len(results)} items." + (f" Using tag filter: {tag}" if tag else ""),
"Use item keys with zotero_item_metadata or zotero_item_fulltext for more details.\n",
]
# Format results
formatted_results = []
for i, item in enumerate(results):
data = item["data"]
item_key = item.get("key", "")
item_type = data.get("itemType", "unknown")
# Special handling for notes
if item_type == "note":
# Get note content
note_content = data.get("note", "")
# Strip HTML tags for cleaner text (simple approach)
note_content = (
note_content.replace("<p>", "")
.replace("</p>", "\n")
.replace("<br>", "\n")
)
note_content = note_content.replace("<strong>", "**").replace(
"</strong>", "**"
)
note_content = note_content.replace("<em>", "*").replace("</em>", "*")
# Extract a title from the first line if possible, otherwise use first few words
title_preview = ""
if note_content:
lines = note_content.strip().split("\n")
first_line = lines[0].strip()
if first_line:
# Use first line if it's reasonably short, otherwise use first few words
if len(first_line) <= 50:
title_preview = first_line
else:
words = first_line.split()
title_preview = " ".join(words[:5]) + "..."
# Create a good title for the note
note_title = title_preview if title_preview else "Note"
# Get a preview of the note content (truncated)
preview = note_content.strip()
if len(preview) > 150:
preview = preview[:147] + "..."
# Format the note entry
entry = [
f"## {i + 1}. 📝 {note_title}",
f"**Type**: Note | **Key**: `{item_key}`",
f"\n{preview}",
]
# Add parent item reference if available
if parent_item := data.get("parentItem"):
entry.insert(2, f"**Parent Item**: `{parent_item}`")
# Add tags if present (limited to first 5)
if tags := data.get("tags"):
tag_list = [f"`{tag['tag']}`" for tag in tags[:5]]
if len(tags) > 5:
tag_list.append("...")
entry.append(f"\n**Tags**: {' '.join(tag_list)}")
formatted_results.append("\n".join(entry))
continue
# Regular item processing (non-notes)
title = data.get("title", "Untitled")
date = data.get("date", "")
# Format primary creators (limited to first 3)
creators = []
for creator in data.get("creators", [])[:3]:
if "firstName" in creator and "lastName" in creator:
creators.append(f"{creator['lastName']}, {creator['firstName']}")
elif "name" in creator:
creators.append(creator["name"])
if len(data.get("creators", [])) > 3:
creators.append("et al.")
creator_str = "; ".join(creators) if creators else "No authors"
# Get publication or source info
source = ""
if pub := data.get("publicationTitle"):
source = pub
elif book := data.get("bookTitle"):
source = f"In: {book}"
elif publisher := data.get("publisher"):
source = f"{publisher}"
# Get a brief abstract (truncated if too long)
abstract = data.get("abstractNote", "")
if len(abstract) > 150:
abstract = abstract[:147] + "..."
# Build formatted entry with markdown for better structure
entry = [
f"## {i + 1}. {title}",
f"**Type**: {item_type} | **Date**: {date} | **Key**: `{item_key}`",
f"**Authors**: {creator_str}",
]
if source:
entry.append(f"**Source**: {source}")
if abstract:
entry.append(f"\n{abstract}")
# Add tags if present (limited to first 5)
if tags := data.get("tags"):
tag_list = [f"`{tag['tag']}`" for tag in tags[:5]]
if len(tags) > 5:
tag_list.append("...")
entry.append(f"\n**Tags**: {' '.join(tag_list)}")
formatted_results.append("\n".join(entry))
return "\n\n".join(header + formatted_results)
```