reeeeemo/ancestry-mcp # codebase.md

# Directory Structure

```
├── .gitignore
├── .python-version
├── LICENSE
├── pyproject.toml
├── README.md
├── src
│   └── mcp_server_ancestry
│       ├── __init__.py
│       └── server.py
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------

```
3.10

```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Visual Studio extensions
.vs

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
#   For a library or package, you might want to ignore these files since the code is
#   intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
#   However, in case of collaboration, if having platform-specific dependencies or dependencies
#   having no cross-platform support, pipenv may install dependencies that don't work, or not
#   install all needed dependencies.
#Pipfile.lock

# poetry
#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
#   This is especially recommended for binary packages to ensure reproducibility, and is more
#   commonly ignored for libraries.
#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
#   in version control.
#   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
#  and can be added to the global gitignore or merged into this file.  For a more nuclear
#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
# Ancestry MCP Server
[![smithery badge](https://smithery.ai/badge/mcp-server-ancestry)](https://smithery.ai/server/mcp-server-ancestry)
[![MIT licensed][mit-badge]][mit-url]
[![Python Version][python-badge]][python-url]
[![PyPI version][pypi-badge]][pypi-url]

[mit-badge]: https://img.shields.io/pypi/l/mcp.svg
[mit-url]: https://github.com/reeeeemo/ancestry-mcp/blob/main/LICENSE
[python-badge]: https://img.shields.io/pypi/pyversions/mcp.svg
[python-url]: https://www.python.org/downloads/
[pypi-badge]: https://badge.fury.io/py/mcp-server-ancestry.svg
[pypi-url]: https://pypi.org/project/mcp-server-ancestry

Built on top of the [Model Context Protocol Python SDK](https://modelcontextprotocol.io)

<a href="https://glama.ai/mcp/servers/pk5j4bp5nv"><img width="380" height="200" src="https://glama.ai/mcp/servers/pk5j4bp5nv/badge" alt="Ancestry MCP server" /></a>

## Overview

Python server implementing Model Context Protocol (MCP) for interactibility with `.ged` files *(GEDCOM files, commonly seen on Ancestry.com)*

## Features
    
- Read and parse .ged files
- Rename `.ged` files
- Search within .ged files for certain individuals, family, etc

**Note:** The server will only allow operations within the directory specified via `args`

## Resources

- `gedcom://{file_name}`: `.ged` operations interface

## Tools

- **list_files**
    - List a (or multiple) `.ged` file within the directory
    - Input: `name` (string)

- **rename_file**
    - Renames a (or multiple) `.ged` file within the directory
    - Inputs:
        - `file_name` (string): Old file name
        - `new_name` (string)
 
- **view_file**
    - Parses and reads full contents of a `.ged` file
    - Can also parse and read multiple files
    - Can get specific information out of file(s), such as date of birth, marriage, etc.
    - Input: `name` (string)


## Usage with Claude Desktop

### Installing via Smithery

To install Ancestry GEDCOM Server for Claude Desktop automatically via [Smithery](https://smithery.ai/server/mcp-server-ancestry):

```bash
npx -y @smithery/cli install mcp-server-ancestry --client claude
```

### Installing Manually
1. First, install the package:
```pip install mcp-server-ancestry```


2. Add this to your `claude_desktop_config.json` 

```json
{
  "mcpServers": {
     "ancestry": {
       "command": "mcp-server-ancestry",
       "args": ["--gedcom-path", "path/to/your/gedcom/files"]
     }
  }
}
```

## License

This project is licensed under the MIT License - see the LICENSE file for details.

```

--------------------------------------------------------------------------------
/src/mcp_server_ancestry/__init__.py:
--------------------------------------------------------------------------------

```python
from .server import serve


def main():
    """MCP Ancestry Server - Takes GEDCOM files and provides functionality"""
    import asyncio
    import argparse

    parser = argparse.ArgumentParser(
        description='give a model the ability to use GEDCOM files'
        )
    parser.add_argument(
        '--gedcom-path',
        type=str,
        required=True,
        help='Path to directory containing GEDCOM files'
        )
    
    args = parser.parse_args()
    
    asyncio.run(serve(args.gedcom_path))

if __name__ == "__main__":
    main()
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
[project]
name = "mcp-server-ancestry"
version = "0.1.1"
description = "A Model Context Protocol server providing functionality to GEDCOM files via LLM usage"
readme = "README.md"
requires-python = ">=3.10"
authors = [{ name = "Robert Oxley" }]
maintainers = [{ name = "Robert Oxley", email = "[email protected]" }]
keywords = ["mcp", "llm", "automation"]
license = { text = "MIT" }
classifiers = [
    "Development Status :: 4 - Beta",
    "Intended Audience :: Developers",
    "License :: OSI Approved :: MIT License",
    "Programming Language :: Python :: 3",
    "Programming Language :: Python :: 3.10",
]
dependencies = [
    "mcp>=1.0.0",
    "pydantic>=2.0.0",
    "requests>=2.32.3",
    "chardet>=5.2.0",
]

[project.scripts]
mcp-server-ancestry = "mcp_server_ancestry:main"

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.uv]
dev-dependencies = ["pyright>=1.1.389", "ruff>=0.7.3"]
```

--------------------------------------------------------------------------------
/src/mcp_server_ancestry/server.py:
--------------------------------------------------------------------------------

```python
from ast import Dict
import asyncio
import logging 
import json
import os 
from pathlib import Path
import mcp.types as types
from mcp.server import Server
from mcp.server.stdio import stdio_server
from enum import Enum
from pydantic import BaseModel
import chardet

ged_level_1_tags = ['BIRT', 'DEAT', 'MARR', 'BURI', 'DIV', 'OCCU', 'RESI', 'CHR']

# Tools schemas 
class ListFiles(BaseModel):
    name: str
    
class RenameFiles(BaseModel):
    file_name: str
    new_name: str
    
class ViewFiles(BaseModel):
    name: str

# Tool names
class AncestryTools(str, Enum):
    LIST_FILES = "list_files"
    RENAME_FILE = "rename_file"
    VIEW_FILES = "view_file"

# Tool helper functions
def find_files_with_name(name: str | None = None, path: Path | None = None) -> list[Path]:
    pattern = f"{name}.ged" if name is not None else "*.ged"
    return list(path.glob(pattern))

def rename_files(new_name: str | None = None, files: list[Path] | None = None) -> tuple[str, list[Dict], str]:
    try:
        renamed_files = []
        for file in files:
            try:
                new_path = file.parent / f"{new_name.removesuffix('.ged')}.ged"
                if new_path.exists():
                    return [], f"Cannot rename, {new_path.name} already exists"
                file.rename(new_path)
                renamed_files.append(new_path)
            except PermissionError:
                return [], f'Permission denied: Cannot rename {file.name}. Check write perms'
            except OSError as e:
                return [], f'Error renaming {file.name}: {str(e)}'
    except Exception as e:
        return [], f'An unexpected error ocurred: {str(e)}. Please try again later or contact support.'
    
    return renamed_files, ""

def parse_ged_file(files: list[Path] | None = None) -> tuple[list[Dict], str]:
    try:
        parsed_geds = {}
        for file in files:
            if not file.exists() or file.suffix.lower() != '.ged':
                continue
            
            parsed_geds[file.name] = []
            
            # determine encoding 
            raw_bytes = file.read_bytes()
            result = chardet.detect(raw_bytes)
            # open file, and parse ged data
            try:
                with file.open(encoding=result['encoding']) as ged:
                    ged_obj = {}
                    cur_lvl1_tag = None
                    
                    for line in ged:
                        '''
                        Level 0: root records
                        Level 1: main info about records
                        Level 2: details about level 1 info
                        '''
                        parts = line.strip().split(' ', 2)
                        if not parts: 
                            continue
                        level = int(parts[0])
                        tag = parts[1]
                        value = parts[2] if len(parts) > 2 else ''

                        if level == 0: 
                            # save prev obj if exists
                            if ged_obj and 'type' in ged_obj:
                                parsed_geds[file.name].append(ged_obj)
                                
                            ged_obj = {}
                            if '@' in tag: # ID
                                ged_obj['id'] = tag
                                ged_obj['type'] = value
                        elif level == 1:
                            cur_lvl1_tag = tag
                            if tag in ged_level_1_tags:
                                ged_obj[tag] = {}
                            else:
                                ged_obj[tag] = value
                        elif level == 2 and cur_lvl1_tag:
                            # If parent is an event
                            if cur_lvl1_tag in ged_level_1_tags:
                                if cur_lvl1_tag not in ged_obj:
                                    ged_obj[cur_lvl1_tag] = {}
                                ged_obj[cur_lvl1_tag][tag] = value
                            elif cur_lvl1_tag == 'NAME':
                                ged_obj[f'NAME_{tag}'] = value
                            else:
                                ged_obj[tag] = value
                                
                    if ged_obj and 'type' in ged_obj:
                        parsed_geds[file.name].append(ged_obj)
            except UnicodeDecodeError:
                return [], f'File could not be decoded, please check encoding on the .ged'
    except Exception as e:
        return [], f'An unexpected error occured: {str(e)}. Please try again later or contact support.'
    return parsed_geds, ""

# logging config
logging.basicConfig(
    filename='mcp_ancestry.log',
    level=logging.DEBUG,
    format='%(asctime)s - %(levelname)s - %(message)s'
    
    )

# server main code
async def serve(gedcom_path: str | None = None) -> None:
    app = Server("ancestry")
    
    # Verification of GEDCOM path
    path = Path(gedcom_path)
    if not path.exists():
        raise ValueError(f'Invalid path: {gedcom_path}')
    if not path.is_dir():
        raise ValueError(f'GEDCOM path is not a directory: {gedcom_path}')

    if not os.access(path, os.R_OK):
        raise ValueError(f'GEDCOM path does not have read / write permissions: {gedcom_path}')
    
    # debug stuff ! 
    logging.debug(f'Path exists and is valid: {path.absolute()}')
    logging.debug(f'Contents of directory: {list(path.iterdir())}')

    # makes GEDCOM files visible to Claude
    @app.list_resources()
    async def list_resources() -> list[types.Resource]:
        gedcom_files = list(path.glob("*.ged"))
        # scan gedcom path dir for .ged files
        return [
            types.Resource(
                uri=f"gedcom://{file.name}",
                name=file.name,
                mimeType="application/x-gedcom"
            )
            for file in gedcom_files
        ]
    

    @app.list_tools()
    async def list_tools() -> list[types.Tool]:
        return [
            types.Tool(
                name=AncestryTools.LIST_FILES,
                description="List GEDCOM files",
                inputSchema=ListFiles.model_json_schema()
            ),
            types.Tool(
                name=AncestryTools.RENAME_FILE,
                description="Rename a GEDCOM file",
                inputSchema=RenameFiles.model_json_schema()
            ),
            types.Tool(
                name=AncestryTools.VIEW_FILES,
                description="View a GEDCOM file in plaintext format",
                inputSchema=ViewFiles.model_json_schema()
            )
        ]
    
    @app.call_tool()
    async def call_tool(name: str, 
    arguments: dict) -> list[types.TextContent]:
        match name:
            case AncestryTools.LIST_FILES:
                gedcom_files = find_files_with_name(arguments["name"].removesuffix('.ged'), path)
                return [
                    types.TextContent(
                        type="text",
                        text=f"File: {file.name}\nSize: {file.stat().st_size} bytes\nURI: gedcom://{file.name}"
                    )
                    for file in gedcom_files
                ]
            case AncestryTools.RENAME_FILE:
                # get files, if none found tell server that
                gedcom_files = find_files_with_name(arguments["file_name"].removesuffix('.ged'), path)
                if not gedcom_files:
                    return [
                        types.TextContent(
                            type="text",
                            text=f'No files found matching {arguments["file_name"]}'
                        )    
                    ]
                # rename files, if error message tell server
                renamed_files, message = rename_files(arguments["new_name"].removesuffix('.ged'), gedcom_files)
                if message:
                    return [
                        types.TextContent(
                            type="text",
                            text=message
                        )
                    ]
                
                return [
                    types.TextContent(
                        type="text",
                        text=f"{file.name}\nURI:gedcom://{file.name}"
                    )
                    for file in renamed_files
                ]
            case AncestryTools.VIEW_FILES:
                # get files, if none found tell serve rthat
                gedcom_files = find_files_with_name(arguments["name"].removesuffix('.ged'), path)
                if not gedcom_files:
                    return [
                        types.TextContent(
                            type="text",
                            text=f'No files found matching {arguments["name"]}'
                        )    
                    ]
                
                # show file, if error message tell server
                parsed_geds, message = parse_ged_file(gedcom_files)
                
                if message:
                    return [
                        types.TextContent(
                            type="text",
                            text=message
                        )
                    ]
                
                return [
                    types.TextContent(
                        type="text",
                        text=json.dumps({filename: data}, indent=2)
                    )
                    for filename, data in parsed_geds.items()
                ]
            case _:
                raise ValueError(f"Unknown Tool: {name}")
        
    
    async with stdio_server() as streams:
        await app.run(
            streams[0],
            streams[1],
            app.create_initialization_options()
        )

if __name__ == "__main__":
    asyncio.run(serve())
```