#
tokens: 13825/50000 11/11 files
lines: off (toggle) GitHub
raw markdown copy
# Directory Structure

```
├── .gitignore
├── docs
│   ├── fastmcp.md
│   └── langextract.md
├── LICENSE
├── pyproject.toml
├── README.md
├── SETUP.md
├── src
│   └── langextract_mcp
│       ├── __init__.py
│       ├── resources
│       │   ├── __init__.py
│       │   ├── README.md
│       │   └── supported-models.md
│       └── server.py
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# IDE
.vscode/
.idea/
*.swp
*.swo
*~

# macOS
.DS_Store

# Windows
Thumbs.db
ehthumbs.db
Desktop.ini

# Project specific
results/
*.jsonl
*.html
temp/
test_output/

# API keys and sensitive info
.env.local
.env.production
api_keys.txt

# Logs
*.log
logs/
EOF < /dev/null

# Claude
.mcp.json
.CLAUDE.md
.claude/
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
# LangExtract MCP Server

A FastMCP server for Google's [langextract](https://github.com/google/langextract) library. This server enables AI assistants like Claude Code to extract structured information from unstructured text using Large Language Models through a MCP interface.

<a href="https://glama.ai/mcp/servers/@larsenweigle/langextract-mcp">
  <img width="380" height="200" src="https://glama.ai/mcp/servers/@larsenweigle/langextract-mcp/badge" alt="LangExtract Server MCP server" />
</a>

## Overview

LangExtract is a Python library that uses LLMs to extract structured information from text documents while maintaining precise source grounding. This MCP server exposes langextract's capabilities through the Model Context Protocol. The server includes intelligent caching, persistent connections, and server-side credential management to provide optimal performance in long-running environments like Claude Code.

## Quick Setup for Claude Code

### Prerequisites

- Claude Code installed and configured
- Google Gemini API key ([Get one here](https://aistudio.google.com/app/apikey))
- Python 3.10 or higher

### Installation

Install directly into Claude Code using the built-in MCP management:

```bash
claude mcp add langextract-mcp -e LANGEXTRACT_API_KEY=your-gemini-api-key -- uv run --with fastmcp fastmcp run src/langextract_mcp/server.py
```

The server will automatically start and integrate with Claude Code. No additional configuration is required.

### Verification

After installation, verify the integration entering in Claude Code:

```
/mcp
```

You should see output indicating the server is running and can enter the server to see its tool contents.

## Available Tools

The server provides the following tools for text extraction workflows:

**Core Extraction**
- `extract_from_text` - Extract structured information from provided text
- `extract_from_url` - Extract information from web content
- `save_extraction_results` - Save results to JSONL format
- `generate_visualization` - Create interactive HTML visualizations

For more information, you can checkout out the resources available to the client under `src/langextract_mcp/resources`

## Usage Examples

I am currently adding the abilty for MCP clients to pass file paths to unstructured text.

### Basic Text Extraction

Ask Claude Code to extract information using natural language:

```
Extract medication information from this text: "Patient prescribed 500mg amoxicillin twice daily for infection"

Use these examples to guide the extraction:
- Text: "Take 250mg ibuprofen every 4 hours"
- Expected: medication=ibuprofen, dosage=250mg, frequency=every 4 hours
```

### Advanced Configuration

For complex extractions, specify configuration parameters:

```
Extract character emotions from Shakespeare using:
- Model: gemini-2.5-pro for better literary analysis
- Multiple passes: 3 for comprehensive extraction
- Temperature: 0.2 for consistent results
```

### URL Processing

Extract information directly from web content:

```
Extract key findings from this research paper: https://arxiv.org/abs/example
Focus on methodology, results, and conclusions
```

## Supported Models

This server currently supports **Google Gemini models only**, optimized for reliable structured extraction with advanced schema constraints:

- `gemini-2.5-flash` - **Recommended default** - Optimal balance of speed, cost, and quality
- `gemini-2.5-pro` - Best for complex reasoning and analysis tasks requiring highest accuracy

The server uses persistent connections, schema caching, and connection pooling for optimal performance with Gemini models. Support for additional providers may be added in future versions.

## Configuration Reference

### Environment Variables

Set during installation or in server environment:

```bash
LANGEXTRACT_API_KEY=your-gemini-api-key  # Required
```

### Tool Parameters

Configure extraction behavior through tool parameters:

```python
{
    "model_id": "gemini-2.5-flash",     # Language model selection
    "max_char_buffer": 1000,            # Text chunk size
    "temperature": 0.5,                 # Sampling temperature (0.0-1.0)  
    "extraction_passes": 1,             # Number of extraction attempts
    "max_workers": 10                   # Parallel processing threads
}
```

### Output Format

All extractions return consistent structured data:

```python
{
    "document_id": "doc_123",
    "total_extractions": 5,
    "extractions": [
        {
            "extraction_class": "medication", 
            "extraction_text": "amoxicillin",
            "attributes": {"type": "antibiotic"},
            "start_char": 25,
            "end_char": 35
        }
    ],
    "metadata": {
        "model_id": "gemini-2.5-flash",
        "extraction_passes": 1,
        "temperature": 0.5
    }
}
```

## Use Cases

LangExtract MCP Server supports a wide range of use cases across multiple domains. In healthcare and life sciences, it can extract medications, dosages, and treatment protocols from clinical notes, structure radiology and pathology reports, and process research papers or clinical trial data. For legal and compliance applications, it enables extraction of contract terms, parties, and obligations, as well as analysis of regulatory documents, compliance reports, and case law. In research and academia, the server is useful for extracting methodologies, findings, and citations from papers, analyzing survey responses and interview transcripts, and processing historical or archival materials. For business intelligence, it helps extract insights from customer feedback and reviews, analyze news articles and market reports, and process financial documents and earnings reports.

## Support and Documentation

**Primary Resources:**
- [LangExtract Documentation](https://github.com/google/langextract) - Core library reference
- [FastMCP Documentation](https://gofastmcp.com/) - MCP server framework
- [Model Context Protocol](https://modelcontextprotocol.io/) - Protocol specification

```

--------------------------------------------------------------------------------
/src/langextract_mcp/resources/README.md:
--------------------------------------------------------------------------------

```markdown
# LangExtract MCP Server - Client Guide

A Model Context Protocol (MCP) server that provides structured information extraction from unstructured text using Google's LangExtract library and Gemini models.

## Overview

This MCP server enables AI assistants to extract structured information from text documents while maintaining precise source grounding. Each extraction is mapped to its exact location in the source text, enabling visual highlighting and verification.

## Available Tools

### Core Extraction Tools

#### `extract_from_text`
Extract structured information from provided text using Large Language Models.

**Parameters:**
- `text` (string): The text to extract information from
- `prompt_description` (string): Clear instructions for what to extract
- `examples` (array): List of example extractions to guide the model
- `config` (object, optional): Configuration parameters

#### `extract_from_url`
Extract structured information from web content by downloading and processing the text.

**Parameters:**
- `url` (string): URL to download text from (must start with http:// or https://)
- `prompt_description` (string): Clear instructions for what to extract
- `examples` (array): List of example extractions to guide the model
- `config` (object, optional): Configuration parameters

#### `save_extraction_results`
Save extraction results to a JSONL file for later use or visualization.

**Parameters:**
- `extraction_results` (object): Results from extract_from_text or extract_from_url
- `output_name` (string): Name for the output file (without .jsonl extension)
- `output_dir` (string, optional): Directory to save the file (default: current directory)

#### `generate_visualization`
Generate interactive HTML visualization from extraction results.

**Parameters:**
- `jsonl_file_path` (string): Path to the JSONL file containing extraction results
- `output_html_path` (string, optional): Optional path for the HTML output

## How to Structure Examples

Examples are critical for guiding the extraction model. Each example should follow this structure:

```json
{
  "text": "Example input text",
  "extractions": [
    {
      "extraction_class": "category_name",
      "extraction_text": "exact text from input",
      "attributes": {
        "key1": "value1",
        "key2": "value2"
      }
    }
  ]
}
```

### Key Principles for Examples:

1. **Use exact text**: `extraction_text` should be verbatim from the input text
2. **Don't paraphrase**: Extract the actual words, not interpretations
3. **Provide meaningful attributes**: Add context through the attributes dictionary
4. **Cover all extraction classes**: Include examples for each type you want to extract
5. **Show variety**: Demonstrate different patterns and edge cases

## Configuration Options

The `config` parameter accepts these options:

- `model_id` (string): Gemini model to use (default: "gemini-2.5-flash")
- `max_char_buffer` (integer): Text chunk size (default: 1000)
- `temperature` (float): Sampling temperature 0.0-1.0 (default: 0.5)
- `extraction_passes` (integer): Number of extraction attempts for better recall (default: 1)
- `max_workers` (integer): Parallel processing threads (default: 10)

## Supported Models

This server only supports Google Gemini models:
- `gemini-2.5-flash` - **Recommended default** - Optimal balance of speed, cost, and quality
- `gemini-2.5-pro` - Best for complex reasoning and analysis tasks

## Complete Usage Examples

### Example 1: Medical Information Extraction

```json
{
  "tool": "extract_from_text",
  "parameters": {
    "text": "Patient prescribed 500mg amoxicillin twice daily for bacterial infection. Take with food to reduce stomach upset.",
    "prompt_description": "Extract medication information including drug names, dosages, frequencies, and administration instructions. Use exact text for extractions.",
    "examples": [
      {
        "text": "Take 250mg ibuprofen every 4 hours as needed for pain",
        "extractions": [
          {
            "extraction_class": "medication",
            "extraction_text": "ibuprofen",
            "attributes": {
              "type": "pain_reliever",
              "category": "NSAID"
            }
          },
          {
            "extraction_class": "dosage",
            "extraction_text": "250mg",
            "attributes": {
              "amount": "250",
              "unit": "mg"
            }
          },
          {
            "extraction_class": "frequency",
            "extraction_text": "every 4 hours",
            "attributes": {
              "interval": "4 hours",
              "schedule_type": "as_needed"
            }
          }
        ]
      }
    ],
    "config": {
      "model_id": "gemini-2.5-flash",
      "temperature": 0.2
    }
  }
}
```

### Example 2: Document Analysis from URL

```json
{
  "tool": "extract_from_url",
  "parameters": {
    "url": "https://example.com/research-paper.html",
    "prompt_description": "Extract research findings, methodologies, and key statistics from academic papers. Focus on quantitative results and experimental methods.",
    "examples": [
      {
        "text": "Our study of 500 participants showed a 23% improvement in accuracy using the new method compared to baseline.",
        "extractions": [
          {
            "extraction_class": "finding",
            "extraction_text": "23% improvement in accuracy",
            "attributes": {
              "metric": "accuracy",
              "change": "improvement",
              "magnitude": "23%"
            }
          },
          {
            "extraction_class": "methodology",
            "extraction_text": "study of 500 participants",
            "attributes": {
              "sample_size": "500",
              "study_type": "comparative"
            }
          }
        ]
      }
    ],
    "config": {
      "model_id": "gemini-2.5-pro",
      "extraction_passes": 2,
      "max_char_buffer": 1500
    }
  }
}
```

### Example 3: Literary Character Analysis

```json
{
  "tool": "extract_from_text",
  "parameters": {
    "text": "ROMEO: But soft! What light through yonder window breaks? It is the east, and Juliet is the sun.",
    "prompt_description": "Extract characters, emotions, and literary devices from Shakespeare. Capture the emotional context and relationships between characters.",
    "examples": [
      {
        "text": "HAMLET: To be or not to be, that is the question.",
        "extractions": [
          {
            "extraction_class": "character",
            "extraction_text": "HAMLET",
            "attributes": {
              "play": "Hamlet",
              "emotional_state": "contemplative"
            }
          },
          {
            "extraction_class": "philosophical_statement",
            "extraction_text": "To be or not to be, that is the question",
            "attributes": {
              "theme": "existential",
              "type": "soliloquy"
            }
          }
        ]
      }
    ]
  }
}
```

### Example 4: Business Intelligence from Customer Feedback

```json
{
  "tool": "extract_from_text",
  "parameters": {
    "text": "The new software update is fantastic! Loading times are 50% faster and the interface is much more intuitive. However, the mobile app still crashes occasionally.",
    "prompt_description": "Extract customer sentiments, specific feedback points, and performance metrics from reviews. Identify both positive and negative aspects.",
    "examples": [
      {
        "text": "Love the new design but the checkout process takes too long - about 3 minutes.",
        "extractions": [
          {
            "extraction_class": "positive_feedback",
            "extraction_text": "Love the new design",
            "attributes": {
              "aspect": "design",
              "sentiment": "positive"
            }
          },
          {
            "extraction_class": "negative_feedback",
            "extraction_text": "checkout process takes too long",
            "attributes": {
              "aspect": "checkout",
              "sentiment": "negative"
            }
          },
          {
            "extraction_class": "metric",
            "extraction_text": "about 3 minutes",
            "attributes": {
              "measurement": "time",
              "value": "3",
              "unit": "minutes"
            }
          }
        ]
      }
    ]
  }
}
```

## Working with Results

### Saving and Visualizing Extractions

After running an extraction, you can save the results and create an interactive visualization:

```json
{
  "tool": "save_extraction_results",
  "parameters": {
    "extraction_results": {...}, // Results from previous extraction
    "output_name": "medical_extractions",
    "output_dir": "./results"
  }
}
```

```json
{
  "tool": "generate_visualization",
  "parameters": {
    "jsonl_file_path": "./results/medical_extractions.jsonl",
    "output_html_path": "./results/medical_visualization.html"
  }
}
```

### Expected Output Format

All extractions return this structured format:

```json
{
  "document_id": "doc_123",
  "total_extractions": 5,
  "extractions": [
    {
      "extraction_class": "medication",
      "extraction_text": "amoxicillin",
      "attributes": {
        "type": "antibiotic"
      },
      "start_char": 25,
      "end_char": 35
    }
  ],
  "metadata": {
    "model_id": "gemini-2.5-flash",
    "extraction_passes": 1,
    "temperature": 0.5
  }
}
```

## Best Practices

### Creating Effective Examples

1. **Quality over quantity**: 1-3 high-quality examples are better than many poor ones
2. **Representative patterns**: Cover the main patterns you expect to see
3. **Exact text matching**: Always use verbatim text from the input
4. **Rich attributes**: Use attributes to provide context and categorization
5. **Edge cases**: Include examples of challenging or ambiguous cases

### Optimizing Performance

- Use `gemini-2.5-flash` for most tasks (faster, cost-effective)
- Use `gemini-2.5-pro` for complex reasoning or analysis
- Increase `extraction_passes` for higher recall on long documents
- Decrease `max_char_buffer` for better accuracy on dense text
- Lower `temperature` (0.1-0.3) for consistent, factual extractions
- Higher `temperature` (0.7-0.9) for creative or interpretive tasks

### Error Handling

Common issues and solutions:

- **"At least one example is required"**: Always provide examples array
- **"Only Gemini models are supported"**: Use `gemini-2.5-flash` or `gemini-2.5-pro`
- **"API key required"**: Server administrator must set LANGEXTRACT_API_KEY
- **"Input text cannot be empty"**: Ensure text parameter has content
- **"URL must start with http://"**: Use full URLs for extract_from_url

## Advanced Features

### Multi-pass Extraction
For comprehensive extraction from long documents:

```json
{
  "config": {
    "extraction_passes": 3,
    "max_workers": 20,
    "max_char_buffer": 800
  }
}
```

### Precision vs. Recall Tuning
- **High precision**: Lower temperature (0.1-0.3), single pass
- **High recall**: Multiple passes (2-3), higher temperature (0.5-0.7)

### Domain-Specific Configurations
- **Medical texts**: Use `gemini-2.5-pro`, low temperature, multiple passes
- **Legal documents**: Smaller chunks (500-800 chars), precise examples
- **Literary analysis**: Higher temperature, rich attribute examples
- **Technical documentation**: Structured examples, consistent terminology

This MCP server provides a powerful interface to Google's LangExtract library, enabling precise structured information extraction with source grounding and interactive visualization capabilities.
```

--------------------------------------------------------------------------------
/src/langextract_mcp/resources/__init__.py:
--------------------------------------------------------------------------------

```python

```

--------------------------------------------------------------------------------
/src/langextract_mcp/__init__.py:
--------------------------------------------------------------------------------

```python
"""LangExtract MCP Server - FastMCP server for Google's langextract library."""

from .server import mcp, main

__version__ = "0.1.0"
__all__ = ["mcp", "main"]
```

--------------------------------------------------------------------------------
/SETUP.md:
--------------------------------------------------------------------------------

```markdown
# LangExtract MCP Server Setup Guide

## Quick Setup (No Config Files Needed!)

This MCP server doesn't use separate configuration files. Everything is handled through environment variables and tool parameters.

### Step 1: Get Your API Key
1. Go to [Google AI Studio](https://aistudio.google.com/app/apikey)
2. Create a new API key
3. Copy the key (keep it secure!)

### Step 2: Install with Claude Code
```bash
# Single command installation - no config files needed!
claude mcp add langextract-mcp -e LANGEXTRACT_API_KEY=your-gemini-api-key -- uv run --with fastmcp fastmcp run src/langextract_mcp/server.py
```

That's it! The server will start automatically when Claude Code needs it.

## Configuration Details

### Environment Variables (Set Once)
```bash
# Required
LANGEXTRACT_API_KEY=your-gemini-api-key

# Optional
LANGEXTRACT_DEFAULT_MODEL=gemini-2.5-flash
LANGEXTRACT_MAX_WORKERS=10
```

### Per-Request Configuration (In Tool Calls)
When using tools, you can configure behavior per request:

```python
{
    "text": "Your text to extract from",
    "config": {
        "model_id": "gemini-2.5-flash",     # Which model to use
        "temperature": 0.5,                 # Randomness (0.0-1.0)
        "extraction_passes": 1,             # How many extraction attempts
        "max_workers": 10                   # Parallel processing
    }
}
```

## Verification
After installation, ask Claude Code:
```
Use the get_server_info tool to show the LangExtract server status
```

You should see:
- Server running: ✅
- API key configured: ✅
- Optimization features enabled: ✅

## Troubleshooting

**"Server not found"**
```bash
# Check if registered
claude mcp list

# Re-add if missing
claude mcp add langextract-mcp -e LANGEXTRACT_API_KEY=your-key -- uv run --with fastmcp fastmcp run src/langextract_mcp/server.py
```

**"API key not set"**
```bash
# Check environment
echo $LANGEXTRACT_API_KEY

# Set if missing (permanent)
echo 'export LANGEXTRACT_API_KEY=your-key' >> ~/.bashrc
source ~/.bashrc
```

**"Tools not working"**
- Verify API key is valid at [Google AI Studio](https://aistudio.google.com/app/apikey)
- Check network connectivity
- Try with different model (e.g., "gemini-2.5-pro")

```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "langextract-mcp"
version = "0.1.0"
description = "FastMCP server for Google's langextract library - extract structured information from unstructured text using LLMs"
readme = "README.md"
license = { text = "Apache-2.0" }
authors = [
    { name = "Larsen Weigle" }
]
classifiers = [
    "Development Status :: 3 - Alpha",
    "Intended Audience :: Developers",
    "License :: OSI Approved :: Apache Software License",
    "Programming Language :: Python :: 3",
    "Programming Language :: Python :: 3.10",
    "Programming Language :: Python :: 3.11",
    "Programming Language :: Python :: 3.12",
    "Topic :: Software Development :: Libraries :: Python Modules",
    "Topic :: Scientific/Engineering :: Artificial Intelligence",
    "Topic :: Text Processing :: Linguistic",
]
keywords = ["mcp", "fastmcp", "langextract", "llm", "text-extraction", "nlp", "ai"]

requires-python = ">=3.10"
dependencies = [
    "fastmcp>=0.1.0",
    "langextract>=0.1.0",
    "pydantic>=2.0.0",
    "python-dotenv>=1.0.0",
    "httpx>=0.25.0",
]

[project.optional-dependencies]
dev = [
    "pytest>=7.0.0",
    "pytest-asyncio>=0.21.0",
    "black>=23.0.0",
    "isort>=5.12.0",
    "mypy>=1.5.0",
    "pre-commit>=3.0.0",
]

[project.urls]
Homepage = "https://github.com/your-org/langextract-mcp"
Repository = "https://github.com/your-org/langextract-mcp"
Documentation = "https://github.com/your-org/langextract-mcp/blob/main/README.md"
Issues = "https://github.com/your-org/langextract-mcp/issues"

[project.scripts]
langextract-mcp = "langextract_mcp.server:main"

[tool.hatch.build.targets.wheel]
packages = ["src/langextract_mcp"]

[tool.hatch.build.targets.sdist]
include = [
    "/src",
    "/docs",
    "/examples",
    "/README.md",
    "/LICENSE",
]

[tool.black]
line-length = 88
target-version = ['py310']
include = '\.pyi?$'

[tool.isort]
profile = "black"
line_length = 88

[tool.mypy]
python_version = "3.10"
warn_return_any = true
warn_unused_configs = true
disallow_untyped_defs = true
disallow_incomplete_defs = true

[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = ["test_*.py"]
python_classes = ["Test*"]
python_functions = ["test_*"]
addopts = "-v --tb=short"
asyncio_mode = "auto"
```

--------------------------------------------------------------------------------
/src/langextract_mcp/resources/supported-models.md:
--------------------------------------------------------------------------------

```markdown
# Supported Language Models

This document provides comprehensive information about the language models supported by the langextract-mcp server.

## Currently Supported Models

The langextract-mcp server currently supports **Google Gemini models only**, which are optimized for reliable structured extraction with schema constraints.

### Gemini 2.5 Flash
- **Provider**: Google
- **Model ID**: `gemini-2.5-flash`
- **Description**: Fast, cost-effective model with excellent quality
- **Schema Constraints**: ✅ Supported
- **Recommended For**:
  - General extraction tasks
  - Fast processing requirements
  - Cost-sensitive applications
- **Notes**: Recommended default choice - optimal balance of speed, cost, and quality

### Gemini 2.5 Pro
- **Provider**: Google
- **Model ID**: `gemini-2.5-pro`
- **Description**: Advanced model for complex reasoning tasks
- **Schema Constraints**: ✅ Supported
- **Recommended For**:
  - Complex extractions
  - High accuracy requirements
  - Sophisticated reasoning tasks
- **Notes**: Best quality for complex tasks but higher cost

## Model Recommendations

| Use Case | Recommended Model | Reason |
|----------|------------------|---------|
| **Default/General** | `gemini-2.5-flash` | Best balance of speed, cost, and quality |
| **High Quality** | `gemini-2.5-pro` | Superior accuracy and reasoning capabilities |
| **Cost Optimized** | `gemini-2.5-flash` | Most cost-effective option |
| **Complex Reasoning** | `gemini-2.5-pro` | Advanced reasoning for complex extraction tasks |

## Configuration Parameters

When using any supported model, you can configure the following parameters:

- **`model_id`**: The model identifier (e.g., "gemini-2.5-flash")
- **`max_char_buffer`**: Maximum characters per chunk (default: 1000)
- **`temperature`**: Sampling temperature 0.0-1.0 (default: 0.5)
- **`extraction_passes`**: Number of extraction passes for better recall (default: 1)
- **`max_workers`**: Maximum parallel workers (default: 10)

## Limitations

- **Provider Support**: Currently supports Google Gemini models only
- **Future Support**: OpenAI and local model support may be added in future versions
- **API Dependencies**: Requires active internet connection and valid API keys

## Schema Constraints

All supported Gemini models include schema constraint capabilities, which means:

- **Structured Output**: Guaranteed JSON structure based on your examples
- **Type Safety**: Consistent field types across extractions
- **Validation**: Automatic validation of extracted data against schema
- **Reliability**: Reduced hallucination and improved consistency

This makes the langextract-mcp server particularly reliable for production applications requiring consistent structured data extraction.

```

--------------------------------------------------------------------------------
/docs/fastmcp.md:
--------------------------------------------------------------------------------

```markdown
# FastMCP Framework Study Notes - Deep Analysis

## Overview
FastMCP is a Python framework for building Model Context Protocol (MCP) servers and clients, designed to enable sophisticated interactions between AI systems and various services. It provides a "fast, Pythonic way" to build MCP servers with comprehensive functionality and enterprise-grade features.

## Core Architecture

### 1. Servers
- **Primary Function**: Expose tools as executable capabilities
- **Authentication**: Support multiple authentication mechanisms
- **Middleware**: Enable cross-cutting functionality for request/response processing
- **Resource Management**: Allow resource and prompt management
- **Monitoring**: Support progress reporting and logging

### 2. Clients
- **Purpose**: Provide programmatic interaction with MCP servers
- **Authentication**: Support multiple methods (Bearer Token, OAuth)
- **Processing**: Handle message processing, logging, and progress monitoring

## Key Features

### Tool Operations
- Define tools as executable functions
- Structured user input handling
- Comprehensive tool management

### Resource Management
- Create and manage resources
- Prompt templating capabilities
- Resource organization and access

### Authentication & Security
- Flexible authentication strategies
- Bearer Token support
- OAuth integration
- Authorization provider compatibility

### Middleware System
- Request/response processing
- Cross-cutting concerns handling
- Extensible middleware chain

### Monitoring & Logging
- Progress tracking
- Comprehensive logging
- User interaction context

## Integration Capabilities

### Supported Platforms
- OpenAI API
- Anthropic
- Google Gemini
- FastAPI
- Starlette/ASGI

### Authorization Providers
- Various authorization providers supported
- Flexible configuration options

## Server Development Guidelines

### 1. Tool Definition
- Define tools as executable functions
- Implement clear input/output schemas
- Handle errors gracefully

### 2. Authentication Setup
- Choose appropriate authentication strategy
- Configure security mechanisms
- Implement user context handling

### 3. Context Configuration
- Set up logging context
- Configure user interactions
- Implement progress tracking

### 4. Middleware Implementation
- Use middleware for common functionality
- Process requests and responses
- Handle cross-cutting concerns

### 5. Resource Creation
- Define resources and prompt templates
- Organize resource access patterns
- Implement resource management

## Unique Selling Points

1. **Pythonic Interface**: Natural Python API design
2. **Flexible Composition**: Modular server composition
3. **Structured Input**: Sophisticated user input handling
4. **Comprehensive SDK**: Extensive documentation and tooling
5. **Standardized Protocol**: Uses MCP for consistent interactions

## FastMCP Implementation Patterns

### 1. Server Instantiation
```python
from fastmcp import FastMCP

# Basic server
mcp = FastMCP("Demo 🚀")

# Server with configuration
mcp = FastMCP(
    name="LangExtractServer",
    instructions="Extract structured information from text using LLMs",
    include_tags={"public"},
    exclude_tags={"internal"}
)
```

### 2. Tool Definition Patterns
```python
@mcp.tool
def add(a: int, b: int) -> int:
    """Add two numbers"""
    return a + b

# Complex parameters with validation
@mcp.tool
def process_data(
    query: str,
    max_results: int = 10,
    sort_by: str = "relevance",
    category: str | None = None
) -> dict:
    """Process data with parameters"""
    return {"results": []}
```

### 3. Error Handling
```python
from fastmcp.exceptions import ToolError

@mcp.tool
def divide(a: float, b: float) -> float:
    if b == 0:
        raise ToolError("Cannot divide by zero")
    return a / b
```

### 4. Authentication Patterns
```python
from fastmcp.server.auth.providers.jwt import JWTVerifier

auth = JWTVerifier(
    jwks_uri="https://your-auth-system.com/.well-known/jwks.json",
    issuer="https://your-auth-system.com",
    audience="your-mcp-server"
)

mcp = FastMCP(name="Protected Server", auth=auth)
```

### 5. Server Execution
```python
# STDIO transport (default for MCP clients)
mcp.run()

# HTTP transport
mcp.run(transport="http", host="0.0.0.0", port=9000)
```

### 6. Server Composition
```python
main = FastMCP(name="MainServer")
sub = FastMCP(name="SubServer")
main.mount(sub, prefix="sub")
```

## Key Insights for LangExtract MCP Server

1. **Simple Decorator Pattern**: Use `@mcp.tool` for all langextract functions
2. **Type Safety**: Leverage Python type hints for automatic validation
3. **Proper Error Handling**: Use `ToolError` for controlled error messaging
4. **Clean Architecture**: Keep tools simple and focused
5. **Context Management**: Use FastMCP's built-in context for logging/progress
6. **Transport Flexibility**: Support both STDIO and HTTP transports
7. **Authentication Ready**: Design with auth in mind for production use

## Implementation Strategy for langextract

Based on deeper FastMCP understanding:

1. **Clean Tool Interface**: Each langextract function as a simple `@mcp.tool`
2. **Type-Safe Parameters**: Use Pydantic models for complex inputs
3. **Structured Outputs**: Return proper dictionaries/models
4. **Error Management**: Comprehensive error handling with `ToolError`
5. **Context Integration**: Use FastMCP context for progress/logging
6. **Resource Management**: Expose example templates as MCP resources
7. **Production Ready**: Authentication and deployment configuration
```

--------------------------------------------------------------------------------
/docs/langextract.md:
--------------------------------------------------------------------------------

```markdown
# LangExtract Library Study Notes

## Overview
LangExtract is a Python library developed by Google that uses Large Language Models (LLMs) to extract structured information from unstructured text documents based on user-defined instructions. It's designed to process materials like clinical notes, reports, and other documents while maintaining precise source grounding.

## Key Features & Differentiators

### 1. Precise Source Grounding
- **Capability**: Maps every extraction to its exact location in the source text
- **Benefit**: Enables visual highlighting for easy traceability and verification
- **Implementation**: Through annotation system that tracks character positions

### 2. Reliable Structured Outputs
- **Schema Enforcement**: Consistent output schema based on few-shot examples
- **Controlled Generation**: Leverages structured output capabilities in supported models (Gemini)
- **Format Support**: JSON and YAML output formats

### 3. Long Document Optimization
- **Challenge Addressed**: "Needle-in-a-haystack" problem in large documents
- **Strategy**: Text chunking + parallel processing + multiple extraction passes
- **Benefit**: Higher recall on complex documents

### 4. Interactive Visualization
- **Output**: Self-contained HTML files for reviewing extractions
- **Scalability**: Handles thousands of extracted entities
- **Context**: Shows entities in their original document context

### 5. Flexible LLM Support
- **Cloud Models**: Google Gemini family, OpenAI models
- **Local Models**: Built-in Ollama interface
- **Extensibility**: Can be extended to other APIs

### 6. Domain Adaptability
- **No Fine-tuning**: Uses few-shot examples instead of model training
- **Flexibility**: Works across any domain with proper examples
- **Customization**: Leverages LLM world knowledge through prompt engineering

## Core Architecture

### Main Components

#### 1. Data Models (`data.py`)
- **ExampleData**: Defines extraction examples with text and expected extractions
- **Extraction**: Individual extracted entity with class, text, and attributes
- **Document**: Input document container
- **AnnotatedDocument**: Result container with extractions and metadata

#### 2. Inference Engine (`inference.py`)
- **GeminiLanguageModel**: Google Gemini API integration
- **OpenAILanguageModel**: OpenAI API integration
- **BaseLanguageModel**: Abstract base for language model implementations
- **Schema Support**: Structured output generation for supported models

#### 3. Annotation System (`annotation.py`)
- **Annotator**: Core extraction orchestrator
- **Text Processing**: Handles chunking and parallel processing
- **Progress Tracking**: Monitors extraction progress

#### 4. Resolver System (`resolver.py`)
- **Purpose**: Parses raw LLM output into structured Extraction objects
- **Fence Handling**: Extracts content from markdown code blocks
- **Format Parsing**: Handles JSON/YAML parsing and validation

#### 5. Chunking Engine (`chunking.py`)
- **Text Segmentation**: Breaks long documents into processable chunks
- **Buffer Management**: Handles max_char_buffer limits
- **Overlap Strategy**: Maintains context across chunk boundaries

#### 6. Visualization (`visualization.py`)
- **HTML Generation**: Creates interactive visualization files
- **Entity Highlighting**: Shows extractions in original context
- **Scalable Interface**: Handles large result sets efficiently

#### 7. I/O Operations (`io.py`)
- **URL Download**: Fetches text from web URLs
- **File Operations**: Saves results to JSONL format
- **Document Loading**: Handles various input formats

### Key API Functions

#### Primary Interface
```python
lx.extract(
    text_or_documents,      # Input text, URL, or Document objects
    prompt_description,     # Extraction instructions
    examples,              # Few-shot examples
    model_id="gemini-2.5-flash",
    # Configuration options...
)
```

#### Visualization
```python
lx.visualize(jsonl_file_path)  # Generate HTML visualization
```

#### I/O Operations
```python
lx.io.save_annotated_documents(results, output_name, output_dir)
```

## Configuration Parameters

### Core Parameters
- **model_id**: LLM model selection
- **api_key**: Authentication for cloud models
- **temperature**: Sampling temperature (0.5 recommended)
- **max_char_buffer**: Chunk size limit (1000 default)

### Performance Parameters
- **max_workers**: Parallel processing workers (10 default)
- **batch_length**: Chunks per batch (10 default)
- **extraction_passes**: Multiple extraction attempts (1 default)

### Output Control
- **format_type**: JSON or YAML output
- **fence_output**: Code fence expectations
- **use_schema_constraints**: Structured output enforcement

## Supported Models

### Google Gemini
- **gemini-2.5-flash**: Recommended default (speed/cost/quality balance)
- **gemini-2.5-pro**: For complex reasoning tasks
- **Schema Support**: Full structured output support
- **Rate Limits**: Tier 2 quota recommended for production

### OpenAI
- **gpt-4o**: Supported with limitations
- **Requirements**: `fence_output=True`, `use_schema_constraints=False`
- **Note**: Schema constraints not yet implemented for OpenAI

### Local Models
- **Ollama**: Built-in support
- **Extension**: Can be extended to other local APIs

## Use Cases & Examples

### 1. Literary Analysis
- **Characters**: Extract character names and emotional states
- **Relationships**: Identify character interactions and metaphors
- **Context**: Track narrative elements across long texts

### 2. Medical Document Processing
- **Medications**: Extract drug names, dosages, routes, frequencies
- **Clinical Notes**: Structure unstructured medical reports
- **Compliance**: Maintain source grounding for medical accuracy

### 3. Radiology Reports
- **Structured Data**: Convert free-text reports to structured findings
- **Demo Available**: RadExtract on HuggingFace Spaces

### 4. Long Document Processing
- **Full Novels**: Process complete books (e.g., Romeo & Juliet - 147k chars)
- **Performance**: Parallel processing with multiple passes
- **Visualization**: Handle hundreds of entities in context

## Technical Implementation Details

### Text Processing Pipeline
1. **Input Validation**: Validate text/documents and examples
2. **URL Handling**: Download content if URL provided
3. **Chunking**: Break long texts into manageable pieces
4. **Parallel Processing**: Distribute chunks across workers
5. **Multiple Passes**: Optional additional extraction rounds
6. **Resolution**: Parse LLM outputs into structured data
7. **Annotation**: Create AnnotatedDocument with source grounding
8. **Visualization**: Generate interactive HTML output

### Error Handling
- **API Failures**: Graceful handling of LLM API errors
- **Parsing Errors**: Robust JSON/YAML parsing with fallbacks
- **Validation**: Schema validation for structured outputs

### Performance Optimization
- **Concurrent Processing**: Parallel chunk processing
- **Efficient Chunking**: Smart text segmentation
- **Progressive Enhancement**: Multiple passes for better recall
- **Memory Management**: Efficient handling of large documents

## MCP Server Design Implications

Based on langextract's architecture, a FastMCP server should expose:

### Core Tools
1. **extract_text**: Main extraction function
2. **extract_from_url**: URL-based extraction
3. **visualize_results**: Generate HTML visualization
4. **validate_examples**: Validate extraction examples

### Configuration Management
1. **set_model**: Configure LLM model
2. **set_api_key**: Set authentication
3. **configure_extraction**: Set extraction parameters

### File Operations
1. **save_results**: Save to JSONL format
2. **load_results**: Load previous results
3. **export_visualization**: Generate and save HTML

### Advanced Features
1. **batch_extract**: Process multiple documents
2. **progressive_extract**: Multi-pass extraction
3. **compare_results**: Compare extraction results

### Resource Management
- **Model Configurations**: Manage different model setups
- **Example Templates**: Store reusable extraction examples
- **Result Archives**: Access previous extraction results

## Dependencies & Installation
- **Core**: Python 3.10+, requests, dotenv
- **LLM APIs**: google-generativeai, openai
- **Processing**: concurrent.futures for parallelization
- **Visualization**: HTML/CSS/JS generation
- **Format Support**: JSON, YAML parsing

## Licensing & Usage
- **License**: Apache 2.0
- **Disclaimer**: Not officially supported Google product
- **Health Applications**: Subject to Health AI Developer Foundations Terms
- **Citation**: Recommended for production/publication use
```

--------------------------------------------------------------------------------
/src/langextract_mcp/server.py:
--------------------------------------------------------------------------------

```python
"""FastMCP server for langextract - optimized for Claude Code integration."""

import os
from typing import Any
from pathlib import Path
import hashlib
import json

import langextract as lx
from fastmcp import FastMCP
from fastmcp.resources import FileResource
from fastmcp.exceptions import ToolError
from pydantic import BaseModel, Field


# Simple dictionary types for easier LLM usage
# ExtractionItem: {"extraction_class": str, "extraction_text": str, "attributes": dict}
# ExtractionExample: {"text": str, "extractions": list[ExtractionItem]}


class ExtractionConfig(BaseModel):
    """Configuration for extraction parameters."""
    model_id: str = Field(default="gemini-2.5-flash", description="LLM model to use")
    max_char_buffer: int = Field(default=1000, description="Max characters per chunk")
    temperature: float = Field(default=0.5, description="Sampling temperature (0.0-1.0)")
    extraction_passes: int = Field(default=1, description="Number of extraction passes for better recall")
    max_workers: int = Field(default=10, description="Max parallel workers")


# Initialize FastMCP server with Claude Code compatibility
mcp = FastMCP(
    name="langextract-mcp",
    instructions="Extract structured information from unstructured text using Google Gemini models. "
                "Provides precise source grounding, interactive visualizations, and optimized caching for performance."
)


class LangExtractClient:
    """Optimized langextract client for MCP server usage.
    
    This client maintains persistent connections and caches expensive operations
    like schema generation and prompt templates for better performance in a
    long-running MCP server context.
    """
    
    def __init__(self):
        self._language_models: dict[str, Any] = {}
        self._schema_cache: dict[str, Any] = {}
        self._prompt_template_cache: dict[str, Any] = {}
        self._resolver_cache: dict[str, Any] = {}
        
    def _get_examples_hash(self, examples: list[dict[str, Any]]) -> str:
        """Generate a hash for caching based on examples."""
        examples_str = json.dumps(examples, sort_keys=True)
        return hashlib.md5(examples_str.encode()).hexdigest()
    
    def _get_language_model(self, config: ExtractionConfig, api_key: str, schema: Any | None = None, schema_hash: str | None = None) -> Any:
        """Get or create a cached language model instance."""
        # Include schema hash in cache key to prevent schema mutation conflicts
        model_key = f"{config.model_id}_{config.temperature}_{config.max_workers}_{schema_hash or 'no_schema'}"
        
        if model_key not in self._language_models:
            # Validate that only Gemini models are supported
            if not config.model_id.startswith('gemini'):
                raise ValueError(f"Only Gemini models are supported. Got: {config.model_id}")
                
            language_model = lx.inference.GeminiLanguageModel(
                model_id=config.model_id,
                api_key=api_key,
                temperature=config.temperature,
                max_workers=config.max_workers,
                gemini_schema=schema
            )
            self._language_models[model_key] = language_model
            
        return self._language_models[model_key]
    
    def _get_schema(self, examples: list[dict[str, Any]], model_id: str) -> tuple[Any, str]:
        """Get or create a cached schema for the examples.
        
        Returns:
            Tuple of (schema, examples_hash) for use in caching language models
        """
        if not model_id.startswith('gemini'):
            return None, ""
            
        examples_hash = self._get_examples_hash(examples)
        schema_key = f"{model_id}_{examples_hash}"
        
        if schema_key not in self._schema_cache:
            # Convert examples to langextract format
            langextract_examples = self._create_langextract_examples(examples)
            
            # Create prompt template to generate schema
            prompt_template = lx.prompting.PromptTemplateStructured(description="Schema generation")
            prompt_template.examples.extend(langextract_examples)
            
            # Generate schema
            schema = lx.schema.GeminiSchema.from_examples(prompt_template.examples)
            self._schema_cache[schema_key] = schema
            
        return self._schema_cache[schema_key], examples_hash
    
    def _get_resolver(self, format_type: str = "JSON") -> Any:
        """Get or create a cached resolver."""
        if format_type not in self._resolver_cache:
            resolver = lx.resolver.Resolver(
                fence_output=False,
                format_type=lx.data.FormatType.JSON if format_type == "JSON" else lx.data.FormatType.YAML,
                extraction_attributes_suffix="_attributes",
                extraction_index_suffix=None,
            )
            self._resolver_cache[format_type] = resolver
            
        return self._resolver_cache[format_type]
    
    def _create_langextract_examples(self, examples: list[dict[str, Any]]) -> list[lx.data.ExampleData]:
        """Convert dictionary examples to langextract ExampleData objects."""
        langextract_examples = []
        
        for example in examples:
            extractions = []
            for extraction_data in example["extractions"]:
                extractions.append(
                    lx.data.Extraction(
                        extraction_class=extraction_data["extraction_class"],
                        extraction_text=extraction_data["extraction_text"],
                        attributes=extraction_data.get("attributes", {})
                    )
                )
            
            langextract_examples.append(
                lx.data.ExampleData(
                    text=example["text"],
                    extractions=extractions
                )
            )
        
        return langextract_examples
    
    def extract(
        self, 
        text_or_url: str,
        prompt_description: str,
        examples: list[dict[str, Any]],
        config: ExtractionConfig,
        api_key: str
    ) -> lx.data.AnnotatedDocument:
        """Optimized extraction using cached components."""
        # Get or generate schema first
        schema, examples_hash = self._get_schema(examples, config.model_id)
        
        # Get cached components with schema-aware caching
        language_model = self._get_language_model(config, api_key, schema, examples_hash)
        resolver = self._get_resolver("JSON")
        
        # Convert examples
        langextract_examples = self._create_langextract_examples(examples)
        
        # Create prompt template
        prompt_template = lx.prompting.PromptTemplateStructured(
            description=prompt_description
        )
        prompt_template.examples.extend(langextract_examples)
        
        # Create annotator
        annotator = lx.annotation.Annotator(
            language_model=language_model,
            prompt_template=prompt_template,
            format_type=lx.data.FormatType.JSON,
            fence_output=False,
        )
        
        # Perform extraction
        if text_or_url.startswith(('http://', 'https://')):
            # Download text first
            text = lx.io.download_text_from_url(text_or_url)
        else:
            text = text_or_url
            
        return annotator.annotate_text(
            text=text,
            resolver=resolver,
            max_char_buffer=config.max_char_buffer,
            batch_length=10,
            additional_context=None,
            debug=False,  # Disable debug for cleaner MCP output
            extraction_passes=config.extraction_passes,
        )


# Global client instance for the server lifecycle
_langextract_client = LangExtractClient()


def _get_api_key() -> str | None:
    """Get API key from environment (server-side only for security)."""
    return os.environ.get("LANGEXTRACT_API_KEY")


def _format_extraction_result(result: lx.data.AnnotatedDocument, config: ExtractionConfig, source_url: str | None = None) -> dict[str, Any]:
    """Format langextract result for MCP response."""
    extractions = []
    
    for extraction in result.extractions or []:
        extractions.append({
            "extraction_class": extraction.extraction_class,
            "extraction_text": extraction.extraction_text,
            "attributes": extraction.attributes,
            "start_char": getattr(extraction, 'start_char', None),
            "end_char": getattr(extraction, 'end_char', None),
        })
    
    response = {
        "document_id": result.document_id if result.document_id else "anonymous",
        "total_extractions": len(extractions),
        "extractions": extractions,
        "metadata": {
            "model_id": config.model_id,
            "extraction_passes": config.extraction_passes,
            "max_char_buffer": config.max_char_buffer,
            "temperature": config.temperature,
        }
    }
    
    if source_url:
        response["source_url"] = source_url
        
    return response

# ============================================================================
# Tools
# ============================================================================

@mcp.tool
def extract_from_text(
    text: str,
    prompt_description: str,
    examples: list[dict[str, Any]],
    model_id: str = "gemini-2.5-flash",
    max_char_buffer: int = 1000,
    temperature: float = 0.5,
    extraction_passes: int = 1,
    max_workers: int = 10
) -> dict[str, Any]:
    """
    Extract structured information from text using langextract.
    
    Uses Large Language Models to extract structured information from unstructured text
    based on user-defined instructions and examples. Each extraction is mapped to its
    exact location in the source text for precise source grounding.
    
    Args:
        text: The text to extract information from
        prompt_description: Clear instructions for what to extract
        examples: List of example extractions to guide the model
        model_id: LLM model to use (default: "gemini-2.5-flash")
        max_char_buffer: Max characters per chunk (default: 1000)
        temperature: Sampling temperature 0.0-1.0 (default: 0.5)
        extraction_passes: Number of extraction passes for better recall (default: 1)
        max_workers: Max parallel workers (default: 10)
        
    Returns:
        Dictionary containing extracted entities with source locations and metadata
        
    Raises:
        ToolError: If extraction fails due to invalid parameters or API issues
    """
    try:
        if not examples:
            raise ToolError("At least one example is required for reliable extraction")
        
        if not prompt_description.strip():
            raise ToolError("Prompt description cannot be empty")
            
        if not text.strip():
            raise ToolError("Input text cannot be empty")
        
        # Validate that only Gemini models are supported
        if not model_id.startswith('gemini'):
            raise ToolError(
                f"Only Google Gemini models are supported. Got: {model_id}. "
                f"Use 'list_supported_models' tool to see available options."
            )
        
        # Create config object from individual parameters
        config = ExtractionConfig(
            model_id=model_id,
            max_char_buffer=max_char_buffer,
            temperature=temperature,
            extraction_passes=extraction_passes,
            max_workers=max_workers
        )
        
        # Get API key (server-side only for security)
        api_key = _get_api_key()
        if not api_key:
            raise ToolError(
                "API key required. Server administrator must set LANGEXTRACT_API_KEY environment variable."
            )
        
        # Perform optimized extraction using cached client
        result = _langextract_client.extract(
            text_or_url=text,
            prompt_description=prompt_description,
            examples=examples,
            config=config,
            api_key=api_key
        )
        
        return _format_extraction_result(result, config)
        
    except ValueError as e:
        raise ToolError(f"Invalid parameters: {str(e)}")
    except Exception as e:
        raise ToolError(f"Extraction failed: {str(e)}")


@mcp.tool
def extract_from_url(
    url: str,
    prompt_description: str,
    examples: list[dict[str, Any]],
    model_id: str = "gemini-2.5-flash",
    max_char_buffer: int = 1000,
    temperature: float = 0.5,
    extraction_passes: int = 1,
    max_workers: int = 10
) -> dict[str, Any]:
    """
    Extract structured information from text content at a URL.
    
    Downloads text from the specified URL and extracts structured information
    using Large Language Models. Ideal for processing web articles, documents,
    or any text content accessible via HTTP/HTTPS.
    
    Args:
        url: URL to download text from (must start with http:// or https://)
        prompt_description: Clear instructions for what to extract
        examples: List of example extractions to guide the model
        model_id: LLM model to use (default: "gemini-2.5-flash")
        max_char_buffer: Max characters per chunk (default: 1000)
        temperature: Sampling temperature 0.0-1.0 (default: 0.5)
        extraction_passes: Number of extraction passes for better recall (default: 1)
        max_workers: Max parallel workers (default: 10)
        
    Returns:
        Dictionary containing extracted entities with source locations and metadata
        
    Raises:
        ToolError: If URL is invalid, download fails, or extraction fails
    """
    try:
        if not url.startswith(('http://', 'https://')):
            raise ToolError("URL must start with http:// or https://")
            
        if not examples:
            raise ToolError("At least one example is required for reliable extraction")
        
        if not prompt_description.strip():
            raise ToolError("Prompt description cannot be empty")
        
        # Validate that only Gemini models are supported
        if not model_id.startswith('gemini'):
            raise ToolError(
                f"Only Google Gemini models are supported. Got: {model_id}. "
                f"Use 'list_supported_models' tool to see available options."
            )
        
        # Create config object from individual parameters
        config = ExtractionConfig(
            model_id=model_id,
            max_char_buffer=max_char_buffer,
            temperature=temperature,
            extraction_passes=extraction_passes,
            max_workers=max_workers
        )
        
        # Get API key (server-side only for security)
        api_key = _get_api_key()
        if not api_key:
            raise ToolError(
                "API key required. Server administrator must set LANGEXTRACT_API_KEY environment variable."
            )
        
        # Perform optimized extraction using cached client
        result = _langextract_client.extract(
            text_or_url=url,
            prompt_description=prompt_description,
            examples=examples,
            config=config,
            api_key=api_key
        )
        
        return _format_extraction_result(result, config, source_url=url)
        
    except ValueError as e:
        raise ToolError(f"Invalid parameters: {str(e)}")
    except Exception as e:
        raise ToolError(f"URL extraction failed: {str(e)}")


@mcp.tool  
def save_extraction_results(
    extraction_results: dict[str, Any],
    output_name: str,
    output_dir: str = "."
) -> dict[str, str]:
    """
    Save extraction results to a JSONL file for later use or visualization.
    
    Saves the extraction results in JSONL (JSON Lines) format, which is commonly
    used for structured data and can be loaded for visualization or further processing.
    
    Args:
        extraction_results: Results from extract_from_text or extract_from_url
        output_name: Name for the output file (without .jsonl extension)
        output_dir: Directory to save the file (default: current directory)
        
    Returns:
        Dictionary with file path and save confirmation
        
    Raises:
        ToolError: If save operation fails
    """
    try:
        # Create output directory if it doesn't exist
        output_path = Path(output_dir)
        output_path.mkdir(parents=True, exist_ok=True)
        
        # Create full file path
        file_path = output_path / f"{output_name}.jsonl"
        
        # Save results to JSONL format
        import json
        with open(file_path, 'w', encoding='utf-8') as f:
            json.dump(extraction_results, f, ensure_ascii=False)
            f.write('\n')
        
        return {
            "message": "Results saved successfully",
            "file_path": str(file_path.absolute()),
            "total_extractions": extraction_results.get("total_extractions", 0)
        }
        
    except Exception as e:
        raise ToolError(f"Failed to save results: {str(e)}")


@mcp.tool
def generate_visualization(
    jsonl_file_path: str,
    output_html_path: str | None = None
) -> dict[str, str]:
    """
    Generate interactive HTML visualization from extraction results.
    
    Creates an interactive HTML file that shows extracted entities highlighted
    in their original text context. The visualization is self-contained and
    can handle thousands of entities with color coding and hover details.
    
    Args:
        jsonl_file_path: Path to the JSONL file containing extraction results
        output_html_path: Optional path for the HTML output (default: auto-generated)
        
    Returns:
        Dictionary with HTML file path and generation details
        
    Raises:
        ToolError: If visualization generation fails
    """
    try:
        # Validate input file exists
        input_path = Path(jsonl_file_path)
        if not input_path.exists():
            raise ToolError(f"Input file not found: {jsonl_file_path}")
        
        # Generate visualization using langextract
        html_content = lx.visualize(str(input_path))
        
        # Determine output path
        if output_html_path:
            output_path = Path(output_html_path)
        else:
            output_path = input_path.parent / f"{input_path.stem}_visualization.html"
        
        # Ensure output directory exists
        output_path.parent.mkdir(parents=True, exist_ok=True)
        
        # Write HTML file
        with open(output_path, 'w', encoding='utf-8') as f:
            f.write(html_content)
        
        return {
            "message": "Visualization generated successfully",
            "html_file_path": str(output_path.absolute()),
            "file_size_bytes": len(html_content.encode('utf-8'))
        }
        
    except Exception as e:
        raise ToolError(f"Failed to generate visualization: {str(e)}")

# ============================================================================
# Resources
# ============================================================================

# Get the directory containing this server.py file
server_dir = Path(__file__).parent

readme_path = (server_dir / "resources" / "README.md").resolve()
if readme_path.exists():
    print(f"Adding README resource: {readme_path}")
    # Use a file:// URI scheme
    readme_resource = FileResource(
        uri=f"file://{readme_path.as_posix()}",
        path=readme_path, # Path to the actual file
        name="README File",
        description="The README for the langextract-mcp server.",
        mime_type="text/markdown",
        tags={"documentation"}
    )
    mcp.add_resource(readme_resource)


supported_models_path = (server_dir / "resources" / "supported-models.md").resolve()
if supported_models_path.exists():
    print(f"Adding Supported Models resource: {supported_models_path}")
    supported_models_resource = FileResource(
        uri=f"file://{supported_models_path.as_posix()}",
        path=supported_models_path,
        name="Supported Models",
        description="The supported models for the langextract-mcp server.",
        mime_type="text/markdown",
        tags={"documentation"}
    )
    mcp.add_resource(supported_models_resource)


def main():
    """Main function to run the FastMCP server."""
    mcp.run()


if __name__ == "__main__":
    main()

```