elevenlabs/elevenlabs-mcp # codebase.md

# Directory Structure

```
├── .env.example
├── .github
│   └── workflows
│       ├── publish.yml
│       └── test.yml
├── .gitignore
├── .pre-commit-config.yaml
├── Dockerfile
├── elevenlabs_mcp
│   ├── __init__.py
│   ├── __main__.py
│   ├── convai.py
│   ├── model.py
│   ├── server.py
│   └── utils.py
├── LICENSE
├── pyproject.toml
├── README.md
├── scripts
│   ├── build.sh
│   ├── deploy.sh
│   ├── dev.sh
│   ├── setup.sh
│   └── test.sh
├── server.json
├── setup.py
├── tests
│   ├── conftest.py
│   └── test_utils.py
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/.pre-commit-config.yaml:
--------------------------------------------------------------------------------

```yaml
1 | repos:
2 | -   repo: https://github.com/astral-sh/ruff-pre-commit
3 |     rev: v0.3.0
4 |     hooks:
5 |     -   id: ruff
6 |         args: [--fix, --exit-non-zero-on-fix]
7 |     -   id: ruff-format 
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
 1 | *.pyc
 2 | *.pyo
 3 | *.pyd
 4 | *.pyw
 5 | *.pyz
 6 | *.pywz
 7 | 
 8 | .env
 9 | .venv
10 | .cursor
11 | .cursorignore
12 | dist/
13 | elevenlabs_mcp.egg-info/
14 | .coverage
15 | coverage.xml
16 | .mcpregistry_github_token
17 | .mcpregistry_registry_token
```

--------------------------------------------------------------------------------
/.env.example:
--------------------------------------------------------------------------------

```
1 | ELEVENLABS_API_KEY=PUT_YOUR_KEY_HERE
2 | ELEVENLABS_MCP_BASE_PATH=~/Desktop # optional base path for output files
3 | ELEVENLABS_API_RESIDENCY="us" # optional data residency location
4 | ELEVENLABS_MCP_OUTPUT_MODE=files # output mode: files, resources, or both
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
  1 | ![export](https://github.com/user-attachments/assets/ee379feb-348d-48e7-899c-134f7f7cd74f)
  2 | 
  3 | <div class="title-block" style="text-align: center;" align="center">
  4 | 
  5 |   [![Discord Community](https://img.shields.io/badge/[email protected]?style=for-the-badge&logo=discord&labelColor=000)](https://discord.gg/elevenlabs)
  6 |   [![Twitter](https://img.shields.io/badge/[email protected]?style=for-the-badge&logo=twitter&labelColor=000)](https://x.com/ElevenLabsDevs)
  7 |   [![PyPI](https://img.shields.io/badge/PyPI-elevenlabs--mcp-000000.svg?style=for-the-badge&logo=pypi&labelColor=000)](https://pypi.org/project/elevenlabs-mcp)
  8 |   [![Tests](https://img.shields.io/badge/tests-passing-000000.svg?style=for-the-badge&logo=github&labelColor=000)](https://github.com/elevenlabs/elevenlabs-mcp-server/actions/workflows/test.yml)
  9 | 
 10 | </div>
 11 | 
 12 | 
 13 | <p align="center">
 14 |   Official ElevenLabs <a href="https://github.com/modelcontextprotocol">Model Context Protocol (MCP)</a> server that enables interaction with powerful Text to Speech and audio processing APIs. This server allows MCP clients like <a href="https://www.anthropic.com/claude">Claude Desktop</a>, <a href="https://www.cursor.so">Cursor</a>, <a href="https://codeium.com/windsurf">Windsurf</a>, <a href="https://github.com/openai/openai-agents-python">OpenAI Agents</a> and others to generate speech, clone voices, transcribe audio, and more.
 15 | </p>
 16 | 
 17 | <!--
 18 | mcp-name: io.github.elevenlabs/elevenlabs-mcp
 19 | -->
 20 | 
 21 | ## Quickstart with Claude Desktop
 22 | 
 23 | 1. Get your API key from [ElevenLabs](https://elevenlabs.io/app/settings/api-keys). There is a free tier with 10k credits per month.
 24 | 2. Install `uv` (Python package manager), install with `curl -LsSf https://astral.sh/uv/install.sh | sh` or see the `uv` [repo](https://github.com/astral-sh/uv) for additional install methods.
 25 | 3. Go to Claude > Settings > Developer > Edit Config > claude_desktop_config.json to include the following:
 26 | 
 27 | ```
 28 | {
 29 |   "mcpServers": {
 30 |     "ElevenLabs": {
 31 |       "command": "uvx",
 32 |       "args": ["elevenlabs-mcp"],
 33 |       "env": {
 34 |         "ELEVENLABS_API_KEY": "<insert-your-api-key-here>"
 35 |       }
 36 |     }
 37 |   }
 38 | }
 39 | 
 40 | ```
 41 | 
 42 | If you're using Windows, you will have to enable "Developer Mode" in Claude Desktop to use the MCP server. Click "Help" in the hamburger menu at the top left and select "Enable Developer Mode".
 43 | 
 44 | ## Other MCP clients
 45 | 
 46 | For other clients like Cursor and Windsurf, run:
 47 | 1. `pip install elevenlabs-mcp`
 48 | 2. `python -m elevenlabs_mcp --api-key={{PUT_YOUR_API_KEY_HERE}} --print` to get the configuration. Paste it into appropriate configuration directory specified by your MCP client.
 49 | 
 50 | That's it. Your MCP client can now interact with ElevenLabs through these tools:
 51 | 
 52 | ## Example usage
 53 | 
 54 | ⚠️ Warning: ElevenLabs credits are needed to use these tools.
 55 | 
 56 | Try asking Claude:
 57 | 
 58 | - "Create an AI agent that speaks like a film noir detective and can answer questions about classic movies"
 59 | - "Generate three voice variations for a wise, ancient dragon character, then I will choose my favorite voice to add to my voice library"
 60 | - "Convert this recording of my voice to sound like a medieval knight"
 61 | - "Create a soundscape of a thunderstorm in a dense jungle with animals reacting to the weather"
 62 | - "Turn this speech into text, identify different speakers, then convert it back using unique voices for each person"
 63 | 
 64 | ## Optional features
 65 | 
 66 | ### File Output Configuration
 67 | 
 68 | You can configure how the MCP server handles file outputs using these environment variables in your `claude_desktop_config.json`:
 69 | 
 70 | - **`ELEVENLABS_MCP_BASE_PATH`**: Specify the base path for file operations with relative paths (default: `~/Desktop`)
 71 | - **`ELEVENLABS_MCP_OUTPUT_MODE`**: Control how generated files are returned (default: `files`)
 72 | 
 73 | #### Output Modes
 74 | 
 75 | The `ELEVENLABS_MCP_OUTPUT_MODE` environment variable supports three modes:
 76 | 
 77 | 1. **`files`** (default): Save files to disk and return file paths
 78 |    ```json
 79 |    "env": {
 80 |      "ELEVENLABS_API_KEY": "your-api-key",
 81 |      "ELEVENLABS_MCP_OUTPUT_MODE": "files"
 82 |    }
 83 |    ```
 84 | 
 85 | 2. **`resources`**: Return files as MCP resources; binary data is base64-encoded, text is returned as UTF-8 text
 86 |    ```json
 87 |    "env": {
 88 |      "ELEVENLABS_API_KEY": "your-api-key",
 89 |      "ELEVENLABS_MCP_OUTPUT_MODE": "resources"
 90 |    }
 91 |    ```
 92 | 
 93 | 3. **`both`**: Save files to disk AND return as MCP resources
 94 |    ```json
 95 |    "env": {
 96 |      "ELEVENLABS_API_KEY": "your-api-key",
 97 |      "ELEVENLABS_MCP_OUTPUT_MODE": "both"
 98 |    }
 99 |    ```
100 | 
101 | **Resource Mode Benefits:**
102 | - Files are returned directly in the MCP response as base64-encoded data
103 | - No disk I/O required - useful for containerized or serverless environments
104 | - MCP clients can access file content immediately without file system access
105 | - In `both` mode, resources can be fetched later using the `elevenlabs://filename` URI pattern
106 | 
107 | **Use Cases:**
108 | - `files`: Traditional file-based workflows, local development
109 | - `resources`: Cloud environments, MCP clients without file system access
110 | - `both`: Maximum flexibility, caching, and resource sharing scenarios
111 | 
112 | ### Data residency keys
113 | 
114 | You can specify the data residency region with the `ELEVENLABS_API_RESIDENCY` environment variable. Defaults to `"us"`.
115 | 
116 | **Note:** Data residency is an enterprise only feature. See [the docs](https://elevenlabs.io/docs/product-guides/administration/data-residency#overview) for more details.
117 | 
118 | ## Contributing
119 | 
120 | If you want to contribute or run from source:
121 | 
122 | 1. Clone the repository:
123 | 
124 | ```bash
125 | git clone https://github.com/elevenlabs/elevenlabs-mcp
126 | cd elevenlabs-mcp
127 | ```
128 | 
129 | 2. Create a virtual environment and install dependencies [using uv](https://github.com/astral-sh/uv):
130 | 
131 | ```bash
132 | uv venv
133 | source .venv/bin/activate
134 | uv pip install -e ".[dev]"
135 | ```
136 | 
137 | 3. Copy `.env.example` to `.env` and add your ElevenLabs API key:
138 | 
139 | ```bash
140 | cp .env.example .env
141 | # Edit .env and add your API key
142 | ```
143 | 
144 | 4. Run the tests to make sure everything is working:
145 | 
146 | ```bash
147 | ./scripts/test.sh
148 | # Or with options
149 | ./scripts/test.sh --verbose --fail-fast
150 | ```
151 | 
152 | 5. Install the server in Claude Desktop: `mcp install elevenlabs_mcp/server.py`
153 | 
154 | 6. Debug and test locally with MCP Inspector: `mcp dev elevenlabs_mcp/server.py`
155 | 
156 | ## Troubleshooting
157 | 
158 | Logs when running with Claude Desktop can be found at:
159 | 
160 | - **Windows**: `%APPDATA%\Claude\logs\mcp-server-elevenlabs.log`
161 | - **macOS**: `~/Library/Logs/Claude/mcp-server-elevenlabs.log`
162 | 
163 | ### Timeouts when using certain tools
164 | 
165 | Certain ElevenLabs API operations, like voice design and audio isolation, can take a long time to resolve. When using the MCP inspector in dev mode, you might get timeout errors despite the tool completing its intended task.
166 | 
167 | This shouldn't occur when using a client like Claude.
168 | 
169 | ### MCP ElevenLabs: spawn uvx ENOENT
170 | 
171 | If you encounter the error "MCP ElevenLabs: spawn uvx ENOENT", confirm its absolute path by running this command in your terminal:
172 | 
173 | ```bash
174 | which uvx
175 | ```
176 | 
177 | Once you obtain the absolute path (e.g., `/usr/local/bin/uvx`), update your configuration to use that path (e.g., `"command": "/usr/local/bin/uvx"`). This ensures that the correct executable is referenced.
178 | 
179 | 
180 | 
181 | 
```

--------------------------------------------------------------------------------
/scripts/build.sh:
--------------------------------------------------------------------------------

```bash
1 | #!/bin/bash
2 | rm -rf dist/ build/ *.egg-info/
3 | uv build
```

--------------------------------------------------------------------------------
/elevenlabs_mcp/__init__.py:
--------------------------------------------------------------------------------

```python
1 | """ElevenLabs MCP Server package."""
2 | 
3 | __version__ = "0.2.1"
4 | 
```

--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------

```python
1 | from setuptools import setup, find_packages
2 | 
3 | setup(
4 |     packages=find_packages(),
5 |     include_package_data=True,
6 | )
7 | 
```

--------------------------------------------------------------------------------
/scripts/dev.sh:
--------------------------------------------------------------------------------

```bash
1 | #!/bin/bash
2 | uv run fastmcp dev elevenlabs_mcp/server.py --with python-dotenv --with elevenlabs --with fuzzywuzzy --with python-Levenshtein --with sounddevice --with soundfile --with-editable .
```

--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
 1 | FROM python:3.11-slim
 2 | 
 3 | # Install system dependencies
 4 | RUN apt-get update && apt-get install -y gcc && rm -rf /var/lib/apt/lists/*
 5 | 
 6 | WORKDIR /app
 7 | 
 8 | # Copy the application code to the container
 9 | COPY . .
10 | 
11 | # Upgrade pip and install the package
12 | RUN pip install --upgrade pip \
13 |     && pip install --no-cache-dir .
14 | 
15 | # Command to run the MCP server
16 | CMD ["elevenlabs-mcp"]
```

--------------------------------------------------------------------------------
/tests/conftest.py:
--------------------------------------------------------------------------------

```python
 1 | import pytest
 2 | from pathlib import Path
 3 | import tempfile
 4 | 
 5 | 
 6 | @pytest.fixture
 7 | def temp_dir():
 8 |     with tempfile.TemporaryDirectory() as temp_dir:
 9 |         yield Path(temp_dir)
10 | 
11 | 
12 | @pytest.fixture
13 | def sample_audio_file(temp_dir):
14 |     audio_file = temp_dir / "test.mp3"
15 |     audio_file.touch()
16 |     return audio_file
17 | 
18 | 
19 | @pytest.fixture
20 | def sample_video_file(temp_dir):
21 |     video_file = temp_dir / "test.mp4"
22 |     video_file.touch()
23 |     return video_file
24 | 
```

--------------------------------------------------------------------------------
/scripts/deploy.sh:
--------------------------------------------------------------------------------

```bash
 1 | #!/bin/bash
 2 | 
 3 | # Check if environment argument is provided
 4 | if [[ $# -lt 1 ]]; then
 5 |     echo "Usage: $0 [test|prod]"
 6 |     exit 1
 7 | fi
 8 | 
 9 | # Clean previous builds
10 | rm -rf dist/ build/ *.egg-info/
11 | 
12 | # Build the package
13 | uv build
14 | 
15 | if [ "$1" = "test" ]; then
16 |     uv run twine upload --repository testpypi dist/* --verbose
17 | elif [ "$1" = "prod" ]; then
18 |     uv run twine upload --repository pypi dist/*
19 | else
20 |     echo "Please specify 'test' or 'prod' as the argument"
21 |     exit 1
22 | fi
```

--------------------------------------------------------------------------------
/elevenlabs_mcp/model.py:
--------------------------------------------------------------------------------

```python
 1 | from pydantic import BaseModel
 2 | from typing import Dict, Optional
 3 | 
 4 | 
 5 | class McpVoice(BaseModel):
 6 |     id: str
 7 |     name: str
 8 |     category: str
 9 |     fine_tuning_status: Optional[Dict] = None
10 | 
11 | 
12 | class ConvAiAgentListItem(BaseModel):
13 |     name: str
14 |     agent_id: str
15 | 
16 | 
17 | class ConvaiAgent(BaseModel):
18 |     name: str
19 |     agent_id: str
20 |     system_prompt: str
21 |     voice_id: str | None
22 |     language: str
23 |     llm: str
24 | 
25 | 
26 | class McpLanguage(BaseModel):
27 |     language_id: str
28 |     name: str
29 | 
30 | 
31 | class McpModel(BaseModel):
32 |     id: str
33 |     name: str
34 |     languages: list[McpLanguage]
35 | 
```

--------------------------------------------------------------------------------
/.github/workflows/test.yml:
--------------------------------------------------------------------------------

```yaml
 1 | name: Run Tests
 2 | 
 3 | on:
 4 |   push:
 5 |     branches:
 6 |       - main
 7 |   pull_request:
 8 | jobs:
 9 |   test:
10 |     runs-on: ubuntu-latest
11 |     steps:
12 |       - uses: actions/checkout@v3
13 | 
14 |       - name: Set up Python 3.11
15 |         uses: actions/setup-python@v4
16 |         with:
17 |           python-version: "3.11"
18 | 
19 |       - name: Install uv
20 |         run: python -m pip install uv
21 | 
22 |       - name: Install dependencies
23 |         run: |
24 |           uv pip install --system -e ".[dev]"
25 | 
26 |       - name: Run tests
27 |         run: |
28 |           uv run pytest --cov=elevenlabs_mcp --cov-report=xml
29 | 
30 |       - name: Upload coverage to Codecov
31 |         uses: codecov/codecov-action@v3
32 |         with:
33 |           file: ./coverage.xml
34 |           fail_ci_if_error: false
35 |           verbose: true
36 | 
```

--------------------------------------------------------------------------------
/scripts/test.sh:
--------------------------------------------------------------------------------

```bash
 1 | #!/bin/bash
 2 | 
 3 | # Set default variables
 4 | COVERAGE=true
 5 | VERBOSE=false
 6 | FAIL_FAST=false
 7 | 
 8 | # Process command-line arguments
 9 | while [[ $# -gt 0 ]]; do
10 |   case $1 in
11 |     --no-coverage)
12 |       COVERAGE=false
13 |       shift
14 |       ;;
15 |     --verbose|-v)
16 |       VERBOSE=true
17 |       shift
18 |       ;;
19 |     --fail-fast|-f)
20 |       FAIL_FAST=true
21 |       shift
22 |       ;;
23 |     *)
24 |       echo "Unknown option: $1"
25 |       echo "Usage: ./test.sh [--no-coverage] [--verbose|-v] [--fail-fast|-f]"
26 |       exit 1
27 |       ;;
28 |   esac
29 | done
30 | 
31 | # Build the command
32 | CMD="python -m pytest"
33 | 
34 | if [ "$COVERAGE" = true ]; then
35 |   CMD="$CMD --cov=elevenlabs_mcp"
36 | fi
37 | 
38 | if [ "$VERBOSE" = true ]; then
39 |   CMD="$CMD -v"
40 | fi
41 | 
42 | if [ "$FAIL_FAST" = true ]; then
43 |   CMD="$CMD -x"
44 | fi
45 | 
46 | # Run the tests
47 | echo "Running tests with command: $CMD"
48 | $CMD 
```

--------------------------------------------------------------------------------
/scripts/setup.sh:
--------------------------------------------------------------------------------

```bash
 1 | #!/bin/bash
 2 | 
 3 | # Ensure uv is available
 4 | if ! command -v uv &> /dev/null; then
 5 |     echo "Error: uv is not installed. Please install it first:"
 6 |     echo "pip install uv"
 7 |     exit 1
 8 | fi
 9 | 
10 | # Create or update virtual environment
11 | echo "Creating/updating virtual environment..."
12 | uv venv .venv
13 | 
14 | # Activate virtual environment based on shell
15 | if [[ "$SHELL" == */zsh ]]; then
16 |     source .venv/bin/activate
17 | elif [[ "$SHELL" == */bash ]]; then
18 |     source .venv/bin/activate
19 | else
20 |     echo "Please activate the virtual environment manually:"
21 |     echo "source .venv/bin/activate"
22 | fi
23 | 
24 | # Install dependencies
25 | echo "Installing dependencies with uv..."
26 | uv pip install -e ".[dev]"
27 | 
28 | # Install pre-commit hooks
29 | echo "Setting up pre-commit hooks..."
30 | pre-commit install
31 | 
32 | echo "Setup complete! Virtual environment is ready." 
```

--------------------------------------------------------------------------------
/.github/workflows/publish.yml:
--------------------------------------------------------------------------------

```yaml
 1 | name: Publish Python Package
 2 | 
 3 | on:
 4 |   push:
 5 |     tags:
 6 |       - "v*"
 7 | 
 8 | jobs:
 9 |   deploy:
10 |     runs-on: ubuntu-latest
11 |     steps:
12 |       - uses: actions/checkout@v3
13 | 
14 |       - name: Set up Python
15 |         uses: actions/setup-python@v4
16 |         with:
17 |           python-version: "3.11"
18 | 
19 |       - name: Install uv
20 |         run: pip install uv
21 | 
22 |       - name: Verify tag matches package version
23 |         run: |
24 |           # Extract version from tag (remove 'v' prefix)
25 |           TAG_VERSION=${GITHUB_REF#refs/tags/v}
26 | 
27 |           # Extract version from pyproject.toml
28 |           PACKAGE_VERSION=$(grep -o 'version = "[^"]*"' pyproject.toml | cut -d'"' -f2)
29 | 
30 |           echo "Tag version: $TAG_VERSION"
31 |           echo "Package version: $PACKAGE_VERSION"
32 | 
33 |           # Verify versions match
34 |           if [ "$TAG_VERSION" != "$PACKAGE_VERSION" ]; then
35 |             echo "Error: Tag version ($TAG_VERSION) does not match package version ($PACKAGE_VERSION)"
36 |             exit 1
37 |           fi
38 | 
39 |       - name: Install dependencies
40 |         run: |
41 |           uv pip install --system -e ".[dev]"
42 | 
43 |       - name: Run tests
44 |         run: |
45 |           uv run pytest --cov=elevenlabs_mcp
46 | 
47 |       - name: Build package
48 |         run: |
49 |           uv build
50 | 
51 |       - name: Publish to PyPI
52 |         uses: pypa/gh-action-pypi-publish@release/v1
53 |         with:
54 |           user: __token__
55 |           password: ${{ secrets.PYPI_API_TOKEN }}
56 |           skip-existing: true
57 | 
```

--------------------------------------------------------------------------------
/server.json:
--------------------------------------------------------------------------------

```json
 1 | {
 2 |   "$schema": "https://static.modelcontextprotocol.io/schemas/2025-07-09/server.schema.json",
 3 |   "name": "io.github.elevenlabs/elevenlabs-mcp",
 4 |   "description": "MCP server that enables interaction with Text to Speech, Voice Agents and audio processing APIs",
 5 |   "status": "active",
 6 |   "repository": {
 7 |     "url": "https://github.com/elevenlabs/elevenlabs-mcp",
 8 |     "source": "github"
 9 |   },
10 |   "version": "0.9.0",
11 |   "packages": [
12 |     {
13 |       "registry_type": "pypi",
14 |       "registry_base_url": "https://pypi.org",
15 |       "identifier": "elevenlabs-mcp",
16 |       "version": "0.8.1",
17 |       "transport": {
18 |         "type": "stdio"
19 |       },
20 |       "environment_variables": [
21 |         {
22 |           "description": "Your ElevenLabs API key",
23 |           "is_required": true,
24 |           "format": "string",
25 |           "is_secret": true,
26 |           "name": "ELEVENLABS_API_KEY"
27 |         },
28 |         {
29 |           "description": "The base path for the MCP server. Defaults to $HOME/Desktop if not provided.",
30 |           "is_required": false,
31 |           "format": "string",
32 |           "is_secret": false,
33 |           "name": "ELEVENLABS_MCP_BASE_PATH"
34 |         },
35 |         {
36 |           "description": "The optional data residency region. Defaults to 'us' if not provided.",
37 |           "is_required": false,
38 |           "format": "string",
39 |           "is_secret": false,
40 |           "name": "ELEVENLABS_API_RESIDENCY"
41 |         },
42 |         {
43 |           "description": "The output mode for the MCP server. Defaults to 'files' if not provided.",
44 |           "is_required": false,
45 |           "format": "string",
46 |           "is_secret": false,
47 |           "name": "ELEVENLABS_MCP_OUTPUT_MODE"
48 |         }
49 |       ]
50 |     }
51 |   ]
52 | }
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
 1 | [project]
 2 | name = "elevenlabs-mcp"
 3 | version = "0.9.0"
 4 | description = "ElevenLabs MCP Server"
 5 | authors = [
 6 |     { name = "Jacek Duszenko", email = "[email protected]" },
 7 |     { name = "Paul Asjes", email = "[email protected]" },
 8 |     { name = "Louis Jordan", email = "[email protected]" },
 9 |     { name = "Luke Harries", email = "[email protected]" },
10 | ]
11 | readme = "README.md"
12 | license = { file = "LICENSE" }
13 | classifiers = [
14 |     "Development Status :: 4 - Beta",
15 |     "Intended Audience :: Developers",
16 |     "License :: OSI Approved :: MIT License",
17 |     "Programming Language :: Python :: 3",
18 |     "Programming Language :: Python :: 3.11",
19 |     "Programming Language :: Python :: 3.12",
20 | ]
21 | keywords = [
22 |     "elevenlabs",
23 |     "mcp",
24 |     "text-to-speech",
25 |     "speech-to-text",
26 |     "voice-cloning",
27 | ]
28 | requires-python = ">=3.11"
29 | dependencies = [
30 |     "mcp[cli]>=1.6.0",
31 |     "fastapi==0.109.2",
32 |     "uvicorn==0.27.1",
33 |     "python-dotenv==1.0.1",
34 |     "pydantic>=2.6.1",
35 |     "httpx==0.28.1",
36 |     "elevenlabs>=2.13.0",
37 |     "fuzzywuzzy==0.18.0",
38 |     "python-Levenshtein>=0.25.0",
39 |     "sounddevice==0.5.1",
40 |     "soundfile==0.13.1",
41 | ]
42 | 
43 | [project.scripts]
44 | elevenlabs-mcp = "elevenlabs_mcp.server:main"
45 | 
46 | [project.optional-dependencies]
47 | dev = [
48 |     "pre-commit==3.6.2",
49 |     "ruff==0.3.0",
50 |     "fastmcp==0.4.1",
51 |     "pytest==8.0.0",
52 |     "pytest-cov==4.1.0",
53 |     "twine==6.1.0",
54 |     "build>=1.0.3",
55 | ]
56 | 
57 | [build-system]
58 | requires = ["setuptools>=45", "wheel"]
59 | build-backend = "setuptools.build_meta"
60 | 
61 | [tool.pytest.ini_options]
62 | testpaths = ["tests"]
63 | python_files = ["test_*.py"]
64 | addopts = "-v --cov=elevenlabs_mcp --cov-report=term-missing"
65 | 
66 | [dependency-groups]
67 | dev = [
68 |     "build>=1.2.2.post1",
69 |     "fastmcp>=0.4.1",
70 |     "pre-commit>=3.6.2",
71 |     "pytest>=8.0.0",
72 |     "pytest-cov>=4.1.0",
73 |     "ruff>=0.3.0",
74 |     "twine>=6.1.0",
75 | ]
76 | 
```

--------------------------------------------------------------------------------
/tests/test_utils.py:
--------------------------------------------------------------------------------

```python
 1 | import pytest
 2 | from pathlib import Path
 3 | import tempfile
 4 | from elevenlabs_mcp.utils import (
 5 |     ElevenLabsMcpError,
 6 |     make_error,
 7 |     is_file_writeable,
 8 |     make_output_file,
 9 |     make_output_path,
10 |     find_similar_filenames,
11 |     try_find_similar_files,
12 |     handle_input_file,
13 | )
14 | 
15 | 
16 | def test_make_error():
17 |     with pytest.raises(ElevenLabsMcpError):
18 |         make_error("Test error")
19 | 
20 | 
21 | def test_is_file_writeable():
22 |     with tempfile.TemporaryDirectory() as temp_dir:
23 |         temp_path = Path(temp_dir)
24 |         assert is_file_writeable(temp_path) is True
25 |         assert is_file_writeable(temp_path / "nonexistent.txt") is True
26 | 
27 | 
28 | def test_make_output_file():
29 |     tool = "test"
30 |     text = "hello world"
31 |     result = make_output_file(tool, text, "mp3")
32 |     assert result.name.startswith("test_hello")
33 |     assert result.suffix == ".mp3"
34 | 
35 | 
36 | def test_make_output_path():
37 |     with tempfile.TemporaryDirectory() as temp_dir:
38 |         result = make_output_path(temp_dir)
39 |         assert result == Path(temp_dir)
40 |         assert result.exists()
41 |         assert result.is_dir()
42 | 
43 | 
44 | def test_find_similar_filenames():
45 |     with tempfile.TemporaryDirectory() as temp_dir:
46 |         temp_path = Path(temp_dir)
47 |         test_file = temp_path / "test_file.txt"
48 |         similar_file = temp_path / "test_file_2.txt"
49 |         different_file = temp_path / "different.txt"
50 | 
51 |         test_file.touch()
52 |         similar_file.touch()
53 |         different_file.touch()
54 | 
55 |         results = find_similar_filenames(str(test_file), temp_path)
56 |         assert len(results) > 0
57 |         assert any(str(similar_file) in str(r[0]) for r in results)
58 | 
59 | 
60 | def test_try_find_similar_files():
61 |     with tempfile.TemporaryDirectory() as temp_dir:
62 |         temp_path = Path(temp_dir)
63 |         test_file = temp_path / "test_file.mp3"
64 |         similar_file = temp_path / "test_file_2.mp3"
65 |         different_file = temp_path / "different.txt"
66 | 
67 |         test_file.touch()
68 |         similar_file.touch()
69 |         different_file.touch()
70 | 
71 |         results = try_find_similar_files(str(test_file), temp_path)
72 |         assert len(results) > 0
73 |         assert any(str(similar_file) in str(r) for r in results)
74 | 
75 | 
76 | def test_handle_input_file():
77 |     with tempfile.TemporaryDirectory() as temp_dir:
78 |         temp_path = Path(temp_dir)
79 |         test_file = temp_path / "test.mp3"
80 | 
81 |         with open(test_file, "wb") as f:
82 |             f.write(b"\xff\xfb\x90\x64\x00")
83 | 
84 |         result = handle_input_file(str(test_file))
85 |         assert result == test_file
86 | 
87 |         with pytest.raises(ElevenLabsMcpError):
88 |             handle_input_file(str(temp_path / "nonexistent.mp3"))
89 | 
```

--------------------------------------------------------------------------------
/elevenlabs_mcp/__main__.py:
--------------------------------------------------------------------------------

```python
 1 | import os
 2 | import json
 3 | from pathlib import Path
 4 | import sys
 5 | from dotenv import load_dotenv
 6 | import argparse
 7 | 
 8 | load_dotenv()
 9 | 
10 | 
11 | def get_claude_config_path() -> Path | None:
12 |     """Get the Claude config directory based on platform."""
13 |     if sys.platform == "win32":
14 |         path = Path(Path.home(), "AppData", "Roaming", "Claude")
15 |     elif sys.platform == "darwin":
16 |         path = Path(Path.home(), "Library", "Application Support", "Claude")
17 |     elif sys.platform.startswith("linux"):
18 |         path = Path(
19 |             os.environ.get("XDG_CONFIG_HOME", Path.home() / ".config"), "Claude"
20 |         )
21 |     else:
22 |         return None
23 | 
24 |     if path.exists():
25 |         return path
26 |     return None
27 | 
28 | 
29 | def get_python_path():
30 |     return sys.executable
31 | 
32 | 
33 | def generate_config(api_key: str | None = None):
34 |     module_dir = Path(__file__).resolve().parent
35 |     server_path = module_dir / "server.py"
36 |     python_path = get_python_path()
37 | 
38 |     final_api_key = api_key or os.environ.get("ELEVENLABS_API_KEY")
39 |     if not final_api_key:
40 |         print("Error: ElevenLabs API key is required.")
41 |         print("Please either:")
42 |         print("  1. Pass the API key using --api-key argument, or")
43 |         print("  2. Set the ELEVENLABS_API_KEY environment variable, or")
44 |         print("  3. Add ELEVENLABS_API_KEY to your .env file")
45 |         sys.exit(1)
46 | 
47 |     config = {
48 |         "mcpServers": {
49 |             "ElevenLabs": {
50 |                 "command": python_path,
51 |                 "args": [
52 |                     str(server_path),
53 |                 ],
54 |                 "env": {"ELEVENLABS_API_KEY": final_api_key},
55 |             }
56 |         }
57 |     }
58 | 
59 |     return config
60 | 
61 | 
62 | if __name__ == "__main__":
63 |     parser = argparse.ArgumentParser()
64 |     parser.add_argument(
65 |         "--print",
66 |         action="store_true",
67 |         help="Print config to screen instead of writing to file",
68 |     )
69 |     parser.add_argument(
70 |         "--api-key",
71 |         help="ElevenLabs API key (alternatively, set ELEVENLABS_API_KEY environment variable)",
72 |     )
73 |     parser.add_argument(
74 |         "--config-path",
75 |         type=Path,
76 |         help="Custom path to Claude config directory",
77 |     )
78 |     args = parser.parse_args()
79 | 
80 |     config = generate_config(args.api_key)
81 | 
82 |     if args.print:
83 |         print(json.dumps(config, indent=2))
84 |     else:
85 |         claude_path = args.config_path if args.config_path else get_claude_config_path()
86 |         if claude_path is None:
87 |             print(
88 |                 "Could not find Claude config path automatically. Please specify it using --config-path argument. The argument should be an absolute path of the claude_desktop_config.json file."
89 |             )
90 |             sys.exit(1)
91 | 
92 |         claude_path.mkdir(parents=True, exist_ok=True)
93 |         print("Writing config to", claude_path / "claude_desktop_config.json")
94 |         with open(claude_path / "claude_desktop_config.json", "w") as f:
95 |             json.dump(config, f, indent=2)
96 | 
```

--------------------------------------------------------------------------------
/elevenlabs_mcp/convai.py:
--------------------------------------------------------------------------------

```python
 1 | def create_conversation_config(
 2 |     language: str,
 3 |     system_prompt: str,
 4 |     llm: str,
 5 |     first_message: str | None,
 6 |     temperature: float,
 7 |     max_tokens: int | None,
 8 |     asr_quality: str,
 9 |     voice_id: str | None,
10 |     model_id: str,
11 |     optimize_streaming_latency: int,
12 |     stability: float,
13 |     similarity_boost: float,
14 |     turn_timeout: int,
15 |     max_duration_seconds: int,
16 | ) -> dict:
17 |     return {
18 |         "agent": {
19 |             "language": language,
20 |             "prompt": {
21 |                 "prompt": system_prompt,
22 |                 "llm": llm,
23 |                 "tools": [{"type": "system", "name": "end_call", "description": ""}],
24 |                 "knowledge_base": [],
25 |                 "temperature": temperature,
26 |                 **({"max_tokens": max_tokens} if max_tokens else {}),
27 |             },
28 |             **({"first_message": first_message} if first_message else {}),
29 |             "dynamic_variables": {"dynamic_variable_placeholders": {}},
30 |         },
31 |         "asr": {
32 |             "quality": asr_quality,
33 |             "provider": "elevenlabs",
34 |             "user_input_audio_format": "pcm_16000",
35 |             "keywords": [],
36 |         },
37 |         "tts": {
38 |             **({"voice_id": voice_id} if voice_id else {}),
39 |             "model_id": model_id,
40 |             "agent_output_audio_format": "pcm_16000",
41 |             "optimize_streaming_latency": optimize_streaming_latency,
42 |             "stability": stability,
43 |             "similarity_boost": similarity_boost,
44 |         },
45 |         "turn": {"turn_timeout": turn_timeout},
46 |         "conversation": {
47 |             "max_duration_seconds": max_duration_seconds,
48 |             "client_events": [
49 |                 "audio",
50 |                 "interruption",
51 |                 "user_transcript",
52 |                 "agent_response",
53 |                 "agent_response_correction",
54 |             ],
55 |         },
56 |         "language_presets": {},
57 |         "is_blocked_ivc": False,
58 |         "is_blocked_non_ivc": False,
59 |     }
60 | 
61 | 
62 | def create_platform_settings(
63 |     record_voice: bool,
64 |     retention_days: int,
65 | ) -> dict:
66 |     return {
67 |         "widget": {
68 |             "variant": "full",
69 |             "avatar": {"type": "orb", "color_1": "#6DB035", "color_2": "#F5CABB"},
70 |             "feedback_mode": "during",
71 |             "terms_text": '#### Terms and conditions\n\nBy clicking "Agree," and each time I interact with this AI agent, I consent to the recording, storage, and sharing of my communications with third-party service providers, and as described in the Privacy Policy.\nIf you do not wish to have your conversations recorded, please refrain from using this service.',
72 |             "show_avatar_when_collapsed": True,
73 |         },
74 |         "evaluation": {},
75 |         "auth": {"allowlist": []},
76 |         "overrides": {},
77 |         "call_limits": {"agent_concurrency_limit": -1, "daily_limit": 100000},
78 |         "privacy": {
79 |             "record_voice": record_voice,
80 |             "retention_days": retention_days,
81 |             "delete_transcript_and_pii": True,
82 |             "delete_audio": True,
83 |             "apply_to_existing_conversations": False,
84 |         },
85 |         "data_collection": {},
86 |     }
87 | 
```

--------------------------------------------------------------------------------
/elevenlabs_mcp/utils.py:
--------------------------------------------------------------------------------

```python
  1 | import os
  2 | import tempfile
  3 | import base64
  4 | from pathlib import Path
  5 | from datetime import datetime
  6 | from fuzzywuzzy import fuzz
  7 | from typing import Union
  8 | from mcp.types import (
  9 |     EmbeddedResource,
 10 |     TextResourceContents,
 11 |     BlobResourceContents,
 12 |     TextContent,
 13 | )
 14 | 
 15 | 
 16 | class ElevenLabsMcpError(Exception):
 17 |     pass
 18 | 
 19 | 
 20 | def make_error(error_text: str):
 21 |     raise ElevenLabsMcpError(error_text)
 22 | 
 23 | 
 24 | def is_file_writeable(path: Path) -> bool:
 25 |     if path.exists():
 26 |         return os.access(path, os.W_OK)
 27 |     parent_dir = path.parent
 28 |     return os.access(parent_dir, os.W_OK)
 29 | 
 30 | 
 31 | def make_output_file(
 32 |     tool: str, text: str, extension: str, full_id: bool = False
 33 | ) -> Path:
 34 |     id = text if full_id else text[:5]
 35 | 
 36 |     output_file_name = f"{tool}_{id.replace(' ', '_')}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.{extension}"
 37 |     return Path(output_file_name)
 38 | 
 39 | 
 40 | def make_output_path(
 41 |     output_directory: str | None, base_path: str | None = None
 42 | ) -> Path:
 43 |     output_path = None
 44 |     if output_directory is None:
 45 |         base = base_path
 46 |         if base and base.strip():
 47 |             output_path = Path(os.path.expanduser(base))
 48 |         else:
 49 |             output_path = Path.home() / "Desktop"
 50 |     elif not os.path.isabs(output_directory) and base_path:
 51 |         output_path = Path(os.path.expanduser(base_path)) / Path(output_directory)
 52 |     else:
 53 |         output_path = Path(os.path.expanduser(output_directory))
 54 |     if not is_file_writeable(output_path):
 55 |         make_error(f"Directory ({output_path}) is not writeable")
 56 |     output_path.mkdir(parents=True, exist_ok=True)
 57 |     return output_path
 58 | 
 59 | 
 60 | def find_similar_filenames(
 61 |     target_file: str, directory: Path, threshold: int = 70
 62 | ) -> list[tuple[str, int]]:
 63 |     """
 64 |     Find files with names similar to the target file using fuzzy matching.
 65 | 
 66 |     Args:
 67 |         target_file (str): The reference filename to compare against
 68 |         directory (str): Directory to search in (defaults to current directory)
 69 |         threshold (int): Similarity threshold (0 to 100, where 100 is identical)
 70 | 
 71 |     Returns:
 72 |         list: List of similar filenames with their similarity scores
 73 |     """
 74 |     target_filename = os.path.basename(target_file)
 75 |     similar_files = []
 76 |     for root, _, files in os.walk(directory):
 77 |         for filename in files:
 78 |             if (
 79 |                 filename == target_filename
 80 |                 and os.path.join(root, filename) == target_file
 81 |             ):
 82 |                 continue
 83 |             similarity = fuzz.token_sort_ratio(target_filename, filename)
 84 | 
 85 |             if similarity >= threshold:
 86 |                 file_path = Path(root) / filename
 87 |                 similar_files.append((file_path, similarity))
 88 | 
 89 |     similar_files.sort(key=lambda x: x[1], reverse=True)
 90 | 
 91 |     return similar_files
 92 | 
 93 | 
 94 | def try_find_similar_files(
 95 |     filename: str, directory: Path, take_n: int = 5
 96 | ) -> list[Path]:
 97 |     similar_files = find_similar_filenames(filename, directory)
 98 |     if not similar_files:
 99 |         return []
100 | 
101 |     filtered_files = []
102 | 
103 |     for path, _ in similar_files[:take_n]:
104 |         if check_audio_file(path):
105 |             filtered_files.append(path)
106 | 
107 |     return filtered_files
108 | 
109 | 
110 | def check_audio_file(path: Path) -> bool:
111 |     audio_extensions = {
112 |         ".wav",
113 |         ".mp3",
114 |         ".m4a",
115 |         ".aac",
116 |         ".ogg",
117 |         ".flac",
118 |         ".mp4",
119 |         ".avi",
120 |         ".mov",
121 |         ".wmv",
122 |     }
123 |     return path.suffix.lower() in audio_extensions
124 | 
125 | 
126 | def handle_input_file(file_path: str, audio_content_check: bool = True) -> Path:
127 |     if not os.path.isabs(file_path) and not os.environ.get("ELEVENLABS_MCP_BASE_PATH"):
128 |         make_error(
129 |             "File path must be an absolute path if ELEVENLABS_MCP_BASE_PATH is not set"
130 |         )
131 |     path = Path(file_path)
132 |     if not path.exists() and path.parent.exists():
133 |         parent_directory = path.parent
134 |         similar_files = try_find_similar_files(path.name, parent_directory)
135 |         similar_files_formatted = ",".join([str(file) for file in similar_files])
136 |         if similar_files:
137 |             make_error(
138 |                 f"File ({path}) does not exist. Did you mean any of these files: {similar_files_formatted}?"
139 |             )
140 |         make_error(f"File ({path}) does not exist")
141 |     elif not path.exists():
142 |         make_error(f"File ({path}) does not exist")
143 |     elif not path.is_file():
144 |         make_error(f"File ({path}) is not a file")
145 | 
146 |     if audio_content_check and not check_audio_file(path):
147 |         make_error(f"File ({path}) is not an audio or video file")
148 |     return path
149 | 
150 | 
151 | def handle_large_text(
152 |     text: str, max_length: int = 10000, content_type: str = "content"
153 | ):
154 |     """
155 |     Handle large text content by saving to temporary file if it exceeds max_length.
156 | 
157 |     Args:
158 |         text: The text content to handle
159 |         max_length: Maximum character length before saving to temp file
160 |         content_type: Description of the content type for user messages
161 | 
162 |     Returns:
163 |         str: Either the original text or a message with temp file path
164 |     """
165 |     if len(text) > max_length:
166 |         with tempfile.NamedTemporaryFile(
167 |             mode="w", suffix=".txt", delete=False, encoding="utf-8"
168 |         ) as temp_file:
169 |             temp_file.write(text)
170 |             temp_path = temp_file.name
171 | 
172 |         return f"{content_type.capitalize()} saved to temporary file: {temp_path}\nUse the Read tool to access the full {content_type}."
173 | 
174 |     return text
175 | 
176 | 
177 | def parse_conversation_transcript(transcript_entries, max_length: int = 50000):
178 |     """
179 |     Parse conversation transcript entries into a formatted string.
180 |     If transcript is too long, save to temporary file and return file path.
181 | 
182 |     Args:
183 |         transcript_entries: List of transcript entries from conversation response
184 |         max_length: Maximum character length before saving to temp file
185 | 
186 |     Returns:
187 |         tuple: (transcript_text_or_path, is_temp_file)
188 |     """
189 |     transcript_lines = []
190 |     for entry in transcript_entries:
191 |         speaker = getattr(entry, "role", "Unknown")
192 |         text = getattr(entry, "message", getattr(entry, "text", ""))
193 |         timestamp = getattr(entry, "timestamp", None)
194 | 
195 |         if timestamp:
196 |             transcript_lines.append(f"[{timestamp}] {speaker}: {text}")
197 |         else:
198 |             transcript_lines.append(f"{speaker}: {text}")
199 | 
200 |     transcript = (
201 |         "\n".join(transcript_lines) if transcript_lines else "No transcript available"
202 |     )
203 | 
204 |     # Check if transcript is too long for LLM context window
205 |     if len(transcript) > max_length:
206 |         # Create temporary file
207 |         temp_file = tempfile.SpooledTemporaryFile(
208 |             mode="w+", max_size=0, encoding="utf-8"
209 |         )
210 |         temp_file.write(transcript)
211 |         temp_file.seek(0)
212 | 
213 |         # Get a persistent temporary file path
214 |         with tempfile.NamedTemporaryFile(
215 |             mode="w", suffix=".txt", delete=False, encoding="utf-8"
216 |         ) as persistent_temp:
217 |             persistent_temp.write(transcript)
218 |             temp_path = persistent_temp.name
219 | 
220 |         return (
221 |             f"Transcript saved to temporary file: {temp_path}\nUse the Read tool to access the full transcript.",
222 |             True,
223 |         )
224 | 
225 |     return transcript, False
226 | 
227 | 
228 | def parse_location(api_residency: str | None) -> str:
229 |     """
230 |     Parse the API residency and return the corresponding origin URL.
231 |     """
232 |     origin_map = {
233 |         "us": "https://api.elevenlabs.io",
234 |         "eu-residency": "https://api.eu.residency.elevenlabs.io",
235 |         "in-residency": "https://api.in.residency.elevenlabs.io",
236 |         "global": "https://api.elevenlabs.io",
237 |     }
238 | 
239 |     if not api_residency or not api_residency.strip():
240 |         return origin_map["us"]
241 | 
242 |     api_residency = api_residency.strip().lower()
243 | 
244 |     if api_residency not in origin_map:
245 |         valid_options = ", ".join(f"'{k}'" for k in origin_map.keys())
246 |         raise ValueError(f"ELEVENLABS_API_RESIDENCY must be one of {valid_options}")
247 | 
248 |     return origin_map[api_residency]
249 | def get_mime_type(file_extension: str) -> str:
250 |     """
251 |     Get MIME type for a given file extension.
252 | 
253 |     Args:
254 |         file_extension: File extension (with or without dot)
255 | 
256 |     Returns:
257 |         str: MIME type string
258 |     """
259 |     # Remove leading dot if present
260 |     ext = file_extension.lstrip(".")
261 | 
262 |     mime_types = {
263 |         "mp3": "audio/mpeg",
264 |         "wav": "audio/wav",
265 |         "ogg": "audio/ogg",
266 |         "flac": "audio/flac",
267 |         "m4a": "audio/mp4",
268 |         "aac": "audio/aac",
269 |         "opus": "audio/opus",
270 |         "txt": "text/plain",
271 |         "json": "application/json",
272 |         "xml": "application/xml",
273 |         "html": "text/html",
274 |         "csv": "text/csv",
275 |         "mp4": "video/mp4",
276 |         "avi": "video/x-msvideo",
277 |         "mov": "video/quicktime",
278 |         "wmv": "video/x-ms-wmv",
279 |     }
280 | 
281 |     return mime_types.get(ext.lower(), "application/octet-stream")
282 | 
283 | 
284 | def generate_resource_uri(filename: str) -> str:
285 |     """
286 |     Generate a resource URI for a given filename.
287 | 
288 |     Args:
289 |         filename: The filename to generate URI for
290 | 
291 |     Returns:
292 |         str: Resource URI in format elevenlabs://filename
293 |     """
294 |     return f"elevenlabs://{filename}"
295 | 
296 | 
297 | def create_resource_response(
298 |     file_data: bytes, filename: str, file_extension: str, directory: Path | None = None
299 | ) -> EmbeddedResource:
300 |     """
301 |     Create a proper MCP EmbeddedResource response.
302 | 
303 |     Args:
304 |         file_data: Raw file data as bytes
305 |         filename: Name of the file
306 |         file_extension: File extension for MIME type detection
307 |         directory: Optional directory where the file is or would be saved; used to embed path in URI
308 | 
309 |     Returns:
310 |         EmbeddedResource: Proper MCP resource object
311 |     """
312 |     mime_type = get_mime_type(file_extension)
313 |     if directory is not None:
314 |         full_path = (directory / filename)
315 |         resource_uri = f"elevenlabs://{full_path.as_posix()}"
316 |     else:
317 |         resource_uri = generate_resource_uri(filename)
318 | 
319 |     # For text files, use TextResourceContents
320 |     if mime_type.startswith("text/"):
321 |         try:
322 |             text_content = file_data.decode("utf-8")
323 |             return EmbeddedResource(
324 |                 type="resource",
325 |                 resource=TextResourceContents(
326 |                     uri=resource_uri, mimeType=mime_type, text=text_content
327 |                 ),
328 |             )
329 |         except UnicodeDecodeError:
330 |             # Fall back to binary if decode fails
331 |             pass
332 | 
333 |     # For binary files (audio, etc.), use BlobResourceContents
334 |     base64_data = base64.b64encode(file_data).decode("utf-8")
335 |     return EmbeddedResource(
336 |         type="resource",
337 |         resource=BlobResourceContents(
338 |             uri=resource_uri, mimeType=mime_type, blob=base64_data
339 |         ),
340 |     )
341 | 
342 | 
343 | def handle_output_mode(
344 |     file_data: bytes,
345 |     output_path: Path,
346 |     filename: str,
347 |     output_mode: str,
348 |     success_message: str = None,
349 | ) -> Union[TextContent, EmbeddedResource]:
350 |     """
351 |     Handle different output modes for file generation.
352 | 
353 |     Args:
354 |         file_data: Raw file data as bytes
355 |         output_path: Path where file should be saved
356 |         filename: Name of the file
357 |         output_mode: Output mode ('files', 'resources', or 'both')
358 |         success_message: Custom success message for files mode (optional)
359 | 
360 |     Returns:
361 |         Union[TextContent, EmbeddedResource]: TextContent for 'files' mode,
362 |                                             EmbeddedResource for 'resources' and 'both' modes
363 |     """
364 |     file_extension = Path(filename).suffix.lstrip(".")
365 |     full_file_path = output_path / filename
366 | 
367 |     if output_mode == "files":
368 |         # Save to disk and return TextContent with success message
369 |         output_path.mkdir(parents=True, exist_ok=True)
370 |         with open(full_file_path, "wb") as f:
371 |             f.write(file_data)
372 | 
373 |         if success_message and "{file_path}" in success_message:
374 |             message = success_message.replace("{file_path}", str(full_file_path))
375 |         else:
376 |             message = success_message or f"Success. File saved as: {full_file_path}"
377 |         return TextContent(type="text", text=message)
378 | 
379 |     elif output_mode == "resources":
380 |         # Return as EmbeddedResource without saving to disk
381 |         return create_resource_response(file_data, filename, file_extension, directory=output_path)
382 | 
383 |     elif output_mode == "both":
384 |         # Save to disk AND return as EmbeddedResource
385 |         output_path.mkdir(parents=True, exist_ok=True)
386 |         with open(full_file_path, "wb") as f:
387 |             f.write(file_data)
388 |         return create_resource_response(file_data, filename, file_extension, directory=output_path)
389 | 
390 |     else:
391 |         raise ValueError(
392 |             f"Invalid output mode: {output_mode}. Must be 'files', 'resources', or 'both'"
393 |         )
394 | 
395 | 
396 | def handle_multiple_files_output_mode(
397 |     results: list[Union[TextContent, EmbeddedResource]],
398 |     output_mode: str,
399 |     additional_info: str = None,
400 | ) -> Union[TextContent, list[EmbeddedResource]]:
401 |     """
402 |     Handle different output modes for multiple file generation.
403 | 
404 |     Args:
405 |         results: List of results from handle_output_mode calls
406 |         output_mode: Output mode ('files', 'resources', or 'both')
407 |         additional_info: Additional information to include in files mode message
408 | 
409 |     Returns:
410 |         Union[TextContent, list[EmbeddedResource]]: TextContent for 'files' mode,
411 |                                                    list of EmbeddedResource for 'resources' and 'both' modes
412 |     """
413 |     if output_mode == "files":
414 |         # Extract file paths from TextContent objects and create combined message
415 |         file_paths = []
416 |         for result in results:
417 |             if isinstance(result, TextContent):
418 |                 # Extract file path from the success message
419 |                 text = result.text
420 |                 if "File saved as: " in text:
421 |                     path = (
422 |                         text.split("File saved as: ")[1].split(".")[0]
423 |                         + "."
424 |                         + text.split(".")[-1].split(" ")[0]
425 |                     )
426 |                     file_paths.append(path)
427 | 
428 |         message = f"Success. Files saved at: {', '.join(file_paths)}"
429 |         if additional_info:
430 |             message += f". {additional_info}"
431 | 
432 |         return TextContent(type="text", text=message)
433 | 
434 |     elif output_mode in ["resources", "both"]:
435 |         # Return list of EmbeddedResource objects
436 |         embedded_resources = []
437 |         for result in results:
438 |             if isinstance(result, EmbeddedResource):
439 |                 embedded_resources.append(result)
440 | 
441 |         if not embedded_resources:
442 |             return TextContent(type="text", text="No files generated")
443 | 
444 |         return embedded_resources
445 | 
446 |     else:
447 |         raise ValueError(
448 |             f"Invalid output mode: {output_mode}. Must be 'files', 'resources', or 'both'"
449 |         )
450 | 
451 | 
452 | def get_output_mode_description(output_mode: str) -> str:
453 |     """
454 |     Generate a dynamic description for the current output mode.
455 | 
456 |     Args:
457 |         output_mode: The current output mode ('files', 'resources', or 'both')
458 | 
459 |     Returns:
460 |         str: Description of how the tool will behave based on the output mode
461 |     """
462 |     if output_mode == "files":
463 |         return "Saves output file to directory (default: $HOME/Desktop)"
464 |     elif output_mode == "resources":
465 |         return "Returns output as base64-encoded MCP resource"
466 |     elif output_mode == "both":
467 |         return "Saves file to directory (default: $HOME/Desktop) AND returns as base64-encoded MCP resource"
468 |     else:
469 |         return "Output behavior depends on ELEVENLABS_MCP_OUTPUT_MODE setting"
470 | 
```

--------------------------------------------------------------------------------
/elevenlabs_mcp/server.py:
--------------------------------------------------------------------------------

```python
   1 | """
   2 | ElevenLabs MCP Server
   3 | 
   4 | ⚠️ IMPORTANT: This server provides access to ElevenLabs API endpoints which may incur costs.
   5 | Each tool that makes an API call is marked with a cost warning. Please follow these guidelines:
   6 | 
   7 | 1. Only use tools when explicitly requested by the user
   8 | 2. For tools that generate audio, consider the length of the text as it affects costs
   9 | 3. Some operations like voice cloning or text-to-voice may have higher costs
  10 | 
  11 | Tools without cost warnings in their description are free to use as they only read existing data.
  12 | """
  13 | 
  14 | import httpx
  15 | import os
  16 | import base64
  17 | from datetime import datetime
  18 | from io import BytesIO
  19 | from typing import Literal, Union
  20 | from dotenv import load_dotenv
  21 | from mcp.server.fastmcp import FastMCP
  22 | from mcp.types import (
  23 |     TextContent,
  24 |     Resource,
  25 |     EmbeddedResource,
  26 | )
  27 | from elevenlabs.client import ElevenLabs
  28 | from elevenlabs.types import MusicPrompt
  29 | from elevenlabs_mcp.model import McpVoice, McpModel, McpLanguage
  30 | from elevenlabs_mcp.utils import (
  31 |     make_error,
  32 |     make_output_path,
  33 |     make_output_file,
  34 |     handle_input_file,
  35 |     parse_conversation_transcript,
  36 |     handle_large_text,
  37 |     parse_location,
  38 |     get_mime_type,
  39 |     handle_output_mode,
  40 |     handle_multiple_files_output_mode,
  41 |     get_output_mode_description,
  42 | )
  43 | 
  44 | from elevenlabs_mcp.convai import create_conversation_config, create_platform_settings
  45 | from elevenlabs.types.knowledge_base_locator import KnowledgeBaseLocator
  46 | 
  47 | from elevenlabs.play import play
  48 | from elevenlabs_mcp import __version__
  49 | from pathlib import Path
  50 | 
  51 | load_dotenv()
  52 | api_key = os.getenv("ELEVENLABS_API_KEY")
  53 | base_path = os.getenv("ELEVENLABS_MCP_BASE_PATH")
  54 | output_mode = os.getenv("ELEVENLABS_MCP_OUTPUT_MODE", "files").strip().lower()
  55 | DEFAULT_VOICE_ID = os.getenv("ELEVENLABS_DEFAULT_VOICE_ID", "cgSgspJ2msm6clMCkdW9")
  56 | 
  57 | if output_mode not in {"files", "resources", "both"}:
  58 |     raise ValueError("ELEVENLABS_MCP_OUTPUT_MODE must be one of: 'files', 'resources', 'both'")
  59 | if not api_key:
  60 |     raise ValueError("ELEVENLABS_API_KEY environment variable is required")
  61 | 
  62 | origin = parse_location(os.getenv("ELEVENLABS_API_RESIDENCY"))
  63 | 
  64 | # Add custom client to ElevenLabs to set User-Agent header
  65 | custom_client = httpx.Client(
  66 |     headers={
  67 |         "User-Agent": f"ElevenLabs-MCP/{__version__}",
  68 |     },
  69 | )
  70 | 
  71 | client = ElevenLabs(api_key=api_key, httpx_client=custom_client, base_url=origin)
  72 | mcp = FastMCP("ElevenLabs")
  73 | 
  74 | 
  75 | def format_diarized_transcript(transcription) -> str:
  76 |     """Format transcript with speaker labels from diarized response."""
  77 |     try:
  78 |         # Try to access words array - the exact attribute might vary
  79 |         words = None
  80 |         if hasattr(transcription, "words"):
  81 |             words = transcription.words
  82 |         elif hasattr(transcription, "__dict__"):
  83 |             # Try to find words in the response dict
  84 |             for key, value in transcription.__dict__.items():
  85 |                 if key == "words" or (
  86 |                     isinstance(value, list)
  87 |                     and len(value) > 0
  88 |                     and (
  89 |                         hasattr(value[0], "speaker_id")
  90 |                         if hasattr(value[0], "__dict__")
  91 |                         else (
  92 |                             "speaker_id" in value[0]
  93 |                             if isinstance(value[0], dict)
  94 |                             else False
  95 |                         )
  96 |                     )
  97 |                 ):
  98 |                     words = value
  99 |                     break
 100 | 
 101 |         if not words:
 102 |             return transcription.text
 103 | 
 104 |         formatted_lines = []
 105 |         current_speaker = None
 106 |         current_text = []
 107 | 
 108 |         for word in words:
 109 |             # Get speaker_id - might be an attribute or dict key
 110 |             word_speaker = None
 111 |             if hasattr(word, "speaker_id"):
 112 |                 word_speaker = word.speaker_id
 113 |             elif isinstance(word, dict) and "speaker_id" in word:
 114 |                 word_speaker = word["speaker_id"]
 115 | 
 116 |             # Get text - might be an attribute or dict key
 117 |             word_text = None
 118 |             if hasattr(word, "text"):
 119 |                 word_text = word.text
 120 |             elif isinstance(word, dict) and "text" in word:
 121 |                 word_text = word["text"]
 122 | 
 123 |             if not word_speaker or not word_text:
 124 |                 continue
 125 | 
 126 |             # Skip spacing/punctuation types if they exist
 127 |             if hasattr(word, "type") and word.type == "spacing":
 128 |                 continue
 129 |             elif isinstance(word, dict) and word.get("type") == "spacing":
 130 |                 continue
 131 | 
 132 |             if current_speaker != word_speaker:
 133 |                 # Save previous speaker's text
 134 |                 if current_speaker and current_text:
 135 |                     speaker_label = current_speaker.upper().replace("_", " ")
 136 |                     formatted_lines.append(f"{speaker_label}: {' '.join(current_text)}")
 137 | 
 138 |                 # Start new speaker
 139 |                 current_speaker = word_speaker
 140 |                 current_text = [word_text.strip()]
 141 |             else:
 142 |                 current_text.append(word_text.strip())
 143 | 
 144 |         # Add final speaker's text
 145 |         if current_speaker and current_text:
 146 |             speaker_label = current_speaker.upper().replace("_", " ")
 147 |             formatted_lines.append(f"{speaker_label}: {' '.join(current_text)}")
 148 | 
 149 |         return "\n\n".join(formatted_lines)
 150 | 
 151 |     except Exception:
 152 |         # Fallback to regular text if something goes wrong
 153 |         return transcription.text
 154 | @mcp.resource("elevenlabs://{filename}")
 155 | def get_elevenlabs_resource(filename: str) -> Resource:
 156 |     """
 157 |     Resource handler for ElevenLabs generated files.
 158 |     """
 159 |     candidate = Path(filename)
 160 |     base_dir = make_output_path(None, base_path)
 161 | 
 162 |     if candidate.is_absolute():
 163 |         file_path = candidate.resolve()
 164 |     else:
 165 |         base_dir_resolved = base_dir.resolve()
 166 |         resolved_file = (base_dir_resolved / candidate).resolve()
 167 |         try:
 168 |             resolved_file.relative_to(base_dir_resolved)
 169 |         except ValueError:
 170 |             make_error(
 171 |                 f"Resource path ({resolved_file}) is outside of allowed directory {base_dir_resolved}"
 172 |             )
 173 |         file_path = resolved_file
 174 | 
 175 |     if not file_path.exists():
 176 |         raise FileNotFoundError(f"Resource file not found: {filename}")
 177 | 
 178 |     # Read the file and determine MIME type
 179 |     try:
 180 |         with open(file_path, "rb") as f:
 181 |             file_data = f.read()
 182 |     except IOError as e:
 183 |         raise FileNotFoundError(f"Failed to read resource file {filename}: {e}")
 184 | 
 185 |     file_extension = file_path.suffix.lstrip(".")
 186 |     mime_type = get_mime_type(file_extension)
 187 | 
 188 |     # For text files, return text content
 189 |     if mime_type.startswith("text/"):
 190 |         try:
 191 |             text_content = file_data.decode("utf-8")
 192 |             return Resource(
 193 |                 uri=f"elevenlabs://{filename}", mimeType=mime_type, text=text_content
 194 |             )
 195 |         except UnicodeDecodeError:
 196 |             make_error(
 197 |                 f"Failed to decode text resource {filename} as UTF-8; MIME type {mime_type} may be incorrect or file is corrupt"
 198 |             )
 199 | 
 200 |     # For binary files, return base64 encoded data
 201 |     base64_data = base64.b64encode(file_data).decode("utf-8")
 202 |     return Resource(
 203 |         uri=f"elevenlabs://{filename}", mimeType=mime_type, data=base64_data
 204 |     )
 205 | 
 206 | 
 207 | @mcp.tool(
 208 |     description=f"""Convert text to speech with a given voice. {get_output_mode_description(output_mode)}.
 209 |     
 210 |     Only one of voice_id or voice_name can be provided. If none are provided, the default voice will be used.
 211 | 
 212 |     ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.
 213 | 
 214 |      Args:
 215 |         text (str): The text to convert to speech.
 216 |         voice_name (str, optional): The name of the voice to use.
 217 |         model_id (str, optional): The model ID to use for speech synthesis. Options include:
 218 |             - eleven_multilingual_v2: High quality multilingual model (29 languages)
 219 |             - eleven_flash_v2_5: Fastest model with ultra-low latency (32 languages)
 220 |             - eleven_turbo_v2_5: Balanced quality and speed (32 languages)
 221 |             - eleven_flash_v2: Fast English-only model
 222 |             - eleven_turbo_v2: Balanced English-only model
 223 |             - eleven_monolingual_v1: Legacy English model
 224 |             Defaults to eleven_multilingual_v2 or environment variable ELEVENLABS_MODEL_ID.
 225 |         stability (float, optional): Stability of the generated audio. Determines how stable the voice is and the randomness between each generation. Lower values introduce broader emotional range for the voice. Higher values can result in a monotonous voice with limited emotion. Range is 0 to 1.
 226 |         similarity_boost (float, optional): Similarity boost of the generated audio. Determines how closely the AI should adhere to the original voice when attempting to replicate it. Range is 0 to 1.
 227 |         style (float, optional): Style of the generated audio. Determines the style exaggeration of the voice. This setting attempts to amplify the style of the original speaker. It does consume additional computational resources and might increase latency if set to anything other than 0. Range is 0 to 1.
 228 |         use_speaker_boost (bool, optional): Use speaker boost of the generated audio. This setting boosts the similarity to the original speaker. Using this setting requires a slightly higher computational load, which in turn increases latency.
 229 |         speed (float, optional): Speed of the generated audio. Controls the speed of the generated speech. Values range from 0.7 to 1.2, with 1.0 being the default speed. Lower values create slower, more deliberate speech while higher values produce faster-paced speech. Extreme values can impact the quality of the generated speech. Range is 0.7 to 1.2.
 230 |         output_directory (str, optional): Directory where files should be saved (only used when saving files).
 231 |             Defaults to $HOME/Desktop if not provided.
 232 |         language: ISO 639-1 language code for the voice.
 233 |         output_format (str, optional): Output format of the generated audio. Formatted as codec_sample_rate_bitrate. So an mp3 with 22.05kHz sample rate at 32kbs is represented as mp3_22050_32. MP3 with 192kbps bitrate requires you to be subscribed to Creator tier or above. PCM with 44.1kHz sample rate requires you to be subscribed to Pro tier or above. Note that the μ-law format (sometimes written mu-law, often approximated as u-law) is commonly used for Twilio audio inputs.
 234 |             Defaults to "mp3_44100_128". Must be one of:
 235 |             mp3_22050_32
 236 |             mp3_44100_32
 237 |             mp3_44100_64
 238 |             mp3_44100_96
 239 |             mp3_44100_128
 240 |             mp3_44100_192
 241 |             pcm_8000
 242 |             pcm_16000
 243 |             pcm_22050
 244 |             pcm_24000
 245 |             pcm_44100
 246 |             ulaw_8000
 247 |             alaw_8000
 248 |             opus_48000_32
 249 |             opus_48000_64
 250 |             opus_48000_96
 251 |             opus_48000_128
 252 |             opus_48000_192
 253 | 
 254 |     Returns:
 255 |         Text content with file path or MCP resource with audio data, depending on output mode.
 256 |     """
 257 | )
 258 | def text_to_speech(
 259 |     text: str,
 260 |     voice_name: str | None = None,
 261 |     output_directory: str | None = None,
 262 |     voice_id: str | None = None,
 263 |     stability: float = 0.5,
 264 |     similarity_boost: float = 0.75,
 265 |     style: float = 0,
 266 |     use_speaker_boost: bool = True,
 267 |     speed: float = 1.0,
 268 |     language: str = "en",
 269 |     output_format: str = "mp3_44100_128",
 270 |     model_id: str | None = None,
 271 | ) -> Union[TextContent, EmbeddedResource]:
 272 |     if text == "":
 273 |         make_error("Text is required.")
 274 | 
 275 |     if voice_id is not None and voice_name is not None:
 276 |         make_error("voice_id and voice_name cannot both be provided.")
 277 | 
 278 |     voice = None
 279 |     if voice_id is not None:
 280 |         voice = client.voices.get(voice_id=voice_id)
 281 |     elif voice_name is not None:
 282 |         voices = client.voices.search(search=voice_name)
 283 |         if len(voices.voices) == 0:
 284 |             make_error("No voices found with that name.")
 285 |         voice = next((v for v in voices.voices if v.name == voice_name), None)
 286 |         if voice is None:
 287 |             make_error(f"Voice with name: {voice_name} does not exist.")
 288 | 
 289 |     voice_id = voice.voice_id if voice else DEFAULT_VOICE_ID
 290 | 
 291 |     output_path = make_output_path(output_directory, base_path)
 292 |     output_file_name = make_output_file("tts", text, "mp3")
 293 | 
 294 |     if model_id is None:
 295 |         model_id = (
 296 |             "eleven_flash_v2_5"
 297 |             if language in ["hu", "no", "vi"]
 298 |             else "eleven_multilingual_v2"
 299 |         )
 300 | 
 301 |     audio_data = client.text_to_speech.convert(
 302 |         text=text,
 303 |         voice_id=voice_id,
 304 |         model_id=model_id,
 305 |         output_format=output_format,
 306 |         voice_settings={
 307 |             "stability": stability,
 308 |             "similarity_boost": similarity_boost,
 309 |             "style": style,
 310 |             "use_speaker_boost": use_speaker_boost,
 311 |             "speed": speed,
 312 |         },
 313 |     )
 314 |     audio_bytes = b"".join(audio_data)
 315 | 
 316 |     # Handle different output modes
 317 |     success_message = f"Success. File saved as: {{file_path}}. Voice used: {voice.name if voice else DEFAULT_VOICE_ID}"
 318 |     return handle_output_mode(
 319 |         audio_bytes, output_path, output_file_name, output_mode, success_message
 320 |     )
 321 | 
 322 | 
 323 | @mcp.tool(
 324 |     description=f"""Transcribe speech from an audio file. When save_transcript_to_file=True: {get_output_mode_description(output_mode)}. When return_transcript_to_client_directly=True, always returns text directly regardless of output mode.
 325 | 
 326 |     ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.
 327 | 
 328 |     Args:
 329 |         file_path: Path to the audio file to transcribe
 330 |         language_code: ISO 639-3 language code for transcription. If not provided, the language will be detected automatically.
 331 |         diarize: Whether to diarize the audio file. If True, which speaker is currently speaking will be annotated in the transcription.
 332 |         save_transcript_to_file: Whether to save the transcript to a file.
 333 |         return_transcript_to_client_directly: Whether to return the transcript to the client directly.
 334 |         output_directory: Directory where files should be saved (only used when saving files).
 335 |             Defaults to $HOME/Desktop if not provided.
 336 | 
 337 |     Returns:
 338 |         TextContent containing the transcription or MCP resource with transcript data.
 339 |     """
 340 | )
 341 | def speech_to_text(
 342 |     input_file_path: str,
 343 |     language_code: str | None = None,
 344 |     diarize: bool = False,
 345 |     save_transcript_to_file: bool = True,
 346 |     return_transcript_to_client_directly: bool = False,
 347 |     output_directory: str | None = None,
 348 | ) -> Union[TextContent, EmbeddedResource]:
 349 |     if not save_transcript_to_file and not return_transcript_to_client_directly:
 350 |         make_error("Must save transcript to file or return it to the client directly.")
 351 |     file_path = handle_input_file(input_file_path)
 352 |     if save_transcript_to_file:
 353 |         output_path = make_output_path(output_directory, base_path)
 354 |         output_file_name = make_output_file("stt", file_path.name, "txt")
 355 |     with file_path.open("rb") as f:
 356 |         audio_bytes = f.read()
 357 | 
 358 |     if language_code == "" or language_code is None:
 359 |         language_code = None
 360 | 
 361 |     transcription = client.speech_to_text.convert(
 362 |         model_id="scribe_v1",
 363 |         file=audio_bytes,
 364 |         language_code=language_code,
 365 |         enable_logging=True,
 366 |         diarize=diarize,
 367 |         tag_audio_events=True,
 368 |     )
 369 | 
 370 |     # Format transcript with speaker identification if diarization was enabled
 371 |     if diarize:
 372 |         formatted_transcript = format_diarized_transcript(transcription)
 373 |     else:
 374 |         formatted_transcript = transcription.text
 375 | 
 376 |     if return_transcript_to_client_directly:
 377 |         return TextContent(type="text", text=formatted_transcript)
 378 | 
 379 |     if save_transcript_to_file:
 380 |         transcript_bytes = formatted_transcript.encode("utf-8")
 381 | 
 382 |         # Handle different output modes
 383 |         success_message = f"Transcription saved to {file_path}"
 384 |         return handle_output_mode(
 385 |             transcript_bytes,
 386 |             output_path,
 387 |             output_file_name,
 388 |             output_mode,
 389 |             success_message,
 390 |         )
 391 | 
 392 |     # This should not be reached due to validation at the start of the function
 393 |     return TextContent(type="text", text="No output mode specified")
 394 | 
 395 | 
 396 | @mcp.tool(
 397 |     description=f"""Convert text description of a sound effect to sound effect with a given duration. {get_output_mode_description(output_mode)}.
 398 |     
 399 |     Duration must be between 0.5 and 5 seconds.
 400 | 
 401 |     ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.
 402 | 
 403 |     Args:
 404 |         text: Text description of the sound effect
 405 |         duration_seconds: Duration of the sound effect in seconds
 406 |         output_directory: Directory where files should be saved (only used when saving files).
 407 |             Defaults to $HOME/Desktop if not provided.
 408 |         loop: Whether to loop the sound effect. Defaults to False.
 409 |         output_format (str, optional): Output format of the generated audio. Formatted as codec_sample_rate_bitrate. So an mp3 with 22.05kHz sample rate at 32kbs is represented as mp3_22050_32. MP3 with 192kbps bitrate requires you to be subscribed to Creator tier or above. PCM with 44.1kHz sample rate requires you to be subscribed to Pro tier or above. Note that the μ-law format (sometimes written mu-law, often approximated as u-law) is commonly used for Twilio audio inputs.
 410 |             Defaults to "mp3_44100_128". Must be one of:
 411 |             mp3_22050_32
 412 |             mp3_44100_32
 413 |             mp3_44100_64
 414 |             mp3_44100_96
 415 |             mp3_44100_128
 416 |             mp3_44100_192
 417 |             pcm_8000
 418 |             pcm_16000
 419 |             pcm_22050
 420 |             pcm_24000
 421 |             pcm_44100
 422 |             ulaw_8000
 423 |             alaw_8000
 424 |             opus_48000_32
 425 |             opus_48000_64
 426 |             opus_48000_96
 427 |             opus_48000_128
 428 |             opus_48000_192
 429 |     """
 430 | )
 431 | def text_to_sound_effects(
 432 |     text: str,
 433 |     duration_seconds: float = 2.0,
 434 |     output_directory: str | None = None,
 435 |     output_format: str = "mp3_44100_128",
 436 |     loop: bool = False,
 437 | ) -> Union[TextContent, EmbeddedResource]:
 438 |     if duration_seconds < 0.5 or duration_seconds > 5:
 439 |         make_error("Duration must be between 0.5 and 5 seconds")
 440 |     output_path = make_output_path(output_directory, base_path)
 441 |     output_file_name = make_output_file("sfx", text, "mp3")
 442 | 
 443 |     audio_data = client.text_to_sound_effects.convert(
 444 |         text=text,
 445 |         output_format=output_format,
 446 |         duration_seconds=duration_seconds,
 447 |         loop=loop,
 448 |     )
 449 |     audio_bytes = b"".join(audio_data)
 450 | 
 451 |     # Handle different output modes
 452 |     return handle_output_mode(audio_bytes, output_path, output_file_name, output_mode)
 453 | 
 454 | 
 455 | @mcp.tool(
 456 |     description="""
 457 |     Search for existing voices, a voice that has already been added to the user's ElevenLabs voice library.
 458 |     Searches in name, description, labels and category.
 459 | 
 460 |     Args:
 461 |         search: Search term to filter voices by. Searches in name, description, labels and category.
 462 |         sort: Which field to sort by. `created_at_unix` might not be available for older voices.
 463 |         sort_direction: Sort order, either ascending or descending.
 464 | 
 465 |     Returns:
 466 |         List of voices that match the search criteria.
 467 |     """
 468 | )
 469 | def search_voices(
 470 |     search: str | None = None,
 471 |     sort: Literal["created_at_unix", "name"] = "name",
 472 |     sort_direction: Literal["asc", "desc"] = "desc",
 473 | ) -> list[McpVoice]:
 474 |     response = client.voices.search(
 475 |         search=search, sort=sort, sort_direction=sort_direction
 476 |     )
 477 |     return [
 478 |         McpVoice(id=voice.voice_id, name=voice.name, category=voice.category)
 479 |         for voice in response.voices
 480 |     ]
 481 | 
 482 | 
 483 | @mcp.tool(description="List all available models")
 484 | def list_models() -> list[McpModel]:
 485 |     response = client.models.list()
 486 |     return [
 487 |         McpModel(
 488 |             id=model.model_id,
 489 |             name=model.name,
 490 |             languages=[
 491 |                 McpLanguage(language_id=lang.language_id, name=lang.name)
 492 |                 for lang in model.languages
 493 |             ],
 494 |         )
 495 |         for model in response
 496 |     ]
 497 | 
 498 | 
 499 | @mcp.tool(description="Get details of a specific voice")
 500 | def get_voice(voice_id: str) -> McpVoice:
 501 |     """Get details of a specific voice."""
 502 |     response = client.voices.get(voice_id=voice_id)
 503 |     return McpVoice(
 504 |         id=response.voice_id,
 505 |         name=response.name,
 506 |         category=response.category,
 507 |         fine_tuning_status=response.fine_tuning.state,
 508 |     )
 509 | 
 510 | 
 511 | @mcp.tool(
 512 |     description="""Create an instant voice clone of a voice using provided audio files.
 513 | 
 514 |     ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.
 515 |     """
 516 | )
 517 | def voice_clone(
 518 |     name: str, files: list[str], description: str | None = None
 519 | ) -> TextContent:
 520 |     input_files = [str(handle_input_file(file).absolute()) for file in files]
 521 |     voice = client.voices.ivc.create(
 522 |         name=name, description=description, files=input_files
 523 |     )
 524 | 
 525 |     return TextContent(
 526 |         type="text",
 527 |         text=f"""Voice cloned successfully: Name: {voice.name}
 528 |         ID: {voice.voice_id}
 529 |         Category: {voice.category}
 530 |         Description: {voice.description or "N/A"}""",
 531 |     )
 532 | 
 533 | 
 534 | @mcp.tool(
 535 |     description=f"""Isolate audio from a file. {get_output_mode_description(output_mode)}.
 536 | 
 537 |     ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.
 538 |     """
 539 | )
 540 | def isolate_audio(
 541 |     input_file_path: str, output_directory: str | None = None
 542 | ) -> Union[TextContent, EmbeddedResource]:
 543 |     file_path = handle_input_file(input_file_path)
 544 |     output_path = make_output_path(output_directory, base_path)
 545 |     output_file_name = make_output_file("iso", file_path.name, "mp3")
 546 |     with file_path.open("rb") as f:
 547 |         audio_bytes = f.read()
 548 |     audio_data = client.audio_isolation.convert(
 549 |         audio=audio_bytes,
 550 |     )
 551 |     audio_bytes = b"".join(audio_data)
 552 | 
 553 |     # Handle different output modes
 554 |     return handle_output_mode(audio_bytes, output_path, output_file_name, output_mode)
 555 | 
 556 | 
 557 | @mcp.tool(
 558 |     description="Check the current subscription status. Could be used to measure the usage of the API."
 559 | )
 560 | def check_subscription() -> TextContent:
 561 |     subscription = client.user.subscription.get()
 562 |     return TextContent(type="text", text=f"{subscription.model_dump_json(indent=2)}")
 563 | 
 564 | 
 565 | @mcp.tool(
 566 |     description="""Create a conversational AI agent with custom configuration.
 567 | 
 568 |     ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.
 569 | 
 570 |     Args:
 571 |         name: Name of the agent
 572 |         first_message: First message the agent will say i.e. "Hi, how can I help you today?"
 573 |         system_prompt: System prompt for the agent
 574 |         voice_id: ID of the voice to use for the agent
 575 |         language: ISO 639-1 language code for the agent
 576 |         llm: LLM to use for the agent
 577 |         temperature: Temperature for the agent. The lower the temperature, the more deterministic the agent's responses will be. Range is 0 to 1.
 578 |         max_tokens: Maximum number of tokens to generate.
 579 |         asr_quality: Quality of the ASR. `high` or `low`.
 580 |         model_id: ID of the ElevenLabs model to use for the agent.
 581 |         optimize_streaming_latency: Optimize streaming latency. Range is 0 to 4.
 582 |         stability: Stability for the agent. Range is 0 to 1.
 583 |         similarity_boost: Similarity boost for the agent. Range is 0 to 1.
 584 |         turn_timeout: Timeout for the agent to respond in seconds. Defaults to 7 seconds.
 585 |         max_duration_seconds: Maximum duration of a conversation in seconds. Defaults to 600 seconds (10 minutes).
 586 |         record_voice: Whether to record the agent's voice.
 587 |         retention_days: Number of days to retain the agent's data.
 588 |     """
 589 | )
 590 | def create_agent(
 591 |     name: str,
 592 |     first_message: str,
 593 |     system_prompt: str,
 594 |     voice_id: str | None = DEFAULT_VOICE_ID,
 595 |     language: str = "en",
 596 |     llm: str = "gemini-2.0-flash-001",
 597 |     temperature: float = 0.5,
 598 |     max_tokens: int | None = None,
 599 |     asr_quality: str = "high",
 600 |     model_id: str = "eleven_turbo_v2",
 601 |     optimize_streaming_latency: int = 3,
 602 |     stability: float = 0.5,
 603 |     similarity_boost: float = 0.8,
 604 |     turn_timeout: int = 7,
 605 |     max_duration_seconds: int = 300,
 606 |     record_voice: bool = True,
 607 |     retention_days: int = 730,
 608 | ) -> TextContent:
 609 |     conversation_config = create_conversation_config(
 610 |         language=language,
 611 |         system_prompt=system_prompt,
 612 |         llm=llm,
 613 |         first_message=first_message,
 614 |         temperature=temperature,
 615 |         max_tokens=max_tokens,
 616 |         asr_quality=asr_quality,
 617 |         voice_id=voice_id,
 618 |         model_id=model_id,
 619 |         optimize_streaming_latency=optimize_streaming_latency,
 620 |         stability=stability,
 621 |         similarity_boost=similarity_boost,
 622 |         turn_timeout=turn_timeout,
 623 |         max_duration_seconds=max_duration_seconds,
 624 |     )
 625 | 
 626 |     platform_settings = create_platform_settings(
 627 |         record_voice=record_voice,
 628 |         retention_days=retention_days,
 629 |     )
 630 | 
 631 |     response = client.conversational_ai.agents.create(
 632 |         name=name,
 633 |         conversation_config=conversation_config,
 634 |         platform_settings=platform_settings,
 635 |     )
 636 | 
 637 |     return TextContent(
 638 |         type="text",
 639 |         text=f"""Agent created successfully: Name: {name}, Agent ID: {response.agent_id}, System Prompt: {system_prompt}, Voice ID: {voice_id or "Default"}, Language: {language}, LLM: {llm}, You can use this agent ID for future interactions with the agent.""",
 640 |     )
 641 | 
 642 | 
 643 | @mcp.tool(
 644 |     description="""Add a knowledge base to ElevenLabs workspace. Allowed types are epub, pdf, docx, txt, html.
 645 | 
 646 |     ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.
 647 | 
 648 |     Args:
 649 |         agent_id: ID of the agent to add the knowledge base to.
 650 |         knowledge_base_name: Name of the knowledge base.
 651 |         url: URL of the knowledge base.
 652 |         input_file_path: Path to the file to add to the knowledge base.
 653 |         text: Text to add to the knowledge base.
 654 |     """
 655 | )
 656 | def add_knowledge_base_to_agent(
 657 |     agent_id: str,
 658 |     knowledge_base_name: str,
 659 |     url: str | None = None,
 660 |     input_file_path: str | None = None,
 661 |     text: str | None = None,
 662 | ) -> TextContent:
 663 |     provided_params = [
 664 |         param for param in [url, input_file_path, text] if param is not None
 665 |     ]
 666 |     if len(provided_params) == 0:
 667 |         make_error("Must provide either a URL, a file, or text")
 668 |     if len(provided_params) > 1:
 669 |         make_error("Must provide exactly one of: URL, file, or text")
 670 | 
 671 |     if url is not None:
 672 |         response = client.conversational_ai.knowledge_base.documents.create_from_url(
 673 |             name=knowledge_base_name,
 674 |             url=url,
 675 |         )
 676 |     else:
 677 |         if text is not None:
 678 |             text_bytes = text.encode("utf-8")
 679 |             text_io = BytesIO(text_bytes)
 680 |             text_io.name = "text.txt"
 681 |             text_io.content_type = "text/plain"
 682 |             file = text_io
 683 |         elif input_file_path is not None:
 684 |             path = handle_input_file(
 685 |                 file_path=input_file_path, audio_content_check=False
 686 |             )
 687 |             file = open(path, "rb")
 688 | 
 689 |         response = client.conversational_ai.knowledge_base.documents.create_from_file(
 690 |             name=knowledge_base_name,
 691 |             file=file,
 692 |         )
 693 | 
 694 |     agent = client.conversational_ai.agents.get(agent_id=agent_id)
 695 | 
 696 |     agent_config = agent.conversation_config.agent
 697 |     knowledge_base_list = (
 698 |         agent_config.get("prompt", {}).get("knowledge_base", []) if agent_config else []
 699 |     )
 700 |     knowledge_base_list.append(
 701 |         KnowledgeBaseLocator(
 702 |             type="file" if file else "url",
 703 |             name=knowledge_base_name,
 704 |             id=response.id,
 705 |         )
 706 |     )
 707 | 
 708 |     if agent_config and "prompt" not in agent_config:
 709 |         agent_config["prompt"] = {}
 710 |     if agent_config:
 711 |         agent_config["prompt"]["knowledge_base"] = knowledge_base_list
 712 | 
 713 |     client.conversational_ai.agents.update(
 714 |         agent_id=agent_id, conversation_config=agent.conversation_config
 715 |     )
 716 |     return TextContent(
 717 |         type="text",
 718 |         text=f"""Knowledge base created with ID: {response.id} and added to agent {agent_id} successfully.""",
 719 |     )
 720 | 
 721 | 
 722 | @mcp.tool(description="List all available conversational AI agents")
 723 | def list_agents() -> TextContent:
 724 |     """List all available conversational AI agents.
 725 | 
 726 |     Returns:
 727 |         TextContent with a formatted list of available agents
 728 |     """
 729 |     response = client.conversational_ai.agents.list()
 730 | 
 731 |     if not response.agents:
 732 |         return TextContent(type="text", text="No agents found.")
 733 | 
 734 |     agent_list = ",".join(
 735 |         f"{agent.name} (ID: {agent.agent_id})" for agent in response.agents
 736 |     )
 737 | 
 738 |     return TextContent(type="text", text=f"Available agents: {agent_list}")
 739 | 
 740 | 
 741 | @mcp.tool(description="Get details about a specific conversational AI agent")
 742 | def get_agent(agent_id: str) -> TextContent:
 743 |     """Get details about a specific conversational AI agent.
 744 | 
 745 |     Args:
 746 |         agent_id: The ID of the agent to retrieve
 747 | 
 748 |     Returns:
 749 |         TextContent with detailed information about the agent
 750 |     """
 751 |     response = client.conversational_ai.agents.get(agent_id=agent_id)
 752 | 
 753 |     voice_info = "None"
 754 |     if response.conversation_config.tts:
 755 |         voice_info = f"Voice ID: {response.conversation_config.tts.voice_id}"
 756 | 
 757 |     return TextContent(
 758 |         type="text",
 759 |         text=f"Agent Details: Name: {response.name}, Agent ID: {response.agent_id}, Voice Configuration: {voice_info}, Created At: {datetime.fromtimestamp(response.metadata.created_at_unix_secs).strftime('%Y-%m-%d %H:%M:%S')}",
 760 |     )
 761 | 
 762 | 
 763 | @mcp.tool(
 764 |     description="""Gets conversation with transcript. Returns: conversation details and full transcript. Use when: analyzing completed agent conversations.
 765 | 
 766 |     Args:
 767 |         conversation_id: The unique identifier of the conversation to retrieve, you can get the ids from the list_conversations tool.
 768 |     """
 769 | )
 770 | def get_conversation(
 771 |     conversation_id: str,
 772 | ) -> TextContent:
 773 |     """Get conversation details with transcript"""
 774 |     try:
 775 |         response = client.conversational_ai.conversations.get(conversation_id)
 776 | 
 777 |         # Parse transcript using utility function
 778 |         transcript, _ = parse_conversation_transcript(response.transcript)
 779 | 
 780 |         response_text = f"""Conversation Details:
 781 | ID: {response.conversation_id}
 782 | Status: {response.status}
 783 | Agent ID: {response.agent_id}
 784 | Message Count: {len(response.transcript)}
 785 | 
 786 | Transcript:
 787 | {transcript}"""
 788 | 
 789 |         if response.metadata:
 790 |             metadata = response.metadata
 791 |             duration = getattr(
 792 |                 metadata,
 793 |                 "call_duration_secs",
 794 |                 getattr(metadata, "duration_seconds", "N/A"),
 795 |             )
 796 |             started_at = getattr(
 797 |                 metadata, "start_time_unix_secs", getattr(metadata, "started_at", "N/A")
 798 |             )
 799 |             response_text += (
 800 |                 f"\n\nMetadata:\nDuration: {duration} seconds\nStarted: {started_at}"
 801 |             )
 802 | 
 803 |         if response.analysis:
 804 |             analysis_summary = getattr(
 805 |                 response.analysis, "summary", "Analysis available but no summary"
 806 |             )
 807 |             response_text += f"\n\nAnalysis:\n{analysis_summary}"
 808 | 
 809 |         return TextContent(type="text", text=response_text)
 810 | 
 811 |     except Exception as e:
 812 |         make_error(f"Failed to fetch conversation: {str(e)}")
 813 |         # satisfies type checker
 814 |         return TextContent(type="text", text="")
 815 | 
 816 | 
 817 | @mcp.tool(
 818 |     description="""Lists agent conversations. Returns: conversation list with metadata. Use when: asked about conversation history.
 819 | 
 820 |     Args:
 821 |         agent_id (str, optional): Filter conversations by specific agent ID
 822 |         cursor (str, optional): Pagination cursor for retrieving next page of results
 823 |         call_start_before_unix (int, optional): Filter conversations that started before this Unix timestamp
 824 |         call_start_after_unix (int, optional): Filter conversations that started after this Unix timestamp
 825 |         page_size (int, optional): Number of conversations to return per page (1-100, defaults to 30)
 826 |         max_length (int, optional): Maximum character length of the response text (defaults to 10000)
 827 |     """
 828 | )
 829 | def list_conversations(
 830 |     agent_id: str | None = None,
 831 |     cursor: str | None = None,
 832 |     call_start_before_unix: int | None = None,
 833 |     call_start_after_unix: int | None = None,
 834 |     page_size: int = 30,
 835 |     max_length: int = 10000,
 836 | ) -> TextContent:
 837 |     """List conversations with filtering options."""
 838 |     page_size = min(page_size, 100)
 839 | 
 840 |     try:
 841 |         response = client.conversational_ai.conversations.list(
 842 |             cursor=cursor,
 843 |             agent_id=agent_id,
 844 |             call_start_before_unix=call_start_before_unix,
 845 |             call_start_after_unix=call_start_after_unix,
 846 |             page_size=page_size,
 847 |         )
 848 | 
 849 |         if not response.conversations:
 850 |             return TextContent(type="text", text="No conversations found.")
 851 | 
 852 |         conv_list = []
 853 |         for conv in response.conversations:
 854 |             start_time = datetime.fromtimestamp(conv.start_time_unix_secs).strftime(
 855 |                 "%Y-%m-%d %H:%M:%S"
 856 |             )
 857 | 
 858 |             conv_info = f"""Conversation ID: {conv.conversation_id}
 859 | Status: {conv.status}
 860 | Agent: {conv.agent_name or 'N/A'} (ID: {conv.agent_id})
 861 | Started: {start_time}
 862 | Duration: {conv.call_duration_secs} seconds
 863 | Messages: {conv.message_count}
 864 | Call Successful: {conv.call_successful}"""
 865 | 
 866 |             conv_list.append(conv_info)
 867 | 
 868 |         formatted_list = "\n\n".join(conv_list)
 869 | 
 870 |         pagination_info = f"Showing {len(response.conversations)} conversations"
 871 |         if response.has_more:
 872 |             pagination_info += f" (more available, next cursor: {response.next_cursor})"
 873 | 
 874 |         full_text = f"{pagination_info}\n\n{formatted_list}"
 875 | 
 876 |         # Use utility to handle large text content
 877 |         result_text = handle_large_text(full_text, max_length, "conversation list")
 878 | 
 879 |         # If content was saved to file, prepend pagination info
 880 |         if result_text != full_text:
 881 |             result_text = f"{pagination_info}\n\n{result_text}"
 882 | 
 883 |         return TextContent(type="text", text=result_text)
 884 | 
 885 |     except Exception as e:
 886 |         make_error(f"Failed to list conversations: {str(e)}")
 887 |         # This line is unreachable but satisfies type checker
 888 |         return TextContent(type="text", text="")
 889 | 
 890 | 
 891 | @mcp.tool(
 892 |     description=f"""Transform audio from one voice to another using provided audio files. {get_output_mode_description(output_mode)}.
 893 | 
 894 |     ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.
 895 |     """
 896 | )
 897 | def speech_to_speech(
 898 |     input_file_path: str,
 899 |     voice_name: str = "Adam",
 900 |     output_directory: str | None = None,
 901 | ) -> Union[TextContent, EmbeddedResource]:
 902 |     voices = client.voices.search(search=voice_name)
 903 | 
 904 |     if len(voices.voices) == 0:
 905 |         make_error("No voice found with that name.")
 906 | 
 907 |     voice = next((v for v in voices.voices if v.name == voice_name), None)
 908 | 
 909 |     if voice is None:
 910 |         make_error(f"Voice with name: {voice_name} does not exist.")
 911 | 
 912 |     assert voice is not None  # Type assertion for type checker
 913 |     file_path = handle_input_file(input_file_path)
 914 |     output_path = make_output_path(output_directory, base_path)
 915 |     output_file_name = make_output_file("sts", file_path.name, "mp3")
 916 | 
 917 |     with file_path.open("rb") as f:
 918 |         audio_bytes = f.read()
 919 | 
 920 |     audio_data = client.speech_to_speech.convert(
 921 |         model_id="eleven_multilingual_sts_v2",
 922 |         voice_id=voice.voice_id,
 923 |         audio=audio_bytes,
 924 |     )
 925 | 
 926 |     audio_bytes = b"".join(audio_data)
 927 | 
 928 |     # Handle different output modes
 929 |     return handle_output_mode(audio_bytes, output_path, output_file_name, output_mode)
 930 | 
 931 | 
 932 | @mcp.tool(
 933 |     description=f"""Create voice previews from a text prompt. Creates three previews with slight variations. {get_output_mode_description(output_mode)}.
 934 |     
 935 |     If no text is provided, the tool will auto-generate text.
 936 | 
 937 |     Voice preview files are saved as: voice_design_(generated_voice_id)_(timestamp).mp3
 938 | 
 939 |     Example file name: voice_design_Ya2J5uIa5Pq14DNPsbC1_20250403_164949.mp3
 940 | 
 941 |     ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.
 942 |     """
 943 | )
 944 | def text_to_voice(
 945 |     voice_description: str,
 946 |     text: str | None = None,
 947 |     output_directory: str | None = None,
 948 | ) -> list[EmbeddedResource] | TextContent:
 949 |     if voice_description == "":
 950 |         make_error("Voice description is required.")
 951 | 
 952 |     previews = client.text_to_voice.create_previews(
 953 |         voice_description=voice_description,
 954 |         text=text,
 955 |         auto_generate_text=True if text is None else False,
 956 |     )
 957 | 
 958 |     output_path = make_output_path(output_directory, base_path)
 959 | 
 960 |     generated_voice_ids = []
 961 |     results = []
 962 | 
 963 |     for preview in previews.previews:
 964 |         output_file_name = make_output_file(
 965 |             "voice_design", preview.generated_voice_id, "mp3", full_id=True
 966 |         )
 967 |         generated_voice_ids.append(preview.generated_voice_id)
 968 |         audio_bytes = base64.b64decode(preview.audio_base_64)
 969 | 
 970 |         # Handle different output modes
 971 |         result = handle_output_mode(
 972 |             audio_bytes, output_path, output_file_name, output_mode
 973 |         )
 974 |         results.append(result)
 975 | 
 976 |     # Use centralized multiple files output handling
 977 |     additional_info = f"Generated voice IDs are: {', '.join(generated_voice_ids)}"
 978 |     return handle_multiple_files_output_mode(results, output_mode, additional_info)
 979 | 
 980 | 
 981 | @mcp.tool(
 982 |     description="""Add a generated voice to the voice library. Uses the voice ID from the `text_to_voice` tool.
 983 | 
 984 |     ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.
 985 |     """
 986 | )
 987 | def create_voice_from_preview(
 988 |     generated_voice_id: str,
 989 |     voice_name: str,
 990 |     voice_description: str,
 991 | ) -> TextContent:
 992 |     voice = client.text_to_voice.create_voice_from_preview(
 993 |         voice_name=voice_name,
 994 |         voice_description=voice_description,
 995 |         generated_voice_id=generated_voice_id,
 996 |     )
 997 | 
 998 |     return TextContent(
 999 |         type="text",
1000 |         text=f"Success. Voice created: {voice.name} with ID:{voice.voice_id}",
1001 |     )
1002 | 
1003 | 
1004 | def _get_phone_number_by_id(phone_number_id: str):
1005 |     """Helper function to get phone number details by ID."""
1006 |     phone_numbers = client.conversational_ai.phone_numbers.list()
1007 |     for phone in phone_numbers:
1008 |         if phone.phone_number_id == phone_number_id:
1009 |             return phone
1010 |     make_error(f"Phone number with ID {phone_number_id} not found.")
1011 | 
1012 | 
1013 | @mcp.tool(
1014 |     description="""Make an outbound call using an ElevenLabs agent. Automatically detects provider type (Twilio or SIP trunk) and uses the appropriate API.
1015 | 
1016 |     ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.
1017 | 
1018 |     Args:
1019 |         agent_id: The ID of the agent that will handle the call
1020 |         agent_phone_number_id: The ID of the phone number to use for the call
1021 |         to_number: The phone number to call (E.164 format: +1xxxxxxxxxx)
1022 | 
1023 |     Returns:
1024 |         TextContent containing information about the call
1025 |     """
1026 | )
1027 | def make_outbound_call(
1028 |     agent_id: str,
1029 |     agent_phone_number_id: str,
1030 |     to_number: str,
1031 | ) -> TextContent:
1032 |     # Get phone number details to determine provider type
1033 |     phone_number = _get_phone_number_by_id(agent_phone_number_id)
1034 | 
1035 |     if phone_number.provider.lower() == "twilio":
1036 |         response = client.conversational_ai.twilio.outbound_call(
1037 |             agent_id=agent_id,
1038 |             agent_phone_number_id=agent_phone_number_id,
1039 |             to_number=to_number,
1040 |         )
1041 |         provider_info = "Twilio"
1042 |     elif phone_number.provider.lower() == "sip_trunk":
1043 |         response = client.conversational_ai.sip_trunk.outbound_call(
1044 |             agent_id=agent_id,
1045 |             agent_phone_number_id=agent_phone_number_id,
1046 |             to_number=to_number,
1047 |         )
1048 |         provider_info = "SIP trunk"
1049 |     else:
1050 |         make_error(f"Unsupported provider type: {phone_number.provider}")
1051 | 
1052 |     return TextContent(
1053 |         type="text", text=f"Outbound call initiated via {provider_info}: {response}."
1054 |     )
1055 | 
1056 | 
1057 | @mcp.tool(
1058 |     description="""Search for a voice across the entire ElevenLabs voice library.
1059 | 
1060 |     Args:
1061 |         page: Page number to return (0-indexed)
1062 |         page_size: Number of voices to return per page (1-100)
1063 |         search: Search term to filter voices by
1064 | 
1065 |     Returns:
1066 |         TextContent containing information about the shared voices
1067 |     """
1068 | )
1069 | def search_voice_library(
1070 |     page: int = 0,
1071 |     page_size: int = 10,
1072 |     search: str | None = None,
1073 | ) -> TextContent:
1074 |     response = client.voices.get_shared(
1075 |         page=page,
1076 |         page_size=page_size,
1077 |         search=search,
1078 |     )
1079 | 
1080 |     if not response.voices:
1081 |         return TextContent(
1082 |             type="text", text="No shared voices found with the specified criteria."
1083 |         )
1084 | 
1085 |     voice_list = []
1086 |     for voice in response.voices:
1087 |         language_info = "N/A"
1088 |         if hasattr(voice, "verified_languages") and voice.verified_languages:
1089 |             languages = []
1090 |             for lang in voice.verified_languages:
1091 |                 accent_info = (
1092 |                     f" ({lang.accent})"
1093 |                     if hasattr(lang, "accent") and lang.accent
1094 |                     else ""
1095 |                 )
1096 |                 languages.append(f"{lang.language}{accent_info}")
1097 |             language_info = ", ".join(languages)
1098 | 
1099 |         details = [
1100 |             f"Name: {voice.name}",
1101 |             f"ID: {voice.voice_id}",
1102 |             f"Category: {getattr(voice, 'category', 'N/A')}",
1103 |         ]
1104 |         # TODO: Make cleaner
1105 |         if hasattr(voice, "gender") and voice.gender:
1106 |             details.append(f"Gender: {voice.gender}")
1107 |         if hasattr(voice, "age") and voice.age:
1108 |             details.append(f"Age: {voice.age}")
1109 |         if hasattr(voice, "accent") and voice.accent:
1110 |             details.append(f"Accent: {voice.accent}")
1111 |         if hasattr(voice, "description") and voice.description:
1112 |             details.append(f"Description: {voice.description}")
1113 |         if hasattr(voice, "use_case") and voice.use_case:
1114 |             details.append(f"Use Case: {voice.use_case}")
1115 | 
1116 |         details.append(f"Languages: {language_info}")
1117 | 
1118 |         if hasattr(voice, "preview_url") and voice.preview_url:
1119 |             details.append(f"Preview URL: {voice.preview_url}")
1120 | 
1121 |         voice_info = "\n".join(details)
1122 |         voice_list.append(voice_info)
1123 | 
1124 |     formatted_info = "\n\n".join(voice_list)
1125 |     return TextContent(type="text", text=f"Shared Voices:\n\n{formatted_info}")
1126 | 
1127 | 
1128 | @mcp.tool(description="List all phone numbers associated with the ElevenLabs account")
1129 | def list_phone_numbers() -> TextContent:
1130 |     """List all phone numbers associated with the ElevenLabs account.
1131 | 
1132 |     Returns:
1133 |         TextContent containing formatted information about the phone numbers
1134 |     """
1135 |     response = client.conversational_ai.phone_numbers.list()
1136 | 
1137 |     if not response:
1138 |         return TextContent(type="text", text="No phone numbers found.")
1139 | 
1140 |     phone_info = []
1141 |     for phone in response:
1142 |         assigned_agent = "None"
1143 |         if phone.assigned_agent:
1144 |             assigned_agent = f"{phone.assigned_agent.agent_name} (ID: {phone.assigned_agent.agent_id})"
1145 | 
1146 |         phone_info.append(
1147 |             f"Phone Number: {phone.phone_number}\n"
1148 |             f"ID: {phone.phone_number_id}\n"
1149 |             f"Provider: {phone.provider}\n"
1150 |             f"Label: {phone.label}\n"
1151 |             f"Assigned Agent: {assigned_agent}"
1152 |         )
1153 | 
1154 |     formatted_info = "\n\n".join(phone_info)
1155 |     return TextContent(type="text", text=f"Phone Numbers:\n\n{formatted_info}")
1156 | 
1157 | 
1158 | @mcp.tool(description="Play an audio file. Supports WAV and MP3 formats.")
1159 | def play_audio(input_file_path: str) -> TextContent:
1160 |     file_path = handle_input_file(input_file_path)
1161 |     play(open(file_path, "rb").read(), use_ffmpeg=False)
1162 |     return TextContent(type="text", text=f"Successfully played audio file: {file_path}")
1163 | 
1164 | 
1165 | @mcp.tool(
1166 |     description="""Convert a prompt to music and save the output audio file to a given directory.
1167 |     Directory is optional, if not provided, the output file will be saved to $HOME/Desktop.
1168 | 
1169 |     Args:
1170 |         prompt: Prompt to convert to music. Must provide either prompt or composition_plan.
1171 |         output_directory: Directory to save the output audio file
1172 |         composition_plan: Composition plan to use for the music. Must provide either prompt or composition_plan.
1173 |         music_length_ms: Length of the generated music in milliseconds. Cannot be used if composition_plan is provided.
1174 | 
1175 |     ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user."""
1176 | )
1177 | def compose_music(
1178 |     prompt: str | None = None,
1179 |     output_directory: str | None = None,
1180 |     composition_plan: MusicPrompt | None = None,
1181 |     music_length_ms: int | None = None,
1182 | ) -> Union[TextContent, EmbeddedResource]:
1183 |     if prompt is None and composition_plan is None:
1184 |         make_error(
1185 |             f"Either prompt or composition_plan must be provided. Prompt: {prompt}"
1186 |         )
1187 | 
1188 |     if prompt is not None and composition_plan is not None:
1189 |         make_error("Only one of prompt or composition_plan must be provided")
1190 | 
1191 |     if music_length_ms is not None and composition_plan is not None:
1192 |         make_error("music_length_ms cannot be used if composition_plan is provided")
1193 | 
1194 |     output_path = make_output_path(output_directory, base_path)
1195 |     output_file_name = make_output_file("music", "", "mp3")
1196 | 
1197 |     audio_data = client.music.compose(
1198 |         prompt=prompt,
1199 |         music_length_ms=music_length_ms,
1200 |         composition_plan=composition_plan,
1201 |     )
1202 | 
1203 |     audio_bytes = b"".join(audio_data)
1204 | 
1205 |     # Handle different output modes
1206 |     return handle_output_mode(audio_bytes, output_path, output_file_name, output_mode)
1207 | 
1208 | 
1209 | @mcp.tool(
1210 |     description="""Create a composition plan for music generation. Usage of this endpoint does not cost any credits but is subject to rate limiting depending on your tier. Composition plans can be used when generating music with the compose_music tool.
1211 | 
1212 |     Args:
1213 |         prompt: Prompt to create a composition plan for
1214 |         music_length_ms: The length of the composition plan to generate in milliseconds. Must be between 10000ms and 300000ms. Optional - if not provided, the model will choose a length based on the prompt.
1215 |         source_composition_plan: An optional composition plan to use as a source for the new composition plan
1216 |     """
1217 | )
1218 | def create_composition_plan(
1219 |     prompt: str,
1220 |     music_length_ms: int | None = None,
1221 |     source_composition_plan: MusicPrompt | None = None,
1222 | ) -> MusicPrompt:
1223 |     composition_plan = client.music.composition_plan.create(
1224 |         prompt=prompt,
1225 |         music_length_ms=music_length_ms,
1226 |         source_composition_plan=source_composition_plan,
1227 |     )
1228 | 
1229 |     return composition_plan
1230 | 
1231 | 
1232 | def main():
1233 |     print("Starting MCP server")
1234 |     """Run the MCP server"""
1235 |     mcp.run()
1236 | 
1237 | 
1238 | if __name__ == "__main__":
1239 |     main()
1240 | 
```