# Directory Structure ``` ├── .env.example ├── .github │ └── workflows │ ├── publish.yml │ └── test.yml ├── .gitignore ├── .pre-commit-config.yaml ├── Dockerfile ├── elevenlabs_mcp │ ├── __init__.py │ ├── __main__.py │ ├── convai.py │ ├── model.py │ ├── server.py │ └── utils.py ├── LICENSE ├── pyproject.toml ├── README.md ├── scripts │ ├── build.sh │ ├── deploy.sh │ ├── dev.sh │ ├── setup.sh │ └── test.sh ├── server.json ├── setup.py ├── tests │ ├── conftest.py │ └── test_utils.py └── uv.lock ``` # Files -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- ```yaml 1 | repos: 2 | - repo: https://github.com/astral-sh/ruff-pre-commit 3 | rev: v0.3.0 4 | hooks: 5 | - id: ruff 6 | args: [--fix, --exit-non-zero-on-fix] 7 | - id: ruff-format ``` -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- ``` 1 | *.pyc 2 | *.pyo 3 | *.pyd 4 | *.pyw 5 | *.pyz 6 | *.pywz 7 | 8 | .env 9 | .venv 10 | .cursor 11 | .cursorignore 12 | dist/ 13 | elevenlabs_mcp.egg-info/ 14 | .coverage 15 | coverage.xml 16 | .mcpregistry_github_token 17 | .mcpregistry_registry_token ``` -------------------------------------------------------------------------------- /.env.example: -------------------------------------------------------------------------------- ``` 1 | ELEVENLABS_API_KEY=PUT_YOUR_KEY_HERE 2 | ELEVENLABS_MCP_BASE_PATH=~/Desktop # optional base path for output files 3 | ELEVENLABS_API_RESIDENCY="us" # optional data residency location 4 | ELEVENLABS_MCP_OUTPUT_MODE=files # output mode: files, resources, or both ``` -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- ```markdown 1 |  2 | 3 | <div class="title-block" style="text-align: center;" align="center"> 4 | 5 | [](https://discord.gg/elevenlabs) 6 | [](https://x.com/ElevenLabsDevs) 7 | [](https://pypi.org/project/elevenlabs-mcp) 8 | [](https://github.com/elevenlabs/elevenlabs-mcp-server/actions/workflows/test.yml) 9 | 10 | </div> 11 | 12 | 13 | <p align="center"> 14 | Official ElevenLabs <a href="https://github.com/modelcontextprotocol">Model Context Protocol (MCP)</a> server that enables interaction with powerful Text to Speech and audio processing APIs. This server allows MCP clients like <a href="https://www.anthropic.com/claude">Claude Desktop</a>, <a href="https://www.cursor.so">Cursor</a>, <a href="https://codeium.com/windsurf">Windsurf</a>, <a href="https://github.com/openai/openai-agents-python">OpenAI Agents</a> and others to generate speech, clone voices, transcribe audio, and more. 15 | </p> 16 | 17 | <!-- 18 | mcp-name: io.github.elevenlabs/elevenlabs-mcp 19 | --> 20 | 21 | ## Quickstart with Claude Desktop 22 | 23 | 1. Get your API key from [ElevenLabs](https://elevenlabs.io/app/settings/api-keys). There is a free tier with 10k credits per month. 24 | 2. Install `uv` (Python package manager), install with `curl -LsSf https://astral.sh/uv/install.sh | sh` or see the `uv` [repo](https://github.com/astral-sh/uv) for additional install methods. 25 | 3. Go to Claude > Settings > Developer > Edit Config > claude_desktop_config.json to include the following: 26 | 27 | ``` 28 | { 29 | "mcpServers": { 30 | "ElevenLabs": { 31 | "command": "uvx", 32 | "args": ["elevenlabs-mcp"], 33 | "env": { 34 | "ELEVENLABS_API_KEY": "<insert-your-api-key-here>" 35 | } 36 | } 37 | } 38 | } 39 | 40 | ``` 41 | 42 | If you're using Windows, you will have to enable "Developer Mode" in Claude Desktop to use the MCP server. Click "Help" in the hamburger menu at the top left and select "Enable Developer Mode". 43 | 44 | ## Other MCP clients 45 | 46 | For other clients like Cursor and Windsurf, run: 47 | 1. `pip install elevenlabs-mcp` 48 | 2. `python -m elevenlabs_mcp --api-key={{PUT_YOUR_API_KEY_HERE}} --print` to get the configuration. Paste it into appropriate configuration directory specified by your MCP client. 49 | 50 | That's it. Your MCP client can now interact with ElevenLabs through these tools: 51 | 52 | ## Example usage 53 | 54 | ⚠️ Warning: ElevenLabs credits are needed to use these tools. 55 | 56 | Try asking Claude: 57 | 58 | - "Create an AI agent that speaks like a film noir detective and can answer questions about classic movies" 59 | - "Generate three voice variations for a wise, ancient dragon character, then I will choose my favorite voice to add to my voice library" 60 | - "Convert this recording of my voice to sound like a medieval knight" 61 | - "Create a soundscape of a thunderstorm in a dense jungle with animals reacting to the weather" 62 | - "Turn this speech into text, identify different speakers, then convert it back using unique voices for each person" 63 | 64 | ## Optional features 65 | 66 | ### File Output Configuration 67 | 68 | You can configure how the MCP server handles file outputs using these environment variables in your `claude_desktop_config.json`: 69 | 70 | - **`ELEVENLABS_MCP_BASE_PATH`**: Specify the base path for file operations with relative paths (default: `~/Desktop`) 71 | - **`ELEVENLABS_MCP_OUTPUT_MODE`**: Control how generated files are returned (default: `files`) 72 | 73 | #### Output Modes 74 | 75 | The `ELEVENLABS_MCP_OUTPUT_MODE` environment variable supports three modes: 76 | 77 | 1. **`files`** (default): Save files to disk and return file paths 78 | ```json 79 | "env": { 80 | "ELEVENLABS_API_KEY": "your-api-key", 81 | "ELEVENLABS_MCP_OUTPUT_MODE": "files" 82 | } 83 | ``` 84 | 85 | 2. **`resources`**: Return files as MCP resources; binary data is base64-encoded, text is returned as UTF-8 text 86 | ```json 87 | "env": { 88 | "ELEVENLABS_API_KEY": "your-api-key", 89 | "ELEVENLABS_MCP_OUTPUT_MODE": "resources" 90 | } 91 | ``` 92 | 93 | 3. **`both`**: Save files to disk AND return as MCP resources 94 | ```json 95 | "env": { 96 | "ELEVENLABS_API_KEY": "your-api-key", 97 | "ELEVENLABS_MCP_OUTPUT_MODE": "both" 98 | } 99 | ``` 100 | 101 | **Resource Mode Benefits:** 102 | - Files are returned directly in the MCP response as base64-encoded data 103 | - No disk I/O required - useful for containerized or serverless environments 104 | - MCP clients can access file content immediately without file system access 105 | - In `both` mode, resources can be fetched later using the `elevenlabs://filename` URI pattern 106 | 107 | **Use Cases:** 108 | - `files`: Traditional file-based workflows, local development 109 | - `resources`: Cloud environments, MCP clients without file system access 110 | - `both`: Maximum flexibility, caching, and resource sharing scenarios 111 | 112 | ### Data residency keys 113 | 114 | You can specify the data residency region with the `ELEVENLABS_API_RESIDENCY` environment variable. Defaults to `"us"`. 115 | 116 | **Note:** Data residency is an enterprise only feature. See [the docs](https://elevenlabs.io/docs/product-guides/administration/data-residency#overview) for more details. 117 | 118 | ## Contributing 119 | 120 | If you want to contribute or run from source: 121 | 122 | 1. Clone the repository: 123 | 124 | ```bash 125 | git clone https://github.com/elevenlabs/elevenlabs-mcp 126 | cd elevenlabs-mcp 127 | ``` 128 | 129 | 2. Create a virtual environment and install dependencies [using uv](https://github.com/astral-sh/uv): 130 | 131 | ```bash 132 | uv venv 133 | source .venv/bin/activate 134 | uv pip install -e ".[dev]" 135 | ``` 136 | 137 | 3. Copy `.env.example` to `.env` and add your ElevenLabs API key: 138 | 139 | ```bash 140 | cp .env.example .env 141 | # Edit .env and add your API key 142 | ``` 143 | 144 | 4. Run the tests to make sure everything is working: 145 | 146 | ```bash 147 | ./scripts/test.sh 148 | # Or with options 149 | ./scripts/test.sh --verbose --fail-fast 150 | ``` 151 | 152 | 5. Install the server in Claude Desktop: `mcp install elevenlabs_mcp/server.py` 153 | 154 | 6. Debug and test locally with MCP Inspector: `mcp dev elevenlabs_mcp/server.py` 155 | 156 | ## Troubleshooting 157 | 158 | Logs when running with Claude Desktop can be found at: 159 | 160 | - **Windows**: `%APPDATA%\Claude\logs\mcp-server-elevenlabs.log` 161 | - **macOS**: `~/Library/Logs/Claude/mcp-server-elevenlabs.log` 162 | 163 | ### Timeouts when using certain tools 164 | 165 | Certain ElevenLabs API operations, like voice design and audio isolation, can take a long time to resolve. When using the MCP inspector in dev mode, you might get timeout errors despite the tool completing its intended task. 166 | 167 | This shouldn't occur when using a client like Claude. 168 | 169 | ### MCP ElevenLabs: spawn uvx ENOENT 170 | 171 | If you encounter the error "MCP ElevenLabs: spawn uvx ENOENT", confirm its absolute path by running this command in your terminal: 172 | 173 | ```bash 174 | which uvx 175 | ``` 176 | 177 | Once you obtain the absolute path (e.g., `/usr/local/bin/uvx`), update your configuration to use that path (e.g., `"command": "/usr/local/bin/uvx"`). This ensures that the correct executable is referenced. 178 | 179 | 180 | 181 | ``` -------------------------------------------------------------------------------- /scripts/build.sh: -------------------------------------------------------------------------------- ```bash 1 | #!/bin/bash 2 | rm -rf dist/ build/ *.egg-info/ 3 | uv build ``` -------------------------------------------------------------------------------- /elevenlabs_mcp/__init__.py: -------------------------------------------------------------------------------- ```python 1 | """ElevenLabs MCP Server package.""" 2 | 3 | __version__ = "0.2.1" 4 | ``` -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- ```python 1 | from setuptools import setup, find_packages 2 | 3 | setup( 4 | packages=find_packages(), 5 | include_package_data=True, 6 | ) 7 | ``` -------------------------------------------------------------------------------- /scripts/dev.sh: -------------------------------------------------------------------------------- ```bash 1 | #!/bin/bash 2 | uv run fastmcp dev elevenlabs_mcp/server.py --with python-dotenv --with elevenlabs --with fuzzywuzzy --with python-Levenshtein --with sounddevice --with soundfile --with-editable . ``` -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- ```dockerfile 1 | FROM python:3.11-slim 2 | 3 | # Install system dependencies 4 | RUN apt-get update && apt-get install -y gcc && rm -rf /var/lib/apt/lists/* 5 | 6 | WORKDIR /app 7 | 8 | # Copy the application code to the container 9 | COPY . . 10 | 11 | # Upgrade pip and install the package 12 | RUN pip install --upgrade pip \ 13 | && pip install --no-cache-dir . 14 | 15 | # Command to run the MCP server 16 | CMD ["elevenlabs-mcp"] ``` -------------------------------------------------------------------------------- /tests/conftest.py: -------------------------------------------------------------------------------- ```python 1 | import pytest 2 | from pathlib import Path 3 | import tempfile 4 | 5 | 6 | @pytest.fixture 7 | def temp_dir(): 8 | with tempfile.TemporaryDirectory() as temp_dir: 9 | yield Path(temp_dir) 10 | 11 | 12 | @pytest.fixture 13 | def sample_audio_file(temp_dir): 14 | audio_file = temp_dir / "test.mp3" 15 | audio_file.touch() 16 | return audio_file 17 | 18 | 19 | @pytest.fixture 20 | def sample_video_file(temp_dir): 21 | video_file = temp_dir / "test.mp4" 22 | video_file.touch() 23 | return video_file 24 | ``` -------------------------------------------------------------------------------- /scripts/deploy.sh: -------------------------------------------------------------------------------- ```bash 1 | #!/bin/bash 2 | 3 | # Check if environment argument is provided 4 | if [[ $# -lt 1 ]]; then 5 | echo "Usage: $0 [test|prod]" 6 | exit 1 7 | fi 8 | 9 | # Clean previous builds 10 | rm -rf dist/ build/ *.egg-info/ 11 | 12 | # Build the package 13 | uv build 14 | 15 | if [ "$1" = "test" ]; then 16 | uv run twine upload --repository testpypi dist/* --verbose 17 | elif [ "$1" = "prod" ]; then 18 | uv run twine upload --repository pypi dist/* 19 | else 20 | echo "Please specify 'test' or 'prod' as the argument" 21 | exit 1 22 | fi ``` -------------------------------------------------------------------------------- /elevenlabs_mcp/model.py: -------------------------------------------------------------------------------- ```python 1 | from pydantic import BaseModel 2 | from typing import Dict, Optional 3 | 4 | 5 | class McpVoice(BaseModel): 6 | id: str 7 | name: str 8 | category: str 9 | fine_tuning_status: Optional[Dict] = None 10 | 11 | 12 | class ConvAiAgentListItem(BaseModel): 13 | name: str 14 | agent_id: str 15 | 16 | 17 | class ConvaiAgent(BaseModel): 18 | name: str 19 | agent_id: str 20 | system_prompt: str 21 | voice_id: str | None 22 | language: str 23 | llm: str 24 | 25 | 26 | class McpLanguage(BaseModel): 27 | language_id: str 28 | name: str 29 | 30 | 31 | class McpModel(BaseModel): 32 | id: str 33 | name: str 34 | languages: list[McpLanguage] 35 | ``` -------------------------------------------------------------------------------- /.github/workflows/test.yml: -------------------------------------------------------------------------------- ```yaml 1 | name: Run Tests 2 | 3 | on: 4 | push: 5 | branches: 6 | - main 7 | pull_request: 8 | jobs: 9 | test: 10 | runs-on: ubuntu-latest 11 | steps: 12 | - uses: actions/checkout@v3 13 | 14 | - name: Set up Python 3.11 15 | uses: actions/setup-python@v4 16 | with: 17 | python-version: "3.11" 18 | 19 | - name: Install uv 20 | run: python -m pip install uv 21 | 22 | - name: Install dependencies 23 | run: | 24 | uv pip install --system -e ".[dev]" 25 | 26 | - name: Run tests 27 | run: | 28 | uv run pytest --cov=elevenlabs_mcp --cov-report=xml 29 | 30 | - name: Upload coverage to Codecov 31 | uses: codecov/codecov-action@v3 32 | with: 33 | file: ./coverage.xml 34 | fail_ci_if_error: false 35 | verbose: true 36 | ``` -------------------------------------------------------------------------------- /scripts/test.sh: -------------------------------------------------------------------------------- ```bash 1 | #!/bin/bash 2 | 3 | # Set default variables 4 | COVERAGE=true 5 | VERBOSE=false 6 | FAIL_FAST=false 7 | 8 | # Process command-line arguments 9 | while [[ $# -gt 0 ]]; do 10 | case $1 in 11 | --no-coverage) 12 | COVERAGE=false 13 | shift 14 | ;; 15 | --verbose|-v) 16 | VERBOSE=true 17 | shift 18 | ;; 19 | --fail-fast|-f) 20 | FAIL_FAST=true 21 | shift 22 | ;; 23 | *) 24 | echo "Unknown option: $1" 25 | echo "Usage: ./test.sh [--no-coverage] [--verbose|-v] [--fail-fast|-f]" 26 | exit 1 27 | ;; 28 | esac 29 | done 30 | 31 | # Build the command 32 | CMD="python -m pytest" 33 | 34 | if [ "$COVERAGE" = true ]; then 35 | CMD="$CMD --cov=elevenlabs_mcp" 36 | fi 37 | 38 | if [ "$VERBOSE" = true ]; then 39 | CMD="$CMD -v" 40 | fi 41 | 42 | if [ "$FAIL_FAST" = true ]; then 43 | CMD="$CMD -x" 44 | fi 45 | 46 | # Run the tests 47 | echo "Running tests with command: $CMD" 48 | $CMD ``` -------------------------------------------------------------------------------- /scripts/setup.sh: -------------------------------------------------------------------------------- ```bash 1 | #!/bin/bash 2 | 3 | # Ensure uv is available 4 | if ! command -v uv &> /dev/null; then 5 | echo "Error: uv is not installed. Please install it first:" 6 | echo "pip install uv" 7 | exit 1 8 | fi 9 | 10 | # Create or update virtual environment 11 | echo "Creating/updating virtual environment..." 12 | uv venv .venv 13 | 14 | # Activate virtual environment based on shell 15 | if [[ "$SHELL" == */zsh ]]; then 16 | source .venv/bin/activate 17 | elif [[ "$SHELL" == */bash ]]; then 18 | source .venv/bin/activate 19 | else 20 | echo "Please activate the virtual environment manually:" 21 | echo "source .venv/bin/activate" 22 | fi 23 | 24 | # Install dependencies 25 | echo "Installing dependencies with uv..." 26 | uv pip install -e ".[dev]" 27 | 28 | # Install pre-commit hooks 29 | echo "Setting up pre-commit hooks..." 30 | pre-commit install 31 | 32 | echo "Setup complete! Virtual environment is ready." ``` -------------------------------------------------------------------------------- /.github/workflows/publish.yml: -------------------------------------------------------------------------------- ```yaml 1 | name: Publish Python Package 2 | 3 | on: 4 | push: 5 | tags: 6 | - "v*" 7 | 8 | jobs: 9 | deploy: 10 | runs-on: ubuntu-latest 11 | steps: 12 | - uses: actions/checkout@v3 13 | 14 | - name: Set up Python 15 | uses: actions/setup-python@v4 16 | with: 17 | python-version: "3.11" 18 | 19 | - name: Install uv 20 | run: pip install uv 21 | 22 | - name: Verify tag matches package version 23 | run: | 24 | # Extract version from tag (remove 'v' prefix) 25 | TAG_VERSION=${GITHUB_REF#refs/tags/v} 26 | 27 | # Extract version from pyproject.toml 28 | PACKAGE_VERSION=$(grep -o 'version = "[^"]*"' pyproject.toml | cut -d'"' -f2) 29 | 30 | echo "Tag version: $TAG_VERSION" 31 | echo "Package version: $PACKAGE_VERSION" 32 | 33 | # Verify versions match 34 | if [ "$TAG_VERSION" != "$PACKAGE_VERSION" ]; then 35 | echo "Error: Tag version ($TAG_VERSION) does not match package version ($PACKAGE_VERSION)" 36 | exit 1 37 | fi 38 | 39 | - name: Install dependencies 40 | run: | 41 | uv pip install --system -e ".[dev]" 42 | 43 | - name: Run tests 44 | run: | 45 | uv run pytest --cov=elevenlabs_mcp 46 | 47 | - name: Build package 48 | run: | 49 | uv build 50 | 51 | - name: Publish to PyPI 52 | uses: pypa/gh-action-pypi-publish@release/v1 53 | with: 54 | user: __token__ 55 | password: ${{ secrets.PYPI_API_TOKEN }} 56 | skip-existing: true 57 | ``` -------------------------------------------------------------------------------- /server.json: -------------------------------------------------------------------------------- ```json 1 | { 2 | "$schema": "https://static.modelcontextprotocol.io/schemas/2025-07-09/server.schema.json", 3 | "name": "io.github.elevenlabs/elevenlabs-mcp", 4 | "description": "MCP server that enables interaction with Text to Speech, Voice Agents and audio processing APIs", 5 | "status": "active", 6 | "repository": { 7 | "url": "https://github.com/elevenlabs/elevenlabs-mcp", 8 | "source": "github" 9 | }, 10 | "version": "0.9.0", 11 | "packages": [ 12 | { 13 | "registry_type": "pypi", 14 | "registry_base_url": "https://pypi.org", 15 | "identifier": "elevenlabs-mcp", 16 | "version": "0.8.1", 17 | "transport": { 18 | "type": "stdio" 19 | }, 20 | "environment_variables": [ 21 | { 22 | "description": "Your ElevenLabs API key", 23 | "is_required": true, 24 | "format": "string", 25 | "is_secret": true, 26 | "name": "ELEVENLABS_API_KEY" 27 | }, 28 | { 29 | "description": "The base path for the MCP server. Defaults to $HOME/Desktop if not provided.", 30 | "is_required": false, 31 | "format": "string", 32 | "is_secret": false, 33 | "name": "ELEVENLABS_MCP_BASE_PATH" 34 | }, 35 | { 36 | "description": "The optional data residency region. Defaults to 'us' if not provided.", 37 | "is_required": false, 38 | "format": "string", 39 | "is_secret": false, 40 | "name": "ELEVENLABS_API_RESIDENCY" 41 | }, 42 | { 43 | "description": "The output mode for the MCP server. Defaults to 'files' if not provided.", 44 | "is_required": false, 45 | "format": "string", 46 | "is_secret": false, 47 | "name": "ELEVENLABS_MCP_OUTPUT_MODE" 48 | } 49 | ] 50 | } 51 | ] 52 | } ``` -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- ```toml 1 | [project] 2 | name = "elevenlabs-mcp" 3 | version = "0.9.0" 4 | description = "ElevenLabs MCP Server" 5 | authors = [ 6 | { name = "Jacek Duszenko", email = "[email protected]" }, 7 | { name = "Paul Asjes", email = "[email protected]" }, 8 | { name = "Louis Jordan", email = "[email protected]" }, 9 | { name = "Luke Harries", email = "[email protected]" }, 10 | ] 11 | readme = "README.md" 12 | license = { file = "LICENSE" } 13 | classifiers = [ 14 | "Development Status :: 4 - Beta", 15 | "Intended Audience :: Developers", 16 | "License :: OSI Approved :: MIT License", 17 | "Programming Language :: Python :: 3", 18 | "Programming Language :: Python :: 3.11", 19 | "Programming Language :: Python :: 3.12", 20 | ] 21 | keywords = [ 22 | "elevenlabs", 23 | "mcp", 24 | "text-to-speech", 25 | "speech-to-text", 26 | "voice-cloning", 27 | ] 28 | requires-python = ">=3.11" 29 | dependencies = [ 30 | "mcp[cli]>=1.6.0", 31 | "fastapi==0.109.2", 32 | "uvicorn==0.27.1", 33 | "python-dotenv==1.0.1", 34 | "pydantic>=2.6.1", 35 | "httpx==0.28.1", 36 | "elevenlabs>=2.13.0", 37 | "fuzzywuzzy==0.18.0", 38 | "python-Levenshtein>=0.25.0", 39 | "sounddevice==0.5.1", 40 | "soundfile==0.13.1", 41 | ] 42 | 43 | [project.scripts] 44 | elevenlabs-mcp = "elevenlabs_mcp.server:main" 45 | 46 | [project.optional-dependencies] 47 | dev = [ 48 | "pre-commit==3.6.2", 49 | "ruff==0.3.0", 50 | "fastmcp==0.4.1", 51 | "pytest==8.0.0", 52 | "pytest-cov==4.1.0", 53 | "twine==6.1.0", 54 | "build>=1.0.3", 55 | ] 56 | 57 | [build-system] 58 | requires = ["setuptools>=45", "wheel"] 59 | build-backend = "setuptools.build_meta" 60 | 61 | [tool.pytest.ini_options] 62 | testpaths = ["tests"] 63 | python_files = ["test_*.py"] 64 | addopts = "-v --cov=elevenlabs_mcp --cov-report=term-missing" 65 | 66 | [dependency-groups] 67 | dev = [ 68 | "build>=1.2.2.post1", 69 | "fastmcp>=0.4.1", 70 | "pre-commit>=3.6.2", 71 | "pytest>=8.0.0", 72 | "pytest-cov>=4.1.0", 73 | "ruff>=0.3.0", 74 | "twine>=6.1.0", 75 | ] 76 | ``` -------------------------------------------------------------------------------- /tests/test_utils.py: -------------------------------------------------------------------------------- ```python 1 | import pytest 2 | from pathlib import Path 3 | import tempfile 4 | from elevenlabs_mcp.utils import ( 5 | ElevenLabsMcpError, 6 | make_error, 7 | is_file_writeable, 8 | make_output_file, 9 | make_output_path, 10 | find_similar_filenames, 11 | try_find_similar_files, 12 | handle_input_file, 13 | ) 14 | 15 | 16 | def test_make_error(): 17 | with pytest.raises(ElevenLabsMcpError): 18 | make_error("Test error") 19 | 20 | 21 | def test_is_file_writeable(): 22 | with tempfile.TemporaryDirectory() as temp_dir: 23 | temp_path = Path(temp_dir) 24 | assert is_file_writeable(temp_path) is True 25 | assert is_file_writeable(temp_path / "nonexistent.txt") is True 26 | 27 | 28 | def test_make_output_file(): 29 | tool = "test" 30 | text = "hello world" 31 | result = make_output_file(tool, text, "mp3") 32 | assert result.name.startswith("test_hello") 33 | assert result.suffix == ".mp3" 34 | 35 | 36 | def test_make_output_path(): 37 | with tempfile.TemporaryDirectory() as temp_dir: 38 | result = make_output_path(temp_dir) 39 | assert result == Path(temp_dir) 40 | assert result.exists() 41 | assert result.is_dir() 42 | 43 | 44 | def test_find_similar_filenames(): 45 | with tempfile.TemporaryDirectory() as temp_dir: 46 | temp_path = Path(temp_dir) 47 | test_file = temp_path / "test_file.txt" 48 | similar_file = temp_path / "test_file_2.txt" 49 | different_file = temp_path / "different.txt" 50 | 51 | test_file.touch() 52 | similar_file.touch() 53 | different_file.touch() 54 | 55 | results = find_similar_filenames(str(test_file), temp_path) 56 | assert len(results) > 0 57 | assert any(str(similar_file) in str(r[0]) for r in results) 58 | 59 | 60 | def test_try_find_similar_files(): 61 | with tempfile.TemporaryDirectory() as temp_dir: 62 | temp_path = Path(temp_dir) 63 | test_file = temp_path / "test_file.mp3" 64 | similar_file = temp_path / "test_file_2.mp3" 65 | different_file = temp_path / "different.txt" 66 | 67 | test_file.touch() 68 | similar_file.touch() 69 | different_file.touch() 70 | 71 | results = try_find_similar_files(str(test_file), temp_path) 72 | assert len(results) > 0 73 | assert any(str(similar_file) in str(r) for r in results) 74 | 75 | 76 | def test_handle_input_file(): 77 | with tempfile.TemporaryDirectory() as temp_dir: 78 | temp_path = Path(temp_dir) 79 | test_file = temp_path / "test.mp3" 80 | 81 | with open(test_file, "wb") as f: 82 | f.write(b"\xff\xfb\x90\x64\x00") 83 | 84 | result = handle_input_file(str(test_file)) 85 | assert result == test_file 86 | 87 | with pytest.raises(ElevenLabsMcpError): 88 | handle_input_file(str(temp_path / "nonexistent.mp3")) 89 | ``` -------------------------------------------------------------------------------- /elevenlabs_mcp/__main__.py: -------------------------------------------------------------------------------- ```python 1 | import os 2 | import json 3 | from pathlib import Path 4 | import sys 5 | from dotenv import load_dotenv 6 | import argparse 7 | 8 | load_dotenv() 9 | 10 | 11 | def get_claude_config_path() -> Path | None: 12 | """Get the Claude config directory based on platform.""" 13 | if sys.platform == "win32": 14 | path = Path(Path.home(), "AppData", "Roaming", "Claude") 15 | elif sys.platform == "darwin": 16 | path = Path(Path.home(), "Library", "Application Support", "Claude") 17 | elif sys.platform.startswith("linux"): 18 | path = Path( 19 | os.environ.get("XDG_CONFIG_HOME", Path.home() / ".config"), "Claude" 20 | ) 21 | else: 22 | return None 23 | 24 | if path.exists(): 25 | return path 26 | return None 27 | 28 | 29 | def get_python_path(): 30 | return sys.executable 31 | 32 | 33 | def generate_config(api_key: str | None = None): 34 | module_dir = Path(__file__).resolve().parent 35 | server_path = module_dir / "server.py" 36 | python_path = get_python_path() 37 | 38 | final_api_key = api_key or os.environ.get("ELEVENLABS_API_KEY") 39 | if not final_api_key: 40 | print("Error: ElevenLabs API key is required.") 41 | print("Please either:") 42 | print(" 1. Pass the API key using --api-key argument, or") 43 | print(" 2. Set the ELEVENLABS_API_KEY environment variable, or") 44 | print(" 3. Add ELEVENLABS_API_KEY to your .env file") 45 | sys.exit(1) 46 | 47 | config = { 48 | "mcpServers": { 49 | "ElevenLabs": { 50 | "command": python_path, 51 | "args": [ 52 | str(server_path), 53 | ], 54 | "env": {"ELEVENLABS_API_KEY": final_api_key}, 55 | } 56 | } 57 | } 58 | 59 | return config 60 | 61 | 62 | if __name__ == "__main__": 63 | parser = argparse.ArgumentParser() 64 | parser.add_argument( 65 | "--print", 66 | action="store_true", 67 | help="Print config to screen instead of writing to file", 68 | ) 69 | parser.add_argument( 70 | "--api-key", 71 | help="ElevenLabs API key (alternatively, set ELEVENLABS_API_KEY environment variable)", 72 | ) 73 | parser.add_argument( 74 | "--config-path", 75 | type=Path, 76 | help="Custom path to Claude config directory", 77 | ) 78 | args = parser.parse_args() 79 | 80 | config = generate_config(args.api_key) 81 | 82 | if args.print: 83 | print(json.dumps(config, indent=2)) 84 | else: 85 | claude_path = args.config_path if args.config_path else get_claude_config_path() 86 | if claude_path is None: 87 | print( 88 | "Could not find Claude config path automatically. Please specify it using --config-path argument. The argument should be an absolute path of the claude_desktop_config.json file." 89 | ) 90 | sys.exit(1) 91 | 92 | claude_path.mkdir(parents=True, exist_ok=True) 93 | print("Writing config to", claude_path / "claude_desktop_config.json") 94 | with open(claude_path / "claude_desktop_config.json", "w") as f: 95 | json.dump(config, f, indent=2) 96 | ``` -------------------------------------------------------------------------------- /elevenlabs_mcp/convai.py: -------------------------------------------------------------------------------- ```python 1 | def create_conversation_config( 2 | language: str, 3 | system_prompt: str, 4 | llm: str, 5 | first_message: str | None, 6 | temperature: float, 7 | max_tokens: int | None, 8 | asr_quality: str, 9 | voice_id: str | None, 10 | model_id: str, 11 | optimize_streaming_latency: int, 12 | stability: float, 13 | similarity_boost: float, 14 | turn_timeout: int, 15 | max_duration_seconds: int, 16 | ) -> dict: 17 | return { 18 | "agent": { 19 | "language": language, 20 | "prompt": { 21 | "prompt": system_prompt, 22 | "llm": llm, 23 | "tools": [{"type": "system", "name": "end_call", "description": ""}], 24 | "knowledge_base": [], 25 | "temperature": temperature, 26 | **({"max_tokens": max_tokens} if max_tokens else {}), 27 | }, 28 | **({"first_message": first_message} if first_message else {}), 29 | "dynamic_variables": {"dynamic_variable_placeholders": {}}, 30 | }, 31 | "asr": { 32 | "quality": asr_quality, 33 | "provider": "elevenlabs", 34 | "user_input_audio_format": "pcm_16000", 35 | "keywords": [], 36 | }, 37 | "tts": { 38 | **({"voice_id": voice_id} if voice_id else {}), 39 | "model_id": model_id, 40 | "agent_output_audio_format": "pcm_16000", 41 | "optimize_streaming_latency": optimize_streaming_latency, 42 | "stability": stability, 43 | "similarity_boost": similarity_boost, 44 | }, 45 | "turn": {"turn_timeout": turn_timeout}, 46 | "conversation": { 47 | "max_duration_seconds": max_duration_seconds, 48 | "client_events": [ 49 | "audio", 50 | "interruption", 51 | "user_transcript", 52 | "agent_response", 53 | "agent_response_correction", 54 | ], 55 | }, 56 | "language_presets": {}, 57 | "is_blocked_ivc": False, 58 | "is_blocked_non_ivc": False, 59 | } 60 | 61 | 62 | def create_platform_settings( 63 | record_voice: bool, 64 | retention_days: int, 65 | ) -> dict: 66 | return { 67 | "widget": { 68 | "variant": "full", 69 | "avatar": {"type": "orb", "color_1": "#6DB035", "color_2": "#F5CABB"}, 70 | "feedback_mode": "during", 71 | "terms_text": '#### Terms and conditions\n\nBy clicking "Agree," and each time I interact with this AI agent, I consent to the recording, storage, and sharing of my communications with third-party service providers, and as described in the Privacy Policy.\nIf you do not wish to have your conversations recorded, please refrain from using this service.', 72 | "show_avatar_when_collapsed": True, 73 | }, 74 | "evaluation": {}, 75 | "auth": {"allowlist": []}, 76 | "overrides": {}, 77 | "call_limits": {"agent_concurrency_limit": -1, "daily_limit": 100000}, 78 | "privacy": { 79 | "record_voice": record_voice, 80 | "retention_days": retention_days, 81 | "delete_transcript_and_pii": True, 82 | "delete_audio": True, 83 | "apply_to_existing_conversations": False, 84 | }, 85 | "data_collection": {}, 86 | } 87 | ``` -------------------------------------------------------------------------------- /elevenlabs_mcp/utils.py: -------------------------------------------------------------------------------- ```python 1 | import os 2 | import tempfile 3 | import base64 4 | from pathlib import Path 5 | from datetime import datetime 6 | from fuzzywuzzy import fuzz 7 | from typing import Union 8 | from mcp.types import ( 9 | EmbeddedResource, 10 | TextResourceContents, 11 | BlobResourceContents, 12 | TextContent, 13 | ) 14 | 15 | 16 | class ElevenLabsMcpError(Exception): 17 | pass 18 | 19 | 20 | def make_error(error_text: str): 21 | raise ElevenLabsMcpError(error_text) 22 | 23 | 24 | def is_file_writeable(path: Path) -> bool: 25 | if path.exists(): 26 | return os.access(path, os.W_OK) 27 | parent_dir = path.parent 28 | return os.access(parent_dir, os.W_OK) 29 | 30 | 31 | def make_output_file( 32 | tool: str, text: str, extension: str, full_id: bool = False 33 | ) -> Path: 34 | id = text if full_id else text[:5] 35 | 36 | output_file_name = f"{tool}_{id.replace(' ', '_')}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.{extension}" 37 | return Path(output_file_name) 38 | 39 | 40 | def make_output_path( 41 | output_directory: str | None, base_path: str | None = None 42 | ) -> Path: 43 | output_path = None 44 | if output_directory is None: 45 | base = base_path 46 | if base and base.strip(): 47 | output_path = Path(os.path.expanduser(base)) 48 | else: 49 | output_path = Path.home() / "Desktop" 50 | elif not os.path.isabs(output_directory) and base_path: 51 | output_path = Path(os.path.expanduser(base_path)) / Path(output_directory) 52 | else: 53 | output_path = Path(os.path.expanduser(output_directory)) 54 | if not is_file_writeable(output_path): 55 | make_error(f"Directory ({output_path}) is not writeable") 56 | output_path.mkdir(parents=True, exist_ok=True) 57 | return output_path 58 | 59 | 60 | def find_similar_filenames( 61 | target_file: str, directory: Path, threshold: int = 70 62 | ) -> list[tuple[str, int]]: 63 | """ 64 | Find files with names similar to the target file using fuzzy matching. 65 | 66 | Args: 67 | target_file (str): The reference filename to compare against 68 | directory (str): Directory to search in (defaults to current directory) 69 | threshold (int): Similarity threshold (0 to 100, where 100 is identical) 70 | 71 | Returns: 72 | list: List of similar filenames with their similarity scores 73 | """ 74 | target_filename = os.path.basename(target_file) 75 | similar_files = [] 76 | for root, _, files in os.walk(directory): 77 | for filename in files: 78 | if ( 79 | filename == target_filename 80 | and os.path.join(root, filename) == target_file 81 | ): 82 | continue 83 | similarity = fuzz.token_sort_ratio(target_filename, filename) 84 | 85 | if similarity >= threshold: 86 | file_path = Path(root) / filename 87 | similar_files.append((file_path, similarity)) 88 | 89 | similar_files.sort(key=lambda x: x[1], reverse=True) 90 | 91 | return similar_files 92 | 93 | 94 | def try_find_similar_files( 95 | filename: str, directory: Path, take_n: int = 5 96 | ) -> list[Path]: 97 | similar_files = find_similar_filenames(filename, directory) 98 | if not similar_files: 99 | return [] 100 | 101 | filtered_files = [] 102 | 103 | for path, _ in similar_files[:take_n]: 104 | if check_audio_file(path): 105 | filtered_files.append(path) 106 | 107 | return filtered_files 108 | 109 | 110 | def check_audio_file(path: Path) -> bool: 111 | audio_extensions = { 112 | ".wav", 113 | ".mp3", 114 | ".m4a", 115 | ".aac", 116 | ".ogg", 117 | ".flac", 118 | ".mp4", 119 | ".avi", 120 | ".mov", 121 | ".wmv", 122 | } 123 | return path.suffix.lower() in audio_extensions 124 | 125 | 126 | def handle_input_file(file_path: str, audio_content_check: bool = True) -> Path: 127 | if not os.path.isabs(file_path) and not os.environ.get("ELEVENLABS_MCP_BASE_PATH"): 128 | make_error( 129 | "File path must be an absolute path if ELEVENLABS_MCP_BASE_PATH is not set" 130 | ) 131 | path = Path(file_path) 132 | if not path.exists() and path.parent.exists(): 133 | parent_directory = path.parent 134 | similar_files = try_find_similar_files(path.name, parent_directory) 135 | similar_files_formatted = ",".join([str(file) for file in similar_files]) 136 | if similar_files: 137 | make_error( 138 | f"File ({path}) does not exist. Did you mean any of these files: {similar_files_formatted}?" 139 | ) 140 | make_error(f"File ({path}) does not exist") 141 | elif not path.exists(): 142 | make_error(f"File ({path}) does not exist") 143 | elif not path.is_file(): 144 | make_error(f"File ({path}) is not a file") 145 | 146 | if audio_content_check and not check_audio_file(path): 147 | make_error(f"File ({path}) is not an audio or video file") 148 | return path 149 | 150 | 151 | def handle_large_text( 152 | text: str, max_length: int = 10000, content_type: str = "content" 153 | ): 154 | """ 155 | Handle large text content by saving to temporary file if it exceeds max_length. 156 | 157 | Args: 158 | text: The text content to handle 159 | max_length: Maximum character length before saving to temp file 160 | content_type: Description of the content type for user messages 161 | 162 | Returns: 163 | str: Either the original text or a message with temp file path 164 | """ 165 | if len(text) > max_length: 166 | with tempfile.NamedTemporaryFile( 167 | mode="w", suffix=".txt", delete=False, encoding="utf-8" 168 | ) as temp_file: 169 | temp_file.write(text) 170 | temp_path = temp_file.name 171 | 172 | return f"{content_type.capitalize()} saved to temporary file: {temp_path}\nUse the Read tool to access the full {content_type}." 173 | 174 | return text 175 | 176 | 177 | def parse_conversation_transcript(transcript_entries, max_length: int = 50000): 178 | """ 179 | Parse conversation transcript entries into a formatted string. 180 | If transcript is too long, save to temporary file and return file path. 181 | 182 | Args: 183 | transcript_entries: List of transcript entries from conversation response 184 | max_length: Maximum character length before saving to temp file 185 | 186 | Returns: 187 | tuple: (transcript_text_or_path, is_temp_file) 188 | """ 189 | transcript_lines = [] 190 | for entry in transcript_entries: 191 | speaker = getattr(entry, "role", "Unknown") 192 | text = getattr(entry, "message", getattr(entry, "text", "")) 193 | timestamp = getattr(entry, "timestamp", None) 194 | 195 | if timestamp: 196 | transcript_lines.append(f"[{timestamp}] {speaker}: {text}") 197 | else: 198 | transcript_lines.append(f"{speaker}: {text}") 199 | 200 | transcript = ( 201 | "\n".join(transcript_lines) if transcript_lines else "No transcript available" 202 | ) 203 | 204 | # Check if transcript is too long for LLM context window 205 | if len(transcript) > max_length: 206 | # Create temporary file 207 | temp_file = tempfile.SpooledTemporaryFile( 208 | mode="w+", max_size=0, encoding="utf-8" 209 | ) 210 | temp_file.write(transcript) 211 | temp_file.seek(0) 212 | 213 | # Get a persistent temporary file path 214 | with tempfile.NamedTemporaryFile( 215 | mode="w", suffix=".txt", delete=False, encoding="utf-8" 216 | ) as persistent_temp: 217 | persistent_temp.write(transcript) 218 | temp_path = persistent_temp.name 219 | 220 | return ( 221 | f"Transcript saved to temporary file: {temp_path}\nUse the Read tool to access the full transcript.", 222 | True, 223 | ) 224 | 225 | return transcript, False 226 | 227 | 228 | def parse_location(api_residency: str | None) -> str: 229 | """ 230 | Parse the API residency and return the corresponding origin URL. 231 | """ 232 | origin_map = { 233 | "us": "https://api.elevenlabs.io", 234 | "eu-residency": "https://api.eu.residency.elevenlabs.io", 235 | "in-residency": "https://api.in.residency.elevenlabs.io", 236 | "global": "https://api.elevenlabs.io", 237 | } 238 | 239 | if not api_residency or not api_residency.strip(): 240 | return origin_map["us"] 241 | 242 | api_residency = api_residency.strip().lower() 243 | 244 | if api_residency not in origin_map: 245 | valid_options = ", ".join(f"'{k}'" for k in origin_map.keys()) 246 | raise ValueError(f"ELEVENLABS_API_RESIDENCY must be one of {valid_options}") 247 | 248 | return origin_map[api_residency] 249 | def get_mime_type(file_extension: str) -> str: 250 | """ 251 | Get MIME type for a given file extension. 252 | 253 | Args: 254 | file_extension: File extension (with or without dot) 255 | 256 | Returns: 257 | str: MIME type string 258 | """ 259 | # Remove leading dot if present 260 | ext = file_extension.lstrip(".") 261 | 262 | mime_types = { 263 | "mp3": "audio/mpeg", 264 | "wav": "audio/wav", 265 | "ogg": "audio/ogg", 266 | "flac": "audio/flac", 267 | "m4a": "audio/mp4", 268 | "aac": "audio/aac", 269 | "opus": "audio/opus", 270 | "txt": "text/plain", 271 | "json": "application/json", 272 | "xml": "application/xml", 273 | "html": "text/html", 274 | "csv": "text/csv", 275 | "mp4": "video/mp4", 276 | "avi": "video/x-msvideo", 277 | "mov": "video/quicktime", 278 | "wmv": "video/x-ms-wmv", 279 | } 280 | 281 | return mime_types.get(ext.lower(), "application/octet-stream") 282 | 283 | 284 | def generate_resource_uri(filename: str) -> str: 285 | """ 286 | Generate a resource URI for a given filename. 287 | 288 | Args: 289 | filename: The filename to generate URI for 290 | 291 | Returns: 292 | str: Resource URI in format elevenlabs://filename 293 | """ 294 | return f"elevenlabs://{filename}" 295 | 296 | 297 | def create_resource_response( 298 | file_data: bytes, filename: str, file_extension: str, directory: Path | None = None 299 | ) -> EmbeddedResource: 300 | """ 301 | Create a proper MCP EmbeddedResource response. 302 | 303 | Args: 304 | file_data: Raw file data as bytes 305 | filename: Name of the file 306 | file_extension: File extension for MIME type detection 307 | directory: Optional directory where the file is or would be saved; used to embed path in URI 308 | 309 | Returns: 310 | EmbeddedResource: Proper MCP resource object 311 | """ 312 | mime_type = get_mime_type(file_extension) 313 | if directory is not None: 314 | full_path = (directory / filename) 315 | resource_uri = f"elevenlabs://{full_path.as_posix()}" 316 | else: 317 | resource_uri = generate_resource_uri(filename) 318 | 319 | # For text files, use TextResourceContents 320 | if mime_type.startswith("text/"): 321 | try: 322 | text_content = file_data.decode("utf-8") 323 | return EmbeddedResource( 324 | type="resource", 325 | resource=TextResourceContents( 326 | uri=resource_uri, mimeType=mime_type, text=text_content 327 | ), 328 | ) 329 | except UnicodeDecodeError: 330 | # Fall back to binary if decode fails 331 | pass 332 | 333 | # For binary files (audio, etc.), use BlobResourceContents 334 | base64_data = base64.b64encode(file_data).decode("utf-8") 335 | return EmbeddedResource( 336 | type="resource", 337 | resource=BlobResourceContents( 338 | uri=resource_uri, mimeType=mime_type, blob=base64_data 339 | ), 340 | ) 341 | 342 | 343 | def handle_output_mode( 344 | file_data: bytes, 345 | output_path: Path, 346 | filename: str, 347 | output_mode: str, 348 | success_message: str = None, 349 | ) -> Union[TextContent, EmbeddedResource]: 350 | """ 351 | Handle different output modes for file generation. 352 | 353 | Args: 354 | file_data: Raw file data as bytes 355 | output_path: Path where file should be saved 356 | filename: Name of the file 357 | output_mode: Output mode ('files', 'resources', or 'both') 358 | success_message: Custom success message for files mode (optional) 359 | 360 | Returns: 361 | Union[TextContent, EmbeddedResource]: TextContent for 'files' mode, 362 | EmbeddedResource for 'resources' and 'both' modes 363 | """ 364 | file_extension = Path(filename).suffix.lstrip(".") 365 | full_file_path = output_path / filename 366 | 367 | if output_mode == "files": 368 | # Save to disk and return TextContent with success message 369 | output_path.mkdir(parents=True, exist_ok=True) 370 | with open(full_file_path, "wb") as f: 371 | f.write(file_data) 372 | 373 | if success_message and "{file_path}" in success_message: 374 | message = success_message.replace("{file_path}", str(full_file_path)) 375 | else: 376 | message = success_message or f"Success. File saved as: {full_file_path}" 377 | return TextContent(type="text", text=message) 378 | 379 | elif output_mode == "resources": 380 | # Return as EmbeddedResource without saving to disk 381 | return create_resource_response(file_data, filename, file_extension, directory=output_path) 382 | 383 | elif output_mode == "both": 384 | # Save to disk AND return as EmbeddedResource 385 | output_path.mkdir(parents=True, exist_ok=True) 386 | with open(full_file_path, "wb") as f: 387 | f.write(file_data) 388 | return create_resource_response(file_data, filename, file_extension, directory=output_path) 389 | 390 | else: 391 | raise ValueError( 392 | f"Invalid output mode: {output_mode}. Must be 'files', 'resources', or 'both'" 393 | ) 394 | 395 | 396 | def handle_multiple_files_output_mode( 397 | results: list[Union[TextContent, EmbeddedResource]], 398 | output_mode: str, 399 | additional_info: str = None, 400 | ) -> Union[TextContent, list[EmbeddedResource]]: 401 | """ 402 | Handle different output modes for multiple file generation. 403 | 404 | Args: 405 | results: List of results from handle_output_mode calls 406 | output_mode: Output mode ('files', 'resources', or 'both') 407 | additional_info: Additional information to include in files mode message 408 | 409 | Returns: 410 | Union[TextContent, list[EmbeddedResource]]: TextContent for 'files' mode, 411 | list of EmbeddedResource for 'resources' and 'both' modes 412 | """ 413 | if output_mode == "files": 414 | # Extract file paths from TextContent objects and create combined message 415 | file_paths = [] 416 | for result in results: 417 | if isinstance(result, TextContent): 418 | # Extract file path from the success message 419 | text = result.text 420 | if "File saved as: " in text: 421 | path = ( 422 | text.split("File saved as: ")[1].split(".")[0] 423 | + "." 424 | + text.split(".")[-1].split(" ")[0] 425 | ) 426 | file_paths.append(path) 427 | 428 | message = f"Success. Files saved at: {', '.join(file_paths)}" 429 | if additional_info: 430 | message += f". {additional_info}" 431 | 432 | return TextContent(type="text", text=message) 433 | 434 | elif output_mode in ["resources", "both"]: 435 | # Return list of EmbeddedResource objects 436 | embedded_resources = [] 437 | for result in results: 438 | if isinstance(result, EmbeddedResource): 439 | embedded_resources.append(result) 440 | 441 | if not embedded_resources: 442 | return TextContent(type="text", text="No files generated") 443 | 444 | return embedded_resources 445 | 446 | else: 447 | raise ValueError( 448 | f"Invalid output mode: {output_mode}. Must be 'files', 'resources', or 'both'" 449 | ) 450 | 451 | 452 | def get_output_mode_description(output_mode: str) -> str: 453 | """ 454 | Generate a dynamic description for the current output mode. 455 | 456 | Args: 457 | output_mode: The current output mode ('files', 'resources', or 'both') 458 | 459 | Returns: 460 | str: Description of how the tool will behave based on the output mode 461 | """ 462 | if output_mode == "files": 463 | return "Saves output file to directory (default: $HOME/Desktop)" 464 | elif output_mode == "resources": 465 | return "Returns output as base64-encoded MCP resource" 466 | elif output_mode == "both": 467 | return "Saves file to directory (default: $HOME/Desktop) AND returns as base64-encoded MCP resource" 468 | else: 469 | return "Output behavior depends on ELEVENLABS_MCP_OUTPUT_MODE setting" 470 | ``` -------------------------------------------------------------------------------- /elevenlabs_mcp/server.py: -------------------------------------------------------------------------------- ```python 1 | """ 2 | ElevenLabs MCP Server 3 | 4 | ⚠️ IMPORTANT: This server provides access to ElevenLabs API endpoints which may incur costs. 5 | Each tool that makes an API call is marked with a cost warning. Please follow these guidelines: 6 | 7 | 1. Only use tools when explicitly requested by the user 8 | 2. For tools that generate audio, consider the length of the text as it affects costs 9 | 3. Some operations like voice cloning or text-to-voice may have higher costs 10 | 11 | Tools without cost warnings in their description are free to use as they only read existing data. 12 | """ 13 | 14 | import httpx 15 | import os 16 | import base64 17 | from datetime import datetime 18 | from io import BytesIO 19 | from typing import Literal, Union 20 | from dotenv import load_dotenv 21 | from mcp.server.fastmcp import FastMCP 22 | from mcp.types import ( 23 | TextContent, 24 | Resource, 25 | EmbeddedResource, 26 | ) 27 | from elevenlabs.client import ElevenLabs 28 | from elevenlabs.types import MusicPrompt 29 | from elevenlabs_mcp.model import McpVoice, McpModel, McpLanguage 30 | from elevenlabs_mcp.utils import ( 31 | make_error, 32 | make_output_path, 33 | make_output_file, 34 | handle_input_file, 35 | parse_conversation_transcript, 36 | handle_large_text, 37 | parse_location, 38 | get_mime_type, 39 | handle_output_mode, 40 | handle_multiple_files_output_mode, 41 | get_output_mode_description, 42 | ) 43 | 44 | from elevenlabs_mcp.convai import create_conversation_config, create_platform_settings 45 | from elevenlabs.types.knowledge_base_locator import KnowledgeBaseLocator 46 | 47 | from elevenlabs.play import play 48 | from elevenlabs_mcp import __version__ 49 | from pathlib import Path 50 | 51 | load_dotenv() 52 | api_key = os.getenv("ELEVENLABS_API_KEY") 53 | base_path = os.getenv("ELEVENLABS_MCP_BASE_PATH") 54 | output_mode = os.getenv("ELEVENLABS_MCP_OUTPUT_MODE", "files").strip().lower() 55 | DEFAULT_VOICE_ID = os.getenv("ELEVENLABS_DEFAULT_VOICE_ID", "cgSgspJ2msm6clMCkdW9") 56 | 57 | if output_mode not in {"files", "resources", "both"}: 58 | raise ValueError("ELEVENLABS_MCP_OUTPUT_MODE must be one of: 'files', 'resources', 'both'") 59 | if not api_key: 60 | raise ValueError("ELEVENLABS_API_KEY environment variable is required") 61 | 62 | origin = parse_location(os.getenv("ELEVENLABS_API_RESIDENCY")) 63 | 64 | # Add custom client to ElevenLabs to set User-Agent header 65 | custom_client = httpx.Client( 66 | headers={ 67 | "User-Agent": f"ElevenLabs-MCP/{__version__}", 68 | }, 69 | ) 70 | 71 | client = ElevenLabs(api_key=api_key, httpx_client=custom_client, base_url=origin) 72 | mcp = FastMCP("ElevenLabs") 73 | 74 | 75 | def format_diarized_transcript(transcription) -> str: 76 | """Format transcript with speaker labels from diarized response.""" 77 | try: 78 | # Try to access words array - the exact attribute might vary 79 | words = None 80 | if hasattr(transcription, "words"): 81 | words = transcription.words 82 | elif hasattr(transcription, "__dict__"): 83 | # Try to find words in the response dict 84 | for key, value in transcription.__dict__.items(): 85 | if key == "words" or ( 86 | isinstance(value, list) 87 | and len(value) > 0 88 | and ( 89 | hasattr(value[0], "speaker_id") 90 | if hasattr(value[0], "__dict__") 91 | else ( 92 | "speaker_id" in value[0] 93 | if isinstance(value[0], dict) 94 | else False 95 | ) 96 | ) 97 | ): 98 | words = value 99 | break 100 | 101 | if not words: 102 | return transcription.text 103 | 104 | formatted_lines = [] 105 | current_speaker = None 106 | current_text = [] 107 | 108 | for word in words: 109 | # Get speaker_id - might be an attribute or dict key 110 | word_speaker = None 111 | if hasattr(word, "speaker_id"): 112 | word_speaker = word.speaker_id 113 | elif isinstance(word, dict) and "speaker_id" in word: 114 | word_speaker = word["speaker_id"] 115 | 116 | # Get text - might be an attribute or dict key 117 | word_text = None 118 | if hasattr(word, "text"): 119 | word_text = word.text 120 | elif isinstance(word, dict) and "text" in word: 121 | word_text = word["text"] 122 | 123 | if not word_speaker or not word_text: 124 | continue 125 | 126 | # Skip spacing/punctuation types if they exist 127 | if hasattr(word, "type") and word.type == "spacing": 128 | continue 129 | elif isinstance(word, dict) and word.get("type") == "spacing": 130 | continue 131 | 132 | if current_speaker != word_speaker: 133 | # Save previous speaker's text 134 | if current_speaker and current_text: 135 | speaker_label = current_speaker.upper().replace("_", " ") 136 | formatted_lines.append(f"{speaker_label}: {' '.join(current_text)}") 137 | 138 | # Start new speaker 139 | current_speaker = word_speaker 140 | current_text = [word_text.strip()] 141 | else: 142 | current_text.append(word_text.strip()) 143 | 144 | # Add final speaker's text 145 | if current_speaker and current_text: 146 | speaker_label = current_speaker.upper().replace("_", " ") 147 | formatted_lines.append(f"{speaker_label}: {' '.join(current_text)}") 148 | 149 | return "\n\n".join(formatted_lines) 150 | 151 | except Exception: 152 | # Fallback to regular text if something goes wrong 153 | return transcription.text 154 | @mcp.resource("elevenlabs://{filename}") 155 | def get_elevenlabs_resource(filename: str) -> Resource: 156 | """ 157 | Resource handler for ElevenLabs generated files. 158 | """ 159 | candidate = Path(filename) 160 | base_dir = make_output_path(None, base_path) 161 | 162 | if candidate.is_absolute(): 163 | file_path = candidate.resolve() 164 | else: 165 | base_dir_resolved = base_dir.resolve() 166 | resolved_file = (base_dir_resolved / candidate).resolve() 167 | try: 168 | resolved_file.relative_to(base_dir_resolved) 169 | except ValueError: 170 | make_error( 171 | f"Resource path ({resolved_file}) is outside of allowed directory {base_dir_resolved}" 172 | ) 173 | file_path = resolved_file 174 | 175 | if not file_path.exists(): 176 | raise FileNotFoundError(f"Resource file not found: {filename}") 177 | 178 | # Read the file and determine MIME type 179 | try: 180 | with open(file_path, "rb") as f: 181 | file_data = f.read() 182 | except IOError as e: 183 | raise FileNotFoundError(f"Failed to read resource file {filename}: {e}") 184 | 185 | file_extension = file_path.suffix.lstrip(".") 186 | mime_type = get_mime_type(file_extension) 187 | 188 | # For text files, return text content 189 | if mime_type.startswith("text/"): 190 | try: 191 | text_content = file_data.decode("utf-8") 192 | return Resource( 193 | uri=f"elevenlabs://{filename}", mimeType=mime_type, text=text_content 194 | ) 195 | except UnicodeDecodeError: 196 | make_error( 197 | f"Failed to decode text resource {filename} as UTF-8; MIME type {mime_type} may be incorrect or file is corrupt" 198 | ) 199 | 200 | # For binary files, return base64 encoded data 201 | base64_data = base64.b64encode(file_data).decode("utf-8") 202 | return Resource( 203 | uri=f"elevenlabs://{filename}", mimeType=mime_type, data=base64_data 204 | ) 205 | 206 | 207 | @mcp.tool( 208 | description=f"""Convert text to speech with a given voice. {get_output_mode_description(output_mode)}. 209 | 210 | Only one of voice_id or voice_name can be provided. If none are provided, the default voice will be used. 211 | 212 | ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user. 213 | 214 | Args: 215 | text (str): The text to convert to speech. 216 | voice_name (str, optional): The name of the voice to use. 217 | model_id (str, optional): The model ID to use for speech synthesis. Options include: 218 | - eleven_multilingual_v2: High quality multilingual model (29 languages) 219 | - eleven_flash_v2_5: Fastest model with ultra-low latency (32 languages) 220 | - eleven_turbo_v2_5: Balanced quality and speed (32 languages) 221 | - eleven_flash_v2: Fast English-only model 222 | - eleven_turbo_v2: Balanced English-only model 223 | - eleven_monolingual_v1: Legacy English model 224 | Defaults to eleven_multilingual_v2 or environment variable ELEVENLABS_MODEL_ID. 225 | stability (float, optional): Stability of the generated audio. Determines how stable the voice is and the randomness between each generation. Lower values introduce broader emotional range for the voice. Higher values can result in a monotonous voice with limited emotion. Range is 0 to 1. 226 | similarity_boost (float, optional): Similarity boost of the generated audio. Determines how closely the AI should adhere to the original voice when attempting to replicate it. Range is 0 to 1. 227 | style (float, optional): Style of the generated audio. Determines the style exaggeration of the voice. This setting attempts to amplify the style of the original speaker. It does consume additional computational resources and might increase latency if set to anything other than 0. Range is 0 to 1. 228 | use_speaker_boost (bool, optional): Use speaker boost of the generated audio. This setting boosts the similarity to the original speaker. Using this setting requires a slightly higher computational load, which in turn increases latency. 229 | speed (float, optional): Speed of the generated audio. Controls the speed of the generated speech. Values range from 0.7 to 1.2, with 1.0 being the default speed. Lower values create slower, more deliberate speech while higher values produce faster-paced speech. Extreme values can impact the quality of the generated speech. Range is 0.7 to 1.2. 230 | output_directory (str, optional): Directory where files should be saved (only used when saving files). 231 | Defaults to $HOME/Desktop if not provided. 232 | language: ISO 639-1 language code for the voice. 233 | output_format (str, optional): Output format of the generated audio. Formatted as codec_sample_rate_bitrate. So an mp3 with 22.05kHz sample rate at 32kbs is represented as mp3_22050_32. MP3 with 192kbps bitrate requires you to be subscribed to Creator tier or above. PCM with 44.1kHz sample rate requires you to be subscribed to Pro tier or above. Note that the μ-law format (sometimes written mu-law, often approximated as u-law) is commonly used for Twilio audio inputs. 234 | Defaults to "mp3_44100_128". Must be one of: 235 | mp3_22050_32 236 | mp3_44100_32 237 | mp3_44100_64 238 | mp3_44100_96 239 | mp3_44100_128 240 | mp3_44100_192 241 | pcm_8000 242 | pcm_16000 243 | pcm_22050 244 | pcm_24000 245 | pcm_44100 246 | ulaw_8000 247 | alaw_8000 248 | opus_48000_32 249 | opus_48000_64 250 | opus_48000_96 251 | opus_48000_128 252 | opus_48000_192 253 | 254 | Returns: 255 | Text content with file path or MCP resource with audio data, depending on output mode. 256 | """ 257 | ) 258 | def text_to_speech( 259 | text: str, 260 | voice_name: str | None = None, 261 | output_directory: str | None = None, 262 | voice_id: str | None = None, 263 | stability: float = 0.5, 264 | similarity_boost: float = 0.75, 265 | style: float = 0, 266 | use_speaker_boost: bool = True, 267 | speed: float = 1.0, 268 | language: str = "en", 269 | output_format: str = "mp3_44100_128", 270 | model_id: str | None = None, 271 | ) -> Union[TextContent, EmbeddedResource]: 272 | if text == "": 273 | make_error("Text is required.") 274 | 275 | if voice_id is not None and voice_name is not None: 276 | make_error("voice_id and voice_name cannot both be provided.") 277 | 278 | voice = None 279 | if voice_id is not None: 280 | voice = client.voices.get(voice_id=voice_id) 281 | elif voice_name is not None: 282 | voices = client.voices.search(search=voice_name) 283 | if len(voices.voices) == 0: 284 | make_error("No voices found with that name.") 285 | voice = next((v for v in voices.voices if v.name == voice_name), None) 286 | if voice is None: 287 | make_error(f"Voice with name: {voice_name} does not exist.") 288 | 289 | voice_id = voice.voice_id if voice else DEFAULT_VOICE_ID 290 | 291 | output_path = make_output_path(output_directory, base_path) 292 | output_file_name = make_output_file("tts", text, "mp3") 293 | 294 | if model_id is None: 295 | model_id = ( 296 | "eleven_flash_v2_5" 297 | if language in ["hu", "no", "vi"] 298 | else "eleven_multilingual_v2" 299 | ) 300 | 301 | audio_data = client.text_to_speech.convert( 302 | text=text, 303 | voice_id=voice_id, 304 | model_id=model_id, 305 | output_format=output_format, 306 | voice_settings={ 307 | "stability": stability, 308 | "similarity_boost": similarity_boost, 309 | "style": style, 310 | "use_speaker_boost": use_speaker_boost, 311 | "speed": speed, 312 | }, 313 | ) 314 | audio_bytes = b"".join(audio_data) 315 | 316 | # Handle different output modes 317 | success_message = f"Success. File saved as: {{file_path}}. Voice used: {voice.name if voice else DEFAULT_VOICE_ID}" 318 | return handle_output_mode( 319 | audio_bytes, output_path, output_file_name, output_mode, success_message 320 | ) 321 | 322 | 323 | @mcp.tool( 324 | description=f"""Transcribe speech from an audio file. When save_transcript_to_file=True: {get_output_mode_description(output_mode)}. When return_transcript_to_client_directly=True, always returns text directly regardless of output mode. 325 | 326 | ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user. 327 | 328 | Args: 329 | file_path: Path to the audio file to transcribe 330 | language_code: ISO 639-3 language code for transcription. If not provided, the language will be detected automatically. 331 | diarize: Whether to diarize the audio file. If True, which speaker is currently speaking will be annotated in the transcription. 332 | save_transcript_to_file: Whether to save the transcript to a file. 333 | return_transcript_to_client_directly: Whether to return the transcript to the client directly. 334 | output_directory: Directory where files should be saved (only used when saving files). 335 | Defaults to $HOME/Desktop if not provided. 336 | 337 | Returns: 338 | TextContent containing the transcription or MCP resource with transcript data. 339 | """ 340 | ) 341 | def speech_to_text( 342 | input_file_path: str, 343 | language_code: str | None = None, 344 | diarize: bool = False, 345 | save_transcript_to_file: bool = True, 346 | return_transcript_to_client_directly: bool = False, 347 | output_directory: str | None = None, 348 | ) -> Union[TextContent, EmbeddedResource]: 349 | if not save_transcript_to_file and not return_transcript_to_client_directly: 350 | make_error("Must save transcript to file or return it to the client directly.") 351 | file_path = handle_input_file(input_file_path) 352 | if save_transcript_to_file: 353 | output_path = make_output_path(output_directory, base_path) 354 | output_file_name = make_output_file("stt", file_path.name, "txt") 355 | with file_path.open("rb") as f: 356 | audio_bytes = f.read() 357 | 358 | if language_code == "" or language_code is None: 359 | language_code = None 360 | 361 | transcription = client.speech_to_text.convert( 362 | model_id="scribe_v1", 363 | file=audio_bytes, 364 | language_code=language_code, 365 | enable_logging=True, 366 | diarize=diarize, 367 | tag_audio_events=True, 368 | ) 369 | 370 | # Format transcript with speaker identification if diarization was enabled 371 | if diarize: 372 | formatted_transcript = format_diarized_transcript(transcription) 373 | else: 374 | formatted_transcript = transcription.text 375 | 376 | if return_transcript_to_client_directly: 377 | return TextContent(type="text", text=formatted_transcript) 378 | 379 | if save_transcript_to_file: 380 | transcript_bytes = formatted_transcript.encode("utf-8") 381 | 382 | # Handle different output modes 383 | success_message = f"Transcription saved to {file_path}" 384 | return handle_output_mode( 385 | transcript_bytes, 386 | output_path, 387 | output_file_name, 388 | output_mode, 389 | success_message, 390 | ) 391 | 392 | # This should not be reached due to validation at the start of the function 393 | return TextContent(type="text", text="No output mode specified") 394 | 395 | 396 | @mcp.tool( 397 | description=f"""Convert text description of a sound effect to sound effect with a given duration. {get_output_mode_description(output_mode)}. 398 | 399 | Duration must be between 0.5 and 5 seconds. 400 | 401 | ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user. 402 | 403 | Args: 404 | text: Text description of the sound effect 405 | duration_seconds: Duration of the sound effect in seconds 406 | output_directory: Directory where files should be saved (only used when saving files). 407 | Defaults to $HOME/Desktop if not provided. 408 | loop: Whether to loop the sound effect. Defaults to False. 409 | output_format (str, optional): Output format of the generated audio. Formatted as codec_sample_rate_bitrate. So an mp3 with 22.05kHz sample rate at 32kbs is represented as mp3_22050_32. MP3 with 192kbps bitrate requires you to be subscribed to Creator tier or above. PCM with 44.1kHz sample rate requires you to be subscribed to Pro tier or above. Note that the μ-law format (sometimes written mu-law, often approximated as u-law) is commonly used for Twilio audio inputs. 410 | Defaults to "mp3_44100_128". Must be one of: 411 | mp3_22050_32 412 | mp3_44100_32 413 | mp3_44100_64 414 | mp3_44100_96 415 | mp3_44100_128 416 | mp3_44100_192 417 | pcm_8000 418 | pcm_16000 419 | pcm_22050 420 | pcm_24000 421 | pcm_44100 422 | ulaw_8000 423 | alaw_8000 424 | opus_48000_32 425 | opus_48000_64 426 | opus_48000_96 427 | opus_48000_128 428 | opus_48000_192 429 | """ 430 | ) 431 | def text_to_sound_effects( 432 | text: str, 433 | duration_seconds: float = 2.0, 434 | output_directory: str | None = None, 435 | output_format: str = "mp3_44100_128", 436 | loop: bool = False, 437 | ) -> Union[TextContent, EmbeddedResource]: 438 | if duration_seconds < 0.5 or duration_seconds > 5: 439 | make_error("Duration must be between 0.5 and 5 seconds") 440 | output_path = make_output_path(output_directory, base_path) 441 | output_file_name = make_output_file("sfx", text, "mp3") 442 | 443 | audio_data = client.text_to_sound_effects.convert( 444 | text=text, 445 | output_format=output_format, 446 | duration_seconds=duration_seconds, 447 | loop=loop, 448 | ) 449 | audio_bytes = b"".join(audio_data) 450 | 451 | # Handle different output modes 452 | return handle_output_mode(audio_bytes, output_path, output_file_name, output_mode) 453 | 454 | 455 | @mcp.tool( 456 | description=""" 457 | Search for existing voices, a voice that has already been added to the user's ElevenLabs voice library. 458 | Searches in name, description, labels and category. 459 | 460 | Args: 461 | search: Search term to filter voices by. Searches in name, description, labels and category. 462 | sort: Which field to sort by. `created_at_unix` might not be available for older voices. 463 | sort_direction: Sort order, either ascending or descending. 464 | 465 | Returns: 466 | List of voices that match the search criteria. 467 | """ 468 | ) 469 | def search_voices( 470 | search: str | None = None, 471 | sort: Literal["created_at_unix", "name"] = "name", 472 | sort_direction: Literal["asc", "desc"] = "desc", 473 | ) -> list[McpVoice]: 474 | response = client.voices.search( 475 | search=search, sort=sort, sort_direction=sort_direction 476 | ) 477 | return [ 478 | McpVoice(id=voice.voice_id, name=voice.name, category=voice.category) 479 | for voice in response.voices 480 | ] 481 | 482 | 483 | @mcp.tool(description="List all available models") 484 | def list_models() -> list[McpModel]: 485 | response = client.models.list() 486 | return [ 487 | McpModel( 488 | id=model.model_id, 489 | name=model.name, 490 | languages=[ 491 | McpLanguage(language_id=lang.language_id, name=lang.name) 492 | for lang in model.languages 493 | ], 494 | ) 495 | for model in response 496 | ] 497 | 498 | 499 | @mcp.tool(description="Get details of a specific voice") 500 | def get_voice(voice_id: str) -> McpVoice: 501 | """Get details of a specific voice.""" 502 | response = client.voices.get(voice_id=voice_id) 503 | return McpVoice( 504 | id=response.voice_id, 505 | name=response.name, 506 | category=response.category, 507 | fine_tuning_status=response.fine_tuning.state, 508 | ) 509 | 510 | 511 | @mcp.tool( 512 | description="""Create an instant voice clone of a voice using provided audio files. 513 | 514 | ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user. 515 | """ 516 | ) 517 | def voice_clone( 518 | name: str, files: list[str], description: str | None = None 519 | ) -> TextContent: 520 | input_files = [str(handle_input_file(file).absolute()) for file in files] 521 | voice = client.voices.ivc.create( 522 | name=name, description=description, files=input_files 523 | ) 524 | 525 | return TextContent( 526 | type="text", 527 | text=f"""Voice cloned successfully: Name: {voice.name} 528 | ID: {voice.voice_id} 529 | Category: {voice.category} 530 | Description: {voice.description or "N/A"}""", 531 | ) 532 | 533 | 534 | @mcp.tool( 535 | description=f"""Isolate audio from a file. {get_output_mode_description(output_mode)}. 536 | 537 | ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user. 538 | """ 539 | ) 540 | def isolate_audio( 541 | input_file_path: str, output_directory: str | None = None 542 | ) -> Union[TextContent, EmbeddedResource]: 543 | file_path = handle_input_file(input_file_path) 544 | output_path = make_output_path(output_directory, base_path) 545 | output_file_name = make_output_file("iso", file_path.name, "mp3") 546 | with file_path.open("rb") as f: 547 | audio_bytes = f.read() 548 | audio_data = client.audio_isolation.convert( 549 | audio=audio_bytes, 550 | ) 551 | audio_bytes = b"".join(audio_data) 552 | 553 | # Handle different output modes 554 | return handle_output_mode(audio_bytes, output_path, output_file_name, output_mode) 555 | 556 | 557 | @mcp.tool( 558 | description="Check the current subscription status. Could be used to measure the usage of the API." 559 | ) 560 | def check_subscription() -> TextContent: 561 | subscription = client.user.subscription.get() 562 | return TextContent(type="text", text=f"{subscription.model_dump_json(indent=2)}") 563 | 564 | 565 | @mcp.tool( 566 | description="""Create a conversational AI agent with custom configuration. 567 | 568 | ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user. 569 | 570 | Args: 571 | name: Name of the agent 572 | first_message: First message the agent will say i.e. "Hi, how can I help you today?" 573 | system_prompt: System prompt for the agent 574 | voice_id: ID of the voice to use for the agent 575 | language: ISO 639-1 language code for the agent 576 | llm: LLM to use for the agent 577 | temperature: Temperature for the agent. The lower the temperature, the more deterministic the agent's responses will be. Range is 0 to 1. 578 | max_tokens: Maximum number of tokens to generate. 579 | asr_quality: Quality of the ASR. `high` or `low`. 580 | model_id: ID of the ElevenLabs model to use for the agent. 581 | optimize_streaming_latency: Optimize streaming latency. Range is 0 to 4. 582 | stability: Stability for the agent. Range is 0 to 1. 583 | similarity_boost: Similarity boost for the agent. Range is 0 to 1. 584 | turn_timeout: Timeout for the agent to respond in seconds. Defaults to 7 seconds. 585 | max_duration_seconds: Maximum duration of a conversation in seconds. Defaults to 600 seconds (10 minutes). 586 | record_voice: Whether to record the agent's voice. 587 | retention_days: Number of days to retain the agent's data. 588 | """ 589 | ) 590 | def create_agent( 591 | name: str, 592 | first_message: str, 593 | system_prompt: str, 594 | voice_id: str | None = DEFAULT_VOICE_ID, 595 | language: str = "en", 596 | llm: str = "gemini-2.0-flash-001", 597 | temperature: float = 0.5, 598 | max_tokens: int | None = None, 599 | asr_quality: str = "high", 600 | model_id: str = "eleven_turbo_v2", 601 | optimize_streaming_latency: int = 3, 602 | stability: float = 0.5, 603 | similarity_boost: float = 0.8, 604 | turn_timeout: int = 7, 605 | max_duration_seconds: int = 300, 606 | record_voice: bool = True, 607 | retention_days: int = 730, 608 | ) -> TextContent: 609 | conversation_config = create_conversation_config( 610 | language=language, 611 | system_prompt=system_prompt, 612 | llm=llm, 613 | first_message=first_message, 614 | temperature=temperature, 615 | max_tokens=max_tokens, 616 | asr_quality=asr_quality, 617 | voice_id=voice_id, 618 | model_id=model_id, 619 | optimize_streaming_latency=optimize_streaming_latency, 620 | stability=stability, 621 | similarity_boost=similarity_boost, 622 | turn_timeout=turn_timeout, 623 | max_duration_seconds=max_duration_seconds, 624 | ) 625 | 626 | platform_settings = create_platform_settings( 627 | record_voice=record_voice, 628 | retention_days=retention_days, 629 | ) 630 | 631 | response = client.conversational_ai.agents.create( 632 | name=name, 633 | conversation_config=conversation_config, 634 | platform_settings=platform_settings, 635 | ) 636 | 637 | return TextContent( 638 | type="text", 639 | text=f"""Agent created successfully: Name: {name}, Agent ID: {response.agent_id}, System Prompt: {system_prompt}, Voice ID: {voice_id or "Default"}, Language: {language}, LLM: {llm}, You can use this agent ID for future interactions with the agent.""", 640 | ) 641 | 642 | 643 | @mcp.tool( 644 | description="""Add a knowledge base to ElevenLabs workspace. Allowed types are epub, pdf, docx, txt, html. 645 | 646 | ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user. 647 | 648 | Args: 649 | agent_id: ID of the agent to add the knowledge base to. 650 | knowledge_base_name: Name of the knowledge base. 651 | url: URL of the knowledge base. 652 | input_file_path: Path to the file to add to the knowledge base. 653 | text: Text to add to the knowledge base. 654 | """ 655 | ) 656 | def add_knowledge_base_to_agent( 657 | agent_id: str, 658 | knowledge_base_name: str, 659 | url: str | None = None, 660 | input_file_path: str | None = None, 661 | text: str | None = None, 662 | ) -> TextContent: 663 | provided_params = [ 664 | param for param in [url, input_file_path, text] if param is not None 665 | ] 666 | if len(provided_params) == 0: 667 | make_error("Must provide either a URL, a file, or text") 668 | if len(provided_params) > 1: 669 | make_error("Must provide exactly one of: URL, file, or text") 670 | 671 | if url is not None: 672 | response = client.conversational_ai.knowledge_base.documents.create_from_url( 673 | name=knowledge_base_name, 674 | url=url, 675 | ) 676 | else: 677 | if text is not None: 678 | text_bytes = text.encode("utf-8") 679 | text_io = BytesIO(text_bytes) 680 | text_io.name = "text.txt" 681 | text_io.content_type = "text/plain" 682 | file = text_io 683 | elif input_file_path is not None: 684 | path = handle_input_file( 685 | file_path=input_file_path, audio_content_check=False 686 | ) 687 | file = open(path, "rb") 688 | 689 | response = client.conversational_ai.knowledge_base.documents.create_from_file( 690 | name=knowledge_base_name, 691 | file=file, 692 | ) 693 | 694 | agent = client.conversational_ai.agents.get(agent_id=agent_id) 695 | 696 | agent_config = agent.conversation_config.agent 697 | knowledge_base_list = ( 698 | agent_config.get("prompt", {}).get("knowledge_base", []) if agent_config else [] 699 | ) 700 | knowledge_base_list.append( 701 | KnowledgeBaseLocator( 702 | type="file" if file else "url", 703 | name=knowledge_base_name, 704 | id=response.id, 705 | ) 706 | ) 707 | 708 | if agent_config and "prompt" not in agent_config: 709 | agent_config["prompt"] = {} 710 | if agent_config: 711 | agent_config["prompt"]["knowledge_base"] = knowledge_base_list 712 | 713 | client.conversational_ai.agents.update( 714 | agent_id=agent_id, conversation_config=agent.conversation_config 715 | ) 716 | return TextContent( 717 | type="text", 718 | text=f"""Knowledge base created with ID: {response.id} and added to agent {agent_id} successfully.""", 719 | ) 720 | 721 | 722 | @mcp.tool(description="List all available conversational AI agents") 723 | def list_agents() -> TextContent: 724 | """List all available conversational AI agents. 725 | 726 | Returns: 727 | TextContent with a formatted list of available agents 728 | """ 729 | response = client.conversational_ai.agents.list() 730 | 731 | if not response.agents: 732 | return TextContent(type="text", text="No agents found.") 733 | 734 | agent_list = ",".join( 735 | f"{agent.name} (ID: {agent.agent_id})" for agent in response.agents 736 | ) 737 | 738 | return TextContent(type="text", text=f"Available agents: {agent_list}") 739 | 740 | 741 | @mcp.tool(description="Get details about a specific conversational AI agent") 742 | def get_agent(agent_id: str) -> TextContent: 743 | """Get details about a specific conversational AI agent. 744 | 745 | Args: 746 | agent_id: The ID of the agent to retrieve 747 | 748 | Returns: 749 | TextContent with detailed information about the agent 750 | """ 751 | response = client.conversational_ai.agents.get(agent_id=agent_id) 752 | 753 | voice_info = "None" 754 | if response.conversation_config.tts: 755 | voice_info = f"Voice ID: {response.conversation_config.tts.voice_id}" 756 | 757 | return TextContent( 758 | type="text", 759 | text=f"Agent Details: Name: {response.name}, Agent ID: {response.agent_id}, Voice Configuration: {voice_info}, Created At: {datetime.fromtimestamp(response.metadata.created_at_unix_secs).strftime('%Y-%m-%d %H:%M:%S')}", 760 | ) 761 | 762 | 763 | @mcp.tool( 764 | description="""Gets conversation with transcript. Returns: conversation details and full transcript. Use when: analyzing completed agent conversations. 765 | 766 | Args: 767 | conversation_id: The unique identifier of the conversation to retrieve, you can get the ids from the list_conversations tool. 768 | """ 769 | ) 770 | def get_conversation( 771 | conversation_id: str, 772 | ) -> TextContent: 773 | """Get conversation details with transcript""" 774 | try: 775 | response = client.conversational_ai.conversations.get(conversation_id) 776 | 777 | # Parse transcript using utility function 778 | transcript, _ = parse_conversation_transcript(response.transcript) 779 | 780 | response_text = f"""Conversation Details: 781 | ID: {response.conversation_id} 782 | Status: {response.status} 783 | Agent ID: {response.agent_id} 784 | Message Count: {len(response.transcript)} 785 | 786 | Transcript: 787 | {transcript}""" 788 | 789 | if response.metadata: 790 | metadata = response.metadata 791 | duration = getattr( 792 | metadata, 793 | "call_duration_secs", 794 | getattr(metadata, "duration_seconds", "N/A"), 795 | ) 796 | started_at = getattr( 797 | metadata, "start_time_unix_secs", getattr(metadata, "started_at", "N/A") 798 | ) 799 | response_text += ( 800 | f"\n\nMetadata:\nDuration: {duration} seconds\nStarted: {started_at}" 801 | ) 802 | 803 | if response.analysis: 804 | analysis_summary = getattr( 805 | response.analysis, "summary", "Analysis available but no summary" 806 | ) 807 | response_text += f"\n\nAnalysis:\n{analysis_summary}" 808 | 809 | return TextContent(type="text", text=response_text) 810 | 811 | except Exception as e: 812 | make_error(f"Failed to fetch conversation: {str(e)}") 813 | # satisfies type checker 814 | return TextContent(type="text", text="") 815 | 816 | 817 | @mcp.tool( 818 | description="""Lists agent conversations. Returns: conversation list with metadata. Use when: asked about conversation history. 819 | 820 | Args: 821 | agent_id (str, optional): Filter conversations by specific agent ID 822 | cursor (str, optional): Pagination cursor for retrieving next page of results 823 | call_start_before_unix (int, optional): Filter conversations that started before this Unix timestamp 824 | call_start_after_unix (int, optional): Filter conversations that started after this Unix timestamp 825 | page_size (int, optional): Number of conversations to return per page (1-100, defaults to 30) 826 | max_length (int, optional): Maximum character length of the response text (defaults to 10000) 827 | """ 828 | ) 829 | def list_conversations( 830 | agent_id: str | None = None, 831 | cursor: str | None = None, 832 | call_start_before_unix: int | None = None, 833 | call_start_after_unix: int | None = None, 834 | page_size: int = 30, 835 | max_length: int = 10000, 836 | ) -> TextContent: 837 | """List conversations with filtering options.""" 838 | page_size = min(page_size, 100) 839 | 840 | try: 841 | response = client.conversational_ai.conversations.list( 842 | cursor=cursor, 843 | agent_id=agent_id, 844 | call_start_before_unix=call_start_before_unix, 845 | call_start_after_unix=call_start_after_unix, 846 | page_size=page_size, 847 | ) 848 | 849 | if not response.conversations: 850 | return TextContent(type="text", text="No conversations found.") 851 | 852 | conv_list = [] 853 | for conv in response.conversations: 854 | start_time = datetime.fromtimestamp(conv.start_time_unix_secs).strftime( 855 | "%Y-%m-%d %H:%M:%S" 856 | ) 857 | 858 | conv_info = f"""Conversation ID: {conv.conversation_id} 859 | Status: {conv.status} 860 | Agent: {conv.agent_name or 'N/A'} (ID: {conv.agent_id}) 861 | Started: {start_time} 862 | Duration: {conv.call_duration_secs} seconds 863 | Messages: {conv.message_count} 864 | Call Successful: {conv.call_successful}""" 865 | 866 | conv_list.append(conv_info) 867 | 868 | formatted_list = "\n\n".join(conv_list) 869 | 870 | pagination_info = f"Showing {len(response.conversations)} conversations" 871 | if response.has_more: 872 | pagination_info += f" (more available, next cursor: {response.next_cursor})" 873 | 874 | full_text = f"{pagination_info}\n\n{formatted_list}" 875 | 876 | # Use utility to handle large text content 877 | result_text = handle_large_text(full_text, max_length, "conversation list") 878 | 879 | # If content was saved to file, prepend pagination info 880 | if result_text != full_text: 881 | result_text = f"{pagination_info}\n\n{result_text}" 882 | 883 | return TextContent(type="text", text=result_text) 884 | 885 | except Exception as e: 886 | make_error(f"Failed to list conversations: {str(e)}") 887 | # This line is unreachable but satisfies type checker 888 | return TextContent(type="text", text="") 889 | 890 | 891 | @mcp.tool( 892 | description=f"""Transform audio from one voice to another using provided audio files. {get_output_mode_description(output_mode)}. 893 | 894 | ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user. 895 | """ 896 | ) 897 | def speech_to_speech( 898 | input_file_path: str, 899 | voice_name: str = "Adam", 900 | output_directory: str | None = None, 901 | ) -> Union[TextContent, EmbeddedResource]: 902 | voices = client.voices.search(search=voice_name) 903 | 904 | if len(voices.voices) == 0: 905 | make_error("No voice found with that name.") 906 | 907 | voice = next((v for v in voices.voices if v.name == voice_name), None) 908 | 909 | if voice is None: 910 | make_error(f"Voice with name: {voice_name} does not exist.") 911 | 912 | assert voice is not None # Type assertion for type checker 913 | file_path = handle_input_file(input_file_path) 914 | output_path = make_output_path(output_directory, base_path) 915 | output_file_name = make_output_file("sts", file_path.name, "mp3") 916 | 917 | with file_path.open("rb") as f: 918 | audio_bytes = f.read() 919 | 920 | audio_data = client.speech_to_speech.convert( 921 | model_id="eleven_multilingual_sts_v2", 922 | voice_id=voice.voice_id, 923 | audio=audio_bytes, 924 | ) 925 | 926 | audio_bytes = b"".join(audio_data) 927 | 928 | # Handle different output modes 929 | return handle_output_mode(audio_bytes, output_path, output_file_name, output_mode) 930 | 931 | 932 | @mcp.tool( 933 | description=f"""Create voice previews from a text prompt. Creates three previews with slight variations. {get_output_mode_description(output_mode)}. 934 | 935 | If no text is provided, the tool will auto-generate text. 936 | 937 | Voice preview files are saved as: voice_design_(generated_voice_id)_(timestamp).mp3 938 | 939 | Example file name: voice_design_Ya2J5uIa5Pq14DNPsbC1_20250403_164949.mp3 940 | 941 | ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user. 942 | """ 943 | ) 944 | def text_to_voice( 945 | voice_description: str, 946 | text: str | None = None, 947 | output_directory: str | None = None, 948 | ) -> list[EmbeddedResource] | TextContent: 949 | if voice_description == "": 950 | make_error("Voice description is required.") 951 | 952 | previews = client.text_to_voice.create_previews( 953 | voice_description=voice_description, 954 | text=text, 955 | auto_generate_text=True if text is None else False, 956 | ) 957 | 958 | output_path = make_output_path(output_directory, base_path) 959 | 960 | generated_voice_ids = [] 961 | results = [] 962 | 963 | for preview in previews.previews: 964 | output_file_name = make_output_file( 965 | "voice_design", preview.generated_voice_id, "mp3", full_id=True 966 | ) 967 | generated_voice_ids.append(preview.generated_voice_id) 968 | audio_bytes = base64.b64decode(preview.audio_base_64) 969 | 970 | # Handle different output modes 971 | result = handle_output_mode( 972 | audio_bytes, output_path, output_file_name, output_mode 973 | ) 974 | results.append(result) 975 | 976 | # Use centralized multiple files output handling 977 | additional_info = f"Generated voice IDs are: {', '.join(generated_voice_ids)}" 978 | return handle_multiple_files_output_mode(results, output_mode, additional_info) 979 | 980 | 981 | @mcp.tool( 982 | description="""Add a generated voice to the voice library. Uses the voice ID from the `text_to_voice` tool. 983 | 984 | ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user. 985 | """ 986 | ) 987 | def create_voice_from_preview( 988 | generated_voice_id: str, 989 | voice_name: str, 990 | voice_description: str, 991 | ) -> TextContent: 992 | voice = client.text_to_voice.create_voice_from_preview( 993 | voice_name=voice_name, 994 | voice_description=voice_description, 995 | generated_voice_id=generated_voice_id, 996 | ) 997 | 998 | return TextContent( 999 | type="text", 1000 | text=f"Success. Voice created: {voice.name} with ID:{voice.voice_id}", 1001 | ) 1002 | 1003 | 1004 | def _get_phone_number_by_id(phone_number_id: str): 1005 | """Helper function to get phone number details by ID.""" 1006 | phone_numbers = client.conversational_ai.phone_numbers.list() 1007 | for phone in phone_numbers: 1008 | if phone.phone_number_id == phone_number_id: 1009 | return phone 1010 | make_error(f"Phone number with ID {phone_number_id} not found.") 1011 | 1012 | 1013 | @mcp.tool( 1014 | description="""Make an outbound call using an ElevenLabs agent. Automatically detects provider type (Twilio or SIP trunk) and uses the appropriate API. 1015 | 1016 | ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user. 1017 | 1018 | Args: 1019 | agent_id: The ID of the agent that will handle the call 1020 | agent_phone_number_id: The ID of the phone number to use for the call 1021 | to_number: The phone number to call (E.164 format: +1xxxxxxxxxx) 1022 | 1023 | Returns: 1024 | TextContent containing information about the call 1025 | """ 1026 | ) 1027 | def make_outbound_call( 1028 | agent_id: str, 1029 | agent_phone_number_id: str, 1030 | to_number: str, 1031 | ) -> TextContent: 1032 | # Get phone number details to determine provider type 1033 | phone_number = _get_phone_number_by_id(agent_phone_number_id) 1034 | 1035 | if phone_number.provider.lower() == "twilio": 1036 | response = client.conversational_ai.twilio.outbound_call( 1037 | agent_id=agent_id, 1038 | agent_phone_number_id=agent_phone_number_id, 1039 | to_number=to_number, 1040 | ) 1041 | provider_info = "Twilio" 1042 | elif phone_number.provider.lower() == "sip_trunk": 1043 | response = client.conversational_ai.sip_trunk.outbound_call( 1044 | agent_id=agent_id, 1045 | agent_phone_number_id=agent_phone_number_id, 1046 | to_number=to_number, 1047 | ) 1048 | provider_info = "SIP trunk" 1049 | else: 1050 | make_error(f"Unsupported provider type: {phone_number.provider}") 1051 | 1052 | return TextContent( 1053 | type="text", text=f"Outbound call initiated via {provider_info}: {response}." 1054 | ) 1055 | 1056 | 1057 | @mcp.tool( 1058 | description="""Search for a voice across the entire ElevenLabs voice library. 1059 | 1060 | Args: 1061 | page: Page number to return (0-indexed) 1062 | page_size: Number of voices to return per page (1-100) 1063 | search: Search term to filter voices by 1064 | 1065 | Returns: 1066 | TextContent containing information about the shared voices 1067 | """ 1068 | ) 1069 | def search_voice_library( 1070 | page: int = 0, 1071 | page_size: int = 10, 1072 | search: str | None = None, 1073 | ) -> TextContent: 1074 | response = client.voices.get_shared( 1075 | page=page, 1076 | page_size=page_size, 1077 | search=search, 1078 | ) 1079 | 1080 | if not response.voices: 1081 | return TextContent( 1082 | type="text", text="No shared voices found with the specified criteria." 1083 | ) 1084 | 1085 | voice_list = [] 1086 | for voice in response.voices: 1087 | language_info = "N/A" 1088 | if hasattr(voice, "verified_languages") and voice.verified_languages: 1089 | languages = [] 1090 | for lang in voice.verified_languages: 1091 | accent_info = ( 1092 | f" ({lang.accent})" 1093 | if hasattr(lang, "accent") and lang.accent 1094 | else "" 1095 | ) 1096 | languages.append(f"{lang.language}{accent_info}") 1097 | language_info = ", ".join(languages) 1098 | 1099 | details = [ 1100 | f"Name: {voice.name}", 1101 | f"ID: {voice.voice_id}", 1102 | f"Category: {getattr(voice, 'category', 'N/A')}", 1103 | ] 1104 | # TODO: Make cleaner 1105 | if hasattr(voice, "gender") and voice.gender: 1106 | details.append(f"Gender: {voice.gender}") 1107 | if hasattr(voice, "age") and voice.age: 1108 | details.append(f"Age: {voice.age}") 1109 | if hasattr(voice, "accent") and voice.accent: 1110 | details.append(f"Accent: {voice.accent}") 1111 | if hasattr(voice, "description") and voice.description: 1112 | details.append(f"Description: {voice.description}") 1113 | if hasattr(voice, "use_case") and voice.use_case: 1114 | details.append(f"Use Case: {voice.use_case}") 1115 | 1116 | details.append(f"Languages: {language_info}") 1117 | 1118 | if hasattr(voice, "preview_url") and voice.preview_url: 1119 | details.append(f"Preview URL: {voice.preview_url}") 1120 | 1121 | voice_info = "\n".join(details) 1122 | voice_list.append(voice_info) 1123 | 1124 | formatted_info = "\n\n".join(voice_list) 1125 | return TextContent(type="text", text=f"Shared Voices:\n\n{formatted_info}") 1126 | 1127 | 1128 | @mcp.tool(description="List all phone numbers associated with the ElevenLabs account") 1129 | def list_phone_numbers() -> TextContent: 1130 | """List all phone numbers associated with the ElevenLabs account. 1131 | 1132 | Returns: 1133 | TextContent containing formatted information about the phone numbers 1134 | """ 1135 | response = client.conversational_ai.phone_numbers.list() 1136 | 1137 | if not response: 1138 | return TextContent(type="text", text="No phone numbers found.") 1139 | 1140 | phone_info = [] 1141 | for phone in response: 1142 | assigned_agent = "None" 1143 | if phone.assigned_agent: 1144 | assigned_agent = f"{phone.assigned_agent.agent_name} (ID: {phone.assigned_agent.agent_id})" 1145 | 1146 | phone_info.append( 1147 | f"Phone Number: {phone.phone_number}\n" 1148 | f"ID: {phone.phone_number_id}\n" 1149 | f"Provider: {phone.provider}\n" 1150 | f"Label: {phone.label}\n" 1151 | f"Assigned Agent: {assigned_agent}" 1152 | ) 1153 | 1154 | formatted_info = "\n\n".join(phone_info) 1155 | return TextContent(type="text", text=f"Phone Numbers:\n\n{formatted_info}") 1156 | 1157 | 1158 | @mcp.tool(description="Play an audio file. Supports WAV and MP3 formats.") 1159 | def play_audio(input_file_path: str) -> TextContent: 1160 | file_path = handle_input_file(input_file_path) 1161 | play(open(file_path, "rb").read(), use_ffmpeg=False) 1162 | return TextContent(type="text", text=f"Successfully played audio file: {file_path}") 1163 | 1164 | 1165 | @mcp.tool( 1166 | description="""Convert a prompt to music and save the output audio file to a given directory. 1167 | Directory is optional, if not provided, the output file will be saved to $HOME/Desktop. 1168 | 1169 | Args: 1170 | prompt: Prompt to convert to music. Must provide either prompt or composition_plan. 1171 | output_directory: Directory to save the output audio file 1172 | composition_plan: Composition plan to use for the music. Must provide either prompt or composition_plan. 1173 | music_length_ms: Length of the generated music in milliseconds. Cannot be used if composition_plan is provided. 1174 | 1175 | ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.""" 1176 | ) 1177 | def compose_music( 1178 | prompt: str | None = None, 1179 | output_directory: str | None = None, 1180 | composition_plan: MusicPrompt | None = None, 1181 | music_length_ms: int | None = None, 1182 | ) -> Union[TextContent, EmbeddedResource]: 1183 | if prompt is None and composition_plan is None: 1184 | make_error( 1185 | f"Either prompt or composition_plan must be provided. Prompt: {prompt}" 1186 | ) 1187 | 1188 | if prompt is not None and composition_plan is not None: 1189 | make_error("Only one of prompt or composition_plan must be provided") 1190 | 1191 | if music_length_ms is not None and composition_plan is not None: 1192 | make_error("music_length_ms cannot be used if composition_plan is provided") 1193 | 1194 | output_path = make_output_path(output_directory, base_path) 1195 | output_file_name = make_output_file("music", "", "mp3") 1196 | 1197 | audio_data = client.music.compose( 1198 | prompt=prompt, 1199 | music_length_ms=music_length_ms, 1200 | composition_plan=composition_plan, 1201 | ) 1202 | 1203 | audio_bytes = b"".join(audio_data) 1204 | 1205 | # Handle different output modes 1206 | return handle_output_mode(audio_bytes, output_path, output_file_name, output_mode) 1207 | 1208 | 1209 | @mcp.tool( 1210 | description="""Create a composition plan for music generation. Usage of this endpoint does not cost any credits but is subject to rate limiting depending on your tier. Composition plans can be used when generating music with the compose_music tool. 1211 | 1212 | Args: 1213 | prompt: Prompt to create a composition plan for 1214 | music_length_ms: The length of the composition plan to generate in milliseconds. Must be between 10000ms and 300000ms. Optional - if not provided, the model will choose a length based on the prompt. 1215 | source_composition_plan: An optional composition plan to use as a source for the new composition plan 1216 | """ 1217 | ) 1218 | def create_composition_plan( 1219 | prompt: str, 1220 | music_length_ms: int | None = None, 1221 | source_composition_plan: MusicPrompt | None = None, 1222 | ) -> MusicPrompt: 1223 | composition_plan = client.music.composition_plan.create( 1224 | prompt=prompt, 1225 | music_length_ms=music_length_ms, 1226 | source_composition_plan=source_composition_plan, 1227 | ) 1228 | 1229 | return composition_plan 1230 | 1231 | 1232 | def main(): 1233 | print("Starting MCP server") 1234 | """Run the MCP server""" 1235 | mcp.run() 1236 | 1237 | 1238 | if __name__ == "__main__": 1239 | main() 1240 | ```