atla-ai/atla-mcp-server # codebase.md

# Directory Structure

```
├── .github
│   └── workflows
│       ├── publish.yaml
│       └── python-ci.yaml
├── .gitignore
├── .pre-commit-config.yaml
├── .python-version
├── atla_mcp_server
│   ├── __init__.py
│   ├── __main__.py
│   ├── debug.py
│   └── server.py
├── CONTRIBUTING.md
├── LICENSE
├── pyproject.toml
└── README.md
```

# Files

--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------

```
1 | 3.11
2 | 
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
 1 | # Python-generated files
 2 | __pycache__/
 3 | *.py[oc]
 4 | build/
 5 | dist/
 6 | wheels/
 7 | *.egg-info
 8 | 
 9 | # Virtual environments
10 | .venv
11 | 
12 | # Lock files
13 | uv.lock
14 | 
```

--------------------------------------------------------------------------------
/.pre-commit-config.yaml:
--------------------------------------------------------------------------------

```yaml
 1 | repos:
 2 |   - repo: https://github.com/pre-commit/pre-commit-hooks
 3 |     rev: v4.5.0
 4 |     hooks:
 5 |       - id: check-yaml
 6 |       - id: check-json
 7 |       - id: check-toml
 8 |       - id: check-merge-conflict
 9 |       - id: end-of-file-fixer
10 |       - id: trailing-whitespace
11 |       - id: mixed-line-ending
12 |       - id: check-case-conflict
13 |       - id: detect-private-key
14 | 
15 |   - repo: https://github.com/astral-sh/ruff-pre-commit
16 |     rev: v0.9.7
17 |     hooks:
18 |       # Run the linter
19 |       - id: ruff
20 |         args: [--fix]
21 |       # Run the formatter
22 |       - id: ruff-format
23 | 
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Atla MCP Server
  2 | 
  3 | > [!CAUTION]
  4 | > This repository was archived on July 21, 2025. The Atla API is no longer active.
  5 | 
  6 | An MCP server implementation providing a standardized interface for LLMs to interact with the Atla API for state-of-the-art LLMJ evaluation.
  7 | 
  8 | > Learn more about Atla [here](https://docs.atla-ai.com). Learn more about the Model Context Protocol [here](https://modelcontextprotocol.io).
  9 | 
 10 | <a href="https://glama.ai/mcp/servers/@atla-ai/atla-mcp-server">
 11 |   <img width="380" height="200" src="https://glama.ai/mcp/servers/@atla-ai/atla-mcp-server/badge" alt="Atla MCP server" />
 12 | </a>
 13 | 
 14 | ## Available Tools
 15 | 
 16 | - `evaluate_llm_response`: Evaluate an LLM's response to a prompt using a given evaluation criteria. This function uses an Atla evaluation model under the hood to return a dictionary containing a score for the model's response and a textual critique containing feedback on the model's response.
 17 | - `evaluate_llm_response_on_multiple_criteria`: Evaluate an LLM's response to a prompt across _multiple_ evaluation criteria. This function uses an Atla evaluation model under the hood to return a list of dictionaries, each containing an evaluation score and critique for a given criteria.
 18 | 
 19 | ## Usage
 20 | 
 21 | > To use the MCP server, you will need an Atla API key. You can find your existing API key [here](https://www.atla-ai.com/sign-in) or create a new one [here](https://www.atla-ai.com/sign-up).
 22 | 
 23 | ### Installation
 24 | 
 25 | > We recommend using `uv` to manage the Python environment. See [here](https://docs.astral.sh/uv/getting-started/installation/) for installation instructions.
 26 | 
 27 | ### Manually running the server
 28 | 
 29 | Once you have `uv` installed and have your Atla API key, you can manually run the MCP server using `uvx` (which is provided by `uv`):
 30 | 
 31 | ```bash
 32 | ATLA_API_KEY=<your-api-key> uvx atla-mcp-server
 33 | ```
 34 | 
 35 | ### Connecting to the server
 36 | 
 37 | > Having issues or need help connecting to another client? Feel free to open an issue or [contact us](mailto:[email protected])!
 38 | 
 39 | #### OpenAI Agents SDK
 40 | 
 41 | > For more details on using the OpenAI Agents SDK with MCP servers, refer to the [official documentation](https://openai.github.io/openai-agents-python/).
 42 | 
 43 | 1. Install the OpenAI Agents SDK:
 44 | 
 45 | ```shell
 46 | pip install openai-agents
 47 | ```
 48 | 
 49 | 2. Use the OpenAI Agents SDK to connect to the server:
 50 | 
 51 | ```python
 52 | import os
 53 | 
 54 | from agents import Agent
 55 | from agents.mcp import MCPServerStdio
 56 | 
 57 | async with MCPServerStdio(
 58 |         params={
 59 |             "command": "uvx",
 60 |             "args": ["atla-mcp-server"],
 61 |             "env": {"ATLA_API_KEY": os.environ.get("ATLA_API_KEY")}
 62 |         }
 63 |     ) as atla_mcp_server:
 64 |     ...
 65 | ```
 66 | 
 67 | #### Claude Desktop
 68 | 
 69 | > For more details on configuring MCP servers in Claude Desktop, refer to the [official MCP quickstart guide](https://modelcontextprotocol.io/quickstart/user).
 70 | 
 71 | 1. Add the following to your `claude_desktop_config.json` file:
 72 | 
 73 | ```json
 74 | {
 75 |   "mcpServers": {
 76 |     "atla-mcp-server": {
 77 |       "command": "uvx",
 78 |       "args": ["atla-mcp-server"],
 79 |       "env": {
 80 |         "ATLA_API_KEY": "<your-atla-api-key>"
 81 |       }
 82 |     }
 83 |   }
 84 | }
 85 | ```
 86 | 
 87 | 2. **Restart Claude Desktop** to apply the changes.
 88 | 
 89 | You should now see options from `atla-mcp-server` in the list of available MCP tools.
 90 | 
 91 | #### Cursor
 92 | 
 93 | > For more details on configuring MCP servers in Cursor, refer to the [official documentation](https://docs.cursor.com/context/model-context-protocol).
 94 | 
 95 | 1. Add the following to your `.cursor/mcp.json` file:
 96 | 
 97 | ```json
 98 | {
 99 |   "mcpServers": {
100 |     "atla-mcp-server": {
101 |       "command": "uvx",
102 |       "args": ["atla-mcp-server"],
103 |       "env": {
104 |         "ATLA_API_KEY": "<your-atla-api-key>"
105 |       }
106 |     }
107 |   }
108 | }
109 | ```
110 | 
111 | You should now see `atla-mcp-server` in the list of available MCP servers.
112 | 
113 | ## Contributing
114 | 
115 | Contributions are welcome! Please see the [CONTRIBUTING.md](CONTRIBUTING.md) file for details.
116 | 
117 | ## License
118 | 
119 | This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
120 | 
```

--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------

```markdown
 1 | # Contributing to Atla MCP Server
 2 | 
 3 | We welcome contributions to the Atla MCP Server! This document provides guidelines and steps for contributing.
 4 | 
 5 | ## Development Setup
 6 | 
 7 | Follow the installation steps in the [README.md](README.md#installation), making sure to install the development dependencies:
 8 | 
 9 | ```shell
10 | uv pip install -e ".[dev]"
11 | pre-commit install  # Set up git hooks
12 | ```
13 | 
14 | ## Making Changes
15 | 
16 | 1. Fork the repository on GitHub
17 | 2. Clone your fork locally
18 | 3. Create a new branch for your changes
19 | 4. Make your changes
20 | 5. Commit your changes (pre-commit hooks will run automatically)
21 | 6. Push to your fork
22 | 7. Submit a pull request from your fork to our main repository
23 | 
24 | ## Questions?
25 | 
26 | Feel free to open an issue if you have questions or run into problems.
27 | 
```

--------------------------------------------------------------------------------
/atla_mcp_server/__init__.py:
--------------------------------------------------------------------------------

```python
1 | """An MCP server implementation providing a standardized interface for LLMs to interact with the Atla API."""  # noqa: E501
2 | 
```

--------------------------------------------------------------------------------
/atla_mcp_server/debug.py:
--------------------------------------------------------------------------------

```python
1 | """File for debugging the Atla MCP Server via the MCP Inspector."""
2 | 
3 | import os
4 | 
5 | from atla_mcp_server.server import app_factory
6 | 
7 | app = app_factory(atla_api_key=os.getenv("ATLA_API_KEY", ""))
8 | 
```

--------------------------------------------------------------------------------
/.github/workflows/python-ci.yaml:
--------------------------------------------------------------------------------

```yaml
 1 | name: Python CI
 2 | 
 3 | on:
 4 |   push:
 5 |     branches: [ main ]
 6 |   pull_request:
 7 |     branches: [ main ]
 8 | 
 9 | jobs:
10 |   python-ci:
11 |     runs-on: ubuntu-latest
12 |     steps:
13 |       - uses: actions/checkout@v4
14 |       - uses: actions/setup-python@v5
15 |         with:
16 |           python-version: "3.11"
17 | 
18 |       - name: Install uv
19 |         run: curl -LsSf https://astral.sh/uv/install.sh | sh
20 | 
21 |       - name: Setup Python environment
22 |         run: |
23 |           uv venv
24 |           . .venv/bin/activate
25 |           uv pip install -e ".[dev]"
26 | 
27 |       - name: Run ruff checks
28 |         run: |
29 |           . .venv/bin/activate
30 |           ruff check .
31 |           ruff format --check .
32 | 
33 |       - name: Run mypy checks
34 |         run: |
35 |           . .venv/bin/activate
36 |           dmypy run -- .
37 | 
```

--------------------------------------------------------------------------------
/.github/workflows/publish.yaml:
--------------------------------------------------------------------------------

```yaml
 1 | name: Publishing
 2 | 
 3 | on:
 4 |   release:
 5 |     types: [published]
 6 | 
 7 | jobs:
 8 |   build:
 9 |     runs-on: ubuntu-latest
10 |     name: Build distribution
11 |     steps:
12 |       - uses: actions/checkout@v4
13 | 
14 |       - name: Install uv
15 |         uses: astral-sh/setup-uv@v3
16 | 
17 |       - name: Build
18 |         run: uv build
19 | 
20 |       - name: Upload artifacts
21 |         uses: actions/upload-artifact@v4
22 |         with:
23 |           name: release-dists
24 |           path: dist/
25 | 
26 |   pypi-publish:
27 |     name: Upload release to PyPI
28 |     runs-on: ubuntu-latest
29 |     environment: release
30 |     needs: [build]
31 |     permissions:
32 |       id-token: write
33 | 
34 |     steps:
35 |       - name: Retrieve release distribution
36 |         uses: actions/download-artifact@v4
37 |         with:
38 |           name: release-dists
39 |           path: dist/
40 | 
41 |       - name: Publish package distribution to PyPI
42 |         uses: pypa/gh-action-pypi-publish@release/v1
43 | 
```

--------------------------------------------------------------------------------
/atla_mcp_server/__main__.py:
--------------------------------------------------------------------------------

```python
 1 | """Entrypoint for the Atla MCP Server."""
 2 | 
 3 | import argparse
 4 | import os
 5 | 
 6 | from atla_mcp_server.server import app_factory
 7 | 
 8 | 
 9 | def main():
10 |     """Entrypoint for the Atla MCP Server."""
11 |     print("Starting Atla MCP Server with stdio transport...")
12 | 
13 |     parser = argparse.ArgumentParser()
14 |     parser.add_argument(
15 |         "--atla-api-key",
16 |         type=str,
17 |         required=False,
18 |         help="Atla API key. Can also be set via ATLA_API_KEY environment variable.",
19 |     )
20 |     args = parser.parse_args()
21 | 
22 |     if args.atla_api_key:
23 |         print("Using Atla API key from --atla-api-key CLI argument...")
24 |         atla_api_key = args.atla_api_key
25 |     elif os.getenv("ATLA_API_KEY"):
26 |         atla_api_key = os.getenv("ATLA_API_KEY")
27 |         print("Using Atla API key from ATLA_API_KEY environment variable...")
28 |     else:
29 |         parser.error(
30 |             "Atla API key must be provided either via --atla-api-key argument "
31 |             "or ATLA_API_KEY environment variable"
32 |         )
33 | 
34 |     print("Creating server...")
35 |     app = app_factory(atla_api_key)
36 | 
37 |     print("Running server...")
38 |     app.run(transport="stdio")
39 | 
40 | 
41 | if __name__ == "__main__":
42 |     main()
43 | 
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
 1 | [build-system]
 2 | requires = ["hatchling", "uv-dynamic-versioning"]
 3 | build-backend = "hatchling.build"
 4 | 
 5 | [tool.hatch.version]
 6 | source = "uv-dynamic-versioning"
 7 | 
 8 | [tool.uv-dynamic-versioning]
 9 | vcs = "git"
10 | style = "pep440"
11 | bump = true
12 | 
13 | [tool.hatch.build.targets.wheel]
14 | packages = ["atla_mcp_server"]
15 | 
16 | [tool.hatch.build.targets.sdist]
17 | packages = ["atla_mcp_server"]
18 | 
19 | [project]
20 | name = "atla-mcp-server"
21 | dynamic = ["version"]
22 | description = "An MCP server implementation providing a standardized interface for LLMs to interact with the Atla API."
23 | readme = "README.md"
24 | requires-python = ">=3.11"
25 | authors = [
26 |     { name="Atla", email="[email protected]" }
27 | ]
28 | license = { text = "MIT" }
29 | classifiers = [
30 |     "Development Status :: 4 - Beta",
31 |     "Intended Audience :: Developers",
32 |     "License :: OSI Approved :: MIT License",
33 |     "Programming Language :: Python :: 3",
34 |     "Programming Language :: Python :: 3.11",
35 | ]
36 | dependencies = [
37 |     "atla>=0.6.0",
38 |     "mcp[cli]>=1.6.0",
39 | ]
40 | 
41 | [project.optional-dependencies]
42 | dev = [
43 |     "mypy>=1.15.0",
44 |     "pre-commit>=3.7.1",
45 |     "ruff>=0.9.7",
46 | ]
47 | 
48 | [project.scripts]
49 | atla-mcp-server = "atla_mcp_server.__main__:main"
50 | 
51 | [project.urls]
52 | Homepage = "https://atla-ai.com"
53 | Repository = "https://github.com/atla-ai/atla-mcp-server"
54 | Issues = "https://github.com/atla-ai/atla-mcp-server/issues"
55 | 
56 | [tool.mypy]
57 | exclude = ['.venv']
58 | explicit_package_bases = false
59 | follow_untyped_imports = true
60 | implicit_optional = false
61 | mypy_path = ["atla_mcp_server"]
62 | plugins = ['pydantic.mypy']
63 | python_version = "3.11"
64 | 
65 | [tool.ruff]
66 | line-length = 90
67 | indent-width = 4
68 | 
69 | [tool.ruff.lint]
70 | exclude = [".venv"]
71 | # See: https://docs.astral.sh/ruff/rules/
72 | select = [
73 |     "B",   # Bugbear
74 |     "C",   # Complexity
75 |     "E",   # Pycodestyle
76 |     "F",   # Pyflakes
77 |     "I",   # Isort
78 |     "RUF", # Ruff
79 |     "W",   # Pycodestyle
80 |     "D", # Docstrings
81 | ]
82 | ignore = []
83 | fixable = ["ALL"]
84 | unfixable = []
85 | 
86 | [tool.ruff.lint.isort]
87 | known-first-party = ["atla_mcp_server"]
88 | 
89 | [tool.ruff.lint.pydocstyle]
90 | convention = "google"
91 | 
92 | [tool.ruff.format]
93 | quote-style = "double"
94 | 
```

--------------------------------------------------------------------------------
/atla_mcp_server/server.py:
--------------------------------------------------------------------------------

```python
  1 | """MCP server implementation."""
  2 | 
  3 | import asyncio
  4 | from contextlib import asynccontextmanager
  5 | from dataclasses import dataclass
  6 | from textwrap import dedent
  7 | from typing import Annotated, AsyncIterator, Literal, Optional, cast
  8 | 
  9 | from atla import AsyncAtla
 10 | from mcp.server.fastmcp import Context, FastMCP
 11 | from pydantic import WithJsonSchema
 12 | 
 13 | # config
 14 | 
 15 | 
 16 | @dataclass
 17 | class MCPState:
 18 |     """State of the MCP server."""
 19 | 
 20 |     atla_client: AsyncAtla
 21 | 
 22 | 
 23 | # types
 24 | 
 25 | AnnotatedLlmPrompt = Annotated[
 26 |     str,
 27 |     WithJsonSchema(
 28 |         {
 29 |             "description": dedent(
 30 |                 """The prompt given to an LLM to generate the `llm_response` to be \
 31 |                 evaluated."""
 32 |             ),
 33 |             "examples": [
 34 |                 "What is the capital of the moon?",
 35 |                 "Explain the difference between supervised and unsupervised learning.",
 36 |                 "Can you summarize the main idea behind transformers in NLP?",
 37 |             ],
 38 |         }
 39 |     ),
 40 | ]
 41 | 
 42 | AnnotatedLlmResponse = Annotated[
 43 |     str,
 44 |     WithJsonSchema(
 45 |         {
 46 |             "description": dedent(
 47 |                 """The output generated by the model in response to the `llm_prompt`, \
 48 |                 which needs to be evaluated."""
 49 |             ),
 50 |             "examples": [
 51 |                 dedent(
 52 |                     """The Moon doesn't have a capital — it has no countries, \
 53 |                     governments, or permanent residents"""
 54 |                 ),
 55 |                 dedent(
 56 |                     """Supervised learning uses labeled data to train models to make \
 57 |                     predictions or classifications. Unsupervised learning, on the other \
 58 |                     hand, works with unlabeled data to uncover hidden patterns or \
 59 |                     groupings, such as through clustering or dimensionality reduction."""
 60 |                 ),
 61 |                 dedent(
 62 |                     """Transformers are neural network architectures designed for \
 63 |                     sequence modeling tasks like NLP. They rely on self-attention \
 64 |                     mechanisms to weigh the importance of different input tokens, \
 65 |                     enabling parallel processing of input data. Unlike RNNs, they don't \
 66 |                     process sequentially, which allows for faster training and better \
 67 |                     handling of long-range dependencies."""
 68 |                 ),
 69 |             ],
 70 |         }
 71 |     ),
 72 | ]
 73 | 
 74 | AnnotatedEvaluationCriteria = Annotated[
 75 |     str,
 76 |     WithJsonSchema(
 77 |         {
 78 |             "description": dedent(
 79 |                 """The specific criteria or instructions on which to evaluate the \
 80 |                 model output. A good evaluation criteria should provide the model \
 81 |                 with: (1) a description of the evaluation task, (2) a rubric of \
 82 |                 possible scores and their corresponding criteria, and (3) a \
 83 |                 final sentence clarifying expected score format. A good evaluation \
 84 |                 criteria should also be specific and focus on a single aspect of \
 85 |                 the model output. To evaluate a model's response on multiple \
 86 |                 criteria, use the `evaluate_llm_response_on_multiple_criteria` \
 87 |                 function and create individual criteria for each relevant evaluation \
 88 |                 task. Typical rubrics score responses either on a Likert scale from \
 89 |                 1 to 5 or binary scale with scores of 'Yes' or 'No', depending on \
 90 |                 the specific evaluation task."""
 91 |             ),
 92 |             "examples": [
 93 |                 dedent(
 94 |                     """Evaluate how well the response fulfills the requirements of the instruction by providing relevant information. This includes responding in accordance with the explicit and implicit purpose of given instruction.
 95 | 
 96 |                         Score 1: The response is completely unrelated to the instruction, or the model entirely misunderstands the instruction.
 97 |                         Score 2: Most of the key points in the response are irrelevant to the instruction, and the response misses major requirements of the instruction.
 98 |                         Score 3: Some major points in the response contain irrelevant information or miss some requirements of the instruction.
 99 |                         Score 4: The response is relevant to the instruction but misses minor requirements of the instruction.
100 |                         Score 5: The response is perfectly relevant to the instruction, and the model fulfills all of the requirements of the instruction.
101 | 
102 |                         Your score should be an integer between 1 and 5."""  # noqa: E501
103 |                 ),
104 |                 dedent(
105 |                     """Evaluate whether the information provided in the response is correct given the reference response.
106 |                         Ignore differences in punctuation and phrasing between the response and reference response.
107 |                         It is okay if the response contains more information than the reference response, as long as it does not contain any conflicting statements.
108 | 
109 |                         Binary scoring
110 |                         "No": The response is not factually accurate when compared against the reference response or includes conflicting statements.
111 |                         "Yes": The response is supported by the reference response and does not contain conflicting statements.
112 | 
113 |                         Your score should be either "No" or "Yes".
114 |                         """  # noqa: E501
115 |                 ),
116 |             ],
117 |         }
118 |     ),
119 | ]
120 | 
121 | 
122 | AnnotatedExpectedLlmOutput = Annotated[
123 |     Optional[str],
124 |     WithJsonSchema(
125 |         {
126 |             "description": dedent(
127 |                 """A reference or ideal answer to compare against the `llm_response`. \
128 |                 This is useful in cases where a specific output is expected from \
129 |                 the model. Defaults to None."""
130 |             )
131 |         }
132 |     ),
133 | ]
134 | 
135 | AnnotatedLlmContext = Annotated[
136 |     Optional[str],
137 |     WithJsonSchema(
138 |         {
139 |             "description": dedent(
140 |                 """Additional context or information provided to the model during \
141 |                 generation. This is useful in cases where the model was provided \
142 |                 with additional information that is not part of the `llm_prompt` \
143 |                 or `expected_llm_output` (e.g., a RAG retrieval context). \
144 |                 Defaults to None."""
145 |             )
146 |         }
147 |     ),
148 | ]
149 | 
150 | AnnotatedModelId = Annotated[
151 |     Literal["atla-selene", "atla-selene-mini"],
152 |     WithJsonSchema(
153 |         {
154 |             "description": dedent(
155 |                 """The Atla model ID to use for evaluation. `atla-selene` is the \
156 |                 flagship Atla model, optimized for the highest all-round performance. \
157 |                 `atla-selene-mini` is a compact model that is generally faster and \
158 |                 cheaper to run. Defaults to `atla-selene`."""
159 |             )
160 |         }
161 |     ),
162 | ]
163 | 
164 | # tools
165 | 
166 | 
167 | async def evaluate_llm_response(
168 |     ctx: Context,
169 |     evaluation_criteria: AnnotatedEvaluationCriteria,
170 |     llm_prompt: AnnotatedLlmPrompt,
171 |     llm_response: AnnotatedLlmResponse,
172 |     expected_llm_output: AnnotatedExpectedLlmOutput = None,
173 |     llm_context: AnnotatedLlmContext = None,
174 |     model_id: AnnotatedModelId = "atla-selene",
175 | ) -> dict[str, str]:
176 |     """Evaluate an LLM's response to a prompt using a given evaluation criteria.
177 | 
178 |     This function uses an Atla evaluation model under the hood to return a dictionary
179 |     containing a score for the model's response and a textual critique containing
180 |     feedback on the model's response.
181 | 
182 |     Returns:
183 |         dict[str, str]: A dictionary containing the evaluation score and critique, in
184 |             the format `{"score": <score>, "critique": <critique>}`.
185 |     """
186 |     state = cast(MCPState, ctx.request_context.lifespan_context)
187 |     result = await state.atla_client.evaluation.create(
188 |         model_id=model_id,
189 |         model_input=llm_prompt,
190 |         model_output=llm_response,
191 |         evaluation_criteria=evaluation_criteria,
192 |         expected_model_output=expected_llm_output,
193 |         model_context=llm_context,
194 |     )
195 | 
196 |     return {
197 |         "score": result.result.evaluation.score,
198 |         "critique": result.result.evaluation.critique,
199 |     }
200 | 
201 | 
202 | async def evaluate_llm_response_on_multiple_criteria(
203 |     ctx: Context,
204 |     evaluation_criteria_list: list[AnnotatedEvaluationCriteria],
205 |     llm_prompt: AnnotatedLlmPrompt,
206 |     llm_response: AnnotatedLlmResponse,
207 |     expected_llm_output: AnnotatedExpectedLlmOutput = None,
208 |     llm_context: AnnotatedLlmContext = None,
209 |     model_id: AnnotatedModelId = "atla-selene",
210 | ) -> list[dict[str, str]]:
211 |     """Evaluate an LLM's response to a prompt across *multiple* evaluation criteria.
212 | 
213 |     This function uses an Atla evaluation model under the hood to return a list of
214 |     dictionaries, each containing an evaluation score and critique for a given
215 |     criteria.
216 | 
217 |     Returns:
218 |         list[dict[str, str]]: A list of dictionaries containing the evaluation score
219 |             and critique, in the format `{"score": <score>, "critique": <critique>}`.
220 |             The order of the dictionaries in the list will match the order of the
221 |             criteria in the `evaluation_criteria_list` argument.
222 |     """
223 |     tasks = [
224 |         evaluate_llm_response(
225 |             ctx=ctx,
226 |             evaluation_criteria=criterion,
227 |             llm_prompt=llm_prompt,
228 |             llm_response=llm_response,
229 |             expected_llm_output=expected_llm_output,
230 |             llm_context=llm_context,
231 |             model_id=model_id,
232 |         )
233 |         for criterion in evaluation_criteria_list
234 |     ]
235 |     results = await asyncio.gather(*tasks)
236 |     return results
237 | 
238 | 
239 | # app factory
240 | 
241 | 
242 | def app_factory(atla_api_key: str) -> FastMCP:
243 |     """Factory function to create an Atla MCP server with the given API key."""
244 | 
245 |     @asynccontextmanager
246 |     async def lifespan(_: FastMCP) -> AsyncIterator[MCPState]:
247 |         async with AsyncAtla(
248 |             api_key=atla_api_key,
249 |             default_headers={
250 |                 "X-Atla-Source": "mcp-server",
251 |             },
252 |         ) as client:
253 |             yield MCPState(atla_client=client)
254 | 
255 |     mcp = FastMCP("Atla", lifespan=lifespan)
256 |     mcp.tool()(evaluate_llm_response)
257 |     mcp.tool()(evaluate_llm_response_on_multiple_criteria)
258 | 
259 |     return mcp
260 | 
```