kevinwatt/yt-dlp-mcp # codebase.md

This is page 2 of 2. Use http://codebase.md/kevinwatt/yt-dlp-mcp?lines=true&page={x} to view the full context.

# Directory Structure

```
├── .claude
│   └── skills
│       └── mcp-builder
│           ├── LICENSE.txt
│           ├── reference
│           │   ├── evaluation.md
│           │   ├── mcp_best_practices.md
│           │   ├── node_mcp_server.md
│           │   └── python_mcp_server.md
│           ├── scripts
│           │   ├── connections.py
│           │   ├── evaluation.py
│           │   ├── example_evaluation.xml
│           │   └── requirements.txt
│           └── SKILL.md
├── .gitignore
├── .npmignore
├── .prettierrc
├── CHANGELOG.md
├── CLAUDE.md
├── docs
│   ├── api.md
│   ├── configuration.md
│   ├── contributing.md
│   ├── error-handling.md
│   └── search-feature-demo.md
├── eslint.config.mjs
├── jest.config.mjs
├── LICENSE
├── package-lock.json
├── package.json
├── README.md
├── src
│   ├── __tests__
│   │   ├── audio.test.ts
│   │   ├── index.test.ts
│   │   ├── metadata.test.ts
│   │   ├── search.test.ts
│   │   ├── subtitle.test.ts
│   │   └── video.test.ts
│   ├── config.ts
│   ├── index.mts
│   └── modules
│       ├── audio.ts
│       ├── metadata.ts
│       ├── search.ts
│       ├── subtitle.ts
│       ├── utils.ts
│       └── video.ts
├── test-bilibili.mjs
├── test-mcp.mjs
├── test-real-video.mjs
├── tsconfig.jest.json
└── tsconfig.json
```

# Files

--------------------------------------------------------------------------------
/.claude/skills/mcp-builder/scripts/evaluation.py:
--------------------------------------------------------------------------------

```python
  1 | """MCP Server Evaluation Harness
  2 | 
  3 | This script evaluates MCP servers by running test questions against them using Claude.
  4 | """
  5 | 
  6 | import argparse
  7 | import asyncio
  8 | import json
  9 | import re
 10 | import sys
 11 | import time
 12 | import traceback
 13 | import xml.etree.ElementTree as ET
 14 | from pathlib import Path
 15 | from typing import Any
 16 | 
 17 | from anthropic import Anthropic
 18 | 
 19 | from connections import create_connection
 20 | 
 21 | EVALUATION_PROMPT = """You are an AI assistant with access to tools.
 22 | 
 23 | When given a task, you MUST:
 24 | 1. Use the available tools to complete the task
 25 | 2. Provide summary of each step in your approach, wrapped in <summary> tags
 26 | 3. Provide feedback on the tools provided, wrapped in <feedback> tags
 27 | 4. Provide your final response, wrapped in <response> tags
 28 | 
 29 | Summary Requirements:
 30 | - In your <summary> tags, you must explain:
 31 |   - The steps you took to complete the task
 32 |   - Which tools you used, in what order, and why
 33 |   - The inputs you provided to each tool
 34 |   - The outputs you received from each tool
 35 |   - A summary for how you arrived at the response
 36 | 
 37 | Feedback Requirements:
 38 | - In your <feedback> tags, provide constructive feedback on the tools:
 39 |   - Comment on tool names: Are they clear and descriptive?
 40 |   - Comment on input parameters: Are they well-documented? Are required vs optional parameters clear?
 41 |   - Comment on descriptions: Do they accurately describe what the tool does?
 42 |   - Comment on any errors encountered during tool usage: Did the tool fail to execute? Did the tool return too many tokens?
 43 |   - Identify specific areas for improvement and explain WHY they would help
 44 |   - Be specific and actionable in your suggestions
 45 | 
 46 | Response Requirements:
 47 | - Your response should be concise and directly address what was asked
 48 | - Always wrap your final response in <response> tags
 49 | - If you cannot solve the task return <response>NOT_FOUND</response>
 50 | - For numeric responses, provide just the number
 51 | - For IDs, provide just the ID
 52 | - For names or text, provide the exact text requested
 53 | - Your response should go last"""
 54 | 
 55 | 
 56 | def parse_evaluation_file(file_path: Path) -> list[dict[str, Any]]:
 57 |     """Parse XML evaluation file with qa_pair elements."""
 58 |     try:
 59 |         tree = ET.parse(file_path)
 60 |         root = tree.getroot()
 61 |         evaluations = []
 62 | 
 63 |         for qa_pair in root.findall(".//qa_pair"):
 64 |             question_elem = qa_pair.find("question")
 65 |             answer_elem = qa_pair.find("answer")
 66 | 
 67 |             if question_elem is not None and answer_elem is not None:
 68 |                 evaluations.append({
 69 |                     "question": (question_elem.text or "").strip(),
 70 |                     "answer": (answer_elem.text or "").strip(),
 71 |                 })
 72 | 
 73 |         return evaluations
 74 |     except Exception as e:
 75 |         print(f"Error parsing evaluation file {file_path}: {e}")
 76 |         return []
 77 | 
 78 | 
 79 | def extract_xml_content(text: str, tag: str) -> str | None:
 80 |     """Extract content from XML tags."""
 81 |     pattern = rf"<{tag}>(.*?)</{tag}>"
 82 |     matches = re.findall(pattern, text, re.DOTALL)
 83 |     return matches[-1].strip() if matches else None
 84 | 
 85 | 
 86 | async def agent_loop(
 87 |     client: Anthropic,
 88 |     model: str,
 89 |     question: str,
 90 |     tools: list[dict[str, Any]],
 91 |     connection: Any,
 92 | ) -> tuple[str, dict[str, Any]]:
 93 |     """Run the agent loop with MCP tools."""
 94 |     messages = [{"role": "user", "content": question}]
 95 | 
 96 |     response = await asyncio.to_thread(
 97 |         client.messages.create,
 98 |         model=model,
 99 |         max_tokens=4096,
100 |         system=EVALUATION_PROMPT,
101 |         messages=messages,
102 |         tools=tools,
103 |     )
104 | 
105 |     messages.append({"role": "assistant", "content": response.content})
106 | 
107 |     tool_metrics = {}
108 | 
109 |     while response.stop_reason == "tool_use":
110 |         tool_use = next(block for block in response.content if block.type == "tool_use")
111 |         tool_name = tool_use.name
112 |         tool_input = tool_use.input
113 | 
114 |         tool_start_ts = time.time()
115 |         try:
116 |             tool_result = await connection.call_tool(tool_name, tool_input)
117 |             tool_response = json.dumps(tool_result) if isinstance(tool_result, (dict, list)) else str(tool_result)
118 |         except Exception as e:
119 |             tool_response = f"Error executing tool {tool_name}: {str(e)}\n"
120 |             tool_response += traceback.format_exc()
121 |         tool_duration = time.time() - tool_start_ts
122 | 
123 |         if tool_name not in tool_metrics:
124 |             tool_metrics[tool_name] = {"count": 0, "durations": []}
125 |         tool_metrics[tool_name]["count"] += 1
126 |         tool_metrics[tool_name]["durations"].append(tool_duration)
127 | 
128 |         messages.append({
129 |             "role": "user",
130 |             "content": [{
131 |                 "type": "tool_result",
132 |                 "tool_use_id": tool_use.id,
133 |                 "content": tool_response,
134 |             }]
135 |         })
136 | 
137 |         response = await asyncio.to_thread(
138 |             client.messages.create,
139 |             model=model,
140 |             max_tokens=4096,
141 |             system=EVALUATION_PROMPT,
142 |             messages=messages,
143 |             tools=tools,
144 |         )
145 |         messages.append({"role": "assistant", "content": response.content})
146 | 
147 |     response_text = next(
148 |         (block.text for block in response.content if hasattr(block, "text")),
149 |         None,
150 |     )
151 |     return response_text, tool_metrics
152 | 
153 | 
154 | async def evaluate_single_task(
155 |     client: Anthropic,
156 |     model: str,
157 |     qa_pair: dict[str, Any],
158 |     tools: list[dict[str, Any]],
159 |     connection: Any,
160 |     task_index: int,
161 | ) -> dict[str, Any]:
162 |     """Evaluate a single QA pair with the given tools."""
163 |     start_time = time.time()
164 | 
165 |     print(f"Task {task_index + 1}: Running task with question: {qa_pair['question']}")
166 |     response, tool_metrics = await agent_loop(client, model, qa_pair["question"], tools, connection)
167 | 
168 |     response_value = extract_xml_content(response, "response")
169 |     summary = extract_xml_content(response, "summary")
170 |     feedback = extract_xml_content(response, "feedback")
171 | 
172 |     duration_seconds = time.time() - start_time
173 | 
174 |     return {
175 |         "question": qa_pair["question"],
176 |         "expected": qa_pair["answer"],
177 |         "actual": response_value,
178 |         "score": int(response_value == qa_pair["answer"]) if response_value else 0,
179 |         "total_duration": duration_seconds,
180 |         "tool_calls": tool_metrics,
181 |         "num_tool_calls": sum(len(metrics["durations"]) for metrics in tool_metrics.values()),
182 |         "summary": summary,
183 |         "feedback": feedback,
184 |     }
185 | 
186 | 
187 | REPORT_HEADER = """
188 | # Evaluation Report
189 | 
190 | ## Summary
191 | 
192 | - **Accuracy**: {correct}/{total} ({accuracy:.1f}%)
193 | - **Average Task Duration**: {average_duration_s:.2f}s
194 | - **Average Tool Calls per Task**: {average_tool_calls:.2f}
195 | - **Total Tool Calls**: {total_tool_calls}
196 | 
197 | ---
198 | """
199 | 
200 | TASK_TEMPLATE = """
201 | ### Task {task_num}
202 | 
203 | **Question**: {question}
204 | **Ground Truth Answer**: `{expected_answer}`
205 | **Actual Answer**: `{actual_answer}`
206 | **Correct**: {correct_indicator}
207 | **Duration**: {total_duration:.2f}s
208 | **Tool Calls**: {tool_calls}
209 | 
210 | **Summary**
211 | {summary}
212 | 
213 | **Feedback**
214 | {feedback}
215 | 
216 | ---
217 | """
218 | 
219 | 
220 | async def run_evaluation(
221 |     eval_path: Path,
222 |     connection: Any,
223 |     model: str = "claude-3-7-sonnet-20250219",
224 | ) -> str:
225 |     """Run evaluation with MCP server tools."""
226 |     print("🚀 Starting Evaluation")
227 | 
228 |     client = Anthropic()
229 | 
230 |     tools = await connection.list_tools()
231 |     print(f"📋 Loaded {len(tools)} tools from MCP server")
232 | 
233 |     qa_pairs = parse_evaluation_file(eval_path)
234 |     print(f"📋 Loaded {len(qa_pairs)} evaluation tasks")
235 | 
236 |     results = []
237 |     for i, qa_pair in enumerate(qa_pairs):
238 |         print(f"Processing task {i + 1}/{len(qa_pairs)}")
239 |         result = await evaluate_single_task(client, model, qa_pair, tools, connection, i)
240 |         results.append(result)
241 | 
242 |     correct = sum(r["score"] for r in results)
243 |     accuracy = (correct / len(results)) * 100 if results else 0
244 |     average_duration_s = sum(r["total_duration"] for r in results) / len(results) if results else 0
245 |     average_tool_calls = sum(r["num_tool_calls"] for r in results) / len(results) if results else 0
246 |     total_tool_calls = sum(r["num_tool_calls"] for r in results)
247 | 
248 |     report = REPORT_HEADER.format(
249 |         correct=correct,
250 |         total=len(results),
251 |         accuracy=accuracy,
252 |         average_duration_s=average_duration_s,
253 |         average_tool_calls=average_tool_calls,
254 |         total_tool_calls=total_tool_calls,
255 |     )
256 | 
257 |     report += "".join([
258 |         TASK_TEMPLATE.format(
259 |             task_num=i + 1,
260 |             question=qa_pair["question"],
261 |             expected_answer=qa_pair["answer"],
262 |             actual_answer=result["actual"] or "N/A",
263 |             correct_indicator="✅" if result["score"] else "❌",
264 |             total_duration=result["total_duration"],
265 |             tool_calls=json.dumps(result["tool_calls"], indent=2),
266 |             summary=result["summary"] or "N/A",
267 |             feedback=result["feedback"] or "N/A",
268 |         )
269 |         for i, (qa_pair, result) in enumerate(zip(qa_pairs, results))
270 |     ])
271 | 
272 |     return report
273 | 
274 | 
275 | def parse_headers(header_list: list[str]) -> dict[str, str]:
276 |     """Parse header strings in format 'Key: Value' into a dictionary."""
277 |     headers = {}
278 |     if not header_list:
279 |         return headers
280 | 
281 |     for header in header_list:
282 |         if ":" in header:
283 |             key, value = header.split(":", 1)
284 |             headers[key.strip()] = value.strip()
285 |         else:
286 |             print(f"Warning: Ignoring malformed header: {header}")
287 |     return headers
288 | 
289 | 
290 | def parse_env_vars(env_list: list[str]) -> dict[str, str]:
291 |     """Parse environment variable strings in format 'KEY=VALUE' into a dictionary."""
292 |     env = {}
293 |     if not env_list:
294 |         return env
295 | 
296 |     for env_var in env_list:
297 |         if "=" in env_var:
298 |             key, value = env_var.split("=", 1)
299 |             env[key.strip()] = value.strip()
300 |         else:
301 |             print(f"Warning: Ignoring malformed environment variable: {env_var}")
302 |     return env
303 | 
304 | 
305 | async def main():
306 |     parser = argparse.ArgumentParser(
307 |         description="Evaluate MCP servers using test questions",
308 |         formatter_class=argparse.RawDescriptionHelpFormatter,
309 |         epilog="""
310 | Examples:
311 |   # Evaluate a local stdio MCP server
312 |   python evaluation.py -t stdio -c python -a my_server.py eval.xml
313 | 
314 |   # Evaluate an SSE MCP server
315 |   python evaluation.py -t sse -u https://example.com/mcp -H "Authorization: Bearer token" eval.xml
316 | 
317 |   # Evaluate an HTTP MCP server with custom model
318 |   python evaluation.py -t http -u https://example.com/mcp -m claude-3-5-sonnet-20241022 eval.xml
319 |         """,
320 |     )
321 | 
322 |     parser.add_argument("eval_file", type=Path, help="Path to evaluation XML file")
323 |     parser.add_argument("-t", "--transport", choices=["stdio", "sse", "http"], default="stdio", help="Transport type (default: stdio)")
324 |     parser.add_argument("-m", "--model", default="claude-3-7-sonnet-20250219", help="Claude model to use (default: claude-3-7-sonnet-20250219)")
325 | 
326 |     stdio_group = parser.add_argument_group("stdio options")
327 |     stdio_group.add_argument("-c", "--command", help="Command to run MCP server (stdio only)")
328 |     stdio_group.add_argument("-a", "--args", nargs="+", help="Arguments for the command (stdio only)")
329 |     stdio_group.add_argument("-e", "--env", nargs="+", help="Environment variables in KEY=VALUE format (stdio only)")
330 | 
331 |     remote_group = parser.add_argument_group("sse/http options")
332 |     remote_group.add_argument("-u", "--url", help="MCP server URL (sse/http only)")
333 |     remote_group.add_argument("-H", "--header", nargs="+", dest="headers", help="HTTP headers in 'Key: Value' format (sse/http only)")
334 | 
335 |     parser.add_argument("-o", "--output", type=Path, help="Output file for evaluation report (default: stdout)")
336 | 
337 |     args = parser.parse_args()
338 | 
339 |     if not args.eval_file.exists():
340 |         print(f"Error: Evaluation file not found: {args.eval_file}")
341 |         sys.exit(1)
342 | 
343 |     headers = parse_headers(args.headers) if args.headers else None
344 |     env_vars = parse_env_vars(args.env) if args.env else None
345 | 
346 |     try:
347 |         connection = create_connection(
348 |             transport=args.transport,
349 |             command=args.command,
350 |             args=args.args,
351 |             env=env_vars,
352 |             url=args.url,
353 |             headers=headers,
354 |         )
355 |     except ValueError as e:
356 |         print(f"Error: {e}")
357 |         sys.exit(1)
358 | 
359 |     print(f"🔗 Connecting to MCP server via {args.transport}...")
360 | 
361 |     async with connection:
362 |         print("✅ Connected successfully")
363 |         report = await run_evaluation(args.eval_file, connection, args.model)
364 | 
365 |         if args.output:
366 |             args.output.write_text(report)
367 |             print(f"\n✅ Report saved to {args.output}")
368 |         else:
369 |             print("\n" + report)
370 | 
371 | 
372 | if __name__ == "__main__":
373 |     asyncio.run(main())
374 | 
```

--------------------------------------------------------------------------------
/.claude/skills/mcp-builder/SKILL.md:
--------------------------------------------------------------------------------

```markdown
  1 | ---
  2 | name: mcp-builder
  3 | description: Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).
  4 | license: Complete terms in LICENSE.txt
  5 | ---
  6 | 
  7 | # MCP Server Development Guide
  8 | 
  9 | ## Overview
 10 | 
 11 | To create high-quality MCP (Model Context Protocol) servers that enable LLMs to effectively interact with external services, use this skill. An MCP server provides tools that allow LLMs to access external services and APIs. The quality of an MCP server is measured by how well it enables LLMs to accomplish real-world tasks using the tools provided.
 12 | 
 13 | ---
 14 | 
 15 | # Process
 16 | 
 17 | ## 🚀 High-Level Workflow
 18 | 
 19 | Creating a high-quality MCP server involves four main phases:
 20 | 
 21 | ### Phase 1: Deep Research and Planning
 22 | 
 23 | #### 1.1 Understand Agent-Centric Design Principles
 24 | 
 25 | Before diving into implementation, understand how to design tools for AI agents by reviewing these principles:
 26 | 
 27 | **Build for Workflows, Not Just API Endpoints:**
 28 | - Don't simply wrap existing API endpoints - build thoughtful, high-impact workflow tools
 29 | - Consolidate related operations (e.g., `schedule_event` that both checks availability and creates event)
 30 | - Focus on tools that enable complete tasks, not just individual API calls
 31 | - Consider what workflows agents actually need to accomplish
 32 | 
 33 | **Optimize for Limited Context:**
 34 | - Agents have constrained context windows - make every token count
 35 | - Return high-signal information, not exhaustive data dumps
 36 | - Provide "concise" vs "detailed" response format options
 37 | - Default to human-readable identifiers over technical codes (names over IDs)
 38 | - Consider the agent's context budget as a scarce resource
 39 | 
 40 | **Design Actionable Error Messages:**
 41 | - Error messages should guide agents toward correct usage patterns
 42 | - Suggest specific next steps: "Try using filter='active_only' to reduce results"
 43 | - Make errors educational, not just diagnostic
 44 | - Help agents learn proper tool usage through clear feedback
 45 | 
 46 | **Follow Natural Task Subdivisions:**
 47 | - Tool names should reflect how humans think about tasks
 48 | - Group related tools with consistent prefixes for discoverability
 49 | - Design tools around natural workflows, not just API structure
 50 | 
 51 | **Use Evaluation-Driven Development:**
 52 | - Create realistic evaluation scenarios early
 53 | - Let agent feedback drive tool improvements
 54 | - Prototype quickly and iterate based on actual agent performance
 55 | 
 56 | #### 1.3 Study MCP Protocol Documentation
 57 | 
 58 | **Fetch the latest MCP protocol documentation:**
 59 | 
 60 | Use WebFetch to load: `https://modelcontextprotocol.io/llms-full.txt`
 61 | 
 62 | This comprehensive document contains the complete MCP specification and guidelines.
 63 | 
 64 | #### 1.4 Study Framework Documentation
 65 | 
 66 | **Load and read the following reference files:**
 67 | 
 68 | - **MCP Best Practices**: [📋 View Best Practices](./reference/mcp_best_practices.md) - Core guidelines for all MCP servers
 69 | 
 70 | **For Python implementations, also load:**
 71 | - **Python SDK Documentation**: Use WebFetch to load `https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md`
 72 | - [🐍 Python Implementation Guide](./reference/python_mcp_server.md) - Python-specific best practices and examples
 73 | 
 74 | **For Node/TypeScript implementations, also load:**
 75 | - **TypeScript SDK Documentation**: Use WebFetch to load `https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md`
 76 | - [⚡ TypeScript Implementation Guide](./reference/node_mcp_server.md) - Node/TypeScript-specific best practices and examples
 77 | 
 78 | #### 1.5 Exhaustively Study API Documentation
 79 | 
 80 | To integrate a service, read through **ALL** available API documentation:
 81 | - Official API reference documentation
 82 | - Authentication and authorization requirements
 83 | - Rate limiting and pagination patterns
 84 | - Error responses and status codes
 85 | - Available endpoints and their parameters
 86 | - Data models and schemas
 87 | 
 88 | **To gather comprehensive information, use web search and the WebFetch tool as needed.**
 89 | 
 90 | #### 1.6 Create a Comprehensive Implementation Plan
 91 | 
 92 | Based on your research, create a detailed plan that includes:
 93 | 
 94 | **Tool Selection:**
 95 | - List the most valuable endpoints/operations to implement
 96 | - Prioritize tools that enable the most common and important use cases
 97 | - Consider which tools work together to enable complex workflows
 98 | 
 99 | **Shared Utilities and Helpers:**
100 | - Identify common API request patterns
101 | - Plan pagination helpers
102 | - Design filtering and formatting utilities
103 | - Plan error handling strategies
104 | 
105 | **Input/Output Design:**
106 | - Define input validation models (Pydantic for Python, Zod for TypeScript)
107 | - Design consistent response formats (e.g., JSON or Markdown), and configurable levels of detail (e.g., Detailed or Concise)
108 | - Plan for large-scale usage (thousands of users/resources)
109 | - Implement character limits and truncation strategies (e.g., 25,000 tokens)
110 | 
111 | **Error Handling Strategy:**
112 | - Plan graceful failure modes
113 | - Design clear, actionable, LLM-friendly, natural language error messages which prompt further action
114 | - Consider rate limiting and timeout scenarios
115 | - Handle authentication and authorization errors
116 | 
117 | ---
118 | 
119 | ### Phase 2: Implementation
120 | 
121 | Now that you have a comprehensive plan, begin implementation following language-specific best practices.
122 | 
123 | #### 2.1 Set Up Project Structure
124 | 
125 | **For Python:**
126 | - Create a single `.py` file or organize into modules if complex (see [🐍 Python Guide](./reference/python_mcp_server.md))
127 | - Use the MCP Python SDK for tool registration
128 | - Define Pydantic models for input validation
129 | 
130 | **For Node/TypeScript:**
131 | - Create proper project structure (see [⚡ TypeScript Guide](./reference/node_mcp_server.md))
132 | - Set up `package.json` and `tsconfig.json`
133 | - Use MCP TypeScript SDK
134 | - Define Zod schemas for input validation
135 | 
136 | #### 2.2 Implement Core Infrastructure First
137 | 
138 | **To begin implementation, create shared utilities before implementing tools:**
139 | - API request helper functions
140 | - Error handling utilities
141 | - Response formatting functions (JSON and Markdown)
142 | - Pagination helpers
143 | - Authentication/token management
144 | 
145 | #### 2.3 Implement Tools Systematically
146 | 
147 | For each tool in the plan:
148 | 
149 | **Define Input Schema:**
150 | - Use Pydantic (Python) or Zod (TypeScript) for validation
151 | - Include proper constraints (min/max length, regex patterns, min/max values, ranges)
152 | - Provide clear, descriptive field descriptions
153 | - Include diverse examples in field descriptions
154 | 
155 | **Write Comprehensive Docstrings/Descriptions:**
156 | - One-line summary of what the tool does
157 | - Detailed explanation of purpose and functionality
158 | - Explicit parameter types with examples
159 | - Complete return type schema
160 | - Usage examples (when to use, when not to use)
161 | - Error handling documentation, which outlines how to proceed given specific errors
162 | 
163 | **Implement Tool Logic:**
164 | - Use shared utilities to avoid code duplication
165 | - Follow async/await patterns for all I/O
166 | - Implement proper error handling
167 | - Support multiple response formats (JSON and Markdown)
168 | - Respect pagination parameters
169 | - Check character limits and truncate appropriately
170 | 
171 | **Add Tool Annotations:**
172 | - `readOnlyHint`: true (for read-only operations)
173 | - `destructiveHint`: false (for non-destructive operations)
174 | - `idempotentHint`: true (if repeated calls have same effect)
175 | - `openWorldHint`: true (if interacting with external systems)
176 | 
177 | #### 2.4 Follow Language-Specific Best Practices
178 | 
179 | **At this point, load the appropriate language guide:**
180 | 
181 | **For Python: Load [🐍 Python Implementation Guide](./reference/python_mcp_server.md) and ensure the following:**
182 | - Using MCP Python SDK with proper tool registration
183 | - Pydantic v2 models with `model_config`
184 | - Type hints throughout
185 | - Async/await for all I/O operations
186 | - Proper imports organization
187 | - Module-level constants (CHARACTER_LIMIT, API_BASE_URL)
188 | 
189 | **For Node/TypeScript: Load [⚡ TypeScript Implementation Guide](./reference/node_mcp_server.md) and ensure the following:**
190 | - Using `server.registerTool` properly
191 | - Zod schemas with `.strict()`
192 | - TypeScript strict mode enabled
193 | - No `any` types - use proper types
194 | - Explicit Promise<T> return types
195 | - Build process configured (`npm run build`)
196 | 
197 | ---
198 | 
199 | ### Phase 3: Review and Refine
200 | 
201 | After initial implementation:
202 | 
203 | #### 3.1 Code Quality Review
204 | 
205 | To ensure quality, review the code for:
206 | - **DRY Principle**: No duplicated code between tools
207 | - **Composability**: Shared logic extracted into functions
208 | - **Consistency**: Similar operations return similar formats
209 | - **Error Handling**: All external calls have error handling
210 | - **Type Safety**: Full type coverage (Python type hints, TypeScript types)
211 | - **Documentation**: Every tool has comprehensive docstrings/descriptions
212 | 
213 | #### 3.2 Test and Build
214 | 
215 | **Important:** MCP servers are long-running processes that wait for requests over stdio/stdin or sse/http. Running them directly in your main process (e.g., `python server.py` or `node dist/index.js`) will cause your process to hang indefinitely.
216 | 
217 | **Safe ways to test the server:**
218 | - Use the evaluation harness (see Phase 4) - recommended approach
219 | - Run the server in tmux to keep it outside your main process
220 | - Use a timeout when testing: `timeout 5s python server.py`
221 | 
222 | **For Python:**
223 | - Verify Python syntax: `python -m py_compile your_server.py`
224 | - Check imports work correctly by reviewing the file
225 | - To manually test: Run server in tmux, then test with evaluation harness in main process
226 | - Or use the evaluation harness directly (it manages the server for stdio transport)
227 | 
228 | **For Node/TypeScript:**
229 | - Run `npm run build` and ensure it completes without errors
230 | - Verify dist/index.js is created
231 | - To manually test: Run server in tmux, then test with evaluation harness in main process
232 | - Or use the evaluation harness directly (it manages the server for stdio transport)
233 | 
234 | #### 3.3 Use Quality Checklist
235 | 
236 | To verify implementation quality, load the appropriate checklist from the language-specific guide:
237 | - Python: see "Quality Checklist" in [🐍 Python Guide](./reference/python_mcp_server.md)
238 | - Node/TypeScript: see "Quality Checklist" in [⚡ TypeScript Guide](./reference/node_mcp_server.md)
239 | 
240 | ---
241 | 
242 | ### Phase 4: Create Evaluations
243 | 
244 | After implementing your MCP server, create comprehensive evaluations to test its effectiveness.
245 | 
246 | **Load [✅ Evaluation Guide](./reference/evaluation.md) for complete evaluation guidelines.**
247 | 
248 | #### 4.1 Understand Evaluation Purpose
249 | 
250 | Evaluations test whether LLMs can effectively use your MCP server to answer realistic, complex questions.
251 | 
252 | #### 4.2 Create 10 Evaluation Questions
253 | 
254 | To create effective evaluations, follow the process outlined in the evaluation guide:
255 | 
256 | 1. **Tool Inspection**: List available tools and understand their capabilities
257 | 2. **Content Exploration**: Use READ-ONLY operations to explore available data
258 | 3. **Question Generation**: Create 10 complex, realistic questions
259 | 4. **Answer Verification**: Solve each question yourself to verify answers
260 | 
261 | #### 4.3 Evaluation Requirements
262 | 
263 | Each question must be:
264 | - **Independent**: Not dependent on other questions
265 | - **Read-only**: Only non-destructive operations required
266 | - **Complex**: Requiring multiple tool calls and deep exploration
267 | - **Realistic**: Based on real use cases humans would care about
268 | - **Verifiable**: Single, clear answer that can be verified by string comparison
269 | - **Stable**: Answer won't change over time
270 | 
271 | #### 4.4 Output Format
272 | 
273 | Create an XML file with this structure:
274 | 
275 | ```xml
276 | <evaluation>
277 |   <qa_pair>
278 |     <question>Find discussions about AI model launches with animal codenames. One model needed a specific safety designation that uses the format ASL-X. What number X was being determined for the model named after a spotted wild cat?</question>
279 |     <answer>3</answer>
280 |   </qa_pair>
281 | <!-- More qa_pairs... -->
282 | </evaluation>
283 | ```
284 | 
285 | ---
286 | 
287 | # Reference Files
288 | 
289 | ## 📚 Documentation Library
290 | 
291 | Load these resources as needed during development:
292 | 
293 | ### Core MCP Documentation (Load First)
294 | - **MCP Protocol**: Fetch from `https://modelcontextprotocol.io/llms-full.txt` - Complete MCP specification
295 | - [📋 MCP Best Practices](./reference/mcp_best_practices.md) - Universal MCP guidelines including:
296 |   - Server and tool naming conventions
297 |   - Response format guidelines (JSON vs Markdown)
298 |   - Pagination best practices
299 |   - Character limits and truncation strategies
300 |   - Tool development guidelines
301 |   - Security and error handling standards
302 | 
303 | ### SDK Documentation (Load During Phase 1/2)
304 | - **Python SDK**: Fetch from `https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md`
305 | - **TypeScript SDK**: Fetch from `https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md`
306 | 
307 | ### Language-Specific Implementation Guides (Load During Phase 2)
308 | - [🐍 Python Implementation Guide](./reference/python_mcp_server.md) - Complete Python/FastMCP guide with:
309 |   - Server initialization patterns
310 |   - Pydantic model examples
311 |   - Tool registration with `@mcp.tool`
312 |   - Complete working examples
313 |   - Quality checklist
314 | 
315 | - [⚡ TypeScript Implementation Guide](./reference/node_mcp_server.md) - Complete TypeScript guide with:
316 |   - Project structure
317 |   - Zod schema patterns
318 |   - Tool registration with `server.registerTool`
319 |   - Complete working examples
320 |   - Quality checklist
321 | 
322 | ### Evaluation Guide (Load During Phase 4)
323 | - [✅ Evaluation Guide](./reference/evaluation.md) - Complete evaluation creation guide with:
324 |   - Question creation guidelines
325 |   - Answer verification strategies
326 |   - XML format specifications
327 |   - Example questions and answers
328 |   - Running an evaluation with the provided scripts
329 | 
```

--------------------------------------------------------------------------------
/.claude/skills/mcp-builder/reference/evaluation.md:
--------------------------------------------------------------------------------

```markdown
  1 | # MCP Server Evaluation Guide
  2 | 
  3 | ## Overview
  4 | 
  5 | This document provides guidance on creating comprehensive evaluations for MCP servers. Evaluations test whether LLMs can effectively use your MCP server to answer realistic, complex questions using only the tools provided.
  6 | 
  7 | ---
  8 | 
  9 | ## Quick Reference
 10 | 
 11 | ### Evaluation Requirements
 12 | - Create 10 human-readable questions
 13 | - Questions must be READ-ONLY, INDEPENDENT, NON-DESTRUCTIVE
 14 | - Each question requires multiple tool calls (potentially dozens)
 15 | - Answers must be single, verifiable values
 16 | - Answers must be STABLE (won't change over time)
 17 | 
 18 | ### Output Format
 19 | ```xml
 20 | <evaluation>
 21 |    <qa_pair>
 22 |       <question>Your question here</question>
 23 |       <answer>Single verifiable answer</answer>
 24 |    </qa_pair>
 25 | </evaluation>
 26 | ```
 27 | 
 28 | ---
 29 | 
 30 | ## Purpose of Evaluations
 31 | 
 32 | The measure of quality of an MCP server is NOT how well or comprehensively the server implements tools, but how well these implementations (input/output schemas, docstrings/descriptions, functionality) enable LLMs with no other context and access ONLY to the MCP servers to answer realistic and difficult questions.
 33 | 
 34 | ## Evaluation Overview
 35 | 
 36 | Create 10 human-readable questions requiring ONLY READ-ONLY, INDEPENDENT, NON-DESTRUCTIVE, and IDEMPOTENT operations to answer. Each question should be:
 37 | - Realistic
 38 | - Clear and concise
 39 | - Unambiguous
 40 | - Complex, requiring potentially dozens of tool calls or steps
 41 | - Answerable with a single, verifiable value that you identify in advance
 42 | 
 43 | ## Question Guidelines
 44 | 
 45 | ### Core Requirements
 46 | 
 47 | 1. **Questions MUST be independent**
 48 |    - Each question should NOT depend on the answer to any other question
 49 |    - Should not assume prior write operations from processing another question
 50 | 
 51 | 2. **Questions MUST require ONLY NON-DESTRUCTIVE AND IDEMPOTENT tool use**
 52 |    - Should not instruct or require modifying state to arrive at the correct answer
 53 | 
 54 | 3. **Questions must be REALISTIC, CLEAR, CONCISE, and COMPLEX**
 55 |    - Must require another LLM to use multiple (potentially dozens of) tools or steps to answer
 56 | 
 57 | ### Complexity and Depth
 58 | 
 59 | 4. **Questions must require deep exploration**
 60 |    - Consider multi-hop questions requiring multiple sub-questions and sequential tool calls
 61 |    - Each step should benefit from information found in previous questions
 62 | 
 63 | 5. **Questions may require extensive paging**
 64 |    - May need paging through multiple pages of results
 65 |    - May require querying old data (1-2 years out-of-date) to find niche information
 66 |    - The questions must be DIFFICULT
 67 | 
 68 | 6. **Questions must require deep understanding**
 69 |    - Rather than surface-level knowledge
 70 |    - May pose complex ideas as True/False questions requiring evidence
 71 |    - May use multiple-choice format where LLM must search different hypotheses
 72 | 
 73 | 7. **Questions must not be solvable with straightforward keyword search**
 74 |    - Do not include specific keywords from the target content
 75 |    - Use synonyms, related concepts, or paraphrases
 76 |    - Require multiple searches, analyzing multiple related items, extracting context, then deriving the answer
 77 | 
 78 | ### Tool Testing
 79 | 
 80 | 8. **Questions should stress-test tool return values**
 81 |    - May elicit tools returning large JSON objects or lists, overwhelming the LLM
 82 |    - Should require understanding multiple modalities of data:
 83 |      - IDs and names
 84 |      - Timestamps and datetimes (months, days, years, seconds)
 85 |      - File IDs, names, extensions, and mimetypes
 86 |      - URLs, GIDs, etc.
 87 |    - Should probe the tool's ability to return all useful forms of data
 88 | 
 89 | 9. **Questions should MOSTLY reflect real human use cases**
 90 |    - The kinds of information retrieval tasks that HUMANS assisted by an LLM would care about
 91 | 
 92 | 10. **Questions may require dozens of tool calls**
 93 |     - This challenges LLMs with limited context
 94 |     - Encourages MCP server tools to reduce information returned
 95 | 
 96 | 11. **Include ambiguous questions**
 97 |     - May be ambiguous OR require difficult decisions on which tools to call
 98 |     - Force the LLM to potentially make mistakes or misinterpret
 99 |     - Ensure that despite AMBIGUITY, there is STILL A SINGLE VERIFIABLE ANSWER
100 | 
101 | ### Stability
102 | 
103 | 12. **Questions must be designed so the answer DOES NOT CHANGE**
104 |     - Do not ask questions that rely on "current state" which is dynamic
105 |     - For example, do not count:
106 |       - Number of reactions to a post
107 |       - Number of replies to a thread
108 |       - Number of members in a channel
109 | 
110 | 13. **DO NOT let the MCP server RESTRICT the kinds of questions you create**
111 |     - Create challenging and complex questions
112 |     - Some may not be solvable with the available MCP server tools
113 |     - Questions may require specific output formats (datetime vs. epoch time, JSON vs. MARKDOWN)
114 |     - Questions may require dozens of tool calls to complete
115 | 
116 | ## Answer Guidelines
117 | 
118 | ### Verification
119 | 
120 | 1. **Answers must be VERIFIABLE via direct string comparison**
121 |    - If the answer can be re-written in many formats, clearly specify the output format in the QUESTION
122 |    - Examples: "Use YYYY/MM/DD.", "Respond True or False.", "Answer A, B, C, or D and nothing else."
123 |    - Answer should be a single VERIFIABLE value such as:
124 |      - User ID, user name, display name, first name, last name
125 |      - Channel ID, channel name
126 |      - Message ID, string
127 |      - URL, title
128 |      - Numerical quantity
129 |      - Timestamp, datetime
130 |      - Boolean (for True/False questions)
131 |      - Email address, phone number
132 |      - File ID, file name, file extension
133 |      - Multiple choice answer
134 |    - Answers must not require special formatting or complex, structured output
135 |    - Answer will be verified using DIRECT STRING COMPARISON
136 | 
137 | ### Readability
138 | 
139 | 2. **Answers should generally prefer HUMAN-READABLE formats**
140 |    - Examples: names, first name, last name, datetime, file name, message string, URL, yes/no, true/false, a/b/c/d
141 |    - Rather than opaque IDs (though IDs are acceptable)
142 |    - The VAST MAJORITY of answers should be human-readable
143 | 
144 | ### Stability
145 | 
146 | 3. **Answers must be STABLE/STATIONARY**
147 |    - Look at old content (e.g., conversations that have ended, projects that have launched, questions answered)
148 |    - Create QUESTIONS based on "closed" concepts that will always return the same answer
149 |    - Questions may ask to consider a fixed time window to insulate from non-stationary answers
150 |    - Rely on context UNLIKELY to change
151 |    - Example: if finding a paper name, be SPECIFIC enough so answer is not confused with papers published later
152 | 
153 | 4. **Answers must be CLEAR and UNAMBIGUOUS**
154 |    - Questions must be designed so there is a single, clear answer
155 |    - Answer can be derived from using the MCP server tools
156 | 
157 | ### Diversity
158 | 
159 | 5. **Answers must be DIVERSE**
160 |    - Answer should be a single VERIFIABLE value in diverse modalities and formats
161 |    - User concept: user ID, user name, display name, first name, last name, email address, phone number
162 |    - Channel concept: channel ID, channel name, channel topic
163 |    - Message concept: message ID, message string, timestamp, month, day, year
164 | 
165 | 6. **Answers must NOT be complex structures**
166 |    - Not a list of values
167 |    - Not a complex object
168 |    - Not a list of IDs or strings
169 |    - Not natural language text
170 |    - UNLESS the answer can be straightforwardly verified using DIRECT STRING COMPARISON
171 |    - And can be realistically reproduced
172 |    - It should be unlikely that an LLM would return the same list in any other order or format
173 | 
174 | ## Evaluation Process
175 | 
176 | ### Step 1: Documentation Inspection
177 | 
178 | Read the documentation of the target API to understand:
179 | - Available endpoints and functionality
180 | - If ambiguity exists, fetch additional information from the web
181 | - Parallelize this step AS MUCH AS POSSIBLE
182 | - Ensure each subagent is ONLY examining documentation from the file system or on the web
183 | 
184 | ### Step 2: Tool Inspection
185 | 
186 | List the tools available in the MCP server:
187 | - Inspect the MCP server directly
188 | - Understand input/output schemas, docstrings, and descriptions
189 | - WITHOUT calling the tools themselves at this stage
190 | 
191 | ### Step 3: Developing Understanding
192 | 
193 | Repeat steps 1 & 2 until you have a good understanding:
194 | - Iterate multiple times
195 | - Think about the kinds of tasks you want to create
196 | - Refine your understanding
197 | - At NO stage should you READ the code of the MCP server implementation itself
198 | - Use your intuition and understanding to create reasonable, realistic, but VERY challenging tasks
199 | 
200 | ### Step 4: Read-Only Content Inspection
201 | 
202 | After understanding the API and tools, USE the MCP server tools:
203 | - Inspect content using READ-ONLY and NON-DESTRUCTIVE operations ONLY
204 | - Goal: identify specific content (e.g., users, channels, messages, projects, tasks) for creating realistic questions
205 | - Should NOT call any tools that modify state
206 | - Will NOT read the code of the MCP server implementation itself
207 | - Parallelize this step with individual sub-agents pursuing independent explorations
208 | - Ensure each subagent is only performing READ-ONLY, NON-DESTRUCTIVE, and IDEMPOTENT operations
209 | - BE CAREFUL: SOME TOOLS may return LOTS OF DATA which would cause you to run out of CONTEXT
210 | - Make INCREMENTAL, SMALL, AND TARGETED tool calls for exploration
211 | - In all tool call requests, use the `limit` parameter to limit results (<10)
212 | - Use pagination
213 | 
214 | ### Step 5: Task Generation
215 | 
216 | After inspecting the content, create 10 human-readable questions:
217 | - An LLM should be able to answer these with the MCP server
218 | - Follow all question and answer guidelines above
219 | 
220 | ## Output Format
221 | 
222 | Each QA pair consists of a question and an answer. The output should be an XML file with this structure:
223 | 
224 | ```xml
225 | <evaluation>
226 |    <qa_pair>
227 |       <question>Find the project created in Q2 2024 with the highest number of completed tasks. What is the project name?</question>
228 |       <answer>Website Redesign</answer>
229 |    </qa_pair>
230 |    <qa_pair>
231 |       <question>Search for issues labeled as "bug" that were closed in March 2024. Which user closed the most issues? Provide their username.</question>
232 |       <answer>sarah_dev</answer>
233 |    </qa_pair>
234 |    <qa_pair>
235 |       <question>Look for pull requests that modified files in the /api directory and were merged between January 1 and January 31, 2024. How many different contributors worked on these PRs?</question>
236 |       <answer>7</answer>
237 |    </qa_pair>
238 |    <qa_pair>
239 |       <question>Find the repository with the most stars that was created before 2023. What is the repository name?</question>
240 |       <answer>data-pipeline</answer>
241 |    </qa_pair>
242 | </evaluation>
243 | ```
244 | 
245 | ## Evaluation Examples
246 | 
247 | ### Good Questions
248 | 
249 | **Example 1: Multi-hop question requiring deep exploration (GitHub MCP)**
250 | ```xml
251 | <qa_pair>
252 |    <question>Find the repository that was archived in Q3 2023 and had previously been the most forked project in the organization. What was the primary programming language used in that repository?</question>
253 |    <answer>Python</answer>
254 | </qa_pair>
255 | ```
256 | 
257 | This question is good because:
258 | - Requires multiple searches to find archived repositories
259 | - Needs to identify which had the most forks before archival
260 | - Requires examining repository details for the language
261 | - Answer is a simple, verifiable value
262 | - Based on historical (closed) data that won't change
263 | 
264 | **Example 2: Requires understanding context without keyword matching (Project Management MCP)**
265 | ```xml
266 | <qa_pair>
267 |    <question>Locate the initiative focused on improving customer onboarding that was completed in late 2023. The project lead created a retrospective document after completion. What was the lead's role title at that time?</question>
268 |    <answer>Product Manager</answer>
269 | </qa_pair>
270 | ```
271 | 
272 | This question is good because:
273 | - Doesn't use specific project name ("initiative focused on improving customer onboarding")
274 | - Requires finding completed projects from specific timeframe
275 | - Needs to identify the project lead and their role
276 | - Requires understanding context from retrospective documents
277 | - Answer is human-readable and stable
278 | - Based on completed work (won't change)
279 | 
280 | **Example 3: Complex aggregation requiring multiple steps (Issue Tracker MCP)**
281 | ```xml
282 | <qa_pair>
283 |    <question>Among all bugs reported in January 2024 that were marked as critical priority, which assignee resolved the highest percentage of their assigned bugs within 48 hours? Provide the assignee's username.</question>
284 |    <answer>alex_eng</answer>
285 | </qa_pair>
286 | ```
287 | 
288 | This question is good because:
289 | - Requires filtering bugs by date, priority, and status
290 | - Needs to group by assignee and calculate resolution rates
291 | - Requires understanding timestamps to determine 48-hour windows
292 | - Tests pagination (potentially many bugs to process)
293 | - Answer is a single username
294 | - Based on historical data from specific time period
295 | 
296 | **Example 4: Requires synthesis across multiple data types (CRM MCP)**
297 | ```xml
298 | <qa_pair>
299 |    <question>Find the account that upgraded from the Starter to Enterprise plan in Q4 2023 and had the highest annual contract value. What industry does this account operate in?</question>
300 |    <answer>Healthcare</answer>
301 | </qa_pair>
302 | ```
303 | 
304 | This question is good because:
305 | - Requires understanding subscription tier changes
306 | - Needs to identify upgrade events in specific timeframe
307 | - Requires comparing contract values
308 | - Must access account industry information
309 | - Answer is simple and verifiable
310 | - Based on completed historical transactions
311 | 
312 | ### Poor Questions
313 | 
314 | **Example 1: Answer changes over time**
315 | ```xml
316 | <qa_pair>
317 |    <question>How many open issues are currently assigned to the engineering team?</question>
318 |    <answer>47</answer>
319 | </qa_pair>
320 | ```
321 | 
322 | This question is poor because:
323 | - The answer will change as issues are created, closed, or reassigned
324 | - Not based on stable/stationary data
325 | - Relies on "current state" which is dynamic
326 | 
327 | **Example 2: Too easy with keyword search**
328 | ```xml
329 | <qa_pair>
330 |    <question>Find the pull request with title "Add authentication feature" and tell me who created it.</question>
331 |    <answer>developer123</answer>
332 | </qa_pair>
333 | ```
334 | 
335 | This question is poor because:
336 | - Can be solved with a straightforward keyword search for exact title
337 | - Doesn't require deep exploration or understanding
338 | - No synthesis or analysis needed
339 | 
340 | **Example 3: Ambiguous answer format**
341 | ```xml
342 | <qa_pair>
343 |    <question>List all the repositories that have Python as their primary language.</question>
344 |    <answer>repo1, repo2, repo3, data-pipeline, ml-tools</answer>
345 | </qa_pair>
346 | ```
347 | 
348 | This question is poor because:
349 | - Answer is a list that could be returned in any order
350 | - Difficult to verify with direct string comparison
351 | - LLM might format differently (JSON array, comma-separated, newline-separated)
352 | - Better to ask for a specific aggregate (count) or superlative (most stars)
353 | 
354 | ## Verification Process
355 | 
356 | After creating evaluations:
357 | 
358 | 1. **Examine the XML file** to understand the schema
359 | 2. **Load each task instruction** and in parallel using the MCP server and tools, identify the correct answer by attempting to solve the task YOURSELF
360 | 3. **Flag any operations** that require WRITE or DESTRUCTIVE operations
361 | 4. **Accumulate all CORRECT answers** and replace any incorrect answers in the document
362 | 5. **Remove any `<qa_pair>`** that require WRITE or DESTRUCTIVE operations
363 | 
364 | Remember to parallelize solving tasks to avoid running out of context, then accumulate all answers and make changes to the file at the end.
365 | 
366 | ## Tips for Creating Quality Evaluations
367 | 
368 | 1. **Think Hard and Plan Ahead** before generating tasks
369 | 2. **Parallelize Where Opportunity Arises** to speed up the process and manage context
370 | 3. **Focus on Realistic Use Cases** that humans would actually want to accomplish
371 | 4. **Create Challenging Questions** that test the limits of the MCP server's capabilities
372 | 5. **Ensure Stability** by using historical data and closed concepts
373 | 6. **Verify Answers** by solving the questions yourself using the MCP server tools
374 | 7. **Iterate and Refine** based on what you learn during the process
375 | 
376 | ---
377 | 
378 | # Running Evaluations
379 | 
380 | After creating your evaluation file, you can use the provided evaluation harness to test your MCP server.
381 | 
382 | ## Setup
383 | 
384 | 1. **Install Dependencies**
385 | 
386 |    ```bash
387 |    pip install -r scripts/requirements.txt
388 |    ```
389 | 
390 |    Or install manually:
391 |    ```bash
392 |    pip install anthropic mcp
393 |    ```
394 | 
395 | 2. **Set API Key**
396 | 
397 |    ```bash
398 |    export ANTHROPIC_API_KEY=your_api_key_here
399 |    ```
400 | 
401 | ## Evaluation File Format
402 | 
403 | Evaluation files use XML format with `<qa_pair>` elements:
404 | 
405 | ```xml
406 | <evaluation>
407 |    <qa_pair>
408 |       <question>Find the project created in Q2 2024 with the highest number of completed tasks. What is the project name?</question>
409 |       <answer>Website Redesign</answer>
410 |    </qa_pair>
411 |    <qa_pair>
412 |       <question>Search for issues labeled as "bug" that were closed in March 2024. Which user closed the most issues? Provide their username.</question>
413 |       <answer>sarah_dev</answer>
414 |    </qa_pair>
415 | </evaluation>
416 | ```
417 | 
418 | ## Running Evaluations
419 | 
420 | The evaluation script (`scripts/evaluation.py`) supports three transport types:
421 | 
422 | **Important:**
423 | - **stdio transport**: The evaluation script automatically launches and manages the MCP server process for you. Do not run the server manually.
424 | - **sse/http transports**: You must start the MCP server separately before running the evaluation. The script connects to the already-running server at the specified URL.
425 | 
426 | ### 1. Local STDIO Server
427 | 
428 | For locally-run MCP servers (script launches the server automatically):
429 | 
430 | ```bash
431 | python scripts/evaluation.py \
432 |   -t stdio \
433 |   -c python \
434 |   -a my_mcp_server.py \
435 |   evaluation.xml
436 | ```
437 | 
438 | With environment variables:
439 | ```bash
440 | python scripts/evaluation.py \
441 |   -t stdio \
442 |   -c python \
443 |   -a my_mcp_server.py \
444 |   -e API_KEY=abc123 \
445 |   -e DEBUG=true \
446 |   evaluation.xml
447 | ```
448 | 
449 | ### 2. Server-Sent Events (SSE)
450 | 
451 | For SSE-based MCP servers (you must start the server first):
452 | 
453 | ```bash
454 | python scripts/evaluation.py \
455 |   -t sse \
456 |   -u https://example.com/mcp \
457 |   -H "Authorization: Bearer token123" \
458 |   -H "X-Custom-Header: value" \
459 |   evaluation.xml
460 | ```
461 | 
462 | ### 3. HTTP (Streamable HTTP)
463 | 
464 | For HTTP-based MCP servers (you must start the server first):
465 | 
466 | ```bash
467 | python scripts/evaluation.py \
468 |   -t http \
469 |   -u https://example.com/mcp \
470 |   -H "Authorization: Bearer token123" \
471 |   evaluation.xml
472 | ```
473 | 
474 | ## Command-Line Options
475 | 
476 | ```
477 | usage: evaluation.py [-h] [-t {stdio,sse,http}] [-m MODEL] [-c COMMAND]
478 |                      [-a ARGS [ARGS ...]] [-e ENV [ENV ...]] [-u URL]
479 |                      [-H HEADERS [HEADERS ...]] [-o OUTPUT]
480 |                      eval_file
481 | 
482 | positional arguments:
483 |   eval_file             Path to evaluation XML file
484 | 
485 | optional arguments:
486 |   -h, --help            Show help message
487 |   -t, --transport       Transport type: stdio, sse, or http (default: stdio)
488 |   -m, --model           Claude model to use (default: claude-3-7-sonnet-20250219)
489 |   -o, --output          Output file for report (default: print to stdout)
490 | 
491 | stdio options:
492 |   -c, --command         Command to run MCP server (e.g., python, node)
493 |   -a, --args            Arguments for the command (e.g., server.py)
494 |   -e, --env             Environment variables in KEY=VALUE format
495 | 
496 | sse/http options:
497 |   -u, --url             MCP server URL
498 |   -H, --header          HTTP headers in 'Key: Value' format
499 | ```
500 | 
501 | ## Output
502 | 
503 | The evaluation script generates a detailed report including:
504 | 
505 | - **Summary Statistics**:
506 |   - Accuracy (correct/total)
507 |   - Average task duration
508 |   - Average tool calls per task
509 |   - Total tool calls
510 | 
511 | - **Per-Task Results**:
512 |   - Prompt and expected response
513 |   - Actual response from the agent
514 |   - Whether the answer was correct (✅/❌)
515 |   - Duration and tool call details
516 |   - Agent's summary of its approach
517 |   - Agent's feedback on the tools
518 | 
519 | ### Save Report to File
520 | 
521 | ```bash
522 | python scripts/evaluation.py \
523 |   -t stdio \
524 |   -c python \
525 |   -a my_server.py \
526 |   -o evaluation_report.md \
527 |   evaluation.xml
528 | ```
529 | 
530 | ## Complete Example Workflow
531 | 
532 | Here's a complete example of creating and running an evaluation:
533 | 
534 | 1. **Create your evaluation file** (`my_evaluation.xml`):
535 | 
536 | ```xml
537 | <evaluation>
538 |    <qa_pair>
539 |       <question>Find the user who created the most issues in January 2024. What is their username?</question>
540 |       <answer>alice_developer</answer>
541 |    </qa_pair>
542 |    <qa_pair>
543 |       <question>Among all pull requests merged in Q1 2024, which repository had the highest number? Provide the repository name.</question>
544 |       <answer>backend-api</answer>
545 |    </qa_pair>
546 |    <qa_pair>
547 |       <question>Find the project that was completed in December 2023 and had the longest duration from start to finish. How many days did it take?</question>
548 |       <answer>127</answer>
549 |    </qa_pair>
550 | </evaluation>
551 | ```
552 | 
553 | 2. **Install dependencies**:
554 | 
555 | ```bash
556 | pip install -r scripts/requirements.txt
557 | export ANTHROPIC_API_KEY=your_api_key
558 | ```
559 | 
560 | 3. **Run evaluation**:
561 | 
562 | ```bash
563 | python scripts/evaluation.py \
564 |   -t stdio \
565 |   -c python \
566 |   -a github_mcp_server.py \
567 |   -e GITHUB_TOKEN=ghp_xxx \
568 |   -o github_eval_report.md \
569 |   my_evaluation.xml
570 | ```
571 | 
572 | 4. **Review the report** in `github_eval_report.md` to:
573 |    - See which questions passed/failed
574 |    - Read the agent's feedback on your tools
575 |    - Identify areas for improvement
576 |    - Iterate on your MCP server design
577 | 
578 | ## Troubleshooting
579 | 
580 | ### Connection Errors
581 | 
582 | If you get connection errors:
583 | - **STDIO**: Verify the command and arguments are correct
584 | - **SSE/HTTP**: Check the URL is accessible and headers are correct
585 | - Ensure any required API keys are set in environment variables or headers
586 | 
587 | ### Low Accuracy
588 | 
589 | If many evaluations fail:
590 | - Review the agent's feedback for each task
591 | - Check if tool descriptions are clear and comprehensive
592 | - Verify input parameters are well-documented
593 | - Consider whether tools return too much or too little data
594 | - Ensure error messages are actionable
595 | 
596 | ### Timeout Issues
597 | 
598 | If tasks are timing out:
599 | - Use a more capable model (e.g., `claude-3-7-sonnet-20250219`)
600 | - Check if tools are returning too much data
601 | - Verify pagination is working correctly
602 | - Consider simplifying complex questions
```

--------------------------------------------------------------------------------
/.claude/skills/mcp-builder/reference/python_mcp_server.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Python MCP Server Implementation Guide
  2 | 
  3 | ## Overview
  4 | 
  5 | This document provides Python-specific best practices and examples for implementing MCP servers using the MCP Python SDK. It covers server setup, tool registration patterns, input validation with Pydantic, error handling, and complete working examples.
  6 | 
  7 | ---
  8 | 
  9 | ## Quick Reference
 10 | 
 11 | ### Key Imports
 12 | ```python
 13 | from mcp.server.fastmcp import FastMCP
 14 | from pydantic import BaseModel, Field, field_validator, ConfigDict
 15 | from typing import Optional, List, Dict, Any
 16 | from enum import Enum
 17 | import httpx
 18 | ```
 19 | 
 20 | ### Server Initialization
 21 | ```python
 22 | mcp = FastMCP("service_mcp")
 23 | ```
 24 | 
 25 | ### Tool Registration Pattern
 26 | ```python
 27 | @mcp.tool(name="tool_name", annotations={...})
 28 | async def tool_function(params: InputModel) -> str:
 29 |     # Implementation
 30 |     pass
 31 | ```
 32 | 
 33 | ---
 34 | 
 35 | ## MCP Python SDK and FastMCP
 36 | 
 37 | The official MCP Python SDK provides FastMCP, a high-level framework for building MCP servers. It provides:
 38 | - Automatic description and inputSchema generation from function signatures and docstrings
 39 | - Pydantic model integration for input validation
 40 | - Decorator-based tool registration with `@mcp.tool`
 41 | 
 42 | **For complete SDK documentation, use WebFetch to load:**
 43 | `https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md`
 44 | 
 45 | ## Server Naming Convention
 46 | 
 47 | Python MCP servers must follow this naming pattern:
 48 | - **Format**: `{service}_mcp` (lowercase with underscores)
 49 | - **Examples**: `github_mcp`, `jira_mcp`, `stripe_mcp`
 50 | 
 51 | The name should be:
 52 | - General (not tied to specific features)
 53 | - Descriptive of the service/API being integrated
 54 | - Easy to infer from the task description
 55 | - Without version numbers or dates
 56 | 
 57 | ## Tool Implementation
 58 | 
 59 | ### Tool Naming
 60 | 
 61 | Use snake_case for tool names (e.g., "search_users", "create_project", "get_channel_info") with clear, action-oriented names.
 62 | 
 63 | **Avoid Naming Conflicts**: Include the service context to prevent overlaps:
 64 | - Use "slack_send_message" instead of just "send_message"
 65 | - Use "github_create_issue" instead of just "create_issue"
 66 | - Use "asana_list_tasks" instead of just "list_tasks"
 67 | 
 68 | ### Tool Structure with FastMCP
 69 | 
 70 | Tools are defined using the `@mcp.tool` decorator with Pydantic models for input validation:
 71 | 
 72 | ```python
 73 | from pydantic import BaseModel, Field, ConfigDict
 74 | from mcp.server.fastmcp import FastMCP
 75 | 
 76 | # Initialize the MCP server
 77 | mcp = FastMCP("example_mcp")
 78 | 
 79 | # Define Pydantic model for input validation
 80 | class ServiceToolInput(BaseModel):
 81 |     '''Input model for service tool operation.'''
 82 |     model_config = ConfigDict(
 83 |         str_strip_whitespace=True,  # Auto-strip whitespace from strings
 84 |         validate_assignment=True,    # Validate on assignment
 85 |         extra='forbid'              # Forbid extra fields
 86 |     )
 87 | 
 88 |     param1: str = Field(..., description="First parameter description (e.g., 'user123', 'project-abc')", min_length=1, max_length=100)
 89 |     param2: Optional[int] = Field(default=None, description="Optional integer parameter with constraints", ge=0, le=1000)
 90 |     tags: Optional[List[str]] = Field(default_factory=list, description="List of tags to apply", max_items=10)
 91 | 
 92 | @mcp.tool(
 93 |     name="service_tool_name",
 94 |     annotations={
 95 |         "title": "Human-Readable Tool Title",
 96 |         "readOnlyHint": True,     # Tool does not modify environment
 97 |         "destructiveHint": False,  # Tool does not perform destructive operations
 98 |         "idempotentHint": True,    # Repeated calls have no additional effect
 99 |         "openWorldHint": False     # Tool does not interact with external entities
100 |     }
101 | )
102 | async def service_tool_name(params: ServiceToolInput) -> str:
103 |     '''Tool description automatically becomes the 'description' field.
104 | 
105 |     This tool performs a specific operation on the service. It validates all inputs
106 |     using the ServiceToolInput Pydantic model before processing.
107 | 
108 |     Args:
109 |         params (ServiceToolInput): Validated input parameters containing:
110 |             - param1 (str): First parameter description
111 |             - param2 (Optional[int]): Optional parameter with default
112 |             - tags (Optional[List[str]]): List of tags
113 | 
114 |     Returns:
115 |         str: JSON-formatted response containing operation results
116 |     '''
117 |     # Implementation here
118 |     pass
119 | ```
120 | 
121 | ## Pydantic v2 Key Features
122 | 
123 | - Use `model_config` instead of nested `Config` class
124 | - Use `field_validator` instead of deprecated `validator`
125 | - Use `model_dump()` instead of deprecated `dict()`
126 | - Validators require `@classmethod` decorator
127 | - Type hints are required for validator methods
128 | 
129 | ```python
130 | from pydantic import BaseModel, Field, field_validator, ConfigDict
131 | 
132 | class CreateUserInput(BaseModel):
133 |     model_config = ConfigDict(
134 |         str_strip_whitespace=True,
135 |         validate_assignment=True
136 |     )
137 | 
138 |     name: str = Field(..., description="User's full name", min_length=1, max_length=100)
139 |     email: str = Field(..., description="User's email address", pattern=r'^[\w\.-]+@[\w\.-]+\.\w+$')
140 |     age: int = Field(..., description="User's age", ge=0, le=150)
141 | 
142 |     @field_validator('email')
143 |     @classmethod
144 |     def validate_email(cls, v: str) -> str:
145 |         if not v.strip():
146 |             raise ValueError("Email cannot be empty")
147 |         return v.lower()
148 | ```
149 | 
150 | ## Response Format Options
151 | 
152 | Support multiple output formats for flexibility:
153 | 
154 | ```python
155 | from enum import Enum
156 | 
157 | class ResponseFormat(str, Enum):
158 |     '''Output format for tool responses.'''
159 |     MARKDOWN = "markdown"
160 |     JSON = "json"
161 | 
162 | class UserSearchInput(BaseModel):
163 |     query: str = Field(..., description="Search query")
164 |     response_format: ResponseFormat = Field(
165 |         default=ResponseFormat.MARKDOWN,
166 |         description="Output format: 'markdown' for human-readable or 'json' for machine-readable"
167 |     )
168 | ```
169 | 
170 | **Markdown format**:
171 | - Use headers, lists, and formatting for clarity
172 | - Convert timestamps to human-readable format (e.g., "2024-01-15 10:30:00 UTC" instead of epoch)
173 | - Show display names with IDs in parentheses (e.g., "@john.doe (U123456)")
174 | - Omit verbose metadata (e.g., show only one profile image URL, not all sizes)
175 | - Group related information logically
176 | 
177 | **JSON format**:
178 | - Return complete, structured data suitable for programmatic processing
179 | - Include all available fields and metadata
180 | - Use consistent field names and types
181 | 
182 | ## Pagination Implementation
183 | 
184 | For tools that list resources:
185 | 
186 | ```python
187 | class ListInput(BaseModel):
188 |     limit: Optional[int] = Field(default=20, description="Maximum results to return", ge=1, le=100)
189 |     offset: Optional[int] = Field(default=0, description="Number of results to skip for pagination", ge=0)
190 | 
191 | async def list_items(params: ListInput) -> str:
192 |     # Make API request with pagination
193 |     data = await api_request(limit=params.limit, offset=params.offset)
194 | 
195 |     # Return pagination info
196 |     response = {
197 |         "total": data["total"],
198 |         "count": len(data["items"]),
199 |         "offset": params.offset,
200 |         "items": data["items"],
201 |         "has_more": data["total"] > params.offset + len(data["items"]),
202 |         "next_offset": params.offset + len(data["items"]) if data["total"] > params.offset + len(data["items"]) else None
203 |     }
204 |     return json.dumps(response, indent=2)
205 | ```
206 | 
207 | ## Character Limits and Truncation
208 | 
209 | Add a CHARACTER_LIMIT constant to prevent overwhelming responses:
210 | 
211 | ```python
212 | # At module level
213 | CHARACTER_LIMIT = 25000  # Maximum response size in characters
214 | 
215 | async def search_tool(params: SearchInput) -> str:
216 |     result = generate_response(data)
217 | 
218 |     # Check character limit and truncate if needed
219 |     if len(result) > CHARACTER_LIMIT:
220 |         # Truncate data and add notice
221 |         truncated_data = data[:max(1, len(data) // 2)]
222 |         response["data"] = truncated_data
223 |         response["truncated"] = True
224 |         response["truncation_message"] = (
225 |             f"Response truncated from {len(data)} to {len(truncated_data)} items. "
226 |             f"Use 'offset' parameter or add filters to see more results."
227 |         )
228 |         result = json.dumps(response, indent=2)
229 | 
230 |     return result
231 | ```
232 | 
233 | ## Error Handling
234 | 
235 | Provide clear, actionable error messages:
236 | 
237 | ```python
238 | def _handle_api_error(e: Exception) -> str:
239 |     '''Consistent error formatting across all tools.'''
240 |     if isinstance(e, httpx.HTTPStatusError):
241 |         if e.response.status_code == 404:
242 |             return "Error: Resource not found. Please check the ID is correct."
243 |         elif e.response.status_code == 403:
244 |             return "Error: Permission denied. You don't have access to this resource."
245 |         elif e.response.status_code == 429:
246 |             return "Error: Rate limit exceeded. Please wait before making more requests."
247 |         return f"Error: API request failed with status {e.response.status_code}"
248 |     elif isinstance(e, httpx.TimeoutException):
249 |         return "Error: Request timed out. Please try again."
250 |     return f"Error: Unexpected error occurred: {type(e).__name__}"
251 | ```
252 | 
253 | ## Shared Utilities
254 | 
255 | Extract common functionality into reusable functions:
256 | 
257 | ```python
258 | # Shared API request function
259 | async def _make_api_request(endpoint: str, method: str = "GET", **kwargs) -> dict:
260 |     '''Reusable function for all API calls.'''
261 |     async with httpx.AsyncClient() as client:
262 |         response = await client.request(
263 |             method,
264 |             f"{API_BASE_URL}/{endpoint}",
265 |             timeout=30.0,
266 |             **kwargs
267 |         )
268 |         response.raise_for_status()
269 |         return response.json()
270 | ```
271 | 
272 | ## Async/Await Best Practices
273 | 
274 | Always use async/await for network requests and I/O operations:
275 | 
276 | ```python
277 | # Good: Async network request
278 | async def fetch_data(resource_id: str) -> dict:
279 |     async with httpx.AsyncClient() as client:
280 |         response = await client.get(f"{API_URL}/resource/{resource_id}")
281 |         response.raise_for_status()
282 |         return response.json()
283 | 
284 | # Bad: Synchronous request
285 | def fetch_data(resource_id: str) -> dict:
286 |     response = requests.get(f"{API_URL}/resource/{resource_id}")  # Blocks
287 |     return response.json()
288 | ```
289 | 
290 | ## Type Hints
291 | 
292 | Use type hints throughout:
293 | 
294 | ```python
295 | from typing import Optional, List, Dict, Any
296 | 
297 | async def get_user(user_id: str) -> Dict[str, Any]:
298 |     data = await fetch_user(user_id)
299 |     return {"id": data["id"], "name": data["name"]}
300 | ```
301 | 
302 | ## Tool Docstrings
303 | 
304 | Every tool must have comprehensive docstrings with explicit type information:
305 | 
306 | ```python
307 | async def search_users(params: UserSearchInput) -> str:
308 |     '''
309 |     Search for users in the Example system by name, email, or team.
310 | 
311 |     This tool searches across all user profiles in the Example platform,
312 |     supporting partial matches and various search filters. It does NOT
313 |     create or modify users, only searches existing ones.
314 | 
315 |     Args:
316 |         params (UserSearchInput): Validated input parameters containing:
317 |             - query (str): Search string to match against names/emails (e.g., "john", "@example.com", "team:marketing")
318 |             - limit (Optional[int]): Maximum results to return, between 1-100 (default: 20)
319 |             - offset (Optional[int]): Number of results to skip for pagination (default: 0)
320 | 
321 |     Returns:
322 |         str: JSON-formatted string containing search results with the following schema:
323 | 
324 |         Success response:
325 |         {
326 |             "total": int,           # Total number of matches found
327 |             "count": int,           # Number of results in this response
328 |             "offset": int,          # Current pagination offset
329 |             "users": [
330 |                 {
331 |                     "id": str,      # User ID (e.g., "U123456789")
332 |                     "name": str,    # Full name (e.g., "John Doe")
333 |                     "email": str,   # Email address (e.g., "[email protected]")
334 |                     "team": str     # Team name (e.g., "Marketing") - optional
335 |                 }
336 |             ]
337 |         }
338 | 
339 |         Error response:
340 |         "Error: <error message>" or "No users found matching '<query>'"
341 | 
342 |     Examples:
343 |         - Use when: "Find all marketing team members" -> params with query="team:marketing"
344 |         - Use when: "Search for John's account" -> params with query="john"
345 |         - Don't use when: You need to create a user (use example_create_user instead)
346 |         - Don't use when: You have a user ID and need full details (use example_get_user instead)
347 | 
348 |     Error Handling:
349 |         - Input validation errors are handled by Pydantic model
350 |         - Returns "Error: Rate limit exceeded" if too many requests (429 status)
351 |         - Returns "Error: Invalid API authentication" if API key is invalid (401 status)
352 |         - Returns formatted list of results or "No users found matching 'query'"
353 |     '''
354 | ```
355 | 
356 | ## Complete Example
357 | 
358 | See below for a complete Python MCP server example:
359 | 
360 | ```python
361 | #!/usr/bin/env python3
362 | '''
363 | MCP Server for Example Service.
364 | 
365 | This server provides tools to interact with Example API, including user search,
366 | project management, and data export capabilities.
367 | '''
368 | 
369 | from typing import Optional, List, Dict, Any
370 | from enum import Enum
371 | import httpx
372 | from pydantic import BaseModel, Field, field_validator, ConfigDict
373 | from mcp.server.fastmcp import FastMCP
374 | 
375 | # Initialize the MCP server
376 | mcp = FastMCP("example_mcp")
377 | 
378 | # Constants
379 | API_BASE_URL = "https://api.example.com/v1"
380 | CHARACTER_LIMIT = 25000  # Maximum response size in characters
381 | 
382 | # Enums
383 | class ResponseFormat(str, Enum):
384 |     '''Output format for tool responses.'''
385 |     MARKDOWN = "markdown"
386 |     JSON = "json"
387 | 
388 | # Pydantic Models for Input Validation
389 | class UserSearchInput(BaseModel):
390 |     '''Input model for user search operations.'''
391 |     model_config = ConfigDict(
392 |         str_strip_whitespace=True,
393 |         validate_assignment=True
394 |     )
395 | 
396 |     query: str = Field(..., description="Search string to match against names/emails", min_length=2, max_length=200)
397 |     limit: Optional[int] = Field(default=20, description="Maximum results to return", ge=1, le=100)
398 |     offset: Optional[int] = Field(default=0, description="Number of results to skip for pagination", ge=0)
399 |     response_format: ResponseFormat = Field(default=ResponseFormat.MARKDOWN, description="Output format")
400 | 
401 |     @field_validator('query')
402 |     @classmethod
403 |     def validate_query(cls, v: str) -> str:
404 |         if not v.strip():
405 |             raise ValueError("Query cannot be empty or whitespace only")
406 |         return v.strip()
407 | 
408 | # Shared utility functions
409 | async def _make_api_request(endpoint: str, method: str = "GET", **kwargs) -> dict:
410 |     '''Reusable function for all API calls.'''
411 |     async with httpx.AsyncClient() as client:
412 |         response = await client.request(
413 |             method,
414 |             f"{API_BASE_URL}/{endpoint}",
415 |             timeout=30.0,
416 |             **kwargs
417 |         )
418 |         response.raise_for_status()
419 |         return response.json()
420 | 
421 | def _handle_api_error(e: Exception) -> str:
422 |     '''Consistent error formatting across all tools.'''
423 |     if isinstance(e, httpx.HTTPStatusError):
424 |         if e.response.status_code == 404:
425 |             return "Error: Resource not found. Please check the ID is correct."
426 |         elif e.response.status_code == 403:
427 |             return "Error: Permission denied. You don't have access to this resource."
428 |         elif e.response.status_code == 429:
429 |             return "Error: Rate limit exceeded. Please wait before making more requests."
430 |         return f"Error: API request failed with status {e.response.status_code}"
431 |     elif isinstance(e, httpx.TimeoutException):
432 |         return "Error: Request timed out. Please try again."
433 |     return f"Error: Unexpected error occurred: {type(e).__name__}"
434 | 
435 | # Tool definitions
436 | @mcp.tool(
437 |     name="example_search_users",
438 |     annotations={
439 |         "title": "Search Example Users",
440 |         "readOnlyHint": True,
441 |         "destructiveHint": False,
442 |         "idempotentHint": True,
443 |         "openWorldHint": True
444 |     }
445 | )
446 | async def example_search_users(params: UserSearchInput) -> str:
447 |     '''Search for users in the Example system by name, email, or team.
448 | 
449 |     [Full docstring as shown above]
450 |     '''
451 |     try:
452 |         # Make API request using validated parameters
453 |         data = await _make_api_request(
454 |             "users/search",
455 |             params={
456 |                 "q": params.query,
457 |                 "limit": params.limit,
458 |                 "offset": params.offset
459 |             }
460 |         )
461 | 
462 |         users = data.get("users", [])
463 |         total = data.get("total", 0)
464 | 
465 |         if not users:
466 |             return f"No users found matching '{params.query}'"
467 | 
468 |         # Format response based on requested format
469 |         if params.response_format == ResponseFormat.MARKDOWN:
470 |             lines = [f"# User Search Results: '{params.query}'", ""]
471 |             lines.append(f"Found {total} users (showing {len(users)})")
472 |             lines.append("")
473 | 
474 |             for user in users:
475 |                 lines.append(f"## {user['name']} ({user['id']})")
476 |                 lines.append(f"- **Email**: {user['email']}")
477 |                 if user.get('team'):
478 |                     lines.append(f"- **Team**: {user['team']}")
479 |                 lines.append("")
480 | 
481 |             return "\n".join(lines)
482 | 
483 |         else:
484 |             # Machine-readable JSON format
485 |             import json
486 |             response = {
487 |                 "total": total,
488 |                 "count": len(users),
489 |                 "offset": params.offset,
490 |                 "users": users
491 |             }
492 |             return json.dumps(response, indent=2)
493 | 
494 |     except Exception as e:
495 |         return _handle_api_error(e)
496 | 
497 | if __name__ == "__main__":
498 |     mcp.run()
499 | ```
500 | 
501 | ---
502 | 
503 | ## Advanced FastMCP Features
504 | 
505 | ### Context Parameter Injection
506 | 
507 | FastMCP can automatically inject a `Context` parameter into tools for advanced capabilities like logging, progress reporting, resource reading, and user interaction:
508 | 
509 | ```python
510 | from mcp.server.fastmcp import FastMCP, Context
511 | 
512 | mcp = FastMCP("example_mcp")
513 | 
514 | @mcp.tool()
515 | async def advanced_search(query: str, ctx: Context) -> str:
516 |     '''Advanced tool with context access for logging and progress.'''
517 | 
518 |     # Report progress for long operations
519 |     await ctx.report_progress(0.25, "Starting search...")
520 | 
521 |     # Log information for debugging
522 |     await ctx.log_info("Processing query", {"query": query, "timestamp": datetime.now()})
523 | 
524 |     # Perform search
525 |     results = await search_api(query)
526 |     await ctx.report_progress(0.75, "Formatting results...")
527 | 
528 |     # Access server configuration
529 |     server_name = ctx.fastmcp.name
530 | 
531 |     return format_results(results)
532 | 
533 | @mcp.tool()
534 | async def interactive_tool(resource_id: str, ctx: Context) -> str:
535 |     '''Tool that can request additional input from users.'''
536 | 
537 |     # Request sensitive information when needed
538 |     api_key = await ctx.elicit(
539 |         prompt="Please provide your API key:",
540 |         input_type="password"
541 |     )
542 | 
543 |     # Use the provided key
544 |     return await api_call(resource_id, api_key)
545 | ```
546 | 
547 | **Context capabilities:**
548 | - `ctx.report_progress(progress, message)` - Report progress for long operations
549 | - `ctx.log_info(message, data)` / `ctx.log_error()` / `ctx.log_debug()` - Logging
550 | - `ctx.elicit(prompt, input_type)` - Request input from users
551 | - `ctx.fastmcp.name` - Access server configuration
552 | - `ctx.read_resource(uri)` - Read MCP resources
553 | 
554 | ### Resource Registration
555 | 
556 | Expose data as resources for efficient, template-based access:
557 | 
558 | ```python
559 | @mcp.resource("file://documents/{name}")
560 | async def get_document(name: str) -> str:
561 |     '''Expose documents as MCP resources.
562 | 
563 |     Resources are useful for static or semi-static data that doesn't
564 |     require complex parameters. They use URI templates for flexible access.
565 |     '''
566 |     document_path = f"./docs/{name}"
567 |     with open(document_path, "r") as f:
568 |         return f.read()
569 | 
570 | @mcp.resource("config://settings/{key}")
571 | async def get_setting(key: str, ctx: Context) -> str:
572 |     '''Expose configuration as resources with context.'''
573 |     settings = await load_settings()
574 |     return json.dumps(settings.get(key, {}))
575 | ```
576 | 
577 | **When to use Resources vs Tools:**
578 | - **Resources**: For data access with simple parameters (URI templates)
579 | - **Tools**: For complex operations with validation and business logic
580 | 
581 | ### Structured Output Types
582 | 
583 | FastMCP supports multiple return types beyond strings:
584 | 
585 | ```python
586 | from typing import TypedDict
587 | from dataclasses import dataclass
588 | from pydantic import BaseModel
589 | 
590 | # TypedDict for structured returns
591 | class UserData(TypedDict):
592 |     id: str
593 |     name: str
594 |     email: str
595 | 
596 | @mcp.tool()
597 | async def get_user_typed(user_id: str) -> UserData:
598 |     '''Returns structured data - FastMCP handles serialization.'''
599 |     return {"id": user_id, "name": "John Doe", "email": "[email protected]"}
600 | 
601 | # Pydantic models for complex validation
602 | class DetailedUser(BaseModel):
603 |     id: str
604 |     name: str
605 |     email: str
606 |     created_at: datetime
607 |     metadata: Dict[str, Any]
608 | 
609 | @mcp.tool()
610 | async def get_user_detailed(user_id: str) -> DetailedUser:
611 |     '''Returns Pydantic model - automatically generates schema.'''
612 |     user = await fetch_user(user_id)
613 |     return DetailedUser(**user)
614 | ```
615 | 
616 | ### Lifespan Management
617 | 
618 | Initialize resources that persist across requests:
619 | 
620 | ```python
621 | from contextlib import asynccontextmanager
622 | 
623 | @asynccontextmanager
624 | async def app_lifespan():
625 |     '''Manage resources that live for the server's lifetime.'''
626 |     # Initialize connections, load config, etc.
627 |     db = await connect_to_database()
628 |     config = load_configuration()
629 | 
630 |     # Make available to all tools
631 |     yield {"db": db, "config": config}
632 | 
633 |     # Cleanup on shutdown
634 |     await db.close()
635 | 
636 | mcp = FastMCP("example_mcp", lifespan=app_lifespan)
637 | 
638 | @mcp.tool()
639 | async def query_data(query: str, ctx: Context) -> str:
640 |     '''Access lifespan resources through context.'''
641 |     db = ctx.request_context.lifespan_state["db"]
642 |     results = await db.query(query)
643 |     return format_results(results)
644 | ```
645 | 
646 | ### Multiple Transport Options
647 | 
648 | FastMCP supports different transport mechanisms:
649 | 
650 | ```python
651 | # Default: Stdio transport (for CLI tools)
652 | if __name__ == "__main__":
653 |     mcp.run()
654 | 
655 | # HTTP transport (for web services)
656 | if __name__ == "__main__":
657 |     mcp.run(transport="streamable_http", port=8000)
658 | 
659 | # SSE transport (for real-time updates)
660 | if __name__ == "__main__":
661 |     mcp.run(transport="sse", port=8000)
662 | ```
663 | 
664 | **Transport selection:**
665 | - **Stdio**: Command-line tools, subprocess integration
666 | - **HTTP**: Web services, remote access, multiple clients
667 | - **SSE**: Real-time updates, push notifications
668 | 
669 | ---
670 | 
671 | ## Code Best Practices
672 | 
673 | ### Code Composability and Reusability
674 | 
675 | Your implementation MUST prioritize composability and code reuse:
676 | 
677 | 1. **Extract Common Functionality**:
678 |    - Create reusable helper functions for operations used across multiple tools
679 |    - Build shared API clients for HTTP requests instead of duplicating code
680 |    - Centralize error handling logic in utility functions
681 |    - Extract business logic into dedicated functions that can be composed
682 |    - Extract shared markdown or JSON field selection & formatting functionality
683 | 
684 | 2. **Avoid Duplication**:
685 |    - NEVER copy-paste similar code between tools
686 |    - If you find yourself writing similar logic twice, extract it into a function
687 |    - Common operations like pagination, filtering, field selection, and formatting should be shared
688 |    - Authentication/authorization logic should be centralized
689 | 
690 | ### Python-Specific Best Practices
691 | 
692 | 1. **Use Type Hints**: Always include type annotations for function parameters and return values
693 | 2. **Pydantic Models**: Define clear Pydantic models for all input validation
694 | 3. **Avoid Manual Validation**: Let Pydantic handle input validation with constraints
695 | 4. **Proper Imports**: Group imports (standard library, third-party, local)
696 | 5. **Error Handling**: Use specific exception types (httpx.HTTPStatusError, not generic Exception)
697 | 6. **Async Context Managers**: Use `async with` for resources that need cleanup
698 | 7. **Constants**: Define module-level constants in UPPER_CASE
699 | 
700 | ## Quality Checklist
701 | 
702 | Before finalizing your Python MCP server implementation, ensure:
703 | 
704 | ### Strategic Design
705 | - [ ] Tools enable complete workflows, not just API endpoint wrappers
706 | - [ ] Tool names reflect natural task subdivisions
707 | - [ ] Response formats optimize for agent context efficiency
708 | - [ ] Human-readable identifiers used where appropriate
709 | - [ ] Error messages guide agents toward correct usage
710 | 
711 | ### Implementation Quality
712 | - [ ] FOCUSED IMPLEMENTATION: Most important and valuable tools implemented
713 | - [ ] All tools have descriptive names and documentation
714 | - [ ] Return types are consistent across similar operations
715 | - [ ] Error handling is implemented for all external calls
716 | - [ ] Server name follows format: `{service}_mcp`
717 | - [ ] All network operations use async/await
718 | - [ ] Common functionality is extracted into reusable functions
719 | - [ ] Error messages are clear, actionable, and educational
720 | - [ ] Outputs are properly validated and formatted
721 | 
722 | ### Tool Configuration
723 | - [ ] All tools implement 'name' and 'annotations' in the decorator
724 | - [ ] Annotations correctly set (readOnlyHint, destructiveHint, idempotentHint, openWorldHint)
725 | - [ ] All tools use Pydantic BaseModel for input validation with Field() definitions
726 | - [ ] All Pydantic Fields have explicit types and descriptions with constraints
727 | - [ ] All tools have comprehensive docstrings with explicit input/output types
728 | - [ ] Docstrings include complete schema structure for dict/JSON returns
729 | - [ ] Pydantic models handle input validation (no manual validation needed)
730 | 
731 | ### Advanced Features (where applicable)
732 | - [ ] Context injection used for logging, progress, or elicitation
733 | - [ ] Resources registered for appropriate data endpoints
734 | - [ ] Lifespan management implemented for persistent connections
735 | - [ ] Structured output types used (TypedDict, Pydantic models)
736 | - [ ] Appropriate transport configured (stdio, HTTP, SSE)
737 | 
738 | ### Code Quality
739 | - [ ] File includes proper imports including Pydantic imports
740 | - [ ] Pagination is properly implemented where applicable
741 | - [ ] Large responses check CHARACTER_LIMIT and truncate with clear messages
742 | - [ ] Filtering options are provided for potentially large result sets
743 | - [ ] All async functions are properly defined with `async def`
744 | - [ ] HTTP client usage follows async patterns with proper context managers
745 | - [ ] Type hints are used throughout the code
746 | - [ ] Constants are defined at module level in UPPER_CASE
747 | 
748 | ### Testing
749 | - [ ] Server runs successfully: `python your_server.py --help`
750 | - [ ] All imports resolve correctly
751 | - [ ] Sample tool calls work as expected
752 | - [ ] Error scenarios handled gracefully
```

--------------------------------------------------------------------------------
/.claude/skills/mcp-builder/reference/node_mcp_server.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Node/TypeScript MCP Server Implementation Guide
  2 | 
  3 | ## Overview
  4 | 
  5 | This document provides Node/TypeScript-specific best practices and examples for implementing MCP servers using the MCP TypeScript SDK. It covers project structure, server setup, tool registration patterns, input validation with Zod, error handling, and complete working examples.
  6 | 
  7 | ---
  8 | 
  9 | ## Quick Reference
 10 | 
 11 | ### Key Imports
 12 | ```typescript
 13 | import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
 14 | import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
 15 | import { z } from "zod";
 16 | import axios, { AxiosError } from "axios";
 17 | ```
 18 | 
 19 | ### Server Initialization
 20 | ```typescript
 21 | const server = new McpServer({
 22 |   name: "service-mcp-server",
 23 |   version: "1.0.0"
 24 | });
 25 | ```
 26 | 
 27 | ### Tool Registration Pattern
 28 | ```typescript
 29 | server.registerTool("tool_name", {...config}, async (params) => {
 30 |   // Implementation
 31 | });
 32 | ```
 33 | 
 34 | ---
 35 | 
 36 | ## MCP TypeScript SDK
 37 | 
 38 | The official MCP TypeScript SDK provides:
 39 | - `McpServer` class for server initialization
 40 | - `registerTool` method for tool registration
 41 | - Zod schema integration for runtime input validation
 42 | - Type-safe tool handler implementations
 43 | 
 44 | See the MCP SDK documentation in the references for complete details.
 45 | 
 46 | ## Server Naming Convention
 47 | 
 48 | Node/TypeScript MCP servers must follow this naming pattern:
 49 | - **Format**: `{service}-mcp-server` (lowercase with hyphens)
 50 | - **Examples**: `github-mcp-server`, `jira-mcp-server`, `stripe-mcp-server`
 51 | 
 52 | The name should be:
 53 | - General (not tied to specific features)
 54 | - Descriptive of the service/API being integrated
 55 | - Easy to infer from the task description
 56 | - Without version numbers or dates
 57 | 
 58 | ## Project Structure
 59 | 
 60 | Create the following structure for Node/TypeScript MCP servers:
 61 | 
 62 | ```
 63 | {service}-mcp-server/
 64 | ├── package.json
 65 | ├── tsconfig.json
 66 | ├── README.md
 67 | ├── src/
 68 | │   ├── index.ts          # Main entry point with McpServer initialization
 69 | │   ├── types.ts          # TypeScript type definitions and interfaces
 70 | │   ├── tools/            # Tool implementations (one file per domain)
 71 | │   ├── services/         # API clients and shared utilities
 72 | │   ├── schemas/          # Zod validation schemas
 73 | │   └── constants.ts      # Shared constants (API_URL, CHARACTER_LIMIT, etc.)
 74 | └── dist/                 # Built JavaScript files (entry point: dist/index.js)
 75 | ```
 76 | 
 77 | ## Tool Implementation
 78 | 
 79 | ### Tool Naming
 80 | 
 81 | Use snake_case for tool names (e.g., "search_users", "create_project", "get_channel_info") with clear, action-oriented names.
 82 | 
 83 | **Avoid Naming Conflicts**: Include the service context to prevent overlaps:
 84 | - Use "slack_send_message" instead of just "send_message"
 85 | - Use "github_create_issue" instead of just "create_issue"
 86 | - Use "asana_list_tasks" instead of just "list_tasks"
 87 | 
 88 | ### Tool Structure
 89 | 
 90 | Tools are registered using the `registerTool` method with the following requirements:
 91 | - Use Zod schemas for runtime input validation and type safety
 92 | - The `description` field must be explicitly provided - JSDoc comments are NOT automatically extracted
 93 | - Explicitly provide `title`, `description`, `inputSchema`, and `annotations`
 94 | - The `inputSchema` must be a Zod schema object (not a JSON schema)
 95 | - Type all parameters and return values explicitly
 96 | 
 97 | ```typescript
 98 | import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
 99 | import { z } from "zod";
100 | 
101 | const server = new McpServer({
102 |   name: "example-mcp",
103 |   version: "1.0.0"
104 | });
105 | 
106 | // Zod schema for input validation
107 | const UserSearchInputSchema = z.object({
108 |   query: z.string()
109 |     .min(2, "Query must be at least 2 characters")
110 |     .max(200, "Query must not exceed 200 characters")
111 |     .describe("Search string to match against names/emails"),
112 |   limit: z.number()
113 |     .int()
114 |     .min(1)
115 |     .max(100)
116 |     .default(20)
117 |     .describe("Maximum results to return"),
118 |   offset: z.number()
119 |     .int()
120 |     .min(0)
121 |     .default(0)
122 |     .describe("Number of results to skip for pagination"),
123 |   response_format: z.nativeEnum(ResponseFormat)
124 |     .default(ResponseFormat.MARKDOWN)
125 |     .describe("Output format: 'markdown' for human-readable or 'json' for machine-readable")
126 | }).strict();
127 | 
128 | // Type definition from Zod schema
129 | type UserSearchInput = z.infer<typeof UserSearchInputSchema>;
130 | 
131 | server.registerTool(
132 |   "example_search_users",
133 |   {
134 |     title: "Search Example Users",
135 |     description: `Search for users in the Example system by name, email, or team.
136 | 
137 | This tool searches across all user profiles in the Example platform, supporting partial matches and various search filters. It does NOT create or modify users, only searches existing ones.
138 | 
139 | Args:
140 |   - query (string): Search string to match against names/emails
141 |   - limit (number): Maximum results to return, between 1-100 (default: 20)
142 |   - offset (number): Number of results to skip for pagination (default: 0)
143 |   - response_format ('markdown' | 'json'): Output format (default: 'markdown')
144 | 
145 | Returns:
146 |   For JSON format: Structured data with schema:
147 |   {
148 |     "total": number,           // Total number of matches found
149 |     "count": number,           // Number of results in this response
150 |     "offset": number,          // Current pagination offset
151 |     "users": [
152 |       {
153 |         "id": string,          // User ID (e.g., "U123456789")
154 |         "name": string,        // Full name (e.g., "John Doe")
155 |         "email": string,       // Email address
156 |         "team": string,        // Team name (optional)
157 |         "active": boolean      // Whether user is active
158 |       }
159 |     ],
160 |     "has_more": boolean,       // Whether more results are available
161 |     "next_offset": number      // Offset for next page (if has_more is true)
162 |   }
163 | 
164 | Examples:
165 |   - Use when: "Find all marketing team members" -> params with query="team:marketing"
166 |   - Use when: "Search for John's account" -> params with query="john"
167 |   - Don't use when: You need to create a user (use example_create_user instead)
168 | 
169 | Error Handling:
170 |   - Returns "Error: Rate limit exceeded" if too many requests (429 status)
171 |   - Returns "No users found matching '<query>'" if search returns empty`,
172 |     inputSchema: UserSearchInputSchema,
173 |     annotations: {
174 |       readOnlyHint: true,
175 |       destructiveHint: false,
176 |       idempotentHint: true,
177 |       openWorldHint: true
178 |     }
179 |   },
180 |   async (params: UserSearchInput) => {
181 |     try {
182 |       // Input validation is handled by Zod schema
183 |       // Make API request using validated parameters
184 |       const data = await makeApiRequest<any>(
185 |         "users/search",
186 |         "GET",
187 |         undefined,
188 |         {
189 |           q: params.query,
190 |           limit: params.limit,
191 |           offset: params.offset
192 |         }
193 |       );
194 | 
195 |       const users = data.users || [];
196 |       const total = data.total || 0;
197 | 
198 |       if (!users.length) {
199 |         return {
200 |           content: [{
201 |             type: "text",
202 |             text: `No users found matching '${params.query}'`
203 |           }]
204 |         };
205 |       }
206 | 
207 |       // Format response based on requested format
208 |       let result: string;
209 | 
210 |       if (params.response_format === ResponseFormat.MARKDOWN) {
211 |         // Human-readable markdown format
212 |         const lines: string[] = [`# User Search Results: '${params.query}'`, ""];
213 |         lines.push(`Found ${total} users (showing ${users.length})`);
214 |         lines.push("");
215 | 
216 |         for (const user of users) {
217 |           lines.push(`## ${user.name} (${user.id})`);
218 |           lines.push(`- **Email**: ${user.email}`);
219 |           if (user.team) {
220 |             lines.push(`- **Team**: ${user.team}`);
221 |           }
222 |           lines.push("");
223 |         }
224 | 
225 |         result = lines.join("\n");
226 | 
227 |       } else {
228 |         // Machine-readable JSON format
229 |         const response: any = {
230 |           total,
231 |           count: users.length,
232 |           offset: params.offset,
233 |           users: users.map((user: any) => ({
234 |             id: user.id,
235 |             name: user.name,
236 |             email: user.email,
237 |             ...(user.team ? { team: user.team } : {}),
238 |             active: user.active ?? true
239 |           }))
240 |         };
241 | 
242 |         // Add pagination info if there are more results
243 |         if (total > params.offset + users.length) {
244 |           response.has_more = true;
245 |           response.next_offset = params.offset + users.length;
246 |         }
247 | 
248 |         result = JSON.stringify(response, null, 2);
249 |       }
250 | 
251 |       return {
252 |         content: [{
253 |           type: "text",
254 |           text: result
255 |         }]
256 |       };
257 |     } catch (error) {
258 |       return {
259 |         content: [{
260 |           type: "text",
261 |           text: handleApiError(error)
262 |         }]
263 |       };
264 |     }
265 |   }
266 | );
267 | ```
268 | 
269 | ## Zod Schemas for Input Validation
270 | 
271 | Zod provides runtime type validation:
272 | 
273 | ```typescript
274 | import { z } from "zod";
275 | 
276 | // Basic schema with validation
277 | const CreateUserSchema = z.object({
278 |   name: z.string()
279 |     .min(1, "Name is required")
280 |     .max(100, "Name must not exceed 100 characters"),
281 |   email: z.string()
282 |     .email("Invalid email format"),
283 |   age: z.number()
284 |     .int("Age must be a whole number")
285 |     .min(0, "Age cannot be negative")
286 |     .max(150, "Age cannot be greater than 150")
287 | }).strict();  // Use .strict() to forbid extra fields
288 | 
289 | // Enums
290 | enum ResponseFormat {
291 |   MARKDOWN = "markdown",
292 |   JSON = "json"
293 | }
294 | 
295 | const SearchSchema = z.object({
296 |   response_format: z.nativeEnum(ResponseFormat)
297 |     .default(ResponseFormat.MARKDOWN)
298 |     .describe("Output format")
299 | });
300 | 
301 | // Optional fields with defaults
302 | const PaginationSchema = z.object({
303 |   limit: z.number()
304 |     .int()
305 |     .min(1)
306 |     .max(100)
307 |     .default(20)
308 |     .describe("Maximum results to return"),
309 |   offset: z.number()
310 |     .int()
311 |     .min(0)
312 |     .default(0)
313 |     .describe("Number of results to skip")
314 | });
315 | ```
316 | 
317 | ## Response Format Options
318 | 
319 | Support multiple output formats for flexibility:
320 | 
321 | ```typescript
322 | enum ResponseFormat {
323 |   MARKDOWN = "markdown",
324 |   JSON = "json"
325 | }
326 | 
327 | const inputSchema = z.object({
328 |   query: z.string(),
329 |   response_format: z.nativeEnum(ResponseFormat)
330 |     .default(ResponseFormat.MARKDOWN)
331 |     .describe("Output format: 'markdown' for human-readable or 'json' for machine-readable")
332 | });
333 | ```
334 | 
335 | **Markdown format**:
336 | - Use headers, lists, and formatting for clarity
337 | - Convert timestamps to human-readable format
338 | - Show display names with IDs in parentheses
339 | - Omit verbose metadata
340 | - Group related information logically
341 | 
342 | **JSON format**:
343 | - Return complete, structured data suitable for programmatic processing
344 | - Include all available fields and metadata
345 | - Use consistent field names and types
346 | 
347 | ## Pagination Implementation
348 | 
349 | For tools that list resources:
350 | 
351 | ```typescript
352 | const ListSchema = z.object({
353 |   limit: z.number().int().min(1).max(100).default(20),
354 |   offset: z.number().int().min(0).default(0)
355 | });
356 | 
357 | async function listItems(params: z.infer<typeof ListSchema>) {
358 |   const data = await apiRequest(params.limit, params.offset);
359 | 
360 |   const response = {
361 |     total: data.total,
362 |     count: data.items.length,
363 |     offset: params.offset,
364 |     items: data.items,
365 |     has_more: data.total > params.offset + data.items.length,
366 |     next_offset: data.total > params.offset + data.items.length
367 |       ? params.offset + data.items.length
368 |       : undefined
369 |   };
370 | 
371 |   return JSON.stringify(response, null, 2);
372 | }
373 | ```
374 | 
375 | ## Character Limits and Truncation
376 | 
377 | Add a CHARACTER_LIMIT constant to prevent overwhelming responses:
378 | 
379 | ```typescript
380 | // At module level in constants.ts
381 | export const CHARACTER_LIMIT = 25000;  // Maximum response size in characters
382 | 
383 | async function searchTool(params: SearchInput) {
384 |   let result = generateResponse(data);
385 | 
386 |   // Check character limit and truncate if needed
387 |   if (result.length > CHARACTER_LIMIT) {
388 |     const truncatedData = data.slice(0, Math.max(1, data.length / 2));
389 |     response.data = truncatedData;
390 |     response.truncated = true;
391 |     response.truncation_message =
392 |       `Response truncated from ${data.length} to ${truncatedData.length} items. ` +
393 |       `Use 'offset' parameter or add filters to see more results.`;
394 |     result = JSON.stringify(response, null, 2);
395 |   }
396 | 
397 |   return result;
398 | }
399 | ```
400 | 
401 | ## Error Handling
402 | 
403 | Provide clear, actionable error messages:
404 | 
405 | ```typescript
406 | import axios, { AxiosError } from "axios";
407 | 
408 | function handleApiError(error: unknown): string {
409 |   if (error instanceof AxiosError) {
410 |     if (error.response) {
411 |       switch (error.response.status) {
412 |         case 404:
413 |           return "Error: Resource not found. Please check the ID is correct.";
414 |         case 403:
415 |           return "Error: Permission denied. You don't have access to this resource.";
416 |         case 429:
417 |           return "Error: Rate limit exceeded. Please wait before making more requests.";
418 |         default:
419 |           return `Error: API request failed with status ${error.response.status}`;
420 |       }
421 |     } else if (error.code === "ECONNABORTED") {
422 |       return "Error: Request timed out. Please try again.";
423 |     }
424 |   }
425 |   return `Error: Unexpected error occurred: ${error instanceof Error ? error.message : String(error)}`;
426 | }
427 | ```
428 | 
429 | ## Shared Utilities
430 | 
431 | Extract common functionality into reusable functions:
432 | 
433 | ```typescript
434 | // Shared API request function
435 | async function makeApiRequest<T>(
436 |   endpoint: string,
437 |   method: "GET" | "POST" | "PUT" | "DELETE" = "GET",
438 |   data?: any,
439 |   params?: any
440 | ): Promise<T> {
441 |   try {
442 |     const response = await axios({
443 |       method,
444 |       url: `${API_BASE_URL}/${endpoint}`,
445 |       data,
446 |       params,
447 |       timeout: 30000,
448 |       headers: {
449 |         "Content-Type": "application/json",
450 |         "Accept": "application/json"
451 |       }
452 |     });
453 |     return response.data;
454 |   } catch (error) {
455 |     throw error;
456 |   }
457 | }
458 | ```
459 | 
460 | ## Async/Await Best Practices
461 | 
462 | Always use async/await for network requests and I/O operations:
463 | 
464 | ```typescript
465 | // Good: Async network request
466 | async function fetchData(resourceId: string): Promise<ResourceData> {
467 |   const response = await axios.get(`${API_URL}/resource/${resourceId}`);
468 |   return response.data;
469 | }
470 | 
471 | // Bad: Promise chains
472 | function fetchData(resourceId: string): Promise<ResourceData> {
473 |   return axios.get(`${API_URL}/resource/${resourceId}`)
474 |     .then(response => response.data);  // Harder to read and maintain
475 | }
476 | ```
477 | 
478 | ## TypeScript Best Practices
479 | 
480 | 1. **Use Strict TypeScript**: Enable strict mode in tsconfig.json
481 | 2. **Define Interfaces**: Create clear interface definitions for all data structures
482 | 3. **Avoid `any`**: Use proper types or `unknown` instead of `any`
483 | 4. **Zod for Runtime Validation**: Use Zod schemas to validate external data
484 | 5. **Type Guards**: Create type guard functions for complex type checking
485 | 6. **Error Handling**: Always use try-catch with proper error type checking
486 | 7. **Null Safety**: Use optional chaining (`?.`) and nullish coalescing (`??`)
487 | 
488 | ```typescript
489 | // Good: Type-safe with Zod and interfaces
490 | interface UserResponse {
491 |   id: string;
492 |   name: string;
493 |   email: string;
494 |   team?: string;
495 |   active: boolean;
496 | }
497 | 
498 | const UserSchema = z.object({
499 |   id: z.string(),
500 |   name: z.string(),
501 |   email: z.string().email(),
502 |   team: z.string().optional(),
503 |   active: z.boolean()
504 | });
505 | 
506 | type User = z.infer<typeof UserSchema>;
507 | 
508 | async function getUser(id: string): Promise<User> {
509 |   const data = await apiCall(`/users/${id}`);
510 |   return UserSchema.parse(data);  // Runtime validation
511 | }
512 | 
513 | // Bad: Using any
514 | async function getUser(id: string): Promise<any> {
515 |   return await apiCall(`/users/${id}`);  // No type safety
516 | }
517 | ```
518 | 
519 | ## Package Configuration
520 | 
521 | ### package.json
522 | 
523 | ```json
524 | {
525 |   "name": "{service}-mcp-server",
526 |   "version": "1.0.0",
527 |   "description": "MCP server for {Service} API integration",
528 |   "type": "module",
529 |   "main": "dist/index.js",
530 |   "scripts": {
531 |     "start": "node dist/index.js",
532 |     "dev": "tsx watch src/index.ts",
533 |     "build": "tsc",
534 |     "clean": "rm -rf dist"
535 |   },
536 |   "engines": {
537 |     "node": ">=18"
538 |   },
539 |   "dependencies": {
540 |     "@modelcontextprotocol/sdk": "^1.6.1",
541 |     "axios": "^1.7.9",
542 |     "zod": "^3.23.8"
543 |   },
544 |   "devDependencies": {
545 |     "@types/node": "^22.10.0",
546 |     "tsx": "^4.19.2",
547 |     "typescript": "^5.7.2"
548 |   }
549 | }
550 | ```
551 | 
552 | ### tsconfig.json
553 | 
554 | ```json
555 | {
556 |   "compilerOptions": {
557 |     "target": "ES2022",
558 |     "module": "Node16",
559 |     "moduleResolution": "Node16",
560 |     "lib": ["ES2022"],
561 |     "outDir": "./dist",
562 |     "rootDir": "./src",
563 |     "strict": true,
564 |     "esModuleInterop": true,
565 |     "skipLibCheck": true,
566 |     "forceConsistentCasingInFileNames": true,
567 |     "declaration": true,
568 |     "declarationMap": true,
569 |     "sourceMap": true,
570 |     "allowSyntheticDefaultImports": true
571 |   },
572 |   "include": ["src/**/*"],
573 |   "exclude": ["node_modules", "dist"]
574 | }
575 | ```
576 | 
577 | ## Complete Example
578 | 
579 | ```typescript
580 | #!/usr/bin/env node
581 | /**
582 |  * MCP Server for Example Service.
583 |  *
584 |  * This server provides tools to interact with Example API, including user search,
585 |  * project management, and data export capabilities.
586 |  */
587 | 
588 | import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
589 | import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
590 | import { z } from "zod";
591 | import axios, { AxiosError } from "axios";
592 | 
593 | // Constants
594 | const API_BASE_URL = "https://api.example.com/v1";
595 | const CHARACTER_LIMIT = 25000;
596 | 
597 | // Enums
598 | enum ResponseFormat {
599 |   MARKDOWN = "markdown",
600 |   JSON = "json"
601 | }
602 | 
603 | // Zod schemas
604 | const UserSearchInputSchema = z.object({
605 |   query: z.string()
606 |     .min(2, "Query must be at least 2 characters")
607 |     .max(200, "Query must not exceed 200 characters")
608 |     .describe("Search string to match against names/emails"),
609 |   limit: z.number()
610 |     .int()
611 |     .min(1)
612 |     .max(100)
613 |     .default(20)
614 |     .describe("Maximum results to return"),
615 |   offset: z.number()
616 |     .int()
617 |     .min(0)
618 |     .default(0)
619 |     .describe("Number of results to skip for pagination"),
620 |   response_format: z.nativeEnum(ResponseFormat)
621 |     .default(ResponseFormat.MARKDOWN)
622 |     .describe("Output format: 'markdown' for human-readable or 'json' for machine-readable")
623 | }).strict();
624 | 
625 | type UserSearchInput = z.infer<typeof UserSearchInputSchema>;
626 | 
627 | // Shared utility functions
628 | async function makeApiRequest<T>(
629 |   endpoint: string,
630 |   method: "GET" | "POST" | "PUT" | "DELETE" = "GET",
631 |   data?: any,
632 |   params?: any
633 | ): Promise<T> {
634 |   try {
635 |     const response = await axios({
636 |       method,
637 |       url: `${API_BASE_URL}/${endpoint}`,
638 |       data,
639 |       params,
640 |       timeout: 30000,
641 |       headers: {
642 |         "Content-Type": "application/json",
643 |         "Accept": "application/json"
644 |       }
645 |     });
646 |     return response.data;
647 |   } catch (error) {
648 |     throw error;
649 |   }
650 | }
651 | 
652 | function handleApiError(error: unknown): string {
653 |   if (error instanceof AxiosError) {
654 |     if (error.response) {
655 |       switch (error.response.status) {
656 |         case 404:
657 |           return "Error: Resource not found. Please check the ID is correct.";
658 |         case 403:
659 |           return "Error: Permission denied. You don't have access to this resource.";
660 |         case 429:
661 |           return "Error: Rate limit exceeded. Please wait before making more requests.";
662 |         default:
663 |           return `Error: API request failed with status ${error.response.status}`;
664 |       }
665 |     } else if (error.code === "ECONNABORTED") {
666 |       return "Error: Request timed out. Please try again.";
667 |     }
668 |   }
669 |   return `Error: Unexpected error occurred: ${error instanceof Error ? error.message : String(error)}`;
670 | }
671 | 
672 | // Create MCP server instance
673 | const server = new McpServer({
674 |   name: "example-mcp",
675 |   version: "1.0.0"
676 | });
677 | 
678 | // Register tools
679 | server.registerTool(
680 |   "example_search_users",
681 |   {
682 |     title: "Search Example Users",
683 |     description: `[Full description as shown above]`,
684 |     inputSchema: UserSearchInputSchema,
685 |     annotations: {
686 |       readOnlyHint: true,
687 |       destructiveHint: false,
688 |       idempotentHint: true,
689 |       openWorldHint: true
690 |     }
691 |   },
692 |   async (params: UserSearchInput) => {
693 |     // Implementation as shown above
694 |   }
695 | );
696 | 
697 | // Main function
698 | async function main() {
699 |   // Verify environment variables if needed
700 |   if (!process.env.EXAMPLE_API_KEY) {
701 |     console.error("ERROR: EXAMPLE_API_KEY environment variable is required");
702 |     process.exit(1);
703 |   }
704 | 
705 |   // Create transport
706 |   const transport = new StdioServerTransport();
707 | 
708 |   // Connect server to transport
709 |   await server.connect(transport);
710 | 
711 |   console.error("Example MCP server running via stdio");
712 | }
713 | 
714 | // Run the server
715 | main().catch((error) => {
716 |   console.error("Server error:", error);
717 |   process.exit(1);
718 | });
719 | ```
720 | 
721 | ---
722 | 
723 | ## Advanced MCP Features
724 | 
725 | ### Resource Registration
726 | 
727 | Expose data as resources for efficient, URI-based access:
728 | 
729 | ```typescript
730 | import { ResourceTemplate } from "@modelcontextprotocol/sdk/types.js";
731 | 
732 | // Register a resource with URI template
733 | server.registerResource(
734 |   {
735 |     uri: "file://documents/{name}",
736 |     name: "Document Resource",
737 |     description: "Access documents by name",
738 |     mimeType: "text/plain"
739 |   },
740 |   async (uri: string) => {
741 |     // Extract parameter from URI
742 |     const match = uri.match(/^file:\/\/documents\/(.+)$/);
743 |     if (!match) {
744 |       throw new Error("Invalid URI format");
745 |     }
746 | 
747 |     const documentName = match[1];
748 |     const content = await loadDocument(documentName);
749 | 
750 |     return {
751 |       contents: [{
752 |         uri,
753 |         mimeType: "text/plain",
754 |         text: content
755 |       }]
756 |     };
757 |   }
758 | );
759 | 
760 | // List available resources dynamically
761 | server.registerResourceList(async () => {
762 |   const documents = await getAvailableDocuments();
763 |   return {
764 |     resources: documents.map(doc => ({
765 |       uri: `file://documents/${doc.name}`,
766 |       name: doc.name,
767 |       mimeType: "text/plain",
768 |       description: doc.description
769 |     }))
770 |   };
771 | });
772 | ```
773 | 
774 | **When to use Resources vs Tools:**
775 | - **Resources**: For data access with simple URI-based parameters
776 | - **Tools**: For complex operations requiring validation and business logic
777 | - **Resources**: When data is relatively static or template-based
778 | - **Tools**: When operations have side effects or complex workflows
779 | 
780 | ### Multiple Transport Options
781 | 
782 | The TypeScript SDK supports different transport mechanisms:
783 | 
784 | ```typescript
785 | import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
786 | import { SSEServerTransport } from "@modelcontextprotocol/sdk/server/sse.js";
787 | 
788 | // Stdio transport (default - for CLI tools)
789 | const stdioTransport = new StdioServerTransport();
790 | await server.connect(stdioTransport);
791 | 
792 | // SSE transport (for real-time web updates)
793 | const sseTransport = new SSEServerTransport("/message", response);
794 | await server.connect(sseTransport);
795 | 
796 | // HTTP transport (for web services)
797 | // Configure based on your HTTP framework integration
798 | ```
799 | 
800 | **Transport selection guide:**
801 | - **Stdio**: Command-line tools, subprocess integration, local development
802 | - **HTTP**: Web services, remote access, multiple simultaneous clients
803 | - **SSE**: Real-time updates, server-push notifications, web dashboards
804 | 
805 | ### Notification Support
806 | 
807 | Notify clients when server state changes:
808 | 
809 | ```typescript
810 | // Notify when tools list changes
811 | server.notification({
812 |   method: "notifications/tools/list_changed"
813 | });
814 | 
815 | // Notify when resources change
816 | server.notification({
817 |   method: "notifications/resources/list_changed"
818 | });
819 | ```
820 | 
821 | Use notifications sparingly - only when server capabilities genuinely change.
822 | 
823 | ---
824 | 
825 | ## Code Best Practices
826 | 
827 | ### Code Composability and Reusability
828 | 
829 | Your implementation MUST prioritize composability and code reuse:
830 | 
831 | 1. **Extract Common Functionality**:
832 |    - Create reusable helper functions for operations used across multiple tools
833 |    - Build shared API clients for HTTP requests instead of duplicating code
834 |    - Centralize error handling logic in utility functions
835 |    - Extract business logic into dedicated functions that can be composed
836 |    - Extract shared markdown or JSON field selection & formatting functionality
837 | 
838 | 2. **Avoid Duplication**:
839 |    - NEVER copy-paste similar code between tools
840 |    - If you find yourself writing similar logic twice, extract it into a function
841 |    - Common operations like pagination, filtering, field selection, and formatting should be shared
842 |    - Authentication/authorization logic should be centralized
843 | 
844 | ## Building and Running
845 | 
846 | Always build your TypeScript code before running:
847 | 
848 | ```bash
849 | # Build the project
850 | npm run build
851 | 
852 | # Run the server
853 | npm start
854 | 
855 | # Development with auto-reload
856 | npm run dev
857 | ```
858 | 
859 | Always ensure `npm run build` completes successfully before considering the implementation complete.
860 | 
861 | ## Quality Checklist
862 | 
863 | Before finalizing your Node/TypeScript MCP server implementation, ensure:
864 | 
865 | ### Strategic Design
866 | - [ ] Tools enable complete workflows, not just API endpoint wrappers
867 | - [ ] Tool names reflect natural task subdivisions
868 | - [ ] Response formats optimize for agent context efficiency
869 | - [ ] Human-readable identifiers used where appropriate
870 | - [ ] Error messages guide agents toward correct usage
871 | 
872 | ### Implementation Quality
873 | - [ ] FOCUSED IMPLEMENTATION: Most important and valuable tools implemented
874 | - [ ] All tools registered using `registerTool` with complete configuration
875 | - [ ] All tools include `title`, `description`, `inputSchema`, and `annotations`
876 | - [ ] Annotations correctly set (readOnlyHint, destructiveHint, idempotentHint, openWorldHint)
877 | - [ ] All tools use Zod schemas for runtime input validation with `.strict()` enforcement
878 | - [ ] All Zod schemas have proper constraints and descriptive error messages
879 | - [ ] All tools have comprehensive descriptions with explicit input/output types
880 | - [ ] Descriptions include return value examples and complete schema documentation
881 | - [ ] Error messages are clear, actionable, and educational
882 | 
883 | ### TypeScript Quality
884 | - [ ] TypeScript interfaces are defined for all data structures
885 | - [ ] Strict TypeScript is enabled in tsconfig.json
886 | - [ ] No use of `any` type - use `unknown` or proper types instead
887 | - [ ] All async functions have explicit Promise<T> return types
888 | - [ ] Error handling uses proper type guards (e.g., `axios.isAxiosError`, `z.ZodError`)
889 | 
890 | ### Advanced Features (where applicable)
891 | - [ ] Resources registered for appropriate data endpoints
892 | - [ ] Appropriate transport configured (stdio, HTTP, SSE)
893 | - [ ] Notifications implemented for dynamic server capabilities
894 | - [ ] Type-safe with SDK interfaces
895 | 
896 | ### Project Configuration
897 | - [ ] Package.json includes all necessary dependencies
898 | - [ ] Build script produces working JavaScript in dist/ directory
899 | - [ ] Main entry point is properly configured as dist/index.js
900 | - [ ] Server name follows format: `{service}-mcp-server`
901 | - [ ] tsconfig.json properly configured with strict mode
902 | 
903 | ### Code Quality
904 | - [ ] Pagination is properly implemented where applicable
905 | - [ ] Large responses check CHARACTER_LIMIT constant and truncate with clear messages
906 | - [ ] Filtering options are provided for potentially large result sets
907 | - [ ] All network operations handle timeouts and connection errors gracefully
908 | - [ ] Common functionality is extracted into reusable functions
909 | - [ ] Return types are consistent across similar operations
910 | 
911 | ### Testing and Build
912 | - [ ] `npm run build` completes successfully without errors
913 | - [ ] dist/index.js created and executable
914 | - [ ] Server runs: `node dist/index.js --help`
915 | - [ ] All imports resolve correctly
916 | - [ ] Sample tool calls work as expected
```

--------------------------------------------------------------------------------
/.claude/skills/mcp-builder/reference/mcp_best_practices.md:
--------------------------------------------------------------------------------

```markdown
  1 | # MCP Server Development Best Practices and Guidelines
  2 | 
  3 | ## Overview
  4 | 
  5 | This document compiles essential best practices and guidelines for building Model Context Protocol (MCP) servers. It covers naming conventions, tool design, response formats, pagination, error handling, security, and compliance requirements.
  6 | 
  7 | ---
  8 | 
  9 | ## Quick Reference
 10 | 
 11 | ### Server Naming
 12 | - **Python**: `{service}_mcp` (e.g., `slack_mcp`)
 13 | - **Node/TypeScript**: `{service}-mcp-server` (e.g., `slack-mcp-server`)
 14 | 
 15 | ### Tool Naming
 16 | - Use snake_case with service prefix
 17 | - Format: `{service}_{action}_{resource}`
 18 | - Example: `slack_send_message`, `github_create_issue`
 19 | 
 20 | ### Response Formats
 21 | - Support both JSON and Markdown formats
 22 | - JSON for programmatic processing
 23 | - Markdown for human readability
 24 | 
 25 | ### Pagination
 26 | - Always respect `limit` parameter
 27 | - Return `has_more`, `next_offset`, `total_count`
 28 | - Default to 20-50 items
 29 | 
 30 | ### Character Limits
 31 | - Set CHARACTER_LIMIT constant (typically 25,000)
 32 | - Truncate gracefully with clear messages
 33 | - Provide guidance on filtering
 34 | 
 35 | ---
 36 | 
 37 | ## Table of Contents
 38 | 1. Server Naming Conventions
 39 | 2. Tool Naming and Design
 40 | 3. Response Format Guidelines
 41 | 4. Pagination Best Practices
 42 | 5. Character Limits and Truncation
 43 | 6. Tool Development Best Practices
 44 | 7. Transport Best Practices
 45 | 8. Testing Requirements
 46 | 9. OAuth and Security Best Practices
 47 | 10. Resource Management Best Practices
 48 | 11. Prompt Management Best Practices
 49 | 12. Error Handling Standards
 50 | 13. Documentation Requirements
 51 | 14. Compliance and Monitoring
 52 | 
 53 | ---
 54 | 
 55 | ## 1. Server Naming Conventions
 56 | 
 57 | Follow these standardized naming patterns for MCP servers:
 58 | 
 59 | **Python**: Use format `{service}_mcp` (lowercase with underscores)
 60 | - Examples: `slack_mcp`, `github_mcp`, `jira_mcp`, `stripe_mcp`
 61 | 
 62 | **Node/TypeScript**: Use format `{service}-mcp-server` (lowercase with hyphens)
 63 | - Examples: `slack-mcp-server`, `github-mcp-server`, `jira-mcp-server`
 64 | 
 65 | The name should be:
 66 | - General (not tied to specific features)
 67 | - Descriptive of the service/API being integrated
 68 | - Easy to infer from the task description
 69 | - Without version numbers or dates
 70 | 
 71 | ---
 72 | 
 73 | ## 2. Tool Naming and Design
 74 | 
 75 | ### Tool Naming Best Practices
 76 | 
 77 | 1. **Use snake_case**: `search_users`, `create_project`, `get_channel_info`
 78 | 2. **Include service prefix**: Anticipate that your MCP server may be used alongside other MCP servers
 79 |    - Use `slack_send_message` instead of just `send_message`
 80 |    - Use `github_create_issue` instead of just `create_issue`
 81 |    - Use `asana_list_tasks` instead of just `list_tasks`
 82 | 3. **Be action-oriented**: Start with verbs (get, list, search, create, etc.)
 83 | 4. **Be specific**: Avoid generic names that could conflict with other servers
 84 | 5. **Maintain consistency**: Use consistent naming patterns within your server
 85 | 
 86 | ### Tool Design Guidelines
 87 | 
 88 | - Tool descriptions must narrowly and unambiguously describe functionality
 89 | - Descriptions must precisely match actual functionality
 90 | - Should not create confusion with other MCP servers
 91 | - Should provide tool annotations (readOnlyHint, destructiveHint, idempotentHint, openWorldHint)
 92 | - Keep tool operations focused and atomic
 93 | 
 94 | ---
 95 | 
 96 | ## 3. Response Format Guidelines
 97 | 
 98 | All tools that return data should support multiple formats for flexibility:
 99 | 
100 | ### JSON Format (`response_format="json"`)
101 | - Machine-readable structured data
102 | - Include all available fields and metadata
103 | - Consistent field names and types
104 | - Suitable for programmatic processing
105 | - Use for when LLMs need to process data further
106 | 
107 | ### Markdown Format (`response_format="markdown"`, typically default)
108 | - Human-readable formatted text
109 | - Use headers, lists, and formatting for clarity
110 | - Convert timestamps to human-readable format (e.g., "2024-01-15 10:30:00 UTC" instead of epoch)
111 | - Show display names with IDs in parentheses (e.g., "@john.doe (U123456)")
112 | - Omit verbose metadata (e.g., show only one profile image URL, not all sizes)
113 | - Group related information logically
114 | - Use for when presenting information to users
115 | 
116 | ---
117 | 
118 | ## 4. Pagination Best Practices
119 | 
120 | For tools that list resources:
121 | 
122 | - **Always respect the `limit` parameter**: Never load all results when a limit is specified
123 | - **Implement pagination**: Use `offset` or cursor-based pagination
124 | - **Return pagination metadata**: Include `has_more`, `next_offset`/`next_cursor`, `total_count`
125 | - **Never load all results into memory**: Especially important for large datasets
126 | - **Default to reasonable limits**: 20-50 items is typical
127 | - **Include clear pagination info in responses**: Make it easy for LLMs to request more data
128 | 
129 | Example pagination response structure:
130 | ```json
131 | {
132 |   "total": 150,
133 |   "count": 20,
134 |   "offset": 0,
135 |   "items": [...],
136 |   "has_more": true,
137 |   "next_offset": 20
138 | }
139 | ```
140 | 
141 | ---
142 | 
143 | ## 5. Character Limits and Truncation
144 | 
145 | To prevent overwhelming responses with too much data:
146 | 
147 | - **Define CHARACTER_LIMIT constant**: Typically 25,000 characters at module level
148 | - **Check response size before returning**: Measure the final response length
149 | - **Truncate gracefully with clear indicators**: Let the LLM know data was truncated
150 | - **Provide guidance on filtering**: Suggest how to use parameters to reduce results
151 | - **Include truncation metadata**: Show what was truncated and how to get more
152 | 
153 | Example truncation handling:
154 | ```python
155 | CHARACTER_LIMIT = 25000
156 | 
157 | if len(result) > CHARACTER_LIMIT:
158 |     truncated_data = data[:max(1, len(data) // 2)]
159 |     response["truncated"] = True
160 |     response["truncation_message"] = (
161 |         f"Response truncated from {len(data)} to {len(truncated_data)} items. "
162 |         f"Use 'offset' parameter or add filters to see more results."
163 |     )
164 | ```
165 | 
166 | ---
167 | 
168 | ## 6. Transport Options
169 | 
170 | MCP servers support multiple transport mechanisms for different deployment scenarios:
171 | 
172 | ### Stdio Transport
173 | 
174 | **Best for**: Command-line tools, local integrations, subprocess execution
175 | 
176 | **Characteristics**:
177 | - Standard input/output stream communication
178 | - Simple setup, no network configuration needed
179 | - Runs as a subprocess of the client
180 | - Ideal for desktop applications and CLI tools
181 | 
182 | **Use when**:
183 | - Building tools for local development environments
184 | - Integrating with desktop applications (e.g., Claude Desktop)
185 | - Creating command-line utilities
186 | - Single-user, single-session scenarios
187 | 
188 | ### HTTP Transport
189 | 
190 | **Best for**: Web services, remote access, multi-client scenarios
191 | 
192 | **Characteristics**:
193 | - Request-response pattern over HTTP
194 | - Supports multiple simultaneous clients
195 | - Can be deployed as a web service
196 | - Requires network configuration and security considerations
197 | 
198 | **Use when**:
199 | - Serving multiple clients simultaneously
200 | - Deploying as a cloud service
201 | - Integration with web applications
202 | - Need for load balancing or scaling
203 | 
204 | ### Server-Sent Events (SSE) Transport
205 | 
206 | **Best for**: Real-time updates, push notifications, streaming data
207 | 
208 | **Characteristics**:
209 | - One-way server-to-client streaming over HTTP
210 | - Enables real-time updates without polling
211 | - Long-lived connections for continuous data flow
212 | - Built on standard HTTP infrastructure
213 | 
214 | **Use when**:
215 | - Clients need real-time data updates
216 | - Implementing push notifications
217 | - Streaming logs or monitoring data
218 | - Progressive result delivery for long operations
219 | 
220 | ### Transport Selection Criteria
221 | 
222 | | Criterion | Stdio | HTTP | SSE |
223 | |-----------|-------|------|-----|
224 | | **Deployment** | Local | Remote | Remote |
225 | | **Clients** | Single | Multiple | Multiple |
226 | | **Communication** | Bidirectional | Request-Response | Server-Push |
227 | | **Complexity** | Low | Medium | Medium-High |
228 | | **Real-time** | No | No | Yes |
229 | 
230 | ---
231 | 
232 | ## 7. Tool Development Best Practices
233 | 
234 | ### General Guidelines
235 | 1. Tool names should be descriptive and action-oriented
236 | 2. Use parameter validation with detailed JSON schemas
237 | 3. Include examples in tool descriptions
238 | 4. Implement proper error handling and validation
239 | 5. Use progress reporting for long operations
240 | 6. Keep tool operations focused and atomic
241 | 7. Document expected return value structures
242 | 8. Implement proper timeouts
243 | 9. Consider rate limiting for resource-intensive operations
244 | 10. Log tool usage for debugging and monitoring
245 | 
246 | ### Security Considerations for Tools
247 | 
248 | #### Input Validation
249 | - Validate all parameters against schema
250 | - Sanitize file paths and system commands
251 | - Validate URLs and external identifiers
252 | - Check parameter sizes and ranges
253 | - Prevent command injection
254 | 
255 | #### Access Control
256 | - Implement authentication where needed
257 | - Use appropriate authorization checks
258 | - Audit tool usage
259 | - Rate limit requests
260 | - Monitor for abuse
261 | 
262 | #### Error Handling
263 | - Don't expose internal errors to clients
264 | - Log security-relevant errors
265 | - Handle timeouts appropriately
266 | - Clean up resources after errors
267 | - Validate return values
268 | 
269 | ### Tool Annotations
270 | - Provide readOnlyHint and destructiveHint annotations
271 | - Remember annotations are hints, not security guarantees
272 | - Clients should not make security-critical decisions based solely on annotations
273 | 
274 | ---
275 | 
276 | ## 8. Transport Best Practices
277 | 
278 | ### General Transport Guidelines
279 | 1. Handle connection lifecycle properly
280 | 2. Implement proper error handling
281 | 3. Use appropriate timeout values
282 | 4. Implement connection state management
283 | 5. Clean up resources on disconnection
284 | 
285 | ### Security Best Practices for Transport
286 | - Follow security considerations for DNS rebinding attacks
287 | - Implement proper authentication mechanisms
288 | - Validate message formats
289 | - Handle malformed messages gracefully
290 | 
291 | ### Stdio Transport Specific
292 | - Local MCP servers should NOT log to stdout (interferes with protocol)
293 | - Use stderr for logging messages
294 | - Handle standard I/O streams properly
295 | 
296 | ---
297 | 
298 | ## 9. Testing Requirements
299 | 
300 | A comprehensive testing strategy should cover:
301 | 
302 | ### Functional Testing
303 | - Verify correct execution with valid/invalid inputs
304 | 
305 | ### Integration Testing
306 | - Test interaction with external systems
307 | 
308 | ### Security Testing
309 | - Validate auth, input sanitization, rate limiting
310 | 
311 | ### Performance Testing
312 | - Check behavior under load, timeouts
313 | 
314 | ### Error Handling
315 | - Ensure proper error reporting and cleanup
316 | 
317 | ---
318 | 
319 | ## 10. OAuth and Security Best Practices
320 | 
321 | ### Authentication and Authorization
322 | 
323 | MCP servers that connect to external services should implement proper authentication:
324 | 
325 | **OAuth 2.1 Implementation:**
326 | - Use secure OAuth 2.1 with certificates from recognized authorities
327 | - Validate access tokens before processing requests
328 | - Only accept tokens specifically intended for your server
329 | - Reject tokens without proper audience claims
330 | - Never pass through tokens received from MCP clients
331 | 
332 | **API Key Management:**
333 | - Store API keys in environment variables, never in code
334 | - Validate keys on server startup
335 | - Provide clear error messages when authentication fails
336 | - Use secure transmission for sensitive credentials
337 | 
338 | ### Input Validation and Security
339 | 
340 | **Always validate inputs:**
341 | - Sanitize file paths to prevent directory traversal
342 | - Validate URLs and external identifiers
343 | - Check parameter sizes and ranges
344 | - Prevent command injection in system calls
345 | - Use schema validation (Pydantic/Zod) for all inputs
346 | 
347 | **Error handling security:**
348 | - Don't expose internal errors to clients
349 | - Log security-relevant errors server-side
350 | - Provide helpful but not revealing error messages
351 | - Clean up resources after errors
352 | 
353 | ### Privacy and Data Protection
354 | 
355 | **Data collection principles:**
356 | - Only collect data strictly necessary for functionality
357 | - Don't collect extraneous conversation data
358 | - Don't collect PII unless explicitly required for the tool's purpose
359 | - Provide clear information about what data is accessed
360 | 
361 | **Data transmission:**
362 | - Don't send data to servers outside your organization without disclosure
363 | - Use secure transmission (HTTPS) for all network communication
364 | - Validate certificates for external services
365 | 
366 | ---
367 | 
368 | ## 11. Resource Management Best Practices
369 | 
370 | 1. Only suggest necessary resources
371 | 2. Use clear, descriptive names for roots
372 | 3. Handle resource boundaries properly
373 | 4. Respect client control over resources
374 | 5. Use model-controlled primitives (tools) for automatic data exposure
375 | 
376 | ---
377 | 
378 | ## 12. Prompt Management Best Practices
379 | 
380 | - Clients should show users proposed prompts
381 | - Users should be able to modify or reject prompts
382 | - Clients should show users completions
383 | - Users should be able to modify or reject completions
384 | - Consider costs when using sampling
385 | 
386 | ---
387 | 
388 | ## 13. Error Handling Standards
389 | 
390 | - Use standard JSON-RPC error codes
391 | - Report tool errors within result objects (not protocol-level)
392 | - Provide helpful, specific error messages
393 | - Don't expose internal implementation details
394 | - Clean up resources properly on errors
395 | 
396 | ---
397 | 
398 | ## 14. Documentation Requirements
399 | 
400 | - Provide clear documentation of all tools and capabilities
401 | - Include working examples (at least 3 per major feature)
402 | - Document security considerations
403 | - Specify required permissions and access levels
404 | - Document rate limits and performance characteristics
405 | 
406 | ---
407 | 
408 | ## 15. Compliance and Monitoring
409 | 
410 | - Implement logging for debugging and monitoring
411 | - Track tool usage patterns
412 | - Monitor for potential abuse
413 | - Maintain audit trails for security-relevant operations
414 | - Be prepared for ongoing compliance reviews
415 | 
416 | ---
417 | 
418 | ## Summary
419 | 
420 | These best practices represent the comprehensive guidelines for building secure, efficient, and compliant MCP servers that work well within the ecosystem. Developers should follow these guidelines to ensure their MCP servers meet the standards for inclusion in the MCP directory and provide a safe, reliable experience for users.
421 | 
422 | 
423 | ----------
424 | 
425 | 
426 | # Tools
427 | 
428 | > Enable LLMs to perform actions through your server
429 | 
430 | Tools are a powerful primitive in the Model Context Protocol (MCP) that enable servers to expose executable functionality to clients. Through tools, LLMs can interact with external systems, perform computations, and take actions in the real world.
431 | 
432 | <Note>
433 |   Tools are designed to be **model-controlled**, meaning that tools are exposed from servers to clients with the intention of the AI model being able to automatically invoke them (with a human in the loop to grant approval).
434 | </Note>
435 | 
436 | ## Overview
437 | 
438 | Tools in MCP allow servers to expose executable functions that can be invoked by clients and used by LLMs to perform actions. Key aspects of tools include:
439 | 
440 | * **Discovery**: Clients can obtain a list of available tools by sending a `tools/list` request
441 | * **Invocation**: Tools are called using the `tools/call` request, where servers perform the requested operation and return results
442 | * **Flexibility**: Tools can range from simple calculations to complex API interactions
443 | 
444 | Like [resources](/docs/concepts/resources), tools are identified by unique names and can include descriptions to guide their usage. However, unlike resources, tools represent dynamic operations that can modify state or interact with external systems.
445 | 
446 | ## Tool definition structure
447 | 
448 | Each tool is defined with the following structure:
449 | 
450 | ```typescript
451 | {
452 |   name: string;          // Unique identifier for the tool
453 |   description?: string;  // Human-readable description
454 |   inputSchema: {         // JSON Schema for the tool's parameters
455 |     type: "object",
456 |     properties: { ... }  // Tool-specific parameters
457 |   },
458 |   annotations?: {        // Optional hints about tool behavior
459 |     title?: string;      // Human-readable title for the tool
460 |     readOnlyHint?: boolean;    // If true, the tool does not modify its environment
461 |     destructiveHint?: boolean; // If true, the tool may perform destructive updates
462 |     idempotentHint?: boolean;  // If true, repeated calls with same args have no additional effect
463 |     openWorldHint?: boolean;   // If true, tool interacts with external entities
464 |   }
465 | }
466 | ```
467 | 
468 | ## Implementing tools
469 | 
470 | Here's an example of implementing a basic tool in an MCP server:
471 | 
472 | <Tabs>
473 |   <Tab title="TypeScript">
474 |     ```typescript
475 |     const server = new Server({
476 |       name: "example-server",
477 |       version: "1.0.0"
478 |     }, {
479 |       capabilities: {
480 |         tools: {}
481 |       }
482 |     });
483 | 
484 |     // Define available tools
485 |     server.setRequestHandler(ListToolsRequestSchema, async () => {
486 |       return {
487 |         tools: [{
488 |           name: "calculate_sum",
489 |           description: "Add two numbers together",
490 |           inputSchema: {
491 |             type: "object",
492 |             properties: {
493 |               a: { type: "number" },
494 |               b: { type: "number" }
495 |             },
496 |             required: ["a", "b"]
497 |           }
498 |         }]
499 |       };
500 |     });
501 | 
502 |     // Handle tool execution
503 |     server.setRequestHandler(CallToolRequestSchema, async (request) => {
504 |       if (request.params.name === "calculate_sum") {
505 |         const { a, b } = request.params.arguments;
506 |         return {
507 |           content: [
508 |             {
509 |               type: "text",
510 |               text: String(a + b)
511 |             }
512 |           ]
513 |         };
514 |       }
515 |       throw new Error("Tool not found");
516 |     });
517 |     ```
518 |   </Tab>
519 | 
520 |   <Tab title="Python">
521 |     ```python
522 |     app = Server("example-server")
523 | 
524 |     @app.list_tools()
525 |     async def list_tools() -> list[types.Tool]:
526 |         return [
527 |             types.Tool(
528 |                 name="calculate_sum",
529 |                 description="Add two numbers together",
530 |                 inputSchema={
531 |                     "type": "object",
532 |                     "properties": {
533 |                         "a": {"type": "number"},
534 |                         "b": {"type": "number"}
535 |                     },
536 |                     "required": ["a", "b"]
537 |                 }
538 |             )
539 |         ]
540 | 
541 |     @app.call_tool()
542 |     async def call_tool(
543 |         name: str,
544 |         arguments: dict
545 |     ) -> list[types.TextContent | types.ImageContent | types.EmbeddedResource]:
546 |         if name == "calculate_sum":
547 |             a = arguments["a"]
548 |             b = arguments["b"]
549 |             result = a + b
550 |             return [types.TextContent(type="text", text=str(result))]
551 |         raise ValueError(f"Tool not found: {name}")
552 |     ```
553 |   </Tab>
554 | </Tabs>
555 | 
556 | ## Example tool patterns
557 | 
558 | Here are some examples of types of tools that a server could provide:
559 | 
560 | ### System operations
561 | 
562 | Tools that interact with the local system:
563 | 
564 | ```typescript
565 | {
566 |   name: "execute_command",
567 |   description: "Run a shell command",
568 |   inputSchema: {
569 |     type: "object",
570 |     properties: {
571 |       command: { type: "string" },
572 |       args: { type: "array", items: { type: "string" } }
573 |     }
574 |   }
575 | }
576 | ```
577 | 
578 | ### API integrations
579 | 
580 | Tools that wrap external APIs:
581 | 
582 | ```typescript
583 | {
584 |   name: "github_create_issue",
585 |   description: "Create a GitHub issue",
586 |   inputSchema: {
587 |     type: "object",
588 |     properties: {
589 |       title: { type: "string" },
590 |       body: { type: "string" },
591 |       labels: { type: "array", items: { type: "string" } }
592 |     }
593 |   }
594 | }
595 | ```
596 | 
597 | ### Data processing
598 | 
599 | Tools that transform or analyze data:
600 | 
601 | ```typescript
602 | {
603 |   name: "analyze_csv",
604 |   description: "Analyze a CSV file",
605 |   inputSchema: {
606 |     type: "object",
607 |     properties: {
608 |       filepath: { type: "string" },
609 |       operations: {
610 |         type: "array",
611 |         items: {
612 |           enum: ["sum", "average", "count"]
613 |         }
614 |       }
615 |     }
616 |   }
617 | }
618 | ```
619 | 
620 | ## Best practices
621 | 
622 | When implementing tools:
623 | 
624 | 1. Provide clear, descriptive names and descriptions
625 | 2. Use detailed JSON Schema definitions for parameters
626 | 3. Include examples in tool descriptions to demonstrate how the model should use them
627 | 4. Implement proper error handling and validation
628 | 5. Use progress reporting for long operations
629 | 6. Keep tool operations focused and atomic
630 | 7. Document expected return value structures
631 | 8. Implement proper timeouts
632 | 9. Consider rate limiting for resource-intensive operations
633 | 10. Log tool usage for debugging and monitoring
634 | 
635 | ### Tool name conflicts
636 | 
637 | MCP client applications and MCP server proxies may encounter tool name conflicts when building their own tool lists. For example, two connected MCP servers `web1` and `web2` may both expose a tool named `search_web`.
638 | 
639 | Applications may disambiguiate tools with one of the following strategies (among others; not an exhaustive list):
640 | 
641 | * Concatenating a unique, user-defined server name with the tool name, e.g. `web1___search_web` and `web2___search_web`. This strategy may be preferable when unique server names are already provided by the user in a configuration file.
642 | * Generating a random prefix for the tool name, e.g. `jrwxs___search_web` and `6cq52___search_web`. This strategy may be preferable in server proxies where user-defined unique names are not available.
643 | * Using the server URI as a prefix for the tool name, e.g. `web1.example.com:search_web` and `web2.example.com:search_web`. This strategy may be suitable when working with remote MCP servers.
644 | 
645 | Note that the server-provided name from the initialization flow is not guaranteed to be unique and is not generally suitable for disambiguation purposes.
646 | 
647 | ## Security considerations
648 | 
649 | When exposing tools:
650 | 
651 | ### Input validation
652 | 
653 | * Validate all parameters against the schema
654 | * Sanitize file paths and system commands
655 | * Validate URLs and external identifiers
656 | * Check parameter sizes and ranges
657 | * Prevent command injection
658 | 
659 | ### Access control
660 | 
661 | * Implement authentication where needed
662 | * Use appropriate authorization checks
663 | * Audit tool usage
664 | * Rate limit requests
665 | * Monitor for abuse
666 | 
667 | ### Error handling
668 | 
669 | * Don't expose internal errors to clients
670 | * Log security-relevant errors
671 | * Handle timeouts appropriately
672 | * Clean up resources after errors
673 | * Validate return values
674 | 
675 | ## Tool discovery and updates
676 | 
677 | MCP supports dynamic tool discovery:
678 | 
679 | 1. Clients can list available tools at any time
680 | 2. Servers can notify clients when tools change using `notifications/tools/list_changed`
681 | 3. Tools can be added or removed during runtime
682 | 4. Tool definitions can be updated (though this should be done carefully)
683 | 
684 | ## Error handling
685 | 
686 | Tool errors should be reported within the result object, not as MCP protocol-level errors. This allows the LLM to see and potentially handle the error. When a tool encounters an error:
687 | 
688 | 1. Set `isError` to `true` in the result
689 | 2. Include error details in the `content` array
690 | 
691 | Here's an example of proper error handling for tools:
692 | 
693 | <Tabs>
694 |   <Tab title="TypeScript">
695 |     ```typescript
696 |     try {
697 |       // Tool operation
698 |       const result = performOperation();
699 |       return {
700 |         content: [
701 |           {
702 |             type: "text",
703 |             text: `Operation successful: ${result}`
704 |           }
705 |         ]
706 |       };
707 |     } catch (error) {
708 |       return {
709 |         isError: true,
710 |         content: [
711 |           {
712 |             type: "text",
713 |             text: `Error: ${error.message}`
714 |           }
715 |         ]
716 |       };
717 |     }
718 |     ```
719 |   </Tab>
720 | 
721 |   <Tab title="Python">
722 |     ```python
723 |     try:
724 |         # Tool operation
725 |         result = perform_operation()
726 |         return types.CallToolResult(
727 |             content=[
728 |                 types.TextContent(
729 |                     type="text",
730 |                     text=f"Operation successful: {result}"
731 |                 )
732 |             ]
733 |         )
734 |     except Exception as error:
735 |         return types.CallToolResult(
736 |             isError=True,
737 |             content=[
738 |                 types.TextContent(
739 |                     type="text",
740 |                     text=f"Error: {str(error)}"
741 |                 )
742 |             ]
743 |         )
744 |     ```
745 |   </Tab>
746 | </Tabs>
747 | 
748 | This approach allows the LLM to see that an error occurred and potentially take corrective action or request human intervention.
749 | 
750 | ## Tool annotations
751 | 
752 | Tool annotations provide additional metadata about a tool's behavior, helping clients understand how to present and manage tools. These annotations are hints that describe the nature and impact of a tool, but should not be relied upon for security decisions.
753 | 
754 | ### Purpose of tool annotations
755 | 
756 | Tool annotations serve several key purposes:
757 | 
758 | 1. Provide UX-specific information without affecting model context
759 | 2. Help clients categorize and present tools appropriately
760 | 3. Convey information about a tool's potential side effects
761 | 4. Assist in developing intuitive interfaces for tool approval
762 | 
763 | ### Available tool annotations
764 | 
765 | The MCP specification defines the following annotations for tools:
766 | 
767 | | Annotation        | Type    | Default | Description                                                                                                                          |
768 | | ----------------- | ------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------ |
769 | | `title`           | string  | -       | A human-readable title for the tool, useful for UI display                                                                           |
770 | | `readOnlyHint`    | boolean | false   | If true, indicates the tool does not modify its environment                                                                          |
771 | | `destructiveHint` | boolean | true    | If true, the tool may perform destructive updates (only meaningful when `readOnlyHint` is false)                                     |
772 | | `idempotentHint`  | boolean | false   | If true, calling the tool repeatedly with the same arguments has no additional effect (only meaningful when `readOnlyHint` is false) |
773 | | `openWorldHint`   | boolean | true    | If true, the tool may interact with an "open world" of external entities                                                             |
774 | 
775 | ### Example usage
776 | 
777 | Here's how to define tools with annotations for different scenarios:
778 | 
779 | ```typescript
780 | // A read-only search tool
781 | {
782 |   name: "web_search",
783 |   description: "Search the web for information",
784 |   inputSchema: {
785 |     type: "object",
786 |     properties: {
787 |       query: { type: "string" }
788 |     },
789 |     required: ["query"]
790 |   },
791 |   annotations: {
792 |     title: "Web Search",
793 |     readOnlyHint: true,
794 |     openWorldHint: true
795 |   }
796 | }
797 | 
798 | // A destructive file deletion tool
799 | {
800 |   name: "delete_file",
801 |   description: "Delete a file from the filesystem",
802 |   inputSchema: {
803 |     type: "object",
804 |     properties: {
805 |       path: { type: "string" }
806 |     },
807 |     required: ["path"]
808 |   },
809 |   annotations: {
810 |     title: "Delete File",
811 |     readOnlyHint: false,
812 |     destructiveHint: true,
813 |     idempotentHint: true,
814 |     openWorldHint: false
815 |   }
816 | }
817 | 
818 | // A non-destructive database record creation tool
819 | {
820 |   name: "create_record",
821 |   description: "Create a new record in the database",
822 |   inputSchema: {
823 |     type: "object",
824 |     properties: {
825 |       table: { type: "string" },
826 |       data: { type: "object" }
827 |     },
828 |     required: ["table", "data"]
829 |   },
830 |   annotations: {
831 |     title: "Create Database Record",
832 |     readOnlyHint: false,
833 |     destructiveHint: false,
834 |     idempotentHint: false,
835 |     openWorldHint: false
836 |   }
837 | }
838 | ```
839 | 
840 | ### Integrating annotations in server implementation
841 | 
842 | <Tabs>
843 |   <Tab title="TypeScript">
844 |     ```typescript
845 |     server.setRequestHandler(ListToolsRequestSchema, async () => {
846 |       return {
847 |         tools: [{
848 |           name: "calculate_sum",
849 |           description: "Add two numbers together",
850 |           inputSchema: {
851 |             type: "object",
852 |             properties: {
853 |               a: { type: "number" },
854 |               b: { type: "number" }
855 |             },
856 |             required: ["a", "b"]
857 |           },
858 |           annotations: {
859 |             title: "Calculate Sum",
860 |             readOnlyHint: true,
861 |             openWorldHint: false
862 |           }
863 |         }]
864 |       };
865 |     });
866 |     ```
867 |   </Tab>
868 | 
869 |   <Tab title="Python">
870 |     ```python
871 |     from mcp.server.fastmcp import FastMCP
872 | 
873 |     mcp = FastMCP("example-server")
874 | 
875 |     @mcp.tool(
876 |         annotations={
877 |             "title": "Calculate Sum",
878 |             "readOnlyHint": True,
879 |             "openWorldHint": False
880 |         }
881 |     )
882 |     async def calculate_sum(a: float, b: float) -> str:
883 |         """Add two numbers together.
884 | 
885 |         Args:
886 |             a: First number to add
887 |             b: Second number to add
888 |         """
889 |         result = a + b
890 |         return str(result)
891 |     ```
892 |   </Tab>
893 | </Tabs>
894 | 
895 | ### Best practices for tool annotations
896 | 
897 | 1. **Be accurate about side effects**: Clearly indicate whether a tool modifies its environment and whether those modifications are destructive.
898 | 
899 | 2. **Use descriptive titles**: Provide human-friendly titles that clearly describe the tool's purpose.
900 | 
901 | 3. **Indicate idempotency properly**: Mark tools as idempotent only if repeated calls with the same arguments truly have no additional effect.
902 | 
903 | 4. **Set appropriate open/closed world hints**: Indicate whether a tool interacts with a closed system (like a database) or an open system (like the web).
904 | 
905 | 5. **Remember annotations are hints**: All properties in ToolAnnotations are hints and not guaranteed to provide a faithful description of tool behavior. Clients should never make security-critical decisions based solely on annotations.
906 | 
907 | ## Testing tools
908 | 
909 | A comprehensive testing strategy for MCP tools should cover:
910 | 
911 | * **Functional testing**: Verify tools execute correctly with valid inputs and handle invalid inputs appropriately
912 | * **Integration testing**: Test tool interaction with external systems using both real and mocked dependencies
913 | * **Security testing**: Validate authentication, authorization, input sanitization, and rate limiting
914 | * **Performance testing**: Check behavior under load, timeout handling, and resource cleanup
915 | * **Error handling**: Ensure tools properly report errors through the MCP protocol and clean up resources
916 | 
```