dicklesworthstone/llm_gateway_mcp

This is page 1 of 45. Use http://codebase.md/dicklesworthstone/llm_gateway_mcp_server?lines=true&page={x} to view the full context.

# Directory Structure

```
├── .cursorignore
├── .env.example
├── .envrc
├── .gitignore
├── additional_features.md
├── check_api_keys.py
├── completion_support.py
├── comprehensive_test.py
├── docker-compose.yml
├── Dockerfile
├── empirically_measured_model_speeds.json
├── error_handling.py
├── example_structured_tool.py
├── examples
│   ├── __init__.py
│   ├── advanced_agent_flows_using_unified_memory_system_demo.py
│   ├── advanced_extraction_demo.py
│   ├── advanced_unified_memory_system_demo.py
│   ├── advanced_vector_search_demo.py
│   ├── analytics_reporting_demo.py
│   ├── audio_transcription_demo.py
│   ├── basic_completion_demo.py
│   ├── cache_demo.py
│   ├── claude_integration_demo.py
│   ├── compare_synthesize_demo.py
│   ├── cost_optimization.py
│   ├── data
│   │   ├── sample_event.txt
│   │   ├── Steve_Jobs_Introducing_The_iPhone_compressed.md
│   │   └── Steve_Jobs_Introducing_The_iPhone_compressed.mp3
│   ├── docstring_refiner_demo.py
│   ├── document_conversion_and_processing_demo.py
│   ├── entity_relation_graph_demo.py
│   ├── filesystem_operations_demo.py
│   ├── grok_integration_demo.py
│   ├── local_text_tools_demo.py
│   ├── marqo_fused_search_demo.py
│   ├── measure_model_speeds.py
│   ├── meta_api_demo.py
│   ├── multi_provider_demo.py
│   ├── ollama_integration_demo.py
│   ├── prompt_templates_demo.py
│   ├── python_sandbox_demo.py
│   ├── rag_example.py
│   ├── research_workflow_demo.py
│   ├── sample
│   │   ├── article.txt
│   │   ├── backprop_paper.pdf
│   │   ├── buffett.pdf
│   │   ├── contract_link.txt
│   │   ├── legal_contract.txt
│   │   ├── medical_case.txt
│   │   ├── northwind.db
│   │   ├── research_paper.txt
│   │   ├── sample_data.json
│   │   └── text_classification_samples
│   │       ├── email_classification.txt
│   │       ├── news_samples.txt
│   │       ├── product_reviews.txt
│   │       └── support_tickets.txt
│   ├── sample_docs
│   │   └── downloaded
│   │       └── attention_is_all_you_need.pdf
│   ├── sentiment_analysis_demo.py
│   ├── simple_completion_demo.py
│   ├── single_shot_synthesis_demo.py
│   ├── smart_browser_demo.py
│   ├── sql_database_demo.py
│   ├── sse_client_demo.py
│   ├── test_code_extraction.py
│   ├── test_content_detection.py
│   ├── test_ollama.py
│   ├── text_classification_demo.py
│   ├── text_redline_demo.py
│   ├── tool_composition_examples.py
│   ├── tournament_code_demo.py
│   ├── tournament_text_demo.py
│   ├── unified_memory_system_demo.py
│   ├── vector_search_demo.py
│   ├── web_automation_instruction_packs.py
│   └── workflow_delegation_demo.py
├── LICENSE
├── list_models.py
├── marqo_index_config.json.example
├── mcp_protocol_schema_2025-03-25_version.json
├── mcp_python_lib_docs.md
├── mcp_tool_context_estimator.py
├── model_preferences.py
├── pyproject.toml
├── quick_test.py
├── README.md
├── resource_annotations.py
├── run_all_demo_scripts_and_check_for_errors.py
├── storage
│   └── smart_browser_internal
│       ├── locator_cache.db
│       ├── readability.js
│       └── storage_state.enc
├── test_client.py
├── test_connection.py
├── TEST_README.md
├── test_sse_client.py
├── test_stdio_client.py
├── tests
│   ├── __init__.py
│   ├── conftest.py
│   ├── integration
│   │   ├── __init__.py
│   │   └── test_server.py
│   ├── manual
│   │   ├── test_extraction_advanced.py
│   │   └── test_extraction.py
│   └── unit
│       ├── __init__.py
│       ├── test_cache.py
│       ├── test_providers.py
│       └── test_tools.py
├── TODO.md
├── tool_annotations.py
├── tools_list.json
├── ultimate_mcp_banner.webp
├── ultimate_mcp_logo.webp
├── ultimate_mcp_server
│   ├── __init__.py
│   ├── __main__.py
│   ├── cli
│   │   ├── __init__.py
│   │   ├── __main__.py
│   │   ├── commands.py
│   │   ├── helpers.py
│   │   └── typer_cli.py
│   ├── clients
│   │   ├── __init__.py
│   │   ├── completion_client.py
│   │   └── rag_client.py
│   ├── config
│   │   └── examples
│   │       └── filesystem_config.yaml
│   ├── config.py
│   ├── constants.py
│   ├── core
│   │   ├── __init__.py
│   │   ├── evaluation
│   │   │   ├── base.py
│   │   │   └── evaluators.py
│   │   ├── providers
│   │   │   ├── __init__.py
│   │   │   ├── anthropic.py
│   │   │   ├── base.py
│   │   │   ├── deepseek.py
│   │   │   ├── gemini.py
│   │   │   ├── grok.py
│   │   │   ├── ollama.py
│   │   │   ├── openai.py
│   │   │   └── openrouter.py
│   │   ├── server.py
│   │   ├── state_store.py
│   │   ├── tournaments
│   │   │   ├── manager.py
│   │   │   ├── tasks.py
│   │   │   └── utils.py
│   │   └── ums_api
│   │       ├── __init__.py
│   │       ├── ums_database.py
│   │       ├── ums_endpoints.py
│   │       ├── ums_models.py
│   │       └── ums_services.py
│   ├── exceptions.py
│   ├── graceful_shutdown.py
│   ├── services
│   │   ├── __init__.py
│   │   ├── analytics
│   │   │   ├── __init__.py
│   │   │   ├── metrics.py
│   │   │   └── reporting.py
│   │   ├── cache
│   │   │   ├── __init__.py
│   │   │   ├── cache_service.py
│   │   │   ├── persistence.py
│   │   │   ├── strategies.py
│   │   │   └── utils.py
│   │   ├── cache.py
│   │   ├── document.py
│   │   ├── knowledge_base
│   │   │   ├── __init__.py
│   │   │   ├── feedback.py
│   │   │   ├── manager.py
│   │   │   ├── rag_engine.py
│   │   │   ├── retriever.py
│   │   │   └── utils.py
│   │   ├── prompts
│   │   │   ├── __init__.py
│   │   │   ├── repository.py
│   │   │   └── templates.py
│   │   ├── prompts.py
│   │   └── vector
│   │       ├── __init__.py
│   │       ├── embeddings.py
│   │       └── vector_service.py
│   ├── tool_token_counter.py
│   ├── tools
│   │   ├── __init__.py
│   │   ├── audio_transcription.py
│   │   ├── base.py
│   │   ├── completion.py
│   │   ├── docstring_refiner.py
│   │   ├── document_conversion_and_processing.py
│   │   ├── enhanced-ums-lookbook.html
│   │   ├── entity_relation_graph.py
│   │   ├── excel_spreadsheet_automation.py
│   │   ├── extraction.py
│   │   ├── filesystem.py
│   │   ├── html_to_markdown.py
│   │   ├── local_text_tools.py
│   │   ├── marqo_fused_search.py
│   │   ├── meta_api_tool.py
│   │   ├── ocr_tools.py
│   │   ├── optimization.py
│   │   ├── provider.py
│   │   ├── pyodide_boot_template.html
│   │   ├── python_sandbox.py
│   │   ├── rag.py
│   │   ├── redline-compiled.css
│   │   ├── sentiment_analysis.py
│   │   ├── single_shot_synthesis.py
│   │   ├── smart_browser.py
│   │   ├── sql_databases.py
│   │   ├── text_classification.py
│   │   ├── text_redline_tools.py
│   │   ├── tournament.py
│   │   ├── ums_explorer.html
│   │   └── unified_memory_system.py
│   ├── utils
│   │   ├── __init__.py
│   │   ├── async_utils.py
│   │   ├── display.py
│   │   ├── logging
│   │   │   ├── __init__.py
│   │   │   ├── console.py
│   │   │   ├── emojis.py
│   │   │   ├── formatter.py
│   │   │   ├── logger.py
│   │   │   ├── panels.py
│   │   │   ├── progress.py
│   │   │   └── themes.py
│   │   ├── parse_yaml.py
│   │   ├── parsing.py
│   │   ├── security.py
│   │   └── text.py
│   └── working_memory_api.py
├── unified_memory_system_technical_analysis.md
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/.envrc:
--------------------------------------------------------------------------------

```
1 | source .venv/bin/activate   # use existing venv
2 | 
```

--------------------------------------------------------------------------------
/.cursorignore:
--------------------------------------------------------------------------------

```
1 | # Add directories or file patterns to ignore during indexing (e.g. foo/ or *.csv)
2 | /data/sec_filings
3 | /data/projects/smartedgar/sec_filings
```

--------------------------------------------------------------------------------
/.env.example:
--------------------------------------------------------------------------------

```
 1 | # Ultimate MCP Server
 2 | # Environment Variables Configuration Example
 3 | # Copy this file to .env and fill in your values
 4 | 
 5 | # Server Configuration
 6 | SERVER_NAME=Ultimate MCP Server
 7 | SERVER_PORT=8013
 8 | SERVER_HOST=0.0.0.0
 9 | SERVER_WORKERS=4
10 | SERVER_DEBUG=false
11 | 
12 | # Logging Configuration
13 | LOG_LEVEL=INFO                        
14 | LOG_FILE=logs/ultimate_mcp_server.log         
15 | USE_RICH_LOGGING=true                 
16 | 
17 | # Cache Configuration
18 | CACHE_ENABLED=true                    
19 | CACHE_TTL=86400                       
20 | CACHE_DIR=.cache                      
21 | CACHE_MAX_ENTRIES=10000               
22 | CACHE_FUZZY_MATCH=true                
23 | 
24 | # Provider API Keys
25 | OPENAI_API_KEY=sk-...                 
26 | ANTHROPIC_API_KEY=sk-ant-...          
27 | DEEPSEEK_API_KEY=sk-...               
28 | GEMINI_API_KEY=...                    
29 | OPENROUTER_API_KEY=sk-...
30 | 
31 | # Provider Default Models
32 | OPENAI_DEFAULT_MODEL=gpt-4.1-mini      
33 | ANTHROPIC_DEFAULT_MODEL=claude-3-5-haiku-20241022
34 | DEEPSEEK_DEFAULT_MODEL=deepseek-chat 
35 | GEMINI_DEFAULT_MODEL=gemini-2.5-pro-preview-03-25
36 | OPENROUTER_DEFAULT_MODEL=mistralai/mistral-nemo
37 | 
38 | DEFAULT_PROVIDER=anthropic
39 | 
40 | # Provider Token Limits
41 | OPENAI_MAX_TOKENS=8192              
42 | ANTHROPIC_MAX_TOKENS=200000         
43 | DEEPSEEK_MAX_TOKENS=8192            
44 | GEMINI_MAX_TOKENS=8192              
45 | OPENROUTER_MAX_TOKENS=8192
46 | 
47 | # Vector Embedding Service
48 | EMBEDDING_CACHE_DIR=.embeddings     
49 | EMBEDDING_DEFAULT_MODEL=text-embedding-3-small 
50 | 
51 | # Advanced Configuration
52 | REQUEST_TIMEOUT=60               
53 | RATE_LIMIT_ENABLED=false         
54 | MAX_CONCURRENT_REQUESTS=20       
55 | 
56 | # Playwright Configuration
57 | PLAYWRIGHT_BROWSER_DEFAULT=chromium # chromium, firefox, webkit
58 | PLAYWRIGHT_HEADLESS_DEFAULT=false
59 | PLAYWRIGHT_DEFAULT_TIMEOUT=30000 # ms
60 | PLAYWRIGHT_DEFAULT_USER_DATA_DIR= # Path for persistent sessions
61 | PLAYWRIGHT_EXECUTABLE_PATH= # Path to custom browser binary
62 | 
63 | # OCR Configuration
64 | OCR_TESSERACT_PATH=/usr/bin/tesseract  # Path to Tesseract executable
65 | OCR_POPPLER_PATH=/usr/bin              # Path to Poppler binaries (for pdf2image)
66 | OCR_DPI=300                            # Default DPI for PDF rendering
67 | OCR_DEFAULT_LANGUAGE=eng               # Default OCR language 
68 | 
69 | # File system configuration
70 | BROWSER_AUTOMATION_OUTPUT_DIR=browser_demo_outputs
71 | BROWSER_AUTOMATION_REPORT_DIR=browser_demo_outputs/reports
72 | BROWSER_AUTOMATION_SCREENSHOTS_DIR=browser_demo_outputs/screenshots
73 | FILESYSTEM__ALLOWED_DIRECTORIES='["/home/ubuntu/ultimate_mcp_server/browser_demo_outputs", "/home/ubuntu/ultimate_mcp_server/browser_demo_outputs/reports", "/home/ubuntu/ultimate_mcp_server/browser_demo_outputs/screenshots", "/home/ubuntu/ultimate_mcp_server/storage", "/home/ubuntu/ultimate_mcp_server/examples/redline_outputs"]'
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
  1 | # Byte-compiled / optimized / DLL files
  2 | __pycache__/
  3 | *.py[cod]
  4 | *$py.class
  5 | 
  6 | # C extensions
  7 | *.so
  8 | 
  9 | # Distribution / packaging
 10 | .Python
 11 | build/
 12 | develop-eggs/
 13 | dist/
 14 | downloads/
 15 | eggs/
 16 | .eggs/
 17 | lib/
 18 | lib64/
 19 | parts/
 20 | sdist/
 21 | var/
 22 | wheels/
 23 | share/python-wheels/
 24 | *.egg-info/
 25 | .installed.cfg
 26 | *.egg
 27 | MANIFEST
 28 | 
 29 | # PyInstaller
 30 | #  Usually these files are written by a python script from a template
 31 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 32 | *.manifest
 33 | *.spec
 34 | 
 35 | # Installer logs
 36 | pip-log.txt
 37 | pip-delete-this-directory.txt
 38 | 
 39 | # Unit test / coverage reports
 40 | htmlcov/
 41 | .tox/
 42 | .nox/
 43 | .coverage
 44 | .coverage.*
 45 | .cache
 46 | nosetests.xml
 47 | coverage.xml
 48 | *.cover
 49 | *.py,cover
 50 | .hypothesis/
 51 | .pytest_cache/
 52 | cover/
 53 | 
 54 | # Translations
 55 | *.mo
 56 | *.pot
 57 | 
 58 | # Django stuff:
 59 | *.log
 60 | local_settings.py
 61 | db.sqlite3
 62 | db.sqlite3-journal
 63 | 
 64 | # Flask stuff:
 65 | instance/
 66 | .webassets-cache
 67 | 
 68 | # Scrapy stuff:
 69 | .scrapy
 70 | 
 71 | # Sphinx documentation
 72 | docs/_build/
 73 | 
 74 | # PyBuilder
 75 | .pybuilder/
 76 | target/
 77 | 
 78 | # Jupyter Notebook
 79 | .ipynb_checkpoints
 80 | 
 81 | # IPython
 82 | profile_default/
 83 | ipython_config.py
 84 | 
 85 | # pyenv
 86 | #   For a library or package, you might want to ignore these files since the code is
 87 | #   intended to run in multiple environments; otherwise, check them in:
 88 | # .python-version
 89 | 
 90 | # pipenv
 91 | #   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
 92 | #   However, in case of collaboration, if having platform-specific dependencies or dependencies
 93 | #   having no cross-platform support, pipenv may install dependencies that don't work, or not
 94 | #   install all needed dependencies.
 95 | #Pipfile.lock
 96 | 
 97 | # poetry
 98 | #   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
 99 | #   This is especially recommended for binary packages to ensure reproducibility, and is more
100 | #   commonly ignored for libraries.
101 | #   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
102 | #poetry.lock
103 | 
104 | # pdm
105 | #   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
106 | #pdm.lock
107 | #   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
108 | #   in version control.
109 | #   https://pdm.fming.dev/#use-with-ide
110 | .pdm.toml
111 | 
112 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
113 | __pypackages__/
114 | 
115 | # Celery stuff
116 | celerybeat-schedule
117 | celerybeat.pid
118 | 
119 | # SageMath parsed files
120 | *.sage.py
121 | 
122 | # Environments
123 | .venv
124 | env/
125 | venv/
126 | ENV/
127 | env.bak/
128 | venv.bak/
129 | 
130 | # Spyder project settings
131 | .spyderproject
132 | .spyproject
133 | 
134 | # Rope project settings
135 | .ropeproject
136 | 
137 | # mkdocs documentation
138 | /site
139 | 
140 | # mypy
141 | .mypy_cache/
142 | .dmypy.json
143 | dmypy.json
144 | 
145 | # Pyre type checker
146 | .pyre/
147 | 
148 | # pytype static type analyzer
149 | .pytype/
150 | 
151 | # Cython debug symbols
152 | cython_debug/
153 | 
154 | # PyCharm
155 | #  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
156 | #  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
157 | #  and can be added to the global gitignore or merged into this file.  For a more nuclear
158 | #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
159 | #.idea/
160 | 
161 | #ignore .bin model files and .sqlite files
162 | *.bin
163 | *.sqlite
164 | *.sqlite-shm
165 | *.sqlite-wal
166 | *.sqlite-journal
167 | *.gguf
168 | folder_of_source_documents__original_format
169 | folder_of_source_documents__converted_to_plaintext
170 | *.zip
171 | # *.md
172 | generated_transcript_metadata_tables
173 | generated_transcript_combined_texts
174 | downloaded_audio
175 | old_logs
176 | bulk_transcript_optimizer_single_file.txt
177 | processing_request_metadata_json_files
178 | temp_result_data_folders
179 | fonts
180 | temp_results/*
181 | generate_graded_quiz.ts
182 | bulk_youtube_transcript_optimizer.log.*
183 | fix_my_documents_single_file__nextjs_frontend.txt
184 | chroma_db
185 | sec_filings.db
186 | finished_processed_filings.db
187 | sec_filings
188 | sec_filings.db-shm
189 | sec_filings.db-wal
190 | minified_code.js
191 | regenerate_markdown_progress.json.bak
192 | .env
193 | storage/tournaments/
194 | cache/cache.pkl
195 | cache/disk_cache/cache.db
196 | marqo_index_config.json
197 | marqo_docstring_cache.json
198 | cache/disk_cache/cache.db-shm
199 | cache/disk_cache/cache.db-wal
200 | logs
201 | *.db-shm
202 | *.db-wal
203 | cache/disk_cache/cache.db-shm
204 | cache/disk_cache/cache.db-wal
205 | demo_database.db
206 | models/
207 | test_input.wav
208 | browser_demo_outputs/
209 | examples/redline_outputs
210 | quantum_computing_papers/
211 | examples/data/
212 | examples/output/
213 | all_demo_script_console_output_log.txt
214 | examples/data/Steve_Jobs_Introducing_The_iPhone_compressed.md
215 | conversion_outputs/
216 | error_log.txt
217 | test_documents/
218 | complete_text_sent_to_llm_api_provider_to_register_active_toolset.txt
219 | all_tools_sent_to_llm.json
220 | current_tools_sent_to_llm.json
221 | demo_output.txt
222 | browser_outputs/
223 | examples/demo.db
224 | downloaded_files/
225 | examples/sample/backprop_paper.md
226 | examples/sample/downloaded/
227 | storage/sb_demo_outputs/
228 | storage/smart_browser_scratch/autopilot_demo/
229 | storage/smart_browser_internal/*.db
230 | storage/smart_browser_internal/locator_cache.db
231 | unified_agent_memory.db
232 | advanced_demo_memory.db
233 | output.txt
234 | advanced_agent_flow_memory.db
235 | debug_mod_tree_with_ids.xml
236 | debug_orig_tree_with_ids.xml
237 | redline.css
238 | tailwind.config.js
239 | storage/single_shot_synthesis
240 | storage/unified_agent_memory.db-journal
241 | storage/exercise_mental_health_quiz.html
242 | storage/exercise_mental_health_report.md
243 | storage/unified_agent_memory.dbOLD
244 | 
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
   1 | # 🧠 Ultimate MCP Server
   2 | 
   3 | <div align="center">
   4 | 
   5 | [![Python 3.13+](https://img.shields.io/badge/python-3.13+-blue.svg)](https://www.python.org/downloads/)
   6 | [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
   7 | [![MCP Protocol](https://img.shields.io/badge/Protocol-MCP-purple.svg)](https://github.com/modelcontextprotocol)
   8 | 
   9 | ### A comprehensive Model Context Protocol (MCP) server providing advanced AI agents with dozens of powerful capabilities for cognitive augmentation, tool use, and intelligent orchestration
  10 | 
  11 | <img src="https://raw.githubusercontent.com/Dicklesworthstone/ultimate_mcp_server/refs/heads/main/ultimate_mcp_banner.webp" alt="Illustration" width="600"/>
  12 | 
  13 | **[Getting Started](#getting-started) • [Key Features](#key-features) • [Usage Examples](#usage-examples) • [Architecture](#architecture)**
  14 | 
  15 | </div>
  16 | 
  17 | ---
  18 | 
  19 | ## 🤖 What is Ultimate MCP Server?
  20 | 
  21 | **Ultimate MCP Server** is a comprehensive MCP-native system that serves as a complete AI agent operating system. It exposes dozens of powerful capabilities through the Model Context Protocol, enabling advanced AI agents to access a rich ecosystem of tools, cognitive systems, and specialized services.
  22 | 
  23 | While it includes intelligent task delegation from sophisticated models (e.g., Claude 3.7 Sonnet) to cost-effective ones (e.g., Gemini Flash 2.0 Lite), this is just one facet of its extensive functionality. The server provides unified access to multiple LLM providers while optimizing for **cost**, **performance**, and **quality**.
  24 | 
  25 | The system offers integrated cognitive memory systems, browser automation, Excel manipulation, database interactions, document processing, command-line utilities, dynamic API integration, OCR capabilities, vector operations, entity relation graphs, SQL database interactions, audio transcription, and much more. These capabilities transform an AI agent from a conversational interface into a powerful autonomous system capable of complex, multi-step operations across digital environments.
  26 | 
  27 | <div align="center">
  28 | 
  29 | <img src="https://raw.githubusercontent.com/Dicklesworthstone/ultimate_mcp_server/refs/heads/main/ultimate_mcp_logo.webp" alt="Illustration" width="600"/>
  30 | 
  31 | </div>
  32 | 
  33 | ---## 🎯 Vision: The Complete AI Agent Operating System
  34 | 
  35 | At its core, Ultimate MCP Server represents a fundamental shift in how AI agents operate in digital environments. It serves as a comprehensive operating system for AI, providing:
  36 | 
  37 | - 🧠 A unified cognitive architecture that enables persistent memory, reasoning, and contextual awareness
  38 | - ⚙️ Seamless access to dozens of specialized tools spanning web browsing, document processing, data analysis, and more
  39 | - 💻 Direct system-level capabilities for filesystem operations, database interactions, and command-line utilities
  40 | - 🔄 Dynamic workflow capabilities for complex multi-step task orchestration and execution
  41 | - 🌐 Intelligent integration of various LLM providers with cost, quality, and performance optimization
  42 | - 🚀 Advanced vector operations, knowledge graphs, and retrieval-augmented generation for enhanced AI capabilities
  43 | 
  44 | This approach mirrors how sophisticated operating systems provide applications with access to hardware, services, and resources - but designed specifically for augmenting AI agents with powerful new capabilities beyond their native abilities.
  45 | 
  46 | ---
  47 | 
  48 | ## 🔌 MCP-Native Architecture
  49 | 
  50 | The server is built entirely on the [Model Context Protocol (MCP)](https://github.com/modelcontextprotocol), making it specifically designed to work with AI agents like Claude. All functionality is exposed through standardized MCP tools that can be directly called by these agents, creating a seamless integration layer between AI agents and a comprehensive ecosystem of capabilities, services, and external systems.
  51 | 
  52 | ---
  53 | 
  54 | ## 🧬 Core Use Cases: AI Agent Augmentation and Ecosystem
  55 | 
  56 | The Ultimate MCP Server transforms AI agents like Claude 3.7 Sonnet into autonomous systems capable of sophisticated operations across digital environments:
  57 | 
  58 | ```plaintext
  59 |                         interacts with
  60 | ┌─────────────┐ ────────────────────────► ┌───────────────────┐         ┌──────────────┐
  61 | │ Claude 3.7  │                           │   Ultimate MCP     │ ───────►│ LLM Providers│
  62 | │   (Agent)   │ ◄──────────────────────── │     Server        │ ◄───────│ External     │
  63 | └─────────────┘      returns results      └───────────────────┘         │ Systems      │
  64 |                                                 │                        └──────────────┘
  65 |                                                 ▼
  66 |                       ┌─────────────────────────────────────────────┐
  67 |                       │ Cognitive Memory Systems                    │
  68 |                       │ Web & Data: Browser, DB, RAG, Vector Search │
  69 |                       │ Documents: Excel, OCR, PDF, Filesystem      │
  70 |                       │ Analysis: Entity Graphs, Classification     │
  71 |                       │ Integration: APIs, CLI, Audio, Multimedia   │
  72 |                       └─────────────────────────────────────────────┘
  73 | ```
  74 | 
  75 | **Example workflow:**
  76 | 
  77 | 1. An AI agent receives a complex task requiring multiple capabilities beyond its native abilities
  78 | 2. The agent uses the Ultimate MCP Server to access specialized tools and services as needed
  79 | 3. The agent can leverage the cognitive memory system to maintain state and context across operations
  80 | 4. Complex tasks like research, data analysis, document creation, and multimedia processing become possible
  81 | 5. The agent can orchestrate multi-step workflows combining various tools in sophisticated sequences
  82 | 6. Results are returned in standard MCP format, enabling the agent to understand and work with them
  83 | 7. One important benefit is cost optimization through delegating appropriate tasks to more efficient models
  84 | 
  85 | This integration unlocks transformative capabilities that enable AI agents to autonomously complete complex projects while intelligently utilizing resources - including potentially saving 70-90% on API costs by using specialized tools and cost-effective models where appropriate.
  86 | 
  87 | ---
  88 | 
  89 | ## 💡 Why Use Ultimate MCP Server?
  90 | 
  91 | ### 🧰 Comprehensive AI Agent Toolkit
  92 | A unified hub enabling advanced AI agents to access an extensive ecosystem of tools:
  93 | -   🌐 Perform complex web automation tasks (**Playwright** integration).
  94 | -   📊 Manipulate and analyze **Excel** spreadsheets with deep integration.
  95 | -   🧠 Access rich **cognitive memory** systems for persistent agent state.
  96 | -   💾 Interact securely with the **filesystem**.
  97 | -   🗄️ Interact with **databases** through SQL operations.
  98 | -   🖼️ Process documents with **OCR** capabilities.
  99 | -   🔍 Perform sophisticated **vector search** and **RAG** operations.
 100 | -   🏷️ Utilize specialized **text processing** and **classification**.
 101 | -   ⌨️ Leverage command-line tools like **ripgrep**, **awk**, **sed**, **jq**.
 102 | -   🔌 Dynamically integrate external **REST APIs**.
 103 | -   ✨ Use **meta tools** for self-discovery, optimization, and documentation refinement.
 104 | 
 105 | ### 💵 Cost Optimization
 106 | API costs for advanced models can be substantial. Ultimate MCP Server helps reduce costs by:
 107 | -   📉 Routing appropriate tasks to cheaper models (e.g., $0.01/1K tokens vs $0.15/1K tokens).
 108 | -   ⚡ Implementing **advanced caching** (exact, semantic, task-aware) to avoid redundant API calls.
 109 | -   💰 Tracking and **optimizing costs** across providers.
 110 | -   🧭 Enabling **cost-aware task routing** decisions.
 111 | -   🛠️ Handling routine processing with specialized non-LLM tools (filesystem, CLI utils, etc.).
 112 | 
 113 | ### 🌐 Provider Abstraction
 114 | Avoid provider lock-in with a unified interface:
 115 | -   🔗 Standard API for **OpenAI**, **Anthropic (Claude)**, **Google (Gemini)**, **xAI (Grok)**, **DeepSeek**, and **OpenRouter**.
 116 | -   ⚙️ Consistent parameter handling and response formatting.
 117 | -   🔄 Ability to **swap providers** without changing application code.
 118 | -   🛡️ Protection against provider-specific outages and limitations through fallback mechanisms.
 119 | 
 120 | ### 📑 Comprehensive Document and Data Processing
 121 | Process documents and data efficiently:
 122 | -   ✂️ Break documents into semantically meaningful **chunks**.
 123 | -   🚀 Process chunks in **parallel** across multiple models.
 124 | -   📊 Extract **structured data** (JSON, tables, key-value) from unstructured text.
 125 | -   ✍️ Generate **summaries** and insights from large texts.
 126 | -   🔁 Convert formats (**HTML to Markdown**, documents to structured data).
 127 | -   👁️ Apply **OCR** to images and PDFs with optional LLM enhancement.
 128 | 
 129 | ---
 130 | 
 131 | ## 🚀 Key Features
 132 | 
 133 | ### 🔌 MCP Protocol Integration
 134 | -   **Native MCP Server**: Built on the Model Context Protocol for seamless AI agent integration.
 135 | -   **MCP Tool Framework**: All functionality exposed through standardized MCP tools with clear schemas.
 136 | -   **Tool Composition**: Tools can be combined in workflows using dependencies.
 137 | -   **Tool Discovery**: Supports dynamic listing and capability discovery for agents.
 138 | 
 139 | ### 🤖 Intelligent Task Delegation
 140 | -   **Task Routing**: Analyzes tasks and routes to appropriate models or specialized tools.
 141 | -   **Provider Selection**: Chooses provider/model based on task requirements, cost, quality, or speed preferences.
 142 | -   **Cost-Performance Balancing**: Optimizes delegation strategy.
 143 | -   **Delegation Tracking**: Monitors delegation patterns, costs, and outcomes (via Analytics).
 144 | 
 145 | ### 🌍 Provider Integration
 146 | -   **Multi-Provider Support**: First-class support for OpenAI, Anthropic, Google, DeepSeek, xAI (Grok), OpenRouter. Extensible architecture.
 147 | -   **Model Management**: Handles different model capabilities, context windows, and pricing. Automatic selection and fallback mechanisms.
 148 | 
 149 | ### 💾 Advanced Caching
 150 | -   **Multi-level Caching**: Exact match, semantic similarity, and task-aware strategies.
 151 | -   **Persistent Cache**: Disk-based persistence (e.g., DiskCache) with fast in-memory access layer.
 152 | -   **Cache Analytics**: Tracks cache hit rates, estimated cost savings.
 153 | 
 154 | ### 📄 Document Tools
 155 | -   **Smart Chunking**: Token-based, semantic boundary detection, structural analysis methods. Configurable overlap.
 156 | -   **Document Operations**: Summarization (paragraph, bullets), entity extraction, question generation, batch processing.
 157 | 
 158 | ### 📁 Secure Filesystem Operations
 159 | -   **Path Management**: Robust validation, normalization, symlink security checks, configurable allowed directories.
 160 | -   **File Operations**: Read/write with encoding handling, smart text editing/replacement, metadata retrieval.
 161 | -   **Directory Operations**: Creation, listing, tree visualization, secure move/copy.
 162 | -   **Search Capabilities**: Recursive search with pattern matching and filtering.
 163 | -   **Security Focus**: Designed to prevent directory traversal and enforce boundaries.
 164 | 
 165 | ### ✨ Autonomous Tool Documentation Refiner
 166 | -   **Automated Improvement**: Systematically analyzes, tests, and refines MCP tool documentation (docstrings, schemas, examples).
 167 | -   **Agent Simulation**: Identifies ambiguities from an LLM agent's perspective.
 168 | -   **Adaptive Testing**: Generates and executes schema-aware test cases.
 169 | -   **Failure Analysis**: Uses LLM ensembles to diagnose documentation weaknesses.
 170 | -   **Iterative Refinement**: Continuously improves documentation quality.
 171 | -   **(See dedicated section for more details)**
 172 | 
 173 | ### 🌐 Browser Automation with Playwright
 174 | -   **Full Control**: Navigate, click, type, scrape data, screenshots, PDFs, file up/download, JS execution.
 175 | -   **Research**: Automate searches across engines, extract structured data, monitor sites.
 176 | -   **Synthesis**: Combine findings from multiple web sources into reports.
 177 | 
 178 | ### 🧠 Cognitive & Agent Memory System
 179 | -   **Memory Hierarchy**: Working, episodic, semantic, procedural levels.
 180 | -   **Knowledge Management**: Store/retrieve memories with metadata, relationships, importance tracking.
 181 | -   **Workflow Tracking**: Record agent actions, reasoning chains, artifacts, dependencies.
 182 | -   **Smart Operations**: Memory consolidation, reflection generation, relevance-based optimization, decay.
 183 | 
 184 | ### 📊 Excel Spreadsheet Automation
 185 | -   **Direct Manipulation**: Create, modify, format Excel files via natural language or structured instructions. Analyze formulas.
 186 | -   **Template Learning**: Learn from examples, adapt templates, apply formatting patterns.
 187 | -   **VBA Macro Generation**: Generate VBA code from instructions for complex automation.
 188 | 
 189 | ### 🏗️ Structured Data Extraction
 190 | -   **JSON Extraction**: Extract structured JSON with schema validation.
 191 | -   **Table Extraction**: Extract tables in multiple formats (JSON, CSV, Markdown).
 192 | -   **Key-Value Extraction**: Simple K/V pair extraction.
 193 | -   **Semantic Schema Inference**: Attempt to generate schemas from text.
 194 | 
 195 | ### ⚔️ Tournament Mode
 196 | -   **Model Competitions**: Run head-to-head comparisons for code or text generation tasks.
 197 | -   **Multi-Model Evaluation**: Compare outputs from different models/providers simultaneously.
 198 | -   **Performance Metrics**: Evaluate correctness, efficiency, style, etc. Persist results.
 199 | 
 200 | ### 🗄️ SQL Database Interactions
 201 | -   **Query Execution**: Run SQL queries against various DB types (SQLite, PostgreSQL, etc. via SQLAlchemy).
 202 | -   **Schema Analysis**: Analyze schemas, suggest optimizations (using LLM).
 203 | -   **Data Exploration**: Browse tables, visualize contents.
 204 | -   **Query Generation**: Generate SQL from natural language descriptions.
 205 | 
 206 | ### 🔗 Entity Relation Graphs
 207 | -   **Entity Extraction**: Identify entities (people, orgs, locations, etc.).
 208 | -   **Relationship Mapping**: Discover and map connections between entities.
 209 | -   **Knowledge Graph Construction**: Build persistent graphs (e.g., using NetworkX).
 210 | -   **Graph Querying**: Extract insights using graph traversal or LLM-based queries.
 211 | 
 212 | ### 🔎 Advanced Vector Operations
 213 | -   **Semantic Search**: Find similar content using vector embeddings.
 214 | -   **Vector Storage Integration**: Interfaces with vector databases or local stores.
 215 | -   **Hybrid Search**: Combines keyword and semantic search (e.g., via Marqo integration).
 216 | -   **Batched Processing**: Efficient embedding generation and searching for large datasets.
 217 | 
 218 | ### 📚 Retrieval-Augmented Generation (RAG)
 219 | -   **Contextual Generation**: Augments prompts with relevant retrieved documents/chunks.
 220 | -   **Accuracy Improvement**: Reduces hallucinations by grounding responses in provided context.
 221 | -   **Workflow Integration**: Seamlessly combines retrieval (vector/keyword search) with generation. Customizable strategies.
 222 | 
 223 | ### 🎙️ Audio Transcription
 224 | -   **Speech-to-Text**: Convert audio files (e.g., WAV, MP3) to text using models like Whisper.
 225 | -   **Speaker Diarization**: Identify different speakers (if supported by the model/library).
 226 | -   **Transcript Enhancement**: Clean and format transcripts using LLMs.
 227 | -   **Multi-language Support**: Handles various languages based on the underlying transcription model.
 228 | 
 229 | ### 🏷️ Text Classification
 230 | -   **Custom Classifiers**: Apply text classification models (potentially fine-tuned or using zero-shot LLMs).
 231 | -   **Multi-label Classification**: Assign multiple categories.
 232 | -   **Confidence Scoring**: Provide probabilities for classifications.
 233 | -   **Batch Processing**: Classify large document sets efficiently.
 234 | 
 235 | ### 👁️ OCR Tools
 236 | -   **PDF/Image Extraction**: Uses Tesseract or other OCR engines, enhanced with LLM correction/formatting.
 237 | -   **Preprocessing**: Image denoising, thresholding, deskewing options.
 238 | -   **Structure Analysis**: Extracts PDF metadata and structure.
 239 | -   **Batch Processing**: Handles multiple files concurrently.
 240 | -   **(Requires `ocr` extra dependencies: `uv pip install -e ".[ocr]"`)**
 241 | 
 242 | ### 📝 Text Redline Tools
 243 | -   **HTML Redline Generation**: Visual diffs (insertions, deletions, moves) between text/HTML. Standalone HTML output.
 244 | -   **Document Comparison**: Compares various formats with intuitive highlighting.
 245 | 
 246 | ### 🔄 HTML to Markdown Conversion
 247 | -   **Intelligent Conversion**: Detects content type, uses libraries like `readability-lxml`, `trafilatura`, `markdownify`.
 248 | -   **Content Extraction**: Filters boilerplate, preserves structure (tables, links).
 249 | -   **Markdown Optimization**: Cleans and normalizes output.
 250 | 
 251 | ### 📈 Workflow Optimization Tools
 252 | -   **Cost Estimation/Comparison**: Pre-execution cost estimates, model cost comparisons.
 253 | -   **Model Selection Guidance**: Recommends models based on task, budget, performance needs.
 254 | -   **Workflow Execution Engine**: Runs multi-stage pipelines with dependencies, parallel execution, variable passing.
 255 | 
 256 | ### 💻 Local Text Processing Tools (CLI Integration)
 257 | -   **Offline Power**: Securely wrap and expose command-line tools like `ripgrep` (fast regex search), `awk` (text processing), `sed` (stream editor), `jq` (JSON processing) as MCP tools. Process text locally without API calls.
 258 | 
 259 | ### ⏱️ Model Performance Benchmarking
 260 | -   **Empirical Measurement**: Tools to measure actual speed (tokens/sec), latency across providers/models.
 261 | -   **Performance Profiles**: Generate comparative reports based on real-world performance.
 262 | -   **Data-Driven Optimization**: Use benchmark data to inform routing decisions.
 263 | 
 264 | ### 📡 Multiple Transport Modes
 265 | -   **Streamable-HTTP (Recommended)**: Modern HTTP transport with streaming request/response bodies, optimal for HTTP-based MCP clients.
 266 | -   **Server-Sent Events (SSE)**: Legacy HTTP transport using server-sent events for real-time streaming.
 267 | -   **Standard I/O (stdio)**: Direct process communication for embedded integrations.
 268 | -   **Real-time Streaming**: Token-by-token updates for LLM completions across all HTTP transports.
 269 | -   **Progress Monitoring**: Track progress of long-running jobs (chunking, batch processing).
 270 | -   **Event-Based Architecture**: Subscribe to specific server events.
 271 | 
 272 | ### ✨ Multi-Model Synthesis
 273 | -   **Comparative Analysis**: Analyze outputs from multiple models side-by-side.
 274 | -   **Response Synthesis**: Combine best elements, generate meta-responses, create consensus outputs.
 275 | -   **Collaborative Reasoning**: Implement workflows where different models handle different steps.
 276 | 
 277 | ### 🧩 Extended Model Support
 278 | -   **Grok Integration**: Native support for xAI's Grok.
 279 | -   **DeepSeek Support**: Optimized handling for DeepSeek models.
 280 | -   **OpenRouter Integration**: Access a wide variety via OpenRouter API key.
 281 | -   **Gemini Integration**: Comprehensive support for Google's Gemini models.
 282 | -   **Anthropic Integration**: Full support for Claude models including Claude 3.5 Sonnet and Haiku.
 283 | -   **OpenAI Integration**: Complete support for GPT-3.5, GPT-4.0, and newer models.
 284 | 
 285 | ### 🔧 Meta Tools for Self-Improvement & Dynamic Integration
 286 | -   **Tool Discovery**: Agents can query available tools, parameters, descriptions (`list_tools`).
 287 | -   **Usage Recommendations**: Get AI-driven advice on tool selection/combination for tasks.
 288 | -   **External API Integration**: Dynamically register REST APIs via OpenAPI specs, making endpoints available as callable MCP tools (`register_api`, `call_dynamic_tool`).
 289 | -   **Documentation Generation**: Part of the Autonomous Refiner feature.
 290 | 
 291 | ### 📊 Analytics and Reporting
 292 | -   **Usage Tracking**: Monitors tokens, costs, requests, success/error rates per provider/model/tool.
 293 | -   **Real-Time Monitoring**: Live dashboard or stream of usage stats.
 294 | -   **Detailed Reporting**: Generate historical cost/usage reports, identify trends, export data.
 295 | -   **Optimization Insights**: Helps identify expensive operations or inefficient patterns.
 296 | 
 297 | ### 📜 Prompt Templates and Management
 298 | -   **Jinja2 Templates**: Create reusable, dynamic prompts with variables, conditionals, includes.
 299 | -   **Prompt Repository**: Store, retrieve, categorize, and version control prompts.
 300 | -   **Metadata**: Add descriptions, authorship, usage examples to templates.
 301 | -   **Optimization**: Test and compare template performance and token usage.
 302 | 
 303 | ### 🛡️ Error Handling and Resilience
 304 | -   **Intelligent Retries**: Automatic retries with exponential backoff for transient errors (rate limits, network issues).
 305 | -   **Fallback Mechanisms**: Configurable provider fallbacks on primary failure.
 306 | -   **Detailed Error Reporting**: Captures comprehensive error context for debugging.
 307 | -   **Input Validation**: Pre-flight checks for common issues (e.g., token limits, required parameters).
 308 | 
 309 | ### ⚙️ System Features
 310 | -   **Rich Logging**: Colorful, informative console logs via `Rich`.
 311 | -   **Health Monitoring**: `/healthz` endpoint for readiness checks.
 312 | -   **Command-Line Interface**: `umcp` CLI for management and interaction.
 313 | 
 314 | ---
 315 | 
 316 | ## 📦 Getting Started
 317 | 
 318 | ### 🧪 Install
 319 | 
 320 | ```bash
 321 | # Install uv (fast Python package manager) if you don't have it:
 322 | curl -LsSf https://astral.sh/uv/install.sh | sh
 323 | 
 324 | # Clone the repository
 325 | git clone https://github.com/Dicklesworthstone/ultimate_mcp_server.git
 326 | cd ultimate_mcp_server
 327 | 
 328 | # Create a virtual environment and install dependencies using uv:
 329 | uv venv --python 3.13
 330 | source .venv/bin/activate
 331 | uv lock --upgrade
 332 | uv sync --all-extras
 333 | ```
 334 | *Note: The `uv sync --all-extras` command installs all optional extras defined in the project (e.g., OCR, Browser Automation, Excel). If you only need specific extras, adjust your project dependencies and run `uv sync` without `--all-extras`.*
 335 | 
 336 | ### ⚙️ .env Configuration
 337 | 
 338 | Create a file named `.env` in the root directory of the cloned repository. Add your API keys and any desired configuration overrides:
 339 | 
 340 | ```bash
 341 | # --- API Keys (at least one provider required) ---
 342 | OPENAI_API_KEY=your_openai_sk-...
 343 | ANTHROPIC_API_KEY=your_anthropic_sk-...
 344 | GEMINI_API_KEY=your_google_ai_studio_key... # For Google AI Studio (Gemini API)
 345 | # Or use GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/service-account-key.json for Vertex AI
 346 | DEEPSEEK_API_KEY=your_deepseek_key...
 347 | OPENROUTER_API_KEY=your_openrouter_key...
 348 | GROK_API_KEY=your_grok_key... # For Grok via xAI API
 349 | 
 350 | # --- Server Configuration (Defaults shown) ---
 351 | GATEWAY_SERVER_PORT=8013
 352 | GATEWAY_SERVER_HOST=127.0.0.1 # Change to 0.0.0.0 to listen on all interfaces (needed for Docker/external access)
 353 | # GATEWAY_API_PREFIX=/
 354 | 
 355 | # --- Logging Configuration (Defaults shown) ---
 356 | LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR, CRITICAL
 357 | USE_RICH_LOGGING=true # Set to false for plain text logs
 358 | 
 359 | # --- Cache Configuration (Defaults shown) ---
 360 | GATEWAY_CACHE_ENABLED=true
 361 | GATEWAY_CACHE_TTL=86400 # Default Time-To-Live in seconds (24 hours)
 362 | # GATEWAY_CACHE_TYPE=memory # Options might include 'memory', 'redis', 'diskcache' (check implementation)
 363 | # GATEWAY_CACHE_MAX_SIZE=1000 # Example: Max number of items for memory cache
 364 | # GATEWAY_CACHE_DIR=./.cache # Directory for disk cache storage
 365 | 
 366 | # --- Provider Timeouts & Retries (Defaults shown) ---
 367 | # GATEWAY_PROVIDER_TIMEOUT=120 # Default timeout in seconds for API calls
 368 | # GATEWAY_PROVIDER_MAX_RETRIES=3 # Default max retries on failure
 369 | 
 370 | # --- Provider-Specific Configuration ---
 371 | # GATEWAY_OPENAI_DEFAULT_MODEL=gpt-4.1-mini # Customize default model
 372 | # GATEWAY_ANTHROPIC_DEFAULT_MODEL=claude-3-5-sonnet-20241022 # Customize default model
 373 | # GATEWAY_GEMINI_DEFAULT_MODEL=gemini-2.0-pro # Customize default model
 374 | 
 375 | # --- Tool Specific Config (Examples) ---
 376 | # FILESYSTEM__ALLOWED_DIRECTORIES=["/path/to/safe/dir1","/path/to/safe/dir2"] # For Filesystem tools (JSON array)
 377 | # GATEWAY_AGENT_MEMORY_DB_PATH=unified_agent_memory.db # Path for agent memory database
 378 | # GATEWAY_PROMPT_TEMPLATES_DIR=./prompt_templates # Directory for prompt templates
 379 | ```
 380 | 
 381 | ### ▶️ Run
 382 | 
 383 | Make sure your virtual environment is active (`source .venv/bin/activate`).
 384 | 
 385 | ```bash
 386 | # Start the MCP server with all registered tools found
 387 | umcp run
 388 | 
 389 | # Start the server including only specific tools
 390 | umcp run --include-tools completion chunk_document read_file write_file
 391 | 
 392 | # Start the server excluding specific tools
 393 | umcp run --exclude-tools browser_init browser_navigate research_and_synthesize_report
 394 | 
 395 | # Start with Docker (ensure .env file exists in the project root or pass environment variables)
 396 | docker compose up --build # Add --build the first time or after changes
 397 | ```
 398 | 
 399 | Once running, the server will typically be available at `http://localhost:8013` (or the host/port configured in your `.env` or command line). You should see log output indicating the server has started and which tools are registered.
 400 | 
 401 | ## 💻 Command Line Interface (CLI)
 402 | 
 403 | The Ultimate MCP Server provides a powerful command-line interface (CLI) through the `umcp` command that allows you to manage the server, interact with LLM providers, test features, and explore examples. This section details all available commands and their options.
 404 | 
 405 | ### 🌟 Global Options
 406 | 
 407 | The `umcp` command supports the following global option:
 408 | 
 409 | ```bash
 410 | umcp --version  # Display version information
 411 | ```
 412 | 
 413 | ### 🚀 Server Management
 414 | 
 415 | #### Starting the Server
 416 | 
 417 | The `run` command starts the Ultimate MCP Server with specified options:
 418 | 
 419 | ```bash
 420 | # Basic server start with default settings from .env
 421 | umcp run
 422 | 
 423 | # Run on a specific host (-h) and port (-p)
 424 | umcp run -h 0.0.0.0 -p 9000
 425 | 
 426 | # Run with multiple worker processes (-w)
 427 | umcp run -w 4
 428 | 
 429 | # Enable debug logging (-d)
 430 | umcp run -d
 431 | 
 432 | # Use stdio transport (-t)
 433 | umcp run -t stdio
 434 | 
 435 | # Use streamable-http transport (recommended for HTTP clients)
 436 | umcp run -t shttp
 437 | 
 438 | # Run only with specific tools (no shortcut for --include-tools)
 439 | umcp run --include-tools completion chunk_document read_file write_file
 440 | 
 441 | # Run with all tools except certain ones (no shortcut for --exclude-tools)
 442 | umcp run --exclude-tools browser_init browser_navigate
 443 | ```
 444 | 
 445 | Example output:
 446 | ```
 447 | ┌─ Starting Ultimate MCP Server ───────────────────┐
 448 | │ Host: 0.0.0.0                                    │
 449 | │ Port: 9000                                       │
 450 | │ Workers: 4                                       │
 451 | │ Transport mode: streamable-http                  │
 452 | └────────────────────────────────────────────────┘
 453 | 
 454 | INFO:     Started server process [12345]
 455 | INFO:     Waiting for application startup.
 456 | INFO:     Application startup complete.
 457 | INFO:     Uvicorn running on http://0.0.0.0:9000 (Press CTRL+C to quit)
 458 | ```
 459 | 
 460 | Available options:
 461 | - `-h, --host`: Host or IP address to bind the server to (default: from .env)
 462 | - `-p, --port`: Port to listen on (default: from .env)
 463 | - `-w, --workers`: Number of worker processes to spawn (default: from .env)
 464 | - `-t, --transport-mode`: Transport mode for server communication ('shttp' for streamable-http, 'sse', or 'stdio', default: shttp)
 465 | - `-d, --debug`: Enable debug logging
 466 | - `--include-tools`: List of tool names to include (comma-separated)
 467 | - `--exclude-tools`: List of tool names to exclude (comma-separated)
 468 | 
 469 | ### 🔌 Provider Management
 470 | 
 471 | #### Listing Providers
 472 | 
 473 | The `providers` command displays information about configured LLM providers:
 474 | 
 475 | ```bash
 476 | # List all configured providers
 477 | umcp providers
 478 | 
 479 | # Check API keys (-c) for all configured providers
 480 | umcp providers -c
 481 | 
 482 | # List available models (no shortcut for --models)
 483 | umcp providers --models
 484 | 
 485 | # Check keys and list models
 486 | umcp providers -c --models
 487 | ```
 488 | 
 489 | Example output:
 490 | ```
 491 | ┌─ LLM Providers ──────────────────────────────────────────────────┐
 492 | │ Provider   Status   Default Model            API Key             │
 493 | ├───────────────────────────────────────────────────────────────────┤
 494 | │ openai     ✓        gpt-4.1-mini            sk-...5vX [VALID]    │
 495 | │ anthropic  ✓        claude-3-5-sonnet-20241022 sk-...Hr [VALID]  │
 496 | │ gemini     ✓        gemini-2.0-pro          [VALID]              │
 497 | │ deepseek   ✗        deepseek-chat           [NOT CONFIGURED]     │
 498 | │ openrouter ✓        --                      [VALID]              │
 499 | │ grok       ✓        grok-1                  [VALID]              │
 500 | └───────────────────────────────────────────────────────────────────┘
 501 | ```
 502 | 
 503 | With `--models`:
 504 | ```
 505 | OPENAI MODELS:
 506 |   - gpt-4.1-mini
 507 |   - gpt-4o
 508 |   - gpt-4-0125-preview
 509 |   - gpt-3.5-turbo
 510 | 
 511 | ANTHROPIC MODELS:
 512 |   - claude-3-5-sonnet-20241022
 513 |   - claude-3-5-haiku-20241022
 514 |   - claude-3-opus-20240229
 515 |   ...
 516 | ```
 517 | 
 518 | Available options:
 519 | - `-c, --check`: Check API keys for all configured providers
 520 | - `--models`: List available models for each provider
 521 | 
 522 | #### Testing a Provider
 523 | 
 524 | The `test` command allows you to test a specific provider:
 525 | 
 526 | ```bash
 527 | # Test the default OpenAI model with a simple prompt
 528 | umcp test openai
 529 | 
 530 | # Test a specific model (--model) with a custom prompt (--prompt)
 531 | umcp test anthropic --model claude-3-5-haiku-20241022 --prompt "Write a short poem about coding."
 532 | 
 533 | # Test Gemini with a different prompt
 534 | umcp test gemini --prompt "What are three interesting AI research papers from 2024?"
 535 | ```
 536 | 
 537 | Example output:
 538 | ```
 539 | Testing provider 'anthropic'...
 540 | 
 541 | Provider: anthropic
 542 | Model: claude-3-5-haiku-20241022
 543 | Prompt: Write a short poem about coding.
 544 | 
 545 | ❯ Response:
 546 | Code flows like water,
 547 | Logic cascades through the mind—
 548 | Bugs bloom like flowers.
 549 | 
 550 | Tokens: 13 input, 19 output
 551 | Cost: $0.00006
 552 | Response time: 0.82s
 553 | ```
 554 | 
 555 | Available options:
 556 | - `--model`: Model ID to test (defaults to the provider's default)
 557 | - `--prompt`: Prompt text to send (default: "Hello, world!")
 558 | 
 559 | ### ⚡ Direct Text Generation
 560 | 
 561 | The `complete` command lets you generate text directly from the CLI:
 562 | 
 563 | ```bash
 564 | # Generate text with default provider (OpenAI) using a prompt (--prompt)
 565 | umcp complete --prompt "Write a concise explanation of quantum computing."
 566 | 
 567 | # Specify a provider (--provider) and model (--model)
 568 | umcp complete --provider anthropic --model claude-3-5-sonnet-20241022 --prompt "What are the key differences between Rust and Go?"
 569 | 
 570 | # Use a system prompt (--system)
 571 | umcp complete --provider openai --model gpt-4o --system "You are an expert programmer..." --prompt "Explain dependency injection."
 572 | 
 573 | # Stream the response token by token (-s)
 574 | umcp complete --provider openai --prompt "Count from 1 to 10." -s
 575 | 
 576 | # Adjust temperature (--temperature) and token limit (--max-tokens)
 577 | umcp complete --provider gemini --temperature 1.2 --max-tokens 250 --prompt "Generate a creative sci-fi story opening."
 578 | 
 579 | # Read prompt from stdin (no --prompt needed)
 580 | echo "Tell me about space exploration." | umcp complete
 581 | ```
 582 | 
 583 | Example output:
 584 | ```
 585 | Quantum computing uses quantum bits (qubits) that can exist in multiple states simultaneously, unlike classical bits (0 or 1). This quantum superposition, along with entanglement, allows quantum computers to process vast amounts of information in parallel, potentially solving certain complex problems exponentially faster than classical computers. Applications include cryptography, materials science, and optimization problems.
 586 | 
 587 | Tokens: 13 input, 72 output
 588 | Cost: $0.00006
 589 | Response time: 0.37s
 590 | ```
 591 | 
 592 | Available options:
 593 | - `--provider`: Provider to use (default: openai)
 594 | - `--model`: Model ID (defaults to provider's default)
 595 | - `--prompt`: Prompt text (reads from stdin if not provided)
 596 | - `--temperature`: Sampling temperature (0.0-2.0, default: 0.7)
 597 | - `--max-tokens`: Maximum tokens to generate
 598 | - `--system`: System prompt for providers that support it
 599 | - `-s, --stream`: Stream the response token by token
 600 | 
 601 | ### 💾 Cache Management
 602 | 
 603 | The `cache` command allows you to view or clear the request cache:
 604 | 
 605 | ```bash
 606 | # Show cache status (default action)
 607 | umcp cache
 608 | 
 609 | # Explicitly show status (no shortcut for --status)
 610 | umcp cache --status
 611 | 
 612 | # Clear the cache (no shortcut for --clear, with confirmation prompt)
 613 | umcp cache --clear
 614 | 
 615 | # Show stats and clear the cache in one command
 616 | umcp cache --status --clear
 617 | ```
 618 | 
 619 | Example output:
 620 | ```
 621 | Cache Status:
 622 |   Backend: memory
 623 |   Enabled: True
 624 |   Items: 127
 625 |   Hit rate: 73.2%
 626 |   Estimated savings: $1.47
 627 | ```
 628 | 
 629 | Available options:
 630 | - `--status`: Show cache status (enabled by default if no other flag)
 631 | - `--clear`: Clear the cache (will prompt for confirmation)
 632 | 
 633 | ### 📊 Benchmarking
 634 | 
 635 | The `benchmark` command lets you compare performance and cost across providers:
 636 | 
 637 | ```bash
 638 | # Run default benchmark (3 runs per provider)
 639 | umcp benchmark
 640 | 
 641 | # Benchmark only specific providers
 642 | umcp benchmark --providers openai,anthropic
 643 | 
 644 | # Benchmark with specific models
 645 | umcp benchmark --providers openai,anthropic --models gpt-4o,claude-3.5-sonnet
 646 | 
 647 | # Use a custom prompt and more runs (-r)
 648 | umcp benchmark --prompt "Explain the process of photosynthesis in detail." -r 5
 649 | ```
 650 | 
 651 | Example output:
 652 | ```
 653 | ┌─ Benchmark Results ───────────────────────────────────────────────────────┐
 654 | │ Provider    Model               Avg Time   Tokens    Cost      Tokens/sec │
 655 | ├──────────────────────────────────────────────────────────────────────────┤
 656 | │ openai      gpt-4.1-mini        0.47s      76 / 213  $0.00023  454        │
 657 | │ anthropic   claude-3-5-haiku    0.52s      76 / 186  $0.00012  358        │
 658 | │ gemini      gemini-2.0-pro      0.64s      76 / 201  $0.00010  314        │
 659 | │ deepseek    deepseek-chat       0.71s      76 / 195  $0.00006  275        │
 660 | └──────────────────────────────────────────────────────────────────────────┘
 661 | ```
 662 | 
 663 | Available options:
 664 | - `--providers`: List of providers to benchmark (default: all configured)
 665 | - `--models`: Model IDs to benchmark (defaults to default model of each provider)
 666 | - `--prompt`: Prompt text to use (default: built-in benchmark prompt)
 667 | - `-r, --runs`: Number of runs per provider/model (default: 3)
 668 | 
 669 | ### 🧰 Tool Management
 670 | 
 671 | The `tools` command lists available tools, optionally filtered by category:
 672 | 
 673 | ```bash
 674 | # List all tools
 675 | umcp tools
 676 | 
 677 | # List tools in a specific category
 678 | umcp tools --category document
 679 | 
 680 | # Show related example scripts
 681 | umcp tools --examples
 682 | ```
 683 | 
 684 | Example output:
 685 | ```
 686 | ┌─ Ultimate MCP Server Tools ─────────────────────────────────────────┐
 687 | │ Category    Tool                           Example Script            │
 688 | ├──────────────────────────────────────────────────────────────────────┤
 689 | │ completion  generate_completion            simple_completion_demo.py │
 690 | │ completion  stream_completion              simple_completion_demo.py │
 691 | │ completion  chat_completion                claude_integration_demo.py│
 692 | │ document    summarize_document             document_processing.py    │
 693 | │ document    chunk_document                 document_processing.py    │
 694 | │ extraction  extract_json                   advanced_extraction_demo.py│
 695 | │ filesystem  read_file                      filesystem_operations_demo.py│
 696 | └──────────────────────────────────────────────────────────────────────┘
 697 | 
 698 | Tip: Run examples using the command:
 699 |   umcp examples <example_name>
 700 | ```
 701 | 
 702 | Available options:
 703 | - `--category`: Filter tools by category
 704 | - `--examples`: Show example scripts alongside tools
 705 | 
 706 | ### 📚 Example Management
 707 | 
 708 | The `examples` command lets you list and run example scripts:
 709 | 
 710 | ```bash
 711 | # List all example scripts (default action)
 712 | umcp examples
 713 | 
 714 | # Explicitly list example scripts (-l)
 715 | umcp examples -l
 716 | 
 717 | # Run a specific example
 718 | umcp examples rag_example.py
 719 | 
 720 | # Can also run by just the name without extension
 721 | umcp examples rag_example
 722 | ```
 723 | 
 724 | Example output when listing:
 725 | ```
 726 | ┌─ Ultimate MCP Server Example Scripts ─────────────────────────────────┐
 727 | │ Category             Example Script                                   │
 728 | ├────────────────────────────────────────────────────────────────────────┤
 729 | │ text-generation      simple_completion_demo.py                        │
 730 | │ text-generation      claude_integration_demo.py                       │
 731 | │ document-processing  document_processing.py                           │
 732 | │ search-and-retrieval rag_example.py                                   │
 733 | │ browser-automation   browser_automation_demo.py                       │
 734 | └────────────────────────────────────────────────────────────────────────┘
 735 | 
 736 | Run an example:
 737 |   umcp examples <example_name>
 738 | ```
 739 | 
 740 | When running an example:
 741 | ```
 742 | Running example: rag_example.py
 743 | 
 744 | Creating vector knowledge base 'demo_kb'...
 745 | Adding sample documents...
 746 | Retrieving context for query: "What are the benefits of clean energy?"
 747 | Generated response:
 748 | Based on the retrieved context, clean energy offers several benefits:
 749 | ...
 750 | ```
 751 | 
 752 | Available options:
 753 | - `-l, --list`: List example scripts only
 754 | - `--category`: Filter examples by category
 755 | 
 756 | ### 🔎 Getting Help
 757 | 
 758 | Every command has detailed help available:
 759 | 
 760 | ```bash
 761 | # General help
 762 | umcp --help
 763 | 
 764 | # Help for a specific command
 765 | umcp run --help
 766 | umcp providers --help
 767 | umcp complete --help
 768 | ```
 769 | 
 770 | Example output:
 771 | ```
 772 | Usage: umcp [OPTIONS] COMMAND [ARGS]...
 773 | 
 774 |   Ultimate MCP Server: Multi-provider LLM management server
 775 |   Unified CLI to run your server, manage providers, and more.
 776 | 
 777 | Options:
 778 |   --version, -v                   Show the application version and exit.
 779 |   --help                          Show this message and exit.
 780 | 
 781 | Commands:
 782 |   run          Run the Ultimate MCP Server
 783 |   providers    List Available Providers
 784 |   test         Test a Specific Provider
 785 |   complete     Generate Text Completion
 786 |   cache        Cache Management
 787 |   benchmark    Benchmark Providers
 788 |   tools        List Available Tools
 789 |   examples     Run or List Example Scripts
 790 | ```
 791 | 
 792 | Command-specific help:
 793 | ```
 794 | Usage: umcp run [OPTIONS]
 795 | 
 796 |   Run the Ultimate MCP Server
 797 | 
 798 |   Start the server with optional overrides.
 799 | 
 800 |   Examples:
 801 |     umcp run -h 0.0.0.0 -p 8000 -w 4 -t sse
 802 |     umcp run -d
 803 | 
 804 | Options:
 805 |   -h, --host TEXT                 Host or IP address to bind the server to.
 806 |                                   Defaults from config.
 807 |   -p, --port INTEGER              Port to listen on. Defaults from config.
 808 |   -w, --workers INTEGER           Number of worker processes to spawn.
 809 |                                   Defaults from config.
 810 |   -t, --transport-mode [shttp|sse|stdio]
 811 |                                   Transport mode for server communication (-t
 812 |                                   shortcut). Options: 'shttp' (streamable-http, 
 813 |                                   recommended), 'sse', or 'stdio'.
 814 |   -d, --debug                     Enable debug logging for detailed output (-d
 815 |                                   shortcut).
 816 |   --include-tools TEXT            List of tool names to include when running
 817 |                                   the server.
 818 |   --exclude-tools TEXT            List of tool names to exclude when running
 819 |                                   the server.
 820 |   --help                          Show this message and exit.
 821 | ```
 822 | 
 823 | ---
 824 | 
 825 | ## 🧪 Usage Examples
 826 | 
 827 | This section provides Python examples demonstrating how an MCP client (like an application using `mcp-client` or an agent like Claude) would interact with the tools provided by a running Ultimate MCP Server instance.
 828 | 
 829 | *Note: These examples assume you have `mcp-client` installed (`pip install mcp-client`) and the Ultimate MCP Server is running at `http://localhost:8013`.*
 830 | 
 831 | *(The detailed code blocks from the original input are preserved below for completeness)*
 832 | 
 833 | ### Basic Completion
 834 | 
 835 | ```python
 836 | import asyncio
 837 | from mcp.client import Client
 838 | 
 839 | async def basic_completion_example():
 840 |     client = Client("http://localhost:8013")
 841 |     response = await client.tools.completion(
 842 |         prompt="Write a short poem about a robot learning to dream.",
 843 |         provider="openai",
 844 |         model="gpt-4.1-mini",
 845 |         max_tokens=100,
 846 |         temperature=0.7
 847 |     )
 848 |     if response["success"]:
 849 |         print(f"Completion: {response['completion']}")
 850 |         print(f"Cost: ${response['cost']:.6f}")
 851 |     else:
 852 |         print(f"Error: {response['error']}")
 853 |     await client.close()
 854 | 
 855 | # if __name__ == "__main__": asyncio.run(basic_completion_example())
 856 | ```
 857 | 
 858 | ### Claude Using Ultimate MCP Server for Document Analysis (Delegation)
 859 | 
 860 | ```python
 861 | import asyncio
 862 | from mcp.client import Client
 863 | 
 864 | async def document_analysis_example():
 865 |     # Assume Claude identifies a large document needing processing
 866 |     client = Client("http://localhost:8013")
 867 |     document = "... large document content ..." * 100 # Placeholder for large content
 868 | 
 869 |     print("Delegating document chunking...")
 870 |     # Step 1: Claude delegates document chunking (often a local, non-LLM task on server)
 871 |     chunks_response = await client.tools.chunk_document(
 872 |         document=document,
 873 |         chunk_size=1000, # Target tokens per chunk
 874 |         overlap=100,     # Token overlap
 875 |         method="semantic" # Use semantic chunking if available
 876 |     )
 877 |     if not chunks_response["success"]:
 878 |         print(f"Chunking failed: {chunks_response['error']}")
 879 |         await client.close()
 880 |         return
 881 | 
 882 |     print(f"Document divided into {chunks_response['chunk_count']} chunks.")
 883 | 
 884 |     # Step 2: Claude delegates summarization of each chunk to a cheaper model
 885 |     summaries = []
 886 |     total_cost = 0.0
 887 |     print("Delegating chunk summarization to gemini-2.0-flash-lite...")
 888 |     for i, chunk in enumerate(chunks_response["chunks"]):
 889 |         # Use Gemini Flash (much cheaper than Claude or GPT-4o) via the server
 890 |         summary_response = await client.tools.summarize_document(
 891 |             document=chunk,
 892 |             provider="gemini", # Explicitly delegate to Gemini via server
 893 |             model="gemini-2.0-flash-lite",
 894 |             format="paragraph",
 895 |             max_length=150 # Request a concise summary
 896 |         )
 897 |         if summary_response["success"]:
 898 |             summaries.append(summary_response["summary"])
 899 |             cost = summary_response.get("cost", 0.0)
 900 |             total_cost += cost
 901 |             print(f"  Processed chunk {i+1}/{chunks_response['chunk_count']} summary. Cost: ${cost:.6f}")
 902 |         else:
 903 |             print(f"  Chunk {i+1} summarization failed: {summary_response['error']}")
 904 | 
 905 |     print("\nDelegating entity extraction to gpt-4.1-mini...")
 906 |     # Step 3: Claude delegates entity extraction for the whole document to another cheap model
 907 |     entities_response = await client.tools.extract_entities(
 908 |         document=document, # Process the original document
 909 |         entity_types=["person", "organization", "location", "date", "product"],
 910 |         provider="openai", # Delegate to OpenAI's cheaper model
 911 |         model="gpt-4.1-mini"
 912 |     )
 913 | 
 914 |     if entities_response["success"]:
 915 |         cost = entities_response.get("cost", 0.0)
 916 |         total_cost += cost
 917 |         print(f"Extracted entities. Cost: ${cost:.6f}")
 918 |         extracted_entities = entities_response['entities']
 919 |         # Claude would now process these summaries and entities using its advanced capabilities
 920 |         print(f"\nClaude can now use {len(summaries)} summaries and {len(extracted_entities)} entity groups.")
 921 |     else:
 922 |         print(f"Entity extraction failed: {entities_response['error']}")
 923 | 
 924 |     print(f"\nTotal estimated delegation cost for sub-tasks: ${total_cost:.6f}")
 925 | 
 926 |     # Claude might perform final synthesis using the collected results
 927 |     final_synthesis_prompt = f"""
 928 | Synthesize the key information from the following summaries and entities extracted from a large document.
 929 | Focus on the main topics, key people involved, and significant events mentioned.
 930 | 
 931 | Summaries:
 932 | {' '.join(summaries)}
 933 | 
 934 | Entities:
 935 | {extracted_entities}
 936 | 
 937 | Provide a concise final report.
 938 | """
 939 |     # This final step would likely use Claude itself (not shown here)
 940 | 
 941 |     await client.close()
 942 | 
 943 | # if __name__ == "__main__": asyncio.run(document_analysis_example())
 944 | ```
 945 | 
 946 | ### Browser Automation for Research
 947 | 
 948 | ```python
 949 | import asyncio
 950 | from mcp.client import Client
 951 | 
 952 | async def browser_research_example():
 953 |     client = Client("http://localhost:8013")
 954 |     print("Starting browser-based research task...")
 955 |     # This tool likely orchestrates multiple browser actions (search, navigate, scrape)
 956 |     # and uses an LLM (specified or default) for synthesis.
 957 |     result = await client.tools.research_and_synthesize_report(
 958 |         topic="Latest advances in AI-powered drug discovery using graph neural networks",
 959 |         instructions={
 960 |             "search_query": "graph neural networks drug discovery 2024 research",
 961 |             "search_engines": ["google", "duckduckgo"], # Use multiple search engines
 962 |             "urls_to_include": ["nature.com", "sciencemag.org", "arxiv.org", "pubmed.ncbi.nlm.nih.gov"], # Prioritize these domains
 963 |             "max_urls_to_process": 7, # Limit the number of pages to visit/scrape
 964 |             "min_content_length": 500, # Ignore pages with very little content
 965 |             "focus_areas": ["novel molecular structures", "binding affinity prediction", "clinical trial results"], # Guide the synthesis
 966 |             "report_format": "markdown", # Desired output format
 967 |             "report_length": "detailed", # comprehensive, detailed, summary
 968 |             "llm_model": "anthropic/claude-3-5-sonnet-20241022" # Specify LLM for synthesis
 969 |         }
 970 |     )
 971 | 
 972 |     if result["success"]:
 973 |         print("\nResearch report generated successfully!")
 974 |         print(f"Processed {len(result.get('extracted_data', []))} sources.")
 975 |         print(f"Total processing time: {result.get('processing_time', 'N/A'):.2f}s")
 976 |         print(f"Estimated cost: ${result.get('total_cost', 0.0):.6f}") # Includes LLM synthesis cost
 977 |         print("\n--- Research Report ---")
 978 |         print(result['report'])
 979 |         print("-----------------------")
 980 |     else:
 981 |         print(f"\nBrowser research failed: {result.get('error', 'Unknown error')}")
 982 |         if 'details' in result: print(f"Details: {result['details']}")
 983 | 
 984 |     await client.close()
 985 | 
 986 | # if __name__ == "__main__": asyncio.run(browser_research_example())
 987 | ```
 988 | 
 989 | ### Cognitive Memory System Usage
 990 | 
 991 | ```python
 992 | import asyncio
 993 | from mcp.client import Client
 994 | import uuid
 995 | 
 996 | async def cognitive_memory_example():
 997 |     client = Client("http://localhost:8013")
 998 |     # Generate a unique ID for this session/workflow if not provided
 999 |     workflow_id = str(uuid.uuid4())
1000 |     print(f"Using Workflow ID: {workflow_id}")
1001 | 
1002 |     print("\nCreating a workflow context...")
1003 |     # Create a workflow context to group related memories and actions
1004 |     workflow_response = await client.tools.create_workflow(
1005 |         workflow_id=workflow_id,
1006 |         title="Quantum Computing Investment Analysis",
1007 |         description="Analyzing the impact of quantum computing on financial markets.",
1008 |         goal="Identify potential investment opportunities or risks."
1009 |     )
1010 |     if not workflow_response["success"]: print(f"Error creating workflow: {workflow_response['error']}")
1011 | 
1012 |     print("\nRecording an agent action...")
1013 |     # Record the start of a research action
1014 |     action_response = await client.tools.record_action_start(
1015 |         workflow_id=workflow_id,
1016 |         action_type="research",
1017 |         title="Initial literature review on quantum algorithms in finance",
1018 |         reasoning="Need to understand the current state-of-the-art before assessing impact."
1019 |     )
1020 |     action_id = action_response.get("action_id") if action_response["success"] else None
1021 |     if not action_id: print(f"Error starting action: {action_response['error']}")
1022 | 
1023 |     print("\nStoring facts in semantic memory...")
1024 |     # Store some key facts discovered during research
1025 |     memory1 = await client.tools.store_memory(
1026 |         workflow_id=workflow_id,
1027 |         content="Shor's algorithm can break RSA encryption, posing a threat to current financial security.",
1028 |         memory_type="fact", memory_level="semantic", importance=9.0,
1029 |         tags=["quantum_algorithm", "cryptography", "risk", "shor"]
1030 |     )
1031 |     memory2 = await client.tools.store_memory(
1032 |         workflow_id=workflow_id,
1033 |         content="Quantum annealing (e.g., D-Wave) shows promise for portfolio optimization problems.",
1034 |         memory_type="fact", memory_level="semantic", importance=7.5,
1035 |         tags=["quantum_computing", "finance", "optimization", "annealing"]
1036 |     )
1037 |     if memory1["success"]: print(f"Stored memory ID: {memory1['memory_id']}")
1038 |     if memory2["success"]: print(f"Stored memory ID: {memory2['memory_id']}")
1039 | 
1040 |     print("\nStoring an observation (episodic memory)...")
1041 |     # Store an observation from a specific event/document
1042 |     obs_memory = await client.tools.store_memory(
1043 |         workflow_id=workflow_id,
1044 |         content="Read Nature article (doi:...) suggesting experimental quantum advantage in a specific financial modeling task.",
1045 |         memory_type="observation", memory_level="episodic", importance=8.0,
1046 |         source="Nature Article XYZ", timestamp="2024-07-20T10:00:00Z", # Example timestamp
1047 |         tags=["research_finding", "publication", "finance_modeling"]
1048 |     )
1049 |     if obs_memory["success"]: print(f"Stored episodic memory ID: {obs_memory['memory_id']}")
1050 | 
1051 |     print("\nSearching for relevant memories...")
1052 |     # Search for memories related to financial risks
1053 |     search_results = await client.tools.hybrid_search_memories(
1054 |         workflow_id=workflow_id,
1055 |         query="What are the financial risks associated with quantum computing?",
1056 |         top_k=5, memory_type="fact", # Search for facts first
1057 |         semantic_weight=0.7, keyword_weight=0.3 # Example weighting for hybrid search
1058 |     )
1059 |     if search_results["success"]:
1060 |         print(f"Found {len(search_results['results'])} relevant memories:")
1061 |         for res in search_results["results"]:
1062 |             print(f"  - Score: {res['score']:.4f}, ID: {res['memory_id']}, Content: {res['content'][:80]}...")
1063 |     else:
1064 |         print(f"Memory search failed: {search_results['error']}")
1065 | 
1066 |     print("\nGenerating a reflection based on stored memories...")
1067 |     # Generate insights or reflections based on the accumulated knowledge in the workflow
1068 |     reflection_response = await client.tools.generate_reflection(
1069 |         workflow_id=workflow_id,
1070 |         reflection_type="summary_and_next_steps", # e.g., insights, risks, opportunities
1071 |         context_query="Summarize the key findings about quantum finance impact and suggest next research actions."
1072 |     )
1073 |     if reflection_response["success"]:
1074 |         print("Generated Reflection:")
1075 |         print(reflection_response["reflection"])
1076 |     else:
1077 |         print(f"Reflection generation failed: {reflection_response['error']}")
1078 | 
1079 |     # Mark the action as completed (assuming research phase is done)
1080 |     if action_id:
1081 |         print("\nCompleting the research action...")
1082 |         await client.tools.record_action_end(
1083 |             workflow_id=workflow_id, action_id=action_id, status="completed",
1084 |             outcome="Gathered initial understanding of quantum algorithms in finance and associated risks."
1085 |         )
1086 | 
1087 |     await client.close()
1088 | 
1089 | # if __name__ == "__main__": asyncio.run(cognitive_memory_example())
1090 | ```
1091 | 
1092 | ### Excel Spreadsheet Automation
1093 | 
1094 | ```python
1095 | import asyncio
1096 | from mcp.client import Client
1097 | import os
1098 | 
1099 | async def excel_automation_example():
1100 |     client = Client("http://localhost:8013")
1101 |     output_dir = "excel_outputs"
1102 |     os.makedirs(output_dir, exist_ok=True)
1103 |     output_path = os.path.join(output_dir, "financial_model.xlsx")
1104 | 
1105 |     print(f"Requesting creation of Excel financial model at {output_path}...")
1106 |     # Example: Create a financial model using natural language instructions
1107 |     create_result = await client.tools.excel_execute(
1108 |         instruction="Create a simple 3-year financial projection.\n"
1109 |                    "Sheet name: 'Projections'.\n"
1110 |                    "Columns: Year 1, Year 2, Year 3.\n"
1111 |                    "Rows: Revenue, COGS, Gross Profit, Operating Expenses, Net Income.\n"
1112 |                    "Data: Start Revenue at $100,000, grows 20% annually.\n"
1113 |                    "COGS is 40% of Revenue.\n"
1114 |                    "Operating Expenses start at $30,000, grow 10% annually.\n"
1115 |                    "Calculate Gross Profit (Revenue - COGS) and Net Income (Gross Profit - OpEx).\n"
1116 |                    "Format currency as $#,##0. Apply bold headers and add a light blue fill to the header row.",
1117 |         file_path=output_path, # Server needs write access to this path/directory if relative
1118 |         operation_type="create", # create, modify, analyze, format
1119 |         # sheet_name="Projections", # Can specify sheet if modifying
1120 |         # cell_range="A1:D6", # Can specify range
1121 |         show_excel=False # Run Excel in the background (if applicable on the server)
1122 |     )
1123 | 
1124 |     if create_result["success"]:
1125 |         print(f"Excel creation successful: {create_result['message']}")
1126 |         print(f"File saved at: {create_result.get('output_file_path', output_path)}") # Confirm output path
1127 | 
1128 |         # Example: Modify the created file - add a chart
1129 |         print("\nRequesting modification: Add a Revenue chart...")
1130 |         modify_result = await client.tools.excel_execute(
1131 |             instruction="Add a column chart showing Revenue for Year 1, Year 2, Year 3. "
1132 |                        "Place it below the table. Title the chart 'Revenue Projection'.",
1133 |             file_path=output_path, # Use the previously created file
1134 |             operation_type="modify",
1135 |             sheet_name="Projections" # Specify the sheet to modify
1136 |         )
1137 |         if modify_result["success"]:
1138 |              print(f"Excel modification successful: {modify_result['message']}")
1139 |              print(f"File updated at: {modify_result.get('output_file_path', output_path)}")
1140 |         else:
1141 |              print(f"Excel modification failed: {modify_result['error']}")
1142 | 
1143 |     else:
1144 |         print(f"Excel creation failed: {create_result['error']}")
1145 |         if 'details' in create_result: print(f"Details: {create_result['details']}")
1146 | 
1147 |     # Example: Analyze formulas (if the tool supports it)
1148 |     # analysis_result = await client.tools.excel_analyze_formulas(...)
1149 | 
1150 |     await client.close()
1151 | 
1152 | # if __name__ == "__main__": asyncio.run(excel_automation_example())
1153 | ```
1154 | 
1155 | ### Multi-Provider Comparison
1156 | 
1157 | ```python
1158 | import asyncio
1159 | from mcp.client import Client
1160 | 
1161 | async def multi_provider_completion_example():
1162 |     client = Client("http://localhost:8013")
1163 |     prompt = "Explain the concept of 'Chain of Thought' prompting for Large Language Models."
1164 | 
1165 |     print(f"Requesting completions for prompt: '{prompt}' from multiple providers...")
1166 |     # Request the same prompt from different models/providers
1167 |     multi_response = await client.tools.multi_completion(
1168 |         prompt=prompt,
1169 |         providers=[
1170 |             {"provider": "openai", "model": "gpt-4.1-mini", "temperature": 0.5},
1171 |             {"provider": "anthropic", "model": "claude-3-5-sonnet-20241022", "temperature": 0.5},
1172 |             {"provider": "gemini", "model": "gemini-2.0-pro", "temperature": 0.5},
1173 |             # {"provider": "deepseek", "model": "deepseek-chat", "temperature": 0.5}, # Add others if configured
1174 |         ],
1175 |         # Common parameters applied to all if not specified per provider
1176 |         max_tokens=300
1177 |     )
1178 | 
1179 |     if multi_response["success"]:
1180 |         print("\n--- Multi-completion Results ---")
1181 |         total_cost = multi_response.get("total_cost", 0.0)
1182 |         print(f"Total Estimated Cost: ${total_cost:.6f}\n")
1183 | 
1184 |         for provider_key, result in multi_response["results"].items():
1185 |             print(f"--- Provider: {provider_key} ---")
1186 |             if result["success"]:
1187 |                 print(f"  Model: {result.get('model', 'N/A')}")
1188 |                 print(f"  Cost: ${result.get('cost', 0.0):.6f}")
1189 |                 print(f"  Tokens: Input={result.get('input_tokens', 'N/A')}, Output={result.get('output_tokens', 'N/A')}")
1190 |                 print(f"  Completion:\n{result['completion']}\n")
1191 |             else:
1192 |                 print(f"  Error: {result['error']}\n")
1193 |         print("------------------------------")
1194 |         # An agent could now analyze these responses for consistency, detail, accuracy etc.
1195 |     else:
1196 |         print(f"\nMulti-completion request failed: {multi_response['error']}")
1197 | 
1198 |     await client.close()
1199 | 
1200 | # if __name__ == "__main__": asyncio.run(multi_provider_completion_example())
1201 | ```
1202 | 
1203 | ### Cost-Optimized Workflow Execution
1204 | 
1205 | ```python
1206 | import asyncio
1207 | from mcp.client import Client
1208 | 
1209 | async def optimized_workflow_example():
1210 |     client = Client("http://localhost:8013")
1211 |     # Example document to process through the workflow
1212 |     document_content = """
1213 |     Project Alpha Report - Q3 2024
1214 |     Lead: Dr. Evelyn Reed ([email protected])
1215 |     Status: On Track
1216 |     Budget: $50,000 remaining. Spent $25,000 this quarter.
1217 |     Key Findings: Successful prototype development (v0.8). User testing feedback positive.
1218 |     Next Steps: Finalize documentation, prepare for Q4 deployment. Target date: 2024-11-15.
1219 |     Risks: Potential delay due to supplier issues for component X. Mitigation plan in place.
1220 |     """
1221 | 
1222 |     print("Defining a multi-stage workflow...")
1223 |     # Define a workflow with stages, dependencies, and provider preferences
1224 |     # Use ${stage_id.output_key} to pass outputs between stages
1225 |     workflow_definition = [
1226 |         {
1227 |             "stage_id": "summarize_report",
1228 |             "tool_name": "summarize_document",
1229 |             "params": {
1230 |                 "document": document_content,
1231 |                 "format": "bullet_points",
1232 |                 "max_length": 100,
1233 |                 # Let the server choose a cost-effective model for summarization
1234 |                 "provider_preference": "cost", # 'cost', 'quality', 'speed', or specific like 'openai/gpt-4.1-mini'
1235 |             }
1236 |             # No 'depends_on', runs first
1237 |             # Default output key is 'summary' for this tool, access via ${summarize_report.summary}
1238 |         },
1239 |         {
1240 |             "stage_id": "extract_key_info",
1241 |             "tool_name": "extract_json", # Use JSON extraction for structured data
1242 |             "params": {
1243 |                 "document": document_content,
1244 |                 "json_schema": {
1245 |                     "type": "object",
1246 |                     "properties": {
1247 |                         "project_lead": {"type": "string"},
1248 |                         "lead_email": {"type": "string", "format": "email"},
1249 |                         "status": {"type": "string"},
1250 |                         "budget_remaining": {"type": "string"},
1251 |                         "next_milestone_date": {"type": "string", "format": "date"}
1252 |                     },
1253 |                     "required": ["project_lead", "status", "next_milestone_date"]
1254 |                 },
1255 |                 # Prefer a model known for good structured data extraction, balancing cost
1256 |                 "provider_preference": "quality", # Prioritize quality for extraction
1257 |                 "preferred_models": ["openai/gpt-4o", "anthropic/claude-3-5-sonnet-20241022"] # Suggest specific models
1258 |             }
1259 |         },
1260 |         {
1261 |             "stage_id": "generate_follow_up_questions",
1262 |             "tool_name": "generate_qa", # Assuming a tool that generates questions
1263 |             "depends_on": ["summarize_report"], # Needs the summary first
1264 |             "params": {
1265 |                 # Use the summary from the first stage as input
1266 |                 "document": "${summarize_report.summary}",
1267 |                 "num_questions": 3,
1268 |                 "provider_preference": "speed" # Use a fast model for question generation
1269 |             }
1270 |             # Default output key 'qa_pairs', access via ${generate_follow_up_questions.qa_pairs}
1271 |         }
1272 |     ]
1273 | 
1274 |     print("Executing the optimized workflow...")
1275 |     # Execute the workflow - the server handles dependencies and model selection
1276 |     results = await client.tools.execute_optimized_workflow(
1277 |         workflow=workflow_definition
1278 |         # Can also pass initial documents if workflow steps reference 'original_document'
1279 |         # documents = {"report.txt": document_content}
1280 |     )
1281 | 
1282 |     if results["success"]:
1283 |         print("\nWorkflow executed successfully!")
1284 |         print(f"  Total processing time: {results.get('processing_time', 'N/A'):.2f}s")
1285 |         print(f"  Total estimated cost: ${results.get('total_cost', 0.0):.6f}\n")
1286 | 
1287 |         print("--- Stage Outputs ---")
1288 |         for stage_id, output in results.get("stage_outputs", {}).items():
1289 |             print(f"Stage: {stage_id}")
1290 |             if output["success"]:
1291 |                 print(f"  Provider/Model Used: {output.get('provider', 'N/A')}/{output.get('model', 'N/A')}")
1292 |                 print(f"  Cost: ${output.get('cost', 0.0):.6f}")
1293 |                 print(f"  Output: {output.get('result', 'N/A')}") # Access the primary result
1294 |                 # You might access specific keys like output.get('result', {}).get('summary') etc.
1295 |             else:
1296 |                 print(f"  Error: {output.get('error', 'Unknown error')}")
1297 |             print("-" * 20)
1298 | 
1299 |     else:
1300 |         print(f"\nWorkflow execution failed: {results.get('error', 'Unknown error')}")
1301 |         if 'details' in results: print(f"Details: {results['details']}")
1302 | 
1303 |     await client.close()
1304 | 
1305 | # if __name__ == "__main__": asyncio.run(optimized_workflow_example())
1306 | ```
1307 | 
1308 | ### Entity Relation Graph Example
1309 | 
1310 | ```python
1311 | import asyncio
1312 | from mcp.client import Client
1313 | # import networkx as nx # To process the graph data if needed
1314 | # import matplotlib.pyplot as plt # To visualize
1315 | 
1316 | async def entity_graph_example():
1317 |     client = Client("http://localhost:8013")
1318 |     document_text = """
1319 |     Meta Platforms, Inc., led by CEO Mark Zuckerberg, announced a partnership with IBM
1320 |     on developing new AI hardware accelerators. The collaboration aims to challenge Nvidia's dominance.
1321 |     IBM, headquartered in Armonk, New York, brings its deep expertise in semiconductor design.
1322 |     The project, codenamed 'Synergy', is expected to yield results by late 2025.
1323 |     """
1324 | 
1325 |     print("Extracting entity relationships from text...")
1326 |     # Request extraction of entities and their relationships
1327 |     entity_graph_response = await client.tools.extract_entity_relations(
1328 |         document=document_text,
1329 |         entity_types=["organization", "person", "location", "date", "project"], # Specify desired entity types
1330 |         relationship_types=["led_by", "partnership_with", "aims_to_challenge", "headquartered_in", "expected_by"], # Specify relationship types
1331 |         # Optional parameters:
1332 |         # provider_preference="quality", # Choose model strategy
1333 |         # llm_model="anthropic/claude-3-5-sonnet-20241022", # Suggest a specific model
1334 |         include_visualization=False # Set True to request image data if tool supports it
1335 |     )
1336 | 
1337 |     if entity_graph_response["success"]:
1338 |         print("Entity relationship extraction successful.")
1339 |         print(f"Estimated Cost: ${entity_graph_response.get('cost', 0.0):.6f}")
1340 | 
1341 |         # The graph data might be in various formats (e.g., node-link list, adjacency list)
1342 |         graph_data = entity_graph_response.get("graph_data")
1343 |         print("\n--- Graph Data (Nodes & Edges) ---")
1344 |         print(graph_data)
1345 |         print("------------------------------------")
1346 | 
1347 |         # Example: Query the extracted graph using another tool or LLM call
1348 |         # (Assuming a separate query tool or using a general completion tool)
1349 |         print("\nQuerying the extracted graph (example)...")
1350 |         query_prompt = f"""
1351 | Based on the following graph data representing relationships extracted from a text:
1352 | {graph_data}
1353 | 
1354 | Answer the question: Who is the CEO of Meta Platforms, Inc.?
1355 | """
1356 |         query_response = await client.tools.completion(
1357 |              prompt=query_prompt, provider="openai", model="gpt-4.1-mini", max_tokens=50
1358 |         )
1359 |         if query_response["success"]:
1360 |              print(f"Graph Query Answer: {query_response['completion']}")
1361 |         else:
1362 |              print(f"Graph query failed: {query_response['error']}")
1363 | 
1364 | 
1365 |     else:
1366 |         print(f"Entity relationship extraction failed: {entity_graph_response.get('error', 'Unknown error')}")
1367 | 
1368 |     await client.close()
1369 | 
1370 | # if __name__ == "__main__": asyncio.run(entity_graph_example())
1371 | ```
1372 | 
1373 | ### Document Chunking
1374 | 
1375 | ```python
1376 | import asyncio
1377 | from mcp.client import Client
1378 | 
1379 | async def document_chunking_example():
1380 |     client = Client("http://localhost:8013")
1381 |     large_document = """
1382 |     This is the first paragraph of a potentially very long document. It discusses various concepts.
1383 |     The second paragraph continues the discussion, adding more details and nuances. Proper chunking
1384 |     is crucial for processing large texts with Large Language Models, especially those with limited
1385 |     context windows. Different strategies exist, such as fixed token size, sentence splitting,
1386 |     or more advanced semantic chunking that tries to keep related ideas together. Overlap between
1387 |     chunks helps maintain context across boundaries. This paragraph is intentionally made longer
1388 |     to demonstrate how chunking might split it. It keeps going and going, describing the benefits
1389 |     of effective text splitting for downstream tasks like summarization, question answering, and
1390 |     retrieval-augmented generation (RAG). The goal is to create manageable pieces of text that
1391 |     still retain coherence. Semantic chunking often uses embedding models to find natural breakpoints
1392 |     in the text's meaning, potentially leading to better results than simple fixed-size chunks.
1393 |     The final sentence of this example paragraph.
1394 |     """ * 5 # Make it a bit longer for demonstration
1395 | 
1396 |     print("Requesting document chunking...")
1397 |     # Request chunking using a specific method and size
1398 |     chunking_response = await client.tools.chunk_document(
1399 |         document=large_document,
1400 |         chunk_size=100,     # Target size in tokens (approximate)
1401 |         overlap=20,         # Token overlap between consecutive chunks
1402 |         method="semantic"   # Options: "token", "sentence", "semantic", "structural" (if available)
1403 |     )
1404 | 
1405 |     if chunking_response["success"]:
1406 |         print(f"Document successfully divided into {chunking_response['chunk_count']} chunks.")
1407 |         print(f"Method Used: {chunking_response.get('method_used', 'N/A')}") # Confirm method if returned
1408 | 
1409 |         print("\n--- Example Chunks ---")
1410 |         for i, chunk in enumerate(chunking_response['chunks'][:3]): # Show first 3 chunks
1411 |             print(f"Chunk {i+1} (Length: {len(chunk)} chars):")
1412 |             print(f"'{chunk}'\n")
1413 |         if chunking_response['chunk_count'] > 3: print("...")
1414 |         print("----------------------")
1415 | 
1416 |         # These chunks can now be passed individually to other tools (e.g., summarize_document)
1417 |     else:
1418 |         print(f"Document chunking failed: {chunking_response['error']}")
1419 | 
1420 |     await client.close()
1421 | 
1422 | # if __name__ == "__main__": asyncio.run(document_chunking_example())
1423 | ```
1424 | 
1425 | ### Multi-Provider Completion (Duplicate of earlier example, kept for structure)
1426 | 
1427 | ```python
1428 | import asyncio
1429 | from mcp.client import Client
1430 | 
1431 | async def multi_provider_completion_example():
1432 |     client = Client("http://localhost:8013")
1433 |     prompt = "What are the main benefits of using the Model Context Protocol (MCP)?"
1434 | 
1435 |     print(f"Requesting completions for prompt: '{prompt}' from multiple providers...")
1436 |     multi_response = await client.tools.multi_completion(
1437 |         prompt=prompt,
1438 |         providers=[
1439 |             {"provider": "openai", "model": "gpt-4.1-mini"},
1440 |             {"provider": "anthropic", "model": "claude-3-5-haiku-20241022"},
1441 |             {"provider": "gemini", "model": "gemini-2.0-flash-lite"}
1442 |             # Add more configured providers as needed
1443 |         ],
1444 |         temperature=0.5,
1445 |         max_tokens=250
1446 |     )
1447 | 
1448 |     if multi_response["success"]:
1449 |         print("\n--- Multi-completion Results ---")
1450 |         total_cost = multi_response.get("total_cost", 0.0)
1451 |         print(f"Total Estimated Cost: ${total_cost:.6f}\n")
1452 |         for provider_key, result in multi_response["results"].items():
1453 |             print(f"--- Provider: {provider_key} ---")
1454 |             if result["success"]:
1455 |                 print(f"  Model: {result.get('model', 'N/A')}")
1456 |                 print(f"  Cost: ${result.get('cost', 0.0):.6f}")
1457 |                 print(f"  Completion:\n{result['completion']}\n")
1458 |             else:
1459 |                 print(f"  Error: {result['error']}\n")
1460 |         print("------------------------------")
1461 |     else:
1462 |         print(f"\nMulti-completion request failed: {multi_response['error']}")
1463 | 
1464 |     await client.close()
1465 | 
1466 | # if __name__ == "__main__": asyncio.run(multi_provider_completion_example())
1467 | ```
1468 | 
1469 | ### Structured Data Extraction (JSON)
1470 | 
1471 | ```python
1472 | import asyncio
1473 | from mcp.client import Client
1474 | import json
1475 | 
1476 | async def json_extraction_example():
1477 |     client = Client("http://localhost:8013")
1478 |     text_with_data = """
1479 |     Meeting Minutes - Project Phoenix - 2024-07-21
1480 | 
1481 |     Attendees: Alice (Lead), Bob (Dev), Charlie (QA)
1482 |     Date: July 21, 2024
1483 |     Project ID: PX-001
1484 | 
1485 |     Discussion Points:
1486 |     - Reviewed user feedback from v1.1 testing. Mostly positive.
1487 |     - Identified performance bottleneck in data processing module. Bob to investigate. Assigned High priority.
1488 |     - QA cycle for v1.2 planned to start next Monday (2024-07-29). Charlie confirmed readiness.
1489 | 
1490 |     Action Items:
1491 |     1. Bob: Investigate performance issue. Due: 2024-07-26. Priority: High. Status: Open.
1492 |     2. Alice: Prepare v1.2 release notes. Due: 2024-07-28. Priority: Medium. Status: Open.
1493 |     """
1494 | 
1495 |     # Define the desired JSON structure (schema)
1496 |     desired_schema = {
1497 |         "type": "object",
1498 |         "properties": {
1499 |             "project_name": {"type": "string", "description": "Name of the project"},
1500 |             "meeting_date": {"type": "string", "format": "date", "description": "Date of the meeting"},
1501 |             "attendees": {"type": "array", "items": {"type": "string"}, "description": "List of attendee names"},
1502 |             "action_items": {
1503 |                 "type": "array",
1504 |                 "items": {
1505 |                     "type": "object",
1506 |                     "properties": {
1507 |                         "task": {"type": "string"},
1508 |                         "assigned_to": {"type": "string"},
1509 |                         "due_date": {"type": "string", "format": "date"},
1510 |                         "priority": {"type": "string", "enum": ["Low", "Medium", "High"]},
1511 |                         "status": {"type": "string", "enum": ["Open", "In Progress", "Done"]}
1512 |                     },
1513 |                     "required": ["task", "assigned_to", "due_date", "priority", "status"]
1514 |                 }
1515 |             }
1516 |         },
1517 |         "required": ["project_name", "meeting_date", "attendees", "action_items"]
1518 |     }
1519 | 
1520 |     print("Requesting JSON extraction based on schema...")
1521 |     # Request extraction using a model capable of following JSON schema instructions
1522 |     json_response = await client.tools.extract_json(
1523 |         document=text_with_data,
1524 |         json_schema=desired_schema,
1525 |         provider="openai", # OpenAI models are generally good at this
1526 |         model="gpt-4o", # Use a capable model like GPT-4o or Claude 3.5 Sonnet
1527 |         # provider_preference="quality" # Could also use preference
1528 |     )
1529 | 
1530 |     if json_response["success"]:
1531 |         print("JSON extraction successful.")
1532 |         print(f"Estimated Cost: ${json_response.get('cost', 0.0):.6f}")
1533 | 
1534 |         # The extracted data should conform to the schema
1535 |         extracted_json_data = json_response.get('json_data')
1536 |         print("\n--- Extracted JSON Data ---")
1537 |         # Pretty print the JSON
1538 |         print(json.dumps(extracted_json_data, indent=2))
1539 |         print("---------------------------")
1540 | 
1541 |         # Optionally, validate the output against the schema client-side (requires jsonschema library)
1542 |         # try:
1543 |         #     from jsonschema import validate
1544 |         #     validate(instance=extracted_json_data, schema=desired_schema)
1545 |         #     print("\nClient-side validation successful: Output matches schema.")
1546 |         # except ImportError:
1547 |         #     print("\n(Install jsonschema to perform client-side validation)")
1548 |         # except Exception as e:
1549 |         #     print(f"\nClient-side validation failed: {e}")
1550 | 
1551 |     else:
1552 |         print(f"JSON Extraction Error: {json_response.get('error', 'Unknown error')}")
1553 |         if 'details' in json_response: print(f"Details: {json_response['details']}")
1554 | 
1555 |     await client.close()
1556 | 
1557 | # if __name__ == "__main__": asyncio.run(json_extraction_example())
1558 | ```
1559 | 
1560 | ### Retrieval-Augmented Generation (RAG) Query
1561 | 
1562 | ```python
1563 | import asyncio
1564 | from mcp.client import Client
1565 | 
1566 | async def rag_query_example():
1567 |     # This example assumes the Ultimate MCP Server has been configured with a RAG pipeline,
1568 |     # including a vector store/index containing relevant documents.
1569 |     client = Client("http://localhost:8013")
1570 |     query = "What are the latest treatment options for mitigating Alzheimer's disease according to recent studies?"
1571 | 
1572 |     print(f"Performing RAG query: '{query}'...")
1573 |     # Call the RAG tool, which handles retrieval and generation
1574 |     rag_response = await client.tools.rag_query( # Assuming the tool name is 'rag_query'
1575 |         query=query,
1576 |         # Optional parameters to control the RAG process:
1577 |         index_name="medical_research_papers", # Specify the index/collection to search
1578 |         top_k=3, # Retrieve top 3 most relevant documents/chunks
1579 |         # filter={"year": {"$gte": 2023}}, # Example filter (syntax depends on vector store)
1580 |         # generation_model={"provider": "anthropic", "model": "claude-3-5-sonnet-20241022"}, # Specify generation model
1581 |         # instruction_prompt="Based on the provided context, answer the user's query concisely." # Customize generation prompt
1582 |     )
1583 | 
1584 |     if rag_response["success"]:
1585 |         print("\nRAG query successful.")
1586 |         print(f"Estimated Cost: ${rag_response.get('cost', 0.0):.6f}") # Includes retrieval + generation cost
1587 | 
1588 |         print("\n--- Generated Answer ---")
1589 |         print(rag_response.get('answer', 'No answer generated.'))
1590 |         print("------------------------")
1591 | 
1592 |         # The response might also include details about the retrieved sources
1593 |         retrieved_sources = rag_response.get('sources', [])
1594 |         if retrieved_sources:
1595 |             print("\n--- Retrieved Sources ---")
1596 |             for i, source in enumerate(retrieved_sources):
1597 |                 print(f"Source {i+1}:")
1598 |                 print(f"  ID: {source.get('id', 'N/A')}")
1599 |                 print(f"  Score: {source.get('score', 'N/A'):.4f}")
1600 |                 # Depending on RAG setup, might include metadata or text snippet
1601 |                 print(f"  Content Snippet: {source.get('text', '')[:150]}...")
1602 |                 print("-" * 15)
1603 |             print("-----------------------")
1604 |         else:
1605 |             print("\nNo sources information provided in the response.")
1606 | 
1607 |     else:
1608 |         print(f"\nRAG Query Error: {rag_response.get('error', 'Unknown error')}")
1609 |         if 'details' in rag_response: print(f"Details: {rag_response['details']}")
1610 | 
1611 |     await client.close()
1612 | 
1613 | # if __name__ == "__main__": asyncio.run(rag_query_example())
1614 | ```
1615 | 
1616 | ### Fused Search (Keyword + Semantic)
1617 | 
1618 | ```python
1619 | import asyncio
1620 | from mcp.client import Client
1621 | 
1622 | async def fused_search_example():
1623 |     # This example assumes the server is configured with a hybrid search provider like Marqo.
1624 |     client = Client("http://localhost:8013")
1625 |     query = "impact of AI on software development productivity and code quality"
1626 | 
1627 |     print(f"Performing fused search for: '{query}'...")
1628 |     # Call the fused search tool
1629 |     fused_search_response = await client.tools.fused_search( # Assuming tool name is 'fused_search'
1630 |         query=query,
1631 |         # --- Parameters specific to the hybrid search backend (e.g., Marqo) ---
1632 |         index_name="tech_articles_index", # Specify the target index
1633 |         searchable_attributes=["title", "content"], # Fields to search within
1634 |         limit=5, # Number of results to return
1635 |         # Tunable weights for keyword vs. semantic relevance (example)
1636 |         hybrid_factors={"keyword_weight": 0.4, "semantic_weight": 0.6},
1637 |         # Optional filter string (syntax depends on backend)
1638 |         filter_string="publication_year >= 2023 AND source_type='journal'"
1639 |         # --------------------------------------------------------------------
1640 |     )
1641 | 
1642 |     if fused_search_response["success"]:
1643 |         print("\nFused search successful.")
1644 |         results = fused_search_response.get("results", [])
1645 |         print(f"Found {len(results)} hits.")
1646 | 
1647 |         if results:
1648 |             print("\n--- Search Results ---")
1649 |             for i, hit in enumerate(results):
1650 |                 print(f"Result {i+1}:")
1651 |                 # Fields depend on Marqo index structure and what's returned
1652 |                 print(f"  ID: {hit.get('_id', 'N/A')}")
1653 |                 print(f"  Score: {hit.get('_score', 'N/A'):.4f}") # Combined score
1654 |                 print(f"  Title: {hit.get('title', 'N/A')}")
1655 |                 print(f"  Content Snippet: {hit.get('content', '')[:150]}...")
1656 |                 # Print highlight info if available
1657 |                 highlights = hit.get('_highlights', {})
1658 |                 if highlights: print(f"  Highlights: {highlights}")
1659 |                 print("-" * 15)
1660 |             print("--------------------")
1661 |         else:
1662 |             print("No results found matching the criteria.")
1663 | 
1664 |     else:
1665 |         print(f"\nFused Search Error: {fused_search_response.get('error', 'Unknown error')}")
1666 |         if 'details' in fused_search_response: print(f"Details: {fused_search_response['details']}")
1667 | 
1668 |     await client.close()
1669 | 
1670 | # if __name__ == "__main__": asyncio.run(fused_search_example())
1671 | ```
1672 | 
1673 | ### Local Text Processing
1674 | 
1675 | ```python
1676 | import asyncio
1677 | from mcp.client import Client
1678 | 
1679 | async def local_text_processing_example():
1680 |     client = Client("http://localhost:8013")
1681 |     # Example assumes a tool named 'process_local_text' exists on the server
1682 |     # that bundles various non-LLM text operations.
1683 |     raw_text = "  This text has   EXTRA whitespace,\n\nmultiple newlines, \t tabs, and needs Case Normalization.  "
1684 | 
1685 |     print("Requesting local text processing operations...")
1686 |     local_process_response = await client.tools.process_local_text(
1687 |         text=raw_text,
1688 |         operations=[
1689 |             {"action": "trim_whitespace"},       # Remove leading/trailing whitespace
1690 |             {"action": "normalize_whitespace"},  # Collapse multiple spaces/tabs to single space
1691 |             {"action": "remove_blank_lines"},    # Remove empty lines
1692 |             {"action": "lowercase"}              # Convert to lowercase
1693 |             # Other potential actions: uppercase, remove_punctuation, normalize_newlines, etc.
1694 |         ]
1695 |     )
1696 | 
1697 |     if local_process_response["success"]:
1698 |         print("\nLocal text processing successful.")
1699 |         print(f"Original Text:\n'{raw_text}'")
1700 |         print(f"\nProcessed Text:\n'{local_process_response['processed_text']}'")
1701 |         # Note: This operation should ideally have zero LLM cost.
1702 |         print(f"Cost: ${local_process_response.get('cost', 0.0):.6f}")
1703 |     else:
1704 |         print(f"\nLocal Text Processing Error: {local_process_response['error']}")
1705 | 
1706 |     await client.close()
1707 | 
1708 | # if __name__ == "__main__": asyncio.run(local_text_processing_example())
1709 | ```
1710 | 
1711 | ### Browser Automation Example: Getting Started and Basic Interaction
1712 | 
1713 | ```python
1714 | import asyncio
1715 | from mcp.client import Client
1716 | 
1717 | async def browser_basic_interaction_example():
1718 |     # This example shows fundamental browser actions controlled by an agent
1719 |     client = Client("http://localhost:8013")
1720 |     print("--- Browser Automation: Basic Interaction ---")
1721 | 
1722 |     # 1. Initialize the browser (creates a browser instance on the server)
1723 |     print("\nInitializing browser (headless)...")
1724 |     # `headless=True` runs without a visible GUI window (common for automation)
1725 |     init_response = await client.tools.browser_init(headless=True, browser_type="chromium")
1726 |     if not init_response["success"]:
1727 |         print(f"Browser initialization failed: {init_response.get('error', 'Unknown error')}")
1728 |         await client.close()
1729 |         return
1730 |     print("Browser initialized successfully.")
1731 |     # Might return session ID if needed for subsequent calls, depends on tool design
1732 | 
1733 |     # 2. Navigate to a page
1734 |     target_url = "https://example.com/"
1735 |     print(f"\nNavigating to {target_url}...")
1736 |     # `wait_until` controls when navigation is considered complete
1737 |     nav_response = await client.tools.browser_navigate(
1738 |         url=target_url,
1739 |         wait_until="domcontentloaded" # Options: load, domcontentloaded, networkidle, commit
1740 |     )
1741 |     if nav_response["success"]:
1742 |         print(f"Navigation successful.")
1743 |         print(f"  Current URL: {nav_response.get('url', 'N/A')}")
1744 |         print(f"  Page Title: {nav_response.get('title', 'N/A')}")
1745 |         # The 'snapshot' gives the agent context about the page state (accessibility tree)
1746 |         # print(f"  Snapshot: {nav_response.get('snapshot', 'N/A')}")
1747 |     else:
1748 |         print(f"Navigation failed: {nav_response.get('error', 'Unknown error')}")
1749 |         # Attempt to close browser even if navigation failed
1750 |         await client.tools.browser_close()
1751 |         await client.close()
1752 |         return
1753 | 
1754 |     # 3. Extract text content using a CSS selector
1755 |     selector = "h1" # CSS selector for the main heading
1756 |     print(f"\nExtracting text from selector '{selector}'...")
1757 |     text_response = await client.tools.browser_get_text(selector=selector)
1758 |     if text_response["success"]:
1759 |         print(f"Extracted text: '{text_response.get('text', 'N/A')}'")
1760 |     else:
1761 |         print(f"Text extraction failed: {text_response.get('error', 'Unknown error')}")
1762 |         # Optionally check text_response['snapshot'] for page state at time of failure
1763 | 
1764 |     # 4. Take a screenshot (optional)
1765 |     print("\nTaking a screenshot...")
1766 |     screenshot_response = await client.tools.browser_screenshot(
1767 |         file_path="example_com_screenshot.png", # Path where server saves the file
1768 |         full_page=False, # Capture only the viewport
1769 |         image_format="png" # png or jpeg
1770 |     )
1771 |     if screenshot_response["success"]:
1772 |         print(f"Screenshot saved successfully on server at: {screenshot_response.get('saved_path', 'N/A')}")
1773 |         # Agent might use this path with a filesystem tool to retrieve the image if needed
1774 |     else:
1775 |          print(f"Screenshot failed: {screenshot_response.get('error', 'Unknown error')}")
1776 | 
1777 |     # 5. Close the browser session
1778 |     print("\nClosing the browser...")
1779 |     close_response = await client.tools.browser_close()
1780 |     if close_response["success"]:
1781 |         print("Browser closed successfully.")
1782 |     else:
1783 |         # Log error, but might happen if browser already crashed
1784 |         print(f"Browser close failed (might be expected if previous steps failed): {close_response.get('error', 'Unknown error')}")
1785 | 
1786 |     print("--- Browser Automation Example Complete ---")
1787 |     await client.close()
1788 | 
1789 | # if __name__ == "__main__": asyncio.run(browser_basic_interaction_example())
1790 | 
1791 | ```
1792 | 
1793 | ### Running a Model Tournament
1794 | 
1795 | ```python
1796 | import asyncio
1797 | from mcp.client import Client
1798 | import json
1799 | 
1800 | async def model_tournament_example():
1801 |     client = Client("http://localhost:8013")
1802 |     # Define the task and prompt for the tournament
1803 |     task_prompt = "Write a Python function that takes a list of integers and returns a new list containing only the even numbers."
1804 |     # Optional: Provide ground truth for automated evaluation if the tool supports it
1805 |     ground_truth_code = """
1806 | def get_even_numbers(numbers):
1807 |     \"\"\"Returns a new list containing only the even numbers from the input list.\"\"\"
1808 |     return [num for num in numbers if num % 2 == 0]
1809 | """
1810 | 
1811 |     print("Setting up and running a model tournament for code generation...")
1812 |     # Call the tournament tool
1813 |     tournament_response = await client.tools.run_model_tournament(
1814 |         task_type="code_generation", # Helps select appropriate evaluation metrics
1815 |         prompt=task_prompt,
1816 |         # List of models/providers to compete
1817 |         competitors=[
1818 |             {"provider": "openai", "model": "gpt-4.1-mini", "temperature": 0.2},
1819 |             {"provider": "anthropic", "model": "claude-3-5-sonnet-20241022", "temperature": 0.2},
1820 |             {"provider": "deepseek", "model": "deepseek-coder", "temperature": 0.2}, # Specialized coder model
1821 |             {"provider": "gemini", "model": "gemini-2.0-pro", "temperature": 0.2},
1822 |         ],
1823 |         # Criteria for evaluating the generated code
1824 |         evaluation_criteria=["correctness", "efficiency", "readability", "docstring_quality"],
1825 |         # Provide ground truth if available for automated correctness checks
1826 |         ground_truth=ground_truth_code,
1827 |         # Optional: Specify an LLM to act as the judge for qualitative criteria
1828 |         evaluation_model={"provider": "anthropic", "model": "claude-3-5-opus-20240229"}, # Use a powerful model for judging
1829 |         num_rounds=1 # Run multiple rounds for stability if needed
1830 |     )
1831 | 
1832 |     if tournament_response["success"]:
1833 |         print("\n--- Model Tournament Results ---")
1834 |         print(f"Task Prompt: {task_prompt}")
1835 |         print(f"Total Estimated Cost: ${tournament_response.get('total_cost', 0.0):.6f}\n")
1836 | 
1837 |         # Display the ranking
1838 |         ranking = tournament_response.get("ranking", [])
1839 |         if ranking:
1840 |             print("Overall Ranking:")
1841 |             for i, result in enumerate(ranking):
1842 |                 provider = result.get('provider', 'N/A')
1843 |                 model = result.get('model', 'N/A')
1844 |                 score = result.get('overall_score', 'N/A')
1845 |                 cost = result.get('cost', 0.0)
1846 |                 print(f"  {i+1}. {provider}/{model} - Score: {score:.2f}/10 - Cost: ${cost:.6f}")
1847 |         else:
1848 |             print("No ranking information available.")
1849 | 
1850 |         # Display detailed results for each competitor
1851 |         detailed_results = tournament_response.get("results", {})
1852 |         if detailed_results:
1853 |             print("\nDetailed Scores per Competitor:")
1854 |             for competitor_key, details in detailed_results.items():
1855 |                  print(f"  Competitor: {competitor_key}")
1856 |                  print(f"    Generated Code:\n```python\n{details.get('output', 'N/A')}\n```")
1857 |                  scores = details.get('scores', {})
1858 |                  if scores:
1859 |                      for criterion, score_value in scores.items():
1860 |                          print(f"    - {criterion}: {score_value}")
1861 |                  print("-" * 10)
1862 |         print("------------------------------")
1863 | 
1864 |     else:
1865 |         print(f"\nModel Tournament Failed: {tournament_response.get('error', 'Unknown error')}")
1866 |         if 'details' in tournament_response: print(f"Details: {tournament_response['details']}")
1867 | 
1868 |     await client.close()
1869 | 
1870 | # if __name__ == "__main__": asyncio.run(model_tournament_example())
1871 | ```
1872 | 
1873 | ### Meta Tools for Tool Discovery
1874 | 
1875 | ```python
1876 | import asyncio
1877 | from mcp.client import Client
1878 | import json
1879 | 
1880 | async def meta_tools_example():
1881 |     client = Client("http://localhost:8013")
1882 |     print("--- Meta Tools Example ---")
1883 | 
1884 |     # 1. List all available tools
1885 |     print("\nFetching list of available tools...")
1886 |     # Assumes a tool named 'list_tools' provides this info
1887 |     list_tools_response = await client.tools.list_tools(include_schemas=False) # Set True for full schemas
1888 | 
1889 |     if list_tools_response["success"]:
1890 |         tools = list_tools_response.get("tools", {})
1891 |         print(f"Found {len(tools)} available tools:")
1892 |         for tool_name, tool_info in tools.items():
1893 |             description = tool_info.get('description', 'No description available.')
1894 |             print(f"  - {tool_name}: {description[:100]}...") # Print truncated description
1895 |     else:
1896 |         print(f"Failed to list tools: {list_tools_response.get('error', 'Unknown error')}")
1897 | 
1898 |     # 2. Get detailed information about a specific tool
1899 |     tool_to_inspect = "extract_json"
1900 |     print(f"\nFetching details for tool: '{tool_to_inspect}'...")
1901 |     # Assumes a tool like 'get_tool_info' or using list_tools with specific name/schema flag
1902 |     tool_info_response = await client.tools.list_tools(tool_names=[tool_to_inspect], include_schemas=True)
1903 | 
1904 |     if tool_info_response["success"] and tool_to_inspect in tool_info_response.get("tools", {}):
1905 |         tool_details = tool_info_response["tools"][tool_to_inspect]
1906 |         print(f"\nDetails for '{tool_to_inspect}':")
1907 |         print(f"  Description: {tool_details.get('description', 'N/A')}")
1908 |         # Print the parameter schema if available
1909 |         schema = tool_details.get('parameters', {}).get('json_schema', {})
1910 |         if schema:
1911 |             print(f"  Parameter Schema:\n{json.dumps(schema, indent=2)}")
1912 |         else:
1913 |             print("  Parameter Schema: Not available.")
1914 |     else:
1915 |         print(f"Failed to get info for tool '{tool_to_inspect}': {tool_info_response.get('error', 'Not found or error')}")
1916 | 
1917 |     # 3. Get tool recommendations for a task (if such a meta tool exists)
1918 |     task_description = "Read data from a PDF file, extract tables, and save them as CSV."
1919 |     print(f"\nGetting tool recommendations for task: '{task_description}'...")
1920 |     # Assumes a tool like 'get_tool_recommendations'
1921 |     recommendations_response = await client.tools.get_tool_recommendations(
1922 |         task=task_description,
1923 |         constraints={"priority": "accuracy", "max_cost_per_doc": 0.10} # Example constraints
1924 |     )
1925 | 
1926 |     if recommendations_response["success"]:
1927 |         print("Recommended Tool Workflow:")
1928 |         recommendations = recommendations_response.get("recommendations", [])
1929 |         if recommendations:
1930 |             for i, step in enumerate(recommendations):
1931 |                 print(f"  Step {i+1}: Tool='{step.get('tool', 'N/A')}' - Reason: {step.get('reason', 'N/A')}")
1932 |         else:
1933 |             print("  No recommendations provided.")
1934 |     else:
1935 |          print(f"Failed to get recommendations: {recommendations_response.get('error', 'Unknown error')}")
1936 | 
1937 |     print("\n--- Meta Tools Example Complete ---")
1938 |     await client.close()
1939 | 
1940 | # if __name__ == "__main__": asyncio.run(meta_tools_example())
1941 | ```
1942 | 
1943 | ### Local Command-Line Text Processing (e.g., jq)
1944 | 
1945 | ```python
1946 | import asyncio
1947 | from mcp.client import Client
1948 | import json
1949 | 
1950 | async def local_cli_tool_example():
1951 |     client = Client("http://localhost:8013")
1952 |     print("--- Local CLI Tool Example (jq) ---")
1953 | 
1954 |     # Example JSON data to be processed by jq
1955 |     json_input_data = json.dumps({
1956 |         "users": [
1957 |             {"id": 1, "name": "Alice", "email": "[email protected]", "status": "active"},
1958 |             {"id": 2, "name": "Bob", "email": "[email protected]", "status": "inactive"},
1959 |             {"id": 3, "name": "Charlie", "email": "[email protected]", "status": "active"}
1960 |         ],
1961 |         "metadata": {"timestamp": "2024-07-21T12:00:00Z"}
1962 |     })
1963 | 
1964 |     # Define the jq filter to apply
1965 |     # This filter selects active users and outputs their name and email
1966 |     jq_filter = '.users[] | select(.status=="active") | {name: .name, email: .email}'
1967 | 
1968 |     print(f"\nRunning jq with filter: '{jq_filter}' on input JSON...")
1969 |     # Call the server tool that wraps jq (e.g., 'run_jq')
1970 |     jq_result = await client.tools.run_jq(
1971 |         args_str=jq_filter, # Pass the filter as arguments (check tool spec how it expects filters)
1972 |         input_data=json_input_data, # Provide the JSON string as input
1973 |         # Additional options might be available depending on the tool wrapper:
1974 |         # e.g., output_format="json_lines" or "compact_json"
1975 |     )
1976 | 
1977 |     if jq_result["success"]:
1978 |         print("jq execution successful.")
1979 |         # stdout typically contains the result of the jq filter
1980 |         print("\n--- jq Output (stdout) ---")
1981 |         print(jq_result.get("stdout", "No output"))
1982 |         print("--------------------------")
1983 |         # stderr might contain warnings or errors from jq itself
1984 |         stderr_output = jq_result.get("stderr")
1985 |         if stderr_output:
1986 |             print("\n--- jq Stderr ---")
1987 |             print(stderr_output)
1988 |             print("-----------------")
1989 |         # This should have minimal or zero cost as it runs locally on the server
1990 |         print(f"\nCost: ${jq_result.get('cost', 0.0):.6f}")
1991 |     else:
1992 |         print(f"\njq Execution Error: {jq_result.get('error', 'Unknown error')}")
1993 |         print(f"Stderr: {jq_result.get('stderr', 'N/A')}")
1994 | 
1995 |     print("\n--- Local CLI Tool Example Complete ---")
1996 |     await client.close()
1997 | 
1998 | # if __name__ == "__main__": asyncio.run(local_cli_tool_example())
1999 | ```
2000 | 
2001 | ### Dynamic API Integration
2002 | 
2003 | ```python
2004 | import asyncio
2005 | from mcp.client import Client
2006 | import json
2007 | 
2008 | async def dynamic_api_example():
2009 |     # This example assumes the server has tools like 'register_api', 'list_registered_apis',
2010 |     # 'call_dynamic_tool', and 'unregister_api'.
2011 |     client = Client("http://localhost:8013")
2012 |     print("--- Dynamic API Integration Example ---")
2013 | 
2014 |     # 1. Register an external API using its OpenAPI (Swagger) specification URL
2015 |     api_name_to_register = "public_cat_facts"
2016 |     openapi_spec_url = "https://catfact.ninja/docs/api-docs.json" # Example public API spec
2017 |     print(f"\nRegistering API '{api_name_to_register}' from {openapi_spec_url}...")
2018 | 
2019 |     register_response = await client.tools.register_api(
2020 |         api_name=api_name_to_register,
2021 |         openapi_url=openapi_spec_url,
2022 |         # Optional: Provide authentication details if needed (e.g., Bearer token, API Key)
2023 |         # authentication={"type": "bearer", "token": "your_api_token"},
2024 |         # Optional: Set default headers
2025 |         # default_headers={"X-Custom-Header": "value"},
2026 |         # Optional: Cache settings for API responses (if tool supports it)
2027 |         cache_ttl=300 # Cache responses for 5 minutes
2028 |     )
2029 | 
2030 |     if register_response["success"]:
2031 |         print(f"API '{api_name_to_register}' registered successfully.")
2032 |         print(f"  Registered {register_response.get('tools_count', 0)} new MCP tools derived from the API.")
2033 |         print(f"  Tools Registered: {register_response.get('tools_registered', [])}")
2034 |     else:
2035 |         print(f"API registration failed: {register_response.get('error', 'Unknown error')}")
2036 |         await client.close()
2037 |         return
2038 | 
2039 |     # 2. List currently registered dynamic APIs
2040 |     print("\nListing registered dynamic APIs...")
2041 |     list_apis_response = await client.tools.list_registered_apis()
2042 |     if list_apis_response["success"]:
2043 |         registered_apis = list_apis_response.get("apis", {})
2044 |         print(f"Currently registered APIs: {list(registered_apis.keys())}")
2045 |         # print(json.dumps(registered_apis, indent=2)) # Print full details
2046 |     else:
2047 |         print(f"Failed to list registered APIs: {list_apis_response.get('error', 'Unknown error')}")
2048 | 
2049 |     # 3. Call a dynamically created tool corresponding to an API endpoint
2050 |     # The tool name is typically derived from the API name and endpoint's operationId or path.
2051 |     # Check the 'tools_registered' list from step 1 or documentation for the exact name.
2052 |     # Let's assume the tool for GET /fact is 'public_cat_facts_getFact'
2053 |     dynamic_tool_name = "public_cat_facts_getFact" # Adjust based on actual registered name
2054 |     print(f"\nCalling dynamic tool '{dynamic_tool_name}'...")
2055 | 
2056 |     call_response = await client.tools.call_dynamic_tool(
2057 |         tool_name=dynamic_tool_name,
2058 |         # Provide inputs matching the API endpoint's parameters
2059 |         inputs={
2060 |             # Example query parameter for GET /fact (check API spec)
2061 |              "max_length": 100
2062 |         }
2063 |     )
2064 | 
2065 |     if call_response["success"]:
2066 |         print("Dynamic tool call successful.")
2067 |         # The result usually contains the API's response body and status code
2068 |         print(f"  Status Code: {call_response.get('status_code', 'N/A')}")
2069 |         print(f"  Response Body:\n{json.dumps(call_response.get('response_body', {}), indent=2)}")
2070 |     else:
2071 |         print(f"Dynamic tool call failed: {call_response.get('error', 'Unknown error')}")
2072 |         print(f"  Status Code: {call_response.get('status_code', 'N/A')}")
2073 |         print(f"  Response Body: {call_response.get('response_body', 'N/A')}")
2074 | 
2075 |     # 4. Unregister the API when no longer needed (optional cleanup)
2076 |     print(f"\nUnregistering API '{api_name_to_register}'...")
2077 |     unregister_response = await client.tools.unregister_api(api_name=api_name_to_register)
2078 |     if unregister_response["success"]:
2079 |         print(f"API unregistered successfully. Removed {unregister_response.get('tools_count', 0)} tools.")
2080 |     else:
2081 |         print(f"API unregistration failed: {unregister_response.get('error', 'Unknown error')}")
2082 | 
2083 |     print("\n--- Dynamic API Integration Example Complete ---")
2084 |     await client.close()
2085 | 
2086 | # if __name__ == "__main__": asyncio.run(dynamic_api_example())
2087 | ```
2088 | 
2089 | ### OCR Usage Example
2090 | 
2091 | ```python
2092 | import asyncio
2093 | from mcp.client import Client
2094 | import os
2095 | 
2096 | async def ocr_example():
2097 |     # Requires 'ocr' extras installed: uv pip install -e ".[ocr]"
2098 |     # Also requires Tesseract OCR engine installed on the server host system.
2099 |     client = Client("http://localhost:8013")
2100 |     print("--- OCR Tool Example ---")
2101 | 
2102 |     # --- Create dummy files for testing ---
2103 |     # In a real scenario, these files would exist on a path accessible by the server.
2104 |     # Ensure the server process has permissions to read these files.
2105 |     dummy_files_dir = "ocr_test_files"
2106 |     os.makedirs(dummy_files_dir, exist_ok=True)
2107 |     dummy_pdf_path = os.path.join(dummy_files_dir, "dummy_document.pdf")
2108 |     dummy_image_path = os.path.join(dummy_files_dir, "dummy_image.png")
2109 | 
2110 |     # Create a simple dummy PDF (requires reportlab - pip install reportlab)
2111 |     try:
2112 |         from reportlab.pdfgen import canvas
2113 |         from reportlab.lib.pagesizes import letter
2114 |         c = canvas.Canvas(dummy_pdf_path, pagesize=letter)
2115 |         c.drawString(100, 750, "This is page 1 of a dummy PDF.")
2116 |         c.drawString(100, 730, "It contains some text for OCR testing.")
2117 |         c.showPage()
2118 |         c.drawString(100, 750, "This is page 2.")
2119 |         c.save()
2120 |         print(f"Created dummy PDF: {dummy_pdf_path}")
2121 |     except ImportError:
2122 |         print("Could not create dummy PDF: reportlab not installed. Skipping PDF test.")
2123 |         dummy_pdf_path = None
2124 |     except Exception as e:
2125 |         print(f"Error creating dummy PDF: {e}. Skipping PDF test.")
2126 |         dummy_pdf_path = None
2127 | 
2128 |     # Create a simple dummy PNG image (requires Pillow - pip install Pillow)
2129 |     try:
2130 |         from PIL import Image, ImageDraw, ImageFont
2131 |         img = Image.new('RGB', (400, 100), color = (255, 255, 255))
2132 |         d = ImageDraw.Draw(img)
2133 |         # Use a default font if possible, otherwise basic text
2134 |         try: font = ImageFont.truetype("arial.ttf", 15)
2135 |         except IOError: font = ImageFont.load_default()
2136 |         d.text((10,10), "Dummy Image Text for OCR\nLine 2 of text.", fill=(0,0,0), font=font)
2137 |         img.save(dummy_image_path)
2138 |         print(f"Created dummy Image: {dummy_image_path}")
2139 |     except ImportError:
2140 |         print("Could not create dummy Image: Pillow not installed. Skipping Image test.")
2141 |         dummy_image_path = None
2142 |     except Exception as e:
2143 |         print(f"Error creating dummy Image: {e}. Skipping Image test.")
2144 |         dummy_image_path = None
2145 |     # --- End of dummy file creation ---
2146 | 
2147 | 
2148 |     # 1. Extract text from the PDF using OCR and LLM correction
2149 |     if dummy_pdf_path:
2150 |         print(f"\nExtracting text from PDF: {dummy_pdf_path} (using hybrid method)...")
2151 |         pdf_text_result = await client.tools.extract_text_from_pdf(
2152 |             file_path=dummy_pdf_path, # Server needs access to this path
2153 |             extraction_method="hybrid", # Try direct extraction, fallback to OCR
2154 |             max_pages=2, # Limit pages to process
2155 |             reformat_as_markdown=True, # Request markdown formatting
2156 |             # Optional: Use an LLM to correct/improve the raw OCR text
2157 |             llm_correction_model={"provider": "openai", "model": "gpt-4.1-mini"}
2158 |         )
2159 |         if pdf_text_result["success"]:
2160 |             print("PDF text extraction successful.")
2161 |             print(f"  Method Used: {pdf_text_result.get('extraction_method_used', 'N/A')}")
2162 |             print(f"  Cost (incl. LLM correction): ${pdf_text_result.get('cost', 0.0):.6f}")
2163 |             print("\n--- Extracted PDF Text (Markdown) ---")
2164 |             print(pdf_text_result.get("text", "No text extracted."))
2165 |             print("-------------------------------------")
2166 |         else:
2167 |             print(f"PDF OCR failed: {pdf_text_result.get('error', 'Unknown error')}")
2168 |             if 'details' in pdf_text_result: print(f"Details: {pdf_text_result['details']}")
2169 |     else:
2170 |          print("\nSkipping PDF OCR test as dummy file could not be created.")
2171 | 
2172 | 
2173 |     # 2. Process the image file with OCR and preprocessing
2174 |     if dummy_image_path:
2175 |         print(f"\nProcessing image OCR: {dummy_image_path} with preprocessing...")
2176 |         image_text_result = await client.tools.process_image_ocr(
2177 |             image_path=dummy_image_path, # Server needs access to this path
2178 |             # Optional preprocessing steps (require OpenCV on server)
2179 |             preprocessing_options={
2180 |                 "grayscale": True,
2181 |                 # "threshold": "otsu", # e.g., otsu, adaptive
2182 |                 # "denoise": True,
2183 |                 # "deskew": True
2184 |             },
2185 |             ocr_language="eng" # Specify language(s) for Tesseract e.g., "eng+fra"
2186 |             # Optional LLM enhancement for image OCR results
2187 |             # llm_enhancement_model={"provider": "gemini", "model": "gemini-2.0-flash-lite"}
2188 |         )
2189 |         if image_text_result["success"]:
2190 |             print("Image OCR successful.")
2191 |             print(f"  Cost (incl. LLM enhancement): ${image_text_result.get('cost', 0.0):.6f}")
2192 |             print("\n--- Extracted Image Text ---")
2193 |             print(image_text_result.get("text", "No text extracted."))
2194 |             print("----------------------------")
2195 |         else:
2196 |             print(f"Image OCR failed: {image_text_result.get('error', 'Unknown error')}")
2197 |             if 'details' in image_text_result: print(f"Details: {image_text_result['details']}")
2198 |     else:
2199 |          print("\nSkipping Image OCR test as dummy file could not be created.")
2200 | 
2201 |     # --- Clean up dummy files ---
2202 |     # try:
2203 |     #     if dummy_pdf_path and os.path.exists(dummy_pdf_path): os.remove(dummy_pdf_path)
2204 |     #     if dummy_image_path and os.path.exists(dummy_image_path): os.remove(dummy_image_path)
2205 |     #     if os.path.exists(dummy_files_dir): os.rmdir(dummy_files_dir) # Only if empty
2206 |     # except Exception as e:
2207 |     #      print(f"\nError cleaning up dummy files: {e}")
2208 |     # --- End cleanup ---
2209 | 
2210 |     print("\n--- OCR Tool Example Complete ---")
2211 |     await client.close()
2212 | 
2213 | # if __name__ == "__main__": asyncio.run(ocr_example())
2214 | ```
2215 | 
2216 | *(Note: Many examples involving file paths assume the server process has access to those paths. For Docker deployments, volume mapping is usually required.)*
2217 | 
2218 | ---
2219 | 
2220 | ## ✨ Autonomous Documentation Refiner
2221 | 
2222 | The Ultimate MCP Server includes a powerful feature for autonomously analyzing, testing, and refining the documentation of registered MCP tools. This feature, implemented in `ultimate/tools/docstring_refiner.py`, helps improve the usability and reliability of tools when invoked by Large Language Models (LLMs) like Claude.
2223 | 
2224 | ### How It Works
2225 | 
2226 | The documentation refiner follows a methodical, iterative approach:
2227 | 
2228 | 1.  **Agent Simulation**: Simulates how an LLM agent would interpret the current documentation (docstring, schema, examples) to identify potential ambiguities or missing information crucial for correct invocation.
2229 | 2.  **Adaptive Test Generation**: Creates diverse test cases based on the tool's input schema (parameter types, constraints, required fields), simulation results, and failures from previous refinement iterations. Aims for good coverage.
2230 | 3.  **Schema-Aware Testing**: Validates generated test inputs against the tool's schema *before* execution. Executes valid tests against the actual tool implementation within the server environment.
2231 | 4.  **Ensemble Failure Analysis**: If a test fails (e.g., wrong output, error thrown), multiple LLMs analyze the failure in the context of the specific documentation version used for that test run to pinpoint the documentation's weaknesses.
2232 | 5.  **Structured Improvement Proposals**: Based on the analysis, the system generates specific, targeted improvements:
2233 |     *   **Description:** Rewording or adding clarity.
2234 |     *   **Schema:** Proposing changes via JSON Patch operations (e.g., adding descriptions to parameters, refining types, adding examples).
2235 |     *   **Usage Examples:** Generating new or refining existing examples.
2236 | 6.  **Validated Schema Patching**: Applies proposed JSON patches to the schema *in-memory* and validates the resulting schema structure before accepting the change for the next iteration.
2237 | 7.  **Iterative Refinement**: Repeats the cycle (generate tests -> execute -> analyze failures -> propose improvements -> patch schema) until tests consistently pass or a maximum iteration count is reached.
2238 | 8.  **Optional Winnowing**: After iterations, performs a final pass to condense and streamline the documentation while ensuring critical information discovered during testing is preserved.
2239 | 
2240 | ### Benefits
2241 | 
2242 | -   **Reduces Manual Effort**: Automates the often tedious process of writing and maintaining high-quality tool documentation for LLM consumption.
2243 | -   **Improves Agent Performance**: Creates clearer, more precise documentation, leading to fewer errors when LLMs try to use the tools.
2244 | -   **Identifies Edge Cases**: The testing process can uncover ambiguities and edge cases that human writers might miss.
2245 | -   **Increases Consistency**: Helps establish a more uniform style and level of detail across documentation for all tools.
2246 | -   **Adapts to Feedback**: Learns directly from simulated agent failures to target specific documentation weaknesses.
2247 | -   **Schema Evolution**: Allows for gradual, validated improvement of tool schemas based on usage simulation.
2248 | -   **Detailed Reporting**: Provides comprehensive logs and reports on the entire refinement process, including tests run, failures encountered, and changes made.
2249 | 
2250 | ### Limitations and Considerations
2251 | 
2252 | -   **Cost & Time**: Can be computationally expensive and time-consuming, as it involves multiple LLM calls (for simulation, test generation, failure analysis, improvement proposal) per tool per iteration.
2253 | -   **Resource Intensive**: May require significant CPU/memory, especially when refining many tools or using large LLMs for analysis.
2254 | -   **LLM Dependency**: The quality of the refinement heavily depends on the capabilities of the LLMs used for the analysis and generation steps.
2255 | -   **Schema Complexity**: Generating correct and meaningful JSON Patches for highly complex or nested schemas can be challenging for the LLM.
2256 | -   **Determinism**: The process involves LLMs, so results might not be perfectly deterministic between runs.
2257 | -   **Maintenance Complexity**: The refiner itself is a complex system with dependencies that require maintenance.
2258 | 
2259 | ### When to Use
2260 | 
2261 | This feature is particularly valuable when:
2262 | 
2263 | -   You have a large number of MCP tools exposed to LLM agents.
2264 | -   You observe frequent tool usage failures potentially caused by agent misinterpretation of documentation.
2265 | -   You are actively developing or expanding your tool ecosystem and need to ensure consistent, high-quality documentation.
2266 | -   You want to proactively improve agent reliability and performance without necessarily modifying the underlying tool code itself.
2267 | -   You have the budget (LLM credits) and time to invest in this automated quality improvement process.
2268 | 
2269 | ### Usage Example (Server-Side Invocation)
2270 | 
2271 | The documentation refiner is typically invoked as a server-side maintenance or administrative task, not directly exposed as an MCP tool for external agents to call.
2272 | 
2273 | ```python
2274 | # This code snippet shows how the refiner might be called from within the
2275 | # server's environment (e.g., via a CLI command or admin interface).
2276 | 
2277 | # Assume necessary imports and context setup:
2278 | # from ultimate_mcp_server.tools.docstring_refiner import refine_tool_documentation
2279 | # from ultimate_mcp_server.core import mcp_context # Represents the server's context
2280 | 
2281 | async def invoke_doc_refiner_task():
2282 |     # Ensure mcp_context is properly initialized with registered tools, config, etc.
2283 |     print("Starting Autonomous Documentation Refinement Task...")
2284 | 
2285 |     # Example: Refine documentation for a specific list of tools
2286 |     refinement_result = await refine_tool_documentation(
2287 |         tool_names=["extract_json", "browser_navigate", "chunk_document"], # Tools to refine
2288 |         max_iterations=3, # Limit refinement cycles per tool
2289 |         refinement_model_config={ # Specify LLM for refinement tasks
2290 |             "provider": "anthropic",
2291 |             "model": "claude-3-5-sonnet-20241022"
2292 |         },
2293 |         testing_model_config={ # Optional: Specify LLM for test generation/simulation
2294 |             "provider": "openai",
2295 |             "model": "gpt-4o"
2296 |         },
2297 |         enable_winnowing=True, # Apply final streamlining pass
2298 |         stop_on_first_error=False, # Continue refining other tools if one fails
2299 |         ctx=mcp_context # Pass the server's MCP context
2300 |     )
2301 | 
2302 |     # Example: Refine all available tools (potentially very long running)
2303 |     # refinement_result = await refine_tool_documentation(
2304 |     #     refine_all_available=True,
2305 |     #     max_iterations=2,
2306 |     #     ctx=mcp_context
2307 |     # )
2308 | 
2309 |     print("\nDocumentation Refinement Task Complete.")
2310 | 
2311 |     # Process the results
2312 |     if refinement_result["success"]:
2313 |         print(f"Successfully processed {len(refinement_result.get('refined_tools', []))} tools.")
2314 |         # The actual docstrings/schemas of the tools in mcp_context might be updated in-memory.
2315 |         # Persisting these changes would require additional logic (e.g., writing back to source files).
2316 |         print("Detailed report available in the result object.")
2317 |         # print(refinement_result.get('report')) # Contains detailed logs and changes
2318 |     else:
2319 |         print(f"Refinement task encountered errors: {refinement_result.get('error', 'Unknown error')}")
2320 |         # Check the report for details on which tools failed and why.
2321 | 
2322 | # To run this, it would need to be integrated into the server's startup sequence,
2323 | # a dedicated CLI command, or an administrative task runner.
2324 | # e.g., await invoke_doc_refiner_task()
2325 | ```
2326 | 
2327 | ---
2328 | 
2329 | ## ✅ Example Library and Testing Framework
2330 | 
2331 | The Ultimate MCP Server includes an extensive collection of **35+ end-to-end examples** located in the `examples/` directory. These serve a dual purpose:
2332 | 
2333 | 1.  **Living Documentation**: They demonstrate practical, real-world usage patterns for nearly every tool and feature.
2334 | 2.  **Integration Test Suite**: They form a comprehensive test suite ensuring all components work together correctly.
2335 | 
2336 | ### Example Structure and Organization
2337 | 
2338 | -   **Categorized**: Examples are grouped by functionality (e.g., `model_integration`, `tool_specific`, `workflows`, `advanced_features`).
2339 | -   **Standalone**: Each example (`*.py`) is a runnable Python script using `mcp-client` to interact with a running server instance.
2340 | -   **Clear Output**: They utilize the `Rich` library for formatted, color-coded console output, clearly showing requests, responses, costs, timings, and results.
2341 | -   **Error Handling**: Examples include basic error checking for robust demonstration.
2342 | 
2343 | ### Rich Visual Output
2344 | 
2345 | Expect informative console output, including:
2346 | 
2347 | -   📊 Tables summarizing results and statistics.
2348 | -   🎨 Syntax highlighting for code and JSON.
2349 | -   ⏳ Progress indicators or detailed step logging.
2350 | -   🖼️ Panels organizing output sections.
2351 | 
2352 | *Example output snippet:*
2353 | ```
2354 | ╭────────────────────── Tournament Results ───────────────────────╮
2355 | │ [1] claude-3-5-haiku-20241022: Score 8.7/10                    │
2356 | │     Cost: $0.00013                                             │
2357 | │ ...                                                            │
2358 | ╰────────────────────────────────────────────────────────────────╯
2359 | ```
2360 | 
2361 | ### Customizing and Learning
2362 | 
2363 | -   **Adaptable**: Easily modify examples to use your API keys (via `.env`), different models, custom prompts, or input files.
2364 | -   **Command-Line Args**: Many examples accept arguments for customization (e.g., `--model`, `--input-file`, `--headless`).
2365 | -   **Educational**: Learn best practices for AI application structure, tool selection, parameter tuning, error handling, cost optimization, and integration patterns.
2366 | 
2367 | ### Comprehensive Testing Framework
2368 | 
2369 | The `run_all_demo_scripts_and_check_for_errors.py` script orchestrates the execution of all examples as a test suite:
2370 | 
2371 | -   **Automated Execution**: Discovers and runs `examples/*.py` sequentially.
2372 | -   **Validation**: Checks exit codes and `stderr` against predefined patterns to distinguish real errors from expected messages (e.g., missing API key warnings).
2373 | -   **Reporting**: Generates a summary report of passed, failed, and skipped tests, along with detailed logs.
2374 | 
2375 | *Example test framework configuration snippet:*
2376 | ```python
2377 | "sql_database_interactions_demo.py": {
2378 |     "expected_exit_code": 0,
2379 |     "allowed_stderr_patterns": [
2380 |         r"Could not compute statistics...", # Known non-fatal warning
2381 |         r"Connection failed...", # Expected if DB not set up
2382 |         r"Configuration not yet loaded..." # Standard info message
2383 |     ]
2384 | }
2385 | ```
2386 | 
2387 | ### Running the Example Suite
2388 | 
2389 | ```bash
2390 | # Ensure the Ultimate MCP Server is running in a separate terminal
2391 | 
2392 | # Run the entire test suite
2393 | python run_all_demo_scripts_and_check_for_errors.py
2394 | 
2395 | # Run a specific example script directly
2396 | python examples/browser_automation_demo.py --headless
2397 | 
2398 | # Run an example with custom arguments
2399 | python examples/text_redline_demo.py --input-file1 path/to/doc1.txt --input-file2 path/to/doc2.txt
2400 | ```
2401 | 
2402 | This combined example library and testing framework provides invaluable resources for understanding, utilizing, and verifying the functionality of the Ultimate MCP Server.
2403 | 
2404 | ---
2405 | 
2406 | ## 💻 CLI Commands
2407 | 
2408 | Ultimate MCP Server comes with a command-line interface (`umcp`) for server management and tool interaction:
2409 | 
2410 | ```bash
2411 | # Show available commands and global options
2412 | umcp --help
2413 | 
2414 | # --- Server Management ---
2415 | # Start the server (loads .env, registers tools)
2416 | umcp run [--host HOST] [--port PORT] [--include-tools tool1 tool2] [--exclude-tools tool3 tool4]
2417 | 
2418 | # --- Information ---
2419 | # List configured LLM providers
2420 | umcp providers [--check] [--models]
2421 | 
2422 | # List available tools
2423 | umcp tools [--category CATEGORY] [--examples]
2424 | 
2425 | # --- Testing & Interaction ---
2426 | # Test connection and basic generation for a specific provider
2427 | umcp test <provider_name> [--model MODEL_NAME] [--prompt TEXT]
2428 | 
2429 | # Generate a completion directly from the CLI
2430 | umcp complete --provider <provider_name> --model <model_name> --prompt "Your prompt here" [--temperature N] [--max-tokens N] [--system TEXT] [--stream]
2431 | 
2432 | # --- Cache Management ---
2433 | # View or clear the request cache
2434 | umcp cache [--status] [--clear]
2435 | 
2436 | # --- Benchmark ---
2437 | umcp benchmark [--providers P1 P2] [--models M1 M2] [--prompt TEXT] [--runs N]
2438 | 
2439 | # --- Examples ---
2440 | umcp examples [--list] [<example_name>] [--category CATEGORY]
2441 | ```
2442 | 
2443 | Each command typically has additional options. Use `umcp COMMAND --help` to see options for a specific command (e.g., `umcp complete --help`).
2444 | 
2445 | ---
2446 | 
2447 | ## 🛠️ Advanced Configuration
2448 | 
2449 | Configuration is primarily managed through **environment variables**, often loaded from a `.env` file in the project root upon startup.
2450 | 
2451 | ### Server Configuration
2452 | -   `SERVER_HOST`: (Default: `127.0.0.1`) Network interface to bind to. Use `0.0.0.0` to listen on all interfaces (necessary for Docker containers or external access).
2453 | -   `SERVER_PORT`: (Default: `8013`) Port the server listens on.
2454 | -   `API_PREFIX`: (Default: `/`) URL prefix for all API endpoints (e.g., set to `/mcp/v1` to serve under that path).
2455 | -   `WORKERS`: (Optional, e.g., `4`) Number of worker processes for the web server (e.g., Uvicorn). Adjust based on CPU cores.
2456 | 
2457 | ### Tool Filtering (Startup Control)
2458 | Control which tools are registered when the server starts using CLI flags:
2459 | -   `--include-tools tool1,tool2,...`: Only register the specified tools.
2460 | -   `--exclude-tools tool3,tool4,...`: Register all tools *except* those specified.
2461 |     ```bash
2462 |     # Example: Start with only filesystem and basic completion tools
2463 |     umcp run --include-tools read_file,write_file,list_directory,completion
2464 |     # Example: Start with all tools except browser automation
2465 |     umcp run --exclude-tools browser_init,browser_navigate,browser_click
2466 |     ```
2467 |     This is useful for creating lightweight instances, managing dependencies, or restricting agent capabilities.
2468 | 
2469 | ### Logging Configuration
2470 | -   `LOG_LEVEL`: (Default: `INFO`) Controls log verbosity (`DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`). `DEBUG` is very verbose.
2471 | -   `USE_RICH_LOGGING`: (Default: `true`) Enables colorful, structured console logs via the Rich library. Set to `false` for plain text logs (better for file redirection or some logging systems).
2472 | -   `LOG_FORMAT`: (Optional) Specify a Python `logging` format string for custom log formats (if `USE_RICH_LOGGING=false`).
2473 | -   `LOG_TO_FILE`: (Optional, e.g., `/var/log/ultimate_mcp_server.log`) Path to a file where logs should *also* be written (in addition to console). Ensure the server process has write permissions.
2474 | 
2475 | ### Cache Configuration
2476 | -   `CACHE_ENABLED`: (Default: `true`) Globally enable or disable response caching.
2477 | -   `CACHE_TTL`: (Default: `86400` seconds = 24 hours) Default Time-To-Live for cached items. Specific tools might have overrides.
2478 | -   `CACHE_TYPE`: (Default: `memory`) Backend storage. Check implementation for supported types (e.g., `memory`, `redis`, `diskcache`). `diskcache` provides persistence.
2479 | -   `CACHE_DIR`: (Default: `./.cache`) Directory used if `CACHE_TYPE=diskcache`. Ensure write permissions.
2480 | -   `CACHE_MAX_SIZE`: (Optional, e.g., `1000` for items or `536870912` for 512MB for `diskcache`) Sets size limits for the cache.
2481 | -   `REDIS_URL`: (Required if `CACHE_TYPE=redis`) Connection URL for Redis server (e.g., `redis://localhost:6379/0`).
2482 | 
2483 | ### Provider Timeouts & Retries
2484 | -   `PROVIDER_TIMEOUT`: (Default: `120`) Default timeout in seconds for waiting for a response from an LLM provider API.
2485 | -   `PROVIDER_MAX_RETRIES`: (Default: `3`) Default number of times to retry a failed request to a provider (for retryable errors like rate limits or temporary server issues). Uses exponential backoff.
2486 | -   Specific provider overrides might exist via dedicated variables (e.g., `OPENAI_TIMEOUT`, `ANTHROPIC_MAX_RETRIES`). Check configuration loading logic or documentation.
2487 | 
2488 | ### Tool-Specific Configuration
2489 | Individual tools might load their own configuration from environment variables. Examples:
2490 | -   `ALLOWED_DIRS`: Comma-separated list of base directories filesystem tools are restricted to. **Crucially for security.**
2491 | -   `PLAYWRIGHT_BROWSER_TYPE`: (Default: `chromium`) Browser used by Playwright tools (`chromium`, `firefox`, `webkit`).
2492 | -   `PLAYWRIGHT_TIMEOUT`: Default timeout for Playwright actions.
2493 | -   `DATABASE_URL`: Connection string for the SQL Database Interaction tools (uses SQLAlchemy).
2494 | -   `MARQO_URL`: URL for the Marqo instance used by the fused search tool.
2495 | -   `TESSERACT_CMD`: Path to the Tesseract executable if not in standard system PATH (for OCR).
2496 | 
2497 | *Always ensure environment variables are set correctly **before** starting the server. Changes typically require a server restart to take effect.*
2498 | 
2499 | ---
2500 | 
2501 | ## ☁️ Deployment Considerations
2502 | 
2503 | While `umcp run` or `docker compose up` are fine for development, consider these for more robust deployments:
2504 | 
2505 | ### 1. Running as a Background Service
2506 | Ensure the server runs continuously and restarts automatically.
2507 | -   **`systemd` (Linux):** Create a service unit file (`.service`) to manage the process with `systemctl start|stop|restart|status`. Provides robust control and logging integration.
2508 | -   **`supervisor`:** A process control system written in Python. Configure `supervisord` to monitor and manage the server process.
2509 | -   **Docker Restart Policies:** Use `--restart unless-stopped` or `--restart always` in your `docker run` command or in `docker-compose.yml` to have Docker manage restarts.
2510 | 
2511 | ### 2. Using a Reverse Proxy (Nginx, Caddy, Apache, Traefik)
2512 | Placing a reverse proxy in front of the Ultimate MCP Server is **highly recommended**:
2513 | -   🔒 **HTTPS/SSL Termination:** Handles SSL certificates (e.g., via Let's Encrypt with Caddy/Certbot) encrypting external traffic.
2514 | -   ⚖️ **Load Balancing:** Distribute traffic if running multiple instances of the server for high availability or scaling.
2515 | -   🗺️ **Path Routing:** Map a clean external URL (e.g., `https://api.yourdomain.com/mcp/`) to the internal server (`http://localhost:8013`). Configure `API_PREFIX` if needed.
2516 | -   🛡️ **Security Headers:** Add important headers like `Strict-Transport-Security` (HSTS), `Content-Security-Policy` (CSP).
2517 | -   🚦 **Access Control:** Implement IP allow-listing, basic authentication, or integrate with OAuth2 proxies.
2518 | -   ⏳ **Buffering/Caching:** May offer additional request/response buffering or caching layers.
2519 | -   ⏱️ **Timeouts:** Manage connection timeouts independently from the application server.
2520 | 
2521 | *Example Nginx `location` block (simplified):*
2522 | ```nginx
2523 | location /mcp/ { # Match your desired public path (corresponds to API_PREFIX if set)
2524 |     proxy_pass http://127.0.0.1:8013/; # Point to the internal server (note trailing /)
2525 |     proxy_set_header Host $host;
2526 |     proxy_set_header X-Real-IP $remote_addr;
2527 |     proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
2528 |     proxy_set_header X-Forwarded-Proto $scheme;
2529 | 
2530 |     # Increase timeouts for potentially long-running AI tasks
2531 |     proxy_connect_timeout 60s;
2532 |     proxy_send_timeout 300s;
2533 |     proxy_read_timeout 300s;
2534 | 
2535 |     # Optional: Add basic authentication
2536 |     # auth_basic "Restricted Access";
2537 |     # auth_basic_user_file /etc/nginx/.htpasswd;
2538 | }
2539 | ```
2540 | 
2541 | ### 3. Container Orchestration (Kubernetes, Docker Swarm)
2542 | For scalable, managed deployments:
2543 | -   ❤️ **Health Checks:** Implement and configure liveness and readiness probes using the server's `/healthz` endpoint (or similar) in your deployment manifests.
2544 | -   🔑 **Configuration:** Use ConfigMaps and Secrets (Kubernetes) or Docker Secrets/Configs to manage environment variables and API keys securely, rather than baking them into images or relying solely on `.env` files.
2545 | -   ⚙️ **Resource Limits:** Define appropriate CPU and memory requests/limits for the container(s) to ensure stable performance and avoid resource starvation on the node.
2546 | -   🌐 **Service Discovery:** Utilize the orchestrator's built-in service discovery instead of hardcoding IPs or hostnames. Expose the service internally (e.g., ClusterIP) and use an Ingress controller for external access.
2547 | -   💾 **Persistent Storage:** If using features requiring persistence (e.g., `diskcache`, persistent memory, file storage), configure persistent volumes (PVs/PVCs).
2548 | 
2549 | ### 4. Resource Allocation
2550 | -   **RAM:** Ensure sufficient memory, especially if using large models, in-memory caching, processing large documents, or running memory-intensive tools (like browser automation or certain data processing tasks). Monitor usage.
2551 | -   **CPU:** Monitor CPU load. LLM inference itself might not be CPU-bound (often GPU/TPU), but other tools (OCR, local processing, web server handling requests) can be. Consider the number of workers (`WORKERS` env var).
2552 | -   **Disk I/O:** Can be a bottleneck if using persistent caching (`diskcache`) or extensive filesystem operations. Use fast storage (SSDs) if needed.
2553 | -   **Network:** Ensure adequate bandwidth, especially if handling large documents, images, or frequent/large API responses.
2554 | 
2555 | ---
2556 | 
2557 | ## 💸 Cost Savings With Delegation
2558 | 
2559 | Using Ultimate MCP Server for intelligent delegation can yield significant cost savings compared to using only a high-end model like Claude 3.7 Sonnet or GPT-4o for every task.
2560 | 
2561 | | Task Scenario                   | High-End Model Only (Est.) | Delegated via MCP Server (Est.) | Estimated Savings | Notes                                        |
2562 | | :------------------------------ | :------------------------- | :------------------------------ | :---------------- | :------------------------------------------- |
2563 | | Summarize 100-page document   | ~$4.50 - $6.00             | ~$0.45 - $0.70 (Gemini Flash)   | **~90%**          | Chunking + parallel cheap summaries          |
2564 | | Extract data from 50 records  | ~$2.25 - $3.00             | ~$0.35 - $0.50 (GPT-4.1 Mini)   | **~84%**          | Batch processing with cost-effective model |
2565 | | Generate 20 content ideas     | ~$0.90 - $1.20             | ~$0.12 - $0.20 (DeepSeek/Haiku) | **~87%**          | Simple generation task on cheaper model    |
2566 | | Process 1,000 customer queries| ~$45.00 - $60.00           | ~$7.50 - $12.00 (Mixed Models)  | **~83%**          | Routing based on query complexity          |
2567 | | OCR & Extract from 10 Scans   | ~$1.50 - $2.50 (If LLM OCR)| ~$0.20 - $0.40 (OCR + LLM Fix)  | **~85%**          | Using dedicated OCR + cheap LLM correction |
2568 | | Basic Web Scrape & Summarize  | ~$0.50 - $1.00             | ~$0.10 - $0.20 (Browser + Haiku)| **~80%**          | Browser tool + cheap LLM for summary       |
2569 | 
2570 | *(Costs are highly illustrative, based on typical token counts and approximate 2024 pricing. Actual costs depend heavily on document size, complexity, specific models used, and current provider pricing.)*
2571 | 
2572 | **How savings are achieved:**
2573 | 
2574 | -   **Matching Model to Task:** Using expensive models only for tasks requiring deep reasoning, creativity, or complex instruction following.
2575 | -   **Leveraging Cheaper Models:** Delegating summarization, extraction, simple Q&A, formatting, etc., to significantly cheaper models (like Gemini Flash, Claude Haiku, GPT-4.1 Mini, DeepSeek Chat).
2576 | -   **Using Specialized Tools:** Employing non-LLM tools (Filesystem, OCR, Browser, CLI utils, Database) where appropriate, avoiding LLM API calls entirely for those operations.
2577 | -   **Caching:** Reducing redundant API calls for identical or semantically similar requests.
2578 | 
2579 | Ultimate MCP Server acts as the intelligent routing layer to make these cost optimizations feasible within a sophisticated agent architecture.
2580 | 
2581 | ---
2582 | 
2583 | ## 🧠 Why AI-to-AI Delegation Matters
2584 | 
2585 | The strategic importance of AI-to-AI delegation, facilitated by systems like the Ultimate MCP Server, extends beyond simple cost savings:
2586 | 
2587 | ### Democratizing Advanced AI Capabilities
2588 | -   Makes the power of cutting-edge reasoning models (like Claude 3.7, GPT-4o) practically accessible for a wider range of applications by offloading routine work.
2589 | -   Allows organizations with budget constraints to leverage top-tier AI capabilities for critical reasoning steps, while managing overall costs effectively.
2590 | -   Enables more efficient and widespread use of AI resources across the industry.
2591 | 
2592 | ### Economic Resource Optimization
2593 | -   Represents a fundamental economic optimization in AI usage: applying the most expensive resource (top-tier LLM inference) only where its unique value is required.
2594 | -   Complex reasoning, creativity, nuanced understanding, and orchestration are reserved for high-capability models.
2595 | -   Routine data processing, extraction, formatting, and simpler Q&A are handled by cost-effective models.
2596 | -   Specialized, non-LLM tasks (web scraping, file I/O, DB queries) are handled by purpose-built tools, avoiding unnecessary LLM calls.
2597 | -   The overall system aims for near-top-tier performance and capability at a significantly reduced blended cost.
2598 | -   Transforms potentially unpredictable LLM API costs into a more controlled expenditure through intelligent routing and caching.
2599 | 
2600 | ### Sustainable AI Architecture
2601 | -   Promotes more sustainable AI usage by reducing the computational demand associated with using the largest models for every single task.
2602 | -   Creates a tiered, capability-matched approach to AI resource allocation.
2603 | -   Allows for more extensive experimentation and development, as many iterations can utilize cheaper models or tools.
2604 | -   Provides a scalable approach to integrating AI that can grow with business needs without costs spiraling uncontrollably.
2605 | 
2606 | ### Technical Evolution Path
2607 | -   Represents an important evolution in AI application architecture, moving beyond monolithic calls to single models towards distributed, multi-agent, multi-model workflows.
2608 | -   Enables sophisticated, AI-driven orchestration of complex processing pipelines involving diverse tools and models.
2609 | -   Creates a foundation for AI systems that can potentially reason about their own resource usage and optimize dynamically.
2610 | -   Builds towards more autonomous, self-optimizing AI systems capable of making intelligent delegation decisions based on context, cost, and required quality.
2611 | 
2612 | ### The Future of AI Efficiency
2613 | -   Ultimate MCP Server points toward a future where AI systems actively manage and optimize their own operational costs and resource usage.
2614 | -   Higher-capability models act as intelligent orchestrators or "managers" for ecosystems of specialized tools and more cost-effective "worker" models.
2615 | -   AI workflows become increasingly sophisticated, potentially self-organizing and resilient.
2616 | -   Organizations can leverage the full spectrum of AI capabilities – from basic processing to advanced reasoning – in a financially viable and scalable manner.
2617 | 
2618 | This vision of efficient, intelligently delegated, self-optimizing AI systems represents the next frontier in practical AI deployment, moving beyond the current paradigm of often using a single, powerful (and expensive) model for almost everything.
2619 | 
2620 | ---
2621 | 
2622 | ## 🧱 Architecture
2623 | 
2624 | ### How MCP Integration Works
2625 | 
2626 | The Ultimate MCP Server is built natively on the Model Context Protocol (MCP):
2627 | 
2628 | 1.  **MCP Server Core**: Implements a web server (e.g., using FastAPI) that listens for incoming HTTP requests conforming to the MCP specification (typically POST requests to a specific endpoint).
2629 | 2.  **Tool Registration**: During startup, the server discovers and registers all available tool implementations. Each tool provides metadata including its name, description, and input/output schemas (often Pydantic models converted to JSON Schema). This registry allows the server (and potentially agents) to know what tools are available and how to use them.
2630 | 3.  **Tool Invocation**: When an MCP client (like Claude or another application) sends a valid MCP request specifying a tool name and parameters, the server core routes the request to the appropriate registered tool's execution logic.
2631 | 4.  **Context Passing & Execution**: The tool receives the validated input parameters. It performs its action (calling an LLM, interacting with Playwright, querying a DB, manipulating a file, etc.).
2632 | 5.  **Structured Response**: The tool's execution result (or error) is packaged into a standard MCP response format, typically including status (success/failure), output data (conforming to the tool's output schema), cost information, and potentially other metadata.
2633 | 6.  **Return to Client**: The MCP server core sends the structured MCP response back to the originating client over HTTP.
2634 | 
2635 | This adherence to the MCP standard ensures seamless, predictable integration with any MCP-compatible agent or client application.
2636 | 
2637 | ### Component Diagram
2638 | 
2639 | ```plaintext
2640 | +---------------------+       MCP Request        +------------------------------------+       API Request       +-----------------+
2641 | |   MCP Agent/Client  | ----------------------> |        Ultimate MCP Server         | ----------------------> |  LLM Providers  |
2642 | | (e.g., Claude 3.7)  | <---------------------- | (FastAPI + MCP Core + Tool Logic)  | <---------------------- | (OpenAI, Anthro.)|
2643 | +---------------------+      MCP Response       +------------------+-----------------+      API Response       +--------+--------+
2644 |                                                             |                                       |
2645 |                                                             | Tool Invocation                       | External API Call
2646 |                                                             ▼                                       ▼
2647 | +-----------------------------------------------------------+------------------------------------------------------------+
2648 | | Internal Services & Tool Implementations                                                                               |
2649 | | +-------------------+  +-------------------+  +-------------------+  +-------------------+  +-------------------+       |
2650 | | | Completion/LLM    |  | Document Proc.    |  | Data Extraction   |  | Browser Automation|  | Excel Automation  |       |
2651 | | | (Routing/Provider)|  | (Chunking, Sum.)  |  | (JSON, Table)     |  | (Playwright)      |  | (OpenPyXL/COM)    |       |
2652 | | +---------+---------+  +-------------------+  +-------------------+  +-------------------+  +-------------------+       |
2653 | |           |                                                                                                            |
2654 | | +---------+---------+  +-------------------+  +-------------------+  +-------------------+  +-------------------+       |
2655 | | | Cognitive Memory  |  | Filesystem Ops    |  | SQL Database      |  | Entity/Graph      |  | Vector/RAG        |       |
2656 | | | (Storage/Query)   |  | (Secure Access)   |  | (SQLAlchemy)      |  | (NetworkX)        |  | (Vector Stores)   |       |
2657 | | +-------------------+  +-------------------+  +-------------------+  +-------------------+  +---------+---------+       |
2658 | |                                                                                                        |                 |
2659 | | +-------------------+  +-------------------+  +-------------------+  +-------------------+  +---------+---------+       |
2660 | | | Audio Transcription|  | OCR Tools         |  | Text Classify     |  | CLI Tools         |  | Dynamic API       |       |
2661 | | | (Whisper, etc.)   |  | (Tesseract+LLM)   |  |                   |  | (jq, rg, awk)     |  | (OpenAPI->Tool)   |       |
2662 | | +-------------------+  +-------------------+  +-------------------+  +-------------------+  +-------------------+       |
2663 | |                                                                                                                        |
2664 | | +-------------------+  +-------------------+  +-------------------+  +-------------------+  +-------------------+       |
2665 | | | Caching Service   |  | Analytics/Metrics |  | Prompt Management |  | Config Service    |  | Meta Tools/Refiner|       |
2666 | | | (Memory/Disk/Redis|  | (Cost/Usage Track)|  | (Jinja2/Repo)     |  | (Loads .env)      |  | (list_tools etc.) |       |
2667 | | +-------------------+  +-------------------+  +-------------------+  +-------------------+  +-------------------+       |
2668 | +------------------------------------------------------------------------------------------------------------------------+
2669 | ```
2670 | 
2671 | ### Request Flow for Delegation (Detailed)
2672 | 
2673 | 1.  **Agent Decision**: An MCP agent determines a need for a specific capability (e.g., summarize a large text, extract JSON, browse a URL) potentially suited for delegation.
2674 | 2.  **MCP Request Formulation**: The agent constructs an MCP tool invocation request, specifying the `tool_name` and required `inputs` according to the tool's schema (which it might have discovered via `list_tools`).
2675 | 3.  **HTTP POST to Server**: The agent sends this request (typically as JSON in the body) via HTTP POST to the Ultimate MCP Server's designated endpoint.
2676 | 4.  **Request Reception & Parsing**: The server's web framework (FastAPI) receives the request. The MCP Core parses the JSON body, validating it against the general MCP request structure.
2677 | 5.  **Tool Dispatch**: The MCP Core looks up the requested `tool_name` in its registry of registered tools.
2678 | 6.  **Input Validation**: The server uses the specific tool's input schema (Pydantic model) to validate the `inputs` provided in the request. If validation fails, an MCP error response is generated immediately.
2679 | 7.  **Tool Execution Context**: A context object might be created, potentially containing configuration, access to shared services (like logging, caching, analytics), etc.
2680 | 8.  **Caching Check**: The Caching Service is consulted. It generates a cache key based on the `tool_name` and validated `inputs`. If a valid, non-expired cache entry exists for this key, the cached response is retrieved and returned (skipping to step 14).
2681 | 9.  **Tool Logic Execution**: If not cached, the tool's main execution logic runs:
2682 |     *   **LLM Task**: If the tool involves calling an LLM (e.g., `completion`, `summarize_document`, `extract_json`):
2683 |         *   The Optimization/Routing logic selects the provider/model based on parameters (`provider`, `model`, `provider_preference`) and server configuration.
2684 |         *   The Prompt Management service might format the final prompt using templates.
2685 |         *   The Provider Abstraction layer constructs the specific API request for the chosen provider.
2686 |         *   The API call is made, handling potential retries and timeouts.
2687 |         *   The LLM response is received and parsed.
2688 |     *   **Specialized Tool Task**: If it's a non-LLM tool (e.g., `read_file`, `browser_navigate`, `run_sql_query`, `run_ripgrep`):
2689 |         *   The tool interacts directly with the relevant system (filesystem, Playwright browser instance, database connection, subprocess execution).
2690 |         *   Security checks (e.g., allowed directories, SQL sanitization placeholders) are performed.
2691 |         *   The result of the operation is obtained.
2692 | 10. **Cost Calculation**: For LLM tasks, the Analytics Service calculates the estimated cost based on input/output tokens and provider pricing. For other tasks, the cost is typically zero unless they consume specific metered resources.
2693 | 11. **Result Formatting**: The tool formats its result (data or error message) according to its defined output schema.
2694 | 12. **Analytics Recording**: The Analytics Service logs the request, response (or error), execution time, cost, provider/model used, cache status (hit/miss), etc.
2695 | 13. **Caching Update**: If the operation was successful and caching is enabled for this tool/request, the Caching Service stores the formatted response with its calculated TTL.
2696 | 14. **MCP Response Formulation**: The MCP Core packages the final result (either from cache or from execution) into a standard MCP response structure, including `status`, `outputs`, `error` (if any), and potentially `cost`, `usage_metadata`.
2697 | 15. **HTTP Response to Agent**: The server sends the MCP response back to the agent as the HTTP response (typically with a 200 OK status, even if the *tool operation* failed – the MCP request itself succeeded). The agent then parses this response to determine the outcome of the tool call.
2698 | 
2699 | ---
2700 | 
2701 | ## 🌍 Real-World Use Cases
2702 | 
2703 | ### Advanced AI Agent Capabilities
2704 | Empower agents like Claude or custom-built autonomous agents to perform complex, multi-modal tasks by giving them tools for:
2705 | -   **Persistent Memory & Learning:** Maintain context across long conversations or tasks using the Cognitive Memory system.
2706 | -   **Web Interaction & Research:** Automate browsing, data extraction from websites, form submissions, and synthesize information from multiple online sources.
2707 | -   **Data Analysis & Reporting:** Create, manipulate, and analyze data within Excel spreadsheets; generate charts and reports.
2708 | -   **Database Operations:** Access and query enterprise databases to retrieve or update information based on agent goals.
2709 | -   **Document Understanding:** Process PDFs, images (OCR), extract key information, summarize long reports, answer questions based on documents (RAG).
2710 | -   **Knowledge Graph Management:** Build and query internal knowledge graphs about specific domains, projects, or entities.
2711 | -   **Multimedia Processing:** Transcribe audio recordings from meetings or voice notes.
2712 | -   **Code Execution & Analysis:** Use CLI tools or specialized code tools (if added) for development or data tasks.
2713 | -   **External Service Integration:** Interact with other company APIs or public APIs dynamically registered via OpenAPI.
2714 | 
2715 | ### Enterprise Workflow Automation
2716 | Build sophisticated automated processes that leverage AI reasoning and specialized tools:
2717 | -   **Intelligent Document Processing Pipeline:** Ingest scans/PDFs -> OCR -> Extract structured data (JSON) -> Validate data -> Classify document type -> Route to appropriate system or summarize for human review.
2718 | -   **Automated Research Assistant:** Given a topic -> Search academic databases (via Browser/API tool) -> Download relevant papers (Browser/Filesystem) -> Chunk & Summarize papers (Document tools) -> Extract key findings (Extraction tools) -> Store in Cognitive Memory -> Generate synthesized report.
2719 | -   **Financial Reporting Automation:** Connect to database (SQL tool) -> Extract financial data -> Populate Excel template (Excel tool) -> Generate charts & variance analysis -> Email report (if an email tool is added).
2720 | -   **Customer Support Ticket Enrichment:** Receive ticket text -> Classify issue type (Classification tool) -> Search internal knowledge base & documentation (RAG tool) -> Draft suggested response -> Augment with customer details from CRM (via DB or API tool).
2721 | -   **Competitor Monitoring:** Schedule browser automation task -> Visit competitor websites/news feeds -> Extract key announcements/pricing changes -> Summarize findings -> Alert relevant team.
2722 | 
2723 | ### Data Processing and Integration
2724 | Handle complex data tasks beyond simple ETL:
2725 | -   **Unstructured to Structured:** Extract specific information (JSON, tables) from emails, reports, chat logs, product reviews.
2726 | -   **Knowledge Graph Creation:** Process a corpus of documents (e.g., company wiki, research papers) to build an entity relationship graph for querying insights.
2727 | -   **Data Transformation & Cleansing:** Use SQL tools, Excel automation, or local text processing (awk, sed) for complex data manipulation guided by LLM instructions.
2728 | -   **Automated Data Categorization:** Apply text classification tools to large datasets (e.g., categorizing user feedback, tagging news articles).
2729 | -   **Semantic Data Search:** Build searchable vector indexes over internal documents, enabling users or agents to find information based on meaning, not just keywords (RAG).
2730 | 
2731 | ### Research and Analysis (Scientific, Market, etc.)
2732 | Support research teams with AI-powered tools:
2733 | -   **Automated Literature Search & Review:** Use browser/API tools to search databases (PubMed, ArXiv, etc.), download papers, chunk, summarize, and extract key methodologies or results.
2734 | -   **Comparative Analysis:** Use multi-provider completion or tournament tools to compare how different models interpret or generate hypotheses based on research data.
2735 | -   **Data Extraction from Studies:** Automatically pull structured data (participant numbers, p-values, outcomes) from published papers or reports into a database or spreadsheet.
2736 | -   **Budget Tracking:** Utilize the analytics features to monitor LLM API costs associated with research tasks.
2737 | -   **Persistent Research Log:** Use the Cognitive Memory system to store findings, hypotheses, observations, and reasoning steps throughout a research project.
2738 | 
2739 | ### Document Intelligence
2740 | Create comprehensive systems for understanding document collections:
2741 | -   **End-to-End Pipeline:** OCR scanned documents -> Enhance text with LLMs -> Extract predefined fields (Extraction tools) -> Classify document types -> Identify key entities/relationships -> Generate summaries -> Index text and metadata into a searchable system (Vector/SQL DB).
2742 | 
2743 | ### Financial Analysis and Modeling
2744 | Equip financial professionals with advanced tools:
2745 | -   **AI-Assisted Model Building:** Use natural language to instruct the Excel automation tool to create complex financial models, projections, or valuation analyses.
2746 | -   **Data Integration:** Pull market data via browser automation or APIs, combine it with internal data from databases (SQL tools).
2747 | -   **Report Analysis:** Use RAG or summarization tools to quickly understand long financial reports or filings.
2748 | -   **Scenario Testing:** Programmatically modify inputs in Excel models to run sensitivity analyses.
2749 | -   **Decision Tracking:** Use Cognitive Memory to log the reasoning behind investment decisions or analyses.
2750 | 
2751 | ---
2752 | 
2753 | ## 🔐 Security Considerations
2754 | 
2755 | When deploying and operating the Ultimate MCP Server, security must be a primary concern. Consider the following aspects:
2756 | 
2757 | 1.  🔑 **API Key Management:**
2758 |     *   **Never hardcode API keys** in source code or commit them to version control.
2759 |     *   Use **environment variables** (`.env` file for local dev, system environment variables, or preferably secrets management tools like HashiCorp Vault, AWS Secrets Manager, GCP Secret Manager for production).
2760 |     *   Ensure the `.env` file (if used locally) has **strict file permissions** (e.g., `chmod 600 .env`) readable only by the user running the server.
2761 |     *   Use **separate keys** for development and production environments.
2762 |     *   Implement **key rotation** policies and revoke suspected compromised keys immediately.
2763 | 
2764 | 2.  🌐 **Network Exposure & Access Control:**
2765 |     *   **Bind to `127.0.0.1` (`SERVER_HOST`)** by default to only allow local connections. Only change to `0.0.0.0` if you intend to expose it, and *only* behind appropriate network controls.
2766 |     *   **Use a Reverse Proxy:** (Nginx, Caddy, Traefik, etc.) placed in front of the server is **highly recommended**. It handles SSL/TLS termination, can enforce access controls (IP allow-listing, client certificate auth, Basic Auth, OAuth2 proxy integration), and provides a layer of separation.
2767 |     *   **Firewall Rules:** Configure host-based or network firewalls to restrict access to the `SERVER_PORT` only from trusted sources (e.g., the reverse proxy's IP, specific application server IPs, VPN ranges).
2768 | 
2769 | 3.  👤 **Authentication & Authorization:**
2770 |     *   The Ultimate MCP Server itself might not have built-in user/agent authentication. Authentication should typically be handled at a layer *before* the server (e.g., by the reverse proxy or an API gateway).
2771 |     *   Ensure that only **authorized clients** (trusted AI agents, specific backend services) can send requests to the server endpoint. Consider using mutual TLS (mTLS) or API keys/tokens managed by the proxy/gateway if needed.
2772 |     *   If tools provide different levels of access (e.g., read-only vs. read-write filesystem), consider if authorization logic is needed *within* the server or managed externally.
2773 | 
2774 | 4.  🚦 **Rate Limiting & Abuse Prevention:**
2775 |     *   Implement **rate limiting** at the reverse proxy or API gateway level based on source IP, API key, or other identifiers. This prevents denial-of-service (DoS) attacks and helps control costs from excessive API usage (both LLM and potentially tool usage).
2776 |     *   Monitor usage patterns for signs of abuse.
2777 | 
2778 | 5.  🛡️ **Input Validation & Sanitization:**
2779 |     *   While MCP provides a structured format, pay close attention to tools that interact with external systems based on user/agent input:
2780 |         *   **Filesystem Tools:** **Crucially**, configure `ALLOWED_DIRS` strictly. Validate and normalize all path inputs rigorously to prevent directory traversal (`../`). Ensure the server process runs with least privilege.
2781 |         *   **SQL Tools:** Use parameterized queries or ORMs (like SQLAlchemy) correctly to prevent SQL injection vulnerabilities. Avoid constructing SQL strings directly from agent input.
2782 |         *   **Browser Tools:** Be cautious with tools that execute arbitrary JavaScript (`browser_evaluate_script`). Avoid running scripts based directly on untrusted agent input if possible. Playwright's sandboxing helps but isn't foolproof.
2783 |         *   **CLI Tools:** Sanitize arguments passed to tools like `run_ripgrep`, `run_jq`, etc., to prevent command injection, especially if constructing complex command strings. Use safe methods for passing input data (e.g., stdin).
2784 |     *   Validate input data types and constraints using Pydantic schemas for all tool inputs.
2785 | 
2786 | 6.  📦 **Dependency Security:**
2787 |     *   Regularly **update dependencies** using `uv pip install --upgrade ...` or `uv sync` to patch known vulnerabilities in third-party libraries (FastAPI, Pydantic, Playwright, database drivers, etc.).
2788 |     *   Use security scanning tools (`pip-audit`, GitHub Dependabot, Snyk) to automatically identify vulnerable dependencies in your `pyproject.toml` or `requirements.txt`.
2789 | 
2790 | 7.  📄 **Logging Security:**
2791 |     *   Be aware that `DEBUG` level logging might log sensitive information, including full prompts, API responses, file contents, or keys present in data. Configure `LOG_LEVEL` appropriately for production (`INFO` or `WARNING` is usually safer).
2792 |     *   Ensure log files (if `LOG_TO_FILE` is used) have appropriate permissions and consider log rotation and retention policies. Avoid logging raw API keys.
2793 | 
2794 | 8.  ⚙️ **Tool-Specific Security:**
2795 |     *   Review the security implications of each specific tool enabled. Does it allow writing files? Executing code? Accessing databases? Ensure configurations (like `ALLOWED_DIRS`, database credentials with limited permissions) follow the principle of least privilege. Disable tools that are not needed or cannot be secured adequately for your environment.
2796 | 
2797 | ---
2798 | 
2799 | ## 📃 License
2800 | 
2801 | This project is licensed under the MIT License - see the `LICENSE` file for details.
2802 | 
2803 | ---
2804 | 
2805 | ## 🙏 Acknowledgements
2806 | 
2807 | This project builds upon the work of many fantastic open-source projects and services. Special thanks to:
2808 | 
2809 | -   [Model Context Protocol (MCP)](https://github.com/modelcontextprotocol) for providing the foundational concepts and protocol specification.
2810 | -   [FastAPI](https://fastapi.tiangolo.com/) team for the high-performance web framework.
2811 | -   [Pydantic](https://docs.pydantic.dev/) developers for robust data validation and settings management.
2812 | -   [Rich](https://github.com/Textualize/rich) library for beautiful and informative terminal output.
2813 | -   [uv](https://github.com/astral-sh/uv) from Astral for blazing-fast Python package installation and resolution.
2814 | -   [Playwright](https://playwright.dev/) team at Microsoft for the powerful browser automation framework.
2815 | -   [OpenPyXL](https://openpyxl.readthedocs.io/en/stable/) maintainers for Excel file manipulation.
2816 | -   [SQLAlchemy](https://www.sqlalchemy.org/) developers for the database toolkit.
2817 | -   Developers of integrated tools like `Tesseract`, `ripgrep`, `jq`, `awk`, `sed`.
2818 | -   All the LLM providers (OpenAI, Anthropic, Google, DeepSeek, xAI, etc.) for making their powerful models accessible via APIs.
2819 | -   The broader Python and open-source communities.
2820 | 
2821 | ---
2822 | 
2823 | > _This README provides a comprehensive overview. For specific tool parameters, advanced configuration options, and detailed implementation notes, please refer to the source code and individual tool documentation within the project._
2824 | 
2825 | ### Running the Server
2826 | 
2827 | Start the server using the CLI:
2828 | 
2829 | ```bash
2830 | # Start in default stdio mode
2831 | umcp run
2832 | 
2833 | # Start in streamable-http mode for web interfaces or remote clients (recommended)
2834 | umcp run --transport-mode shttp
2835 | # Or use the shortcut:
2836 | umcp run -t shttp
2837 | 
2838 | # Run on a specific host and port (streamable-http mode)
2839 | umcp run -t shttp --host 0.0.0.0 --port 8080
2840 | ```
```

--------------------------------------------------------------------------------
/examples/sample/contract_link.txt:
--------------------------------------------------------------------------------

```
1 | legal_contract.txt
```

--------------------------------------------------------------------------------
/ultimate_mcp_server/__init__.py:
--------------------------------------------------------------------------------

```python
1 | """Ultimate MCP Server package."""
2 | __version__ = "0.1.0"
```

--------------------------------------------------------------------------------
/examples/__init__.py:
--------------------------------------------------------------------------------

```python
1 | """Example scripts demonstrating Ultimate MCP Server functionality."""
```

--------------------------------------------------------------------------------
/ultimate_mcp_server/__main__.py:
--------------------------------------------------------------------------------

```python
1 | """Main entry point for Ultimate MCP Server CLI."""
2 | from ultimate_mcp_server.cli import cli
3 | 
4 | if __name__ == "__main__":
5 |     cli() 
```

--------------------------------------------------------------------------------
/tests/unit/__init__.py:
--------------------------------------------------------------------------------

```python
1 | """Unit tests for Ultimate MCP Server."""
2 | # This file is intentionally left empty
3 | # It marks the unit tests directory as a Python package
```

--------------------------------------------------------------------------------
/TODO.md:
--------------------------------------------------------------------------------

```markdown
1 | * Make CLI options to enable/disable tools from loading.
2 | * Debug SQL and Playwrite tools.
3 | * Improve docstrings for better tool usage by claude
```

--------------------------------------------------------------------------------
/tests/integration/__init__.py:
--------------------------------------------------------------------------------

```python
1 | """Integration tests for Ultimate MCP Server."""
2 | # This file is intentionally left empty
3 | # It marks the integration tests directory as a Python package
```

--------------------------------------------------------------------------------
/ultimate_mcp_server/cli/__main__.py:
--------------------------------------------------------------------------------

```python
1 | """Entry point for running the Ultimate MCP Server CLI as a module."""
2 | 
3 | if __name__ == "__main__":
4 |     from ultimate_mcp_server.cli.typer_cli import app
5 |     app() 
```

--------------------------------------------------------------------------------
/ultimate_mcp_server/cli/__init__.py:
--------------------------------------------------------------------------------

```python
1 | """Command-line interface for Ultimate MCP Server."""
2 | # Modern CLI implementation using typer
3 | from ultimate_mcp_server.cli.typer_cli import app, cli
4 | 
5 | __all__ = ["app", "cli"]
```

--------------------------------------------------------------------------------
/ultimate_mcp_server/clients/__init__.py:
--------------------------------------------------------------------------------

```python
1 | """Client classes for the Ultimate MCP Server."""
2 | 
3 | from ultimate_mcp_server.clients.completion_client import CompletionClient
4 | from ultimate_mcp_server.clients.rag_client import RAGClient
5 | 
6 | __all__ = [
7 |     "CompletionClient",
8 |     "RAGClient"
9 | ] 
```

--------------------------------------------------------------------------------
/tests/__init__.py:
--------------------------------------------------------------------------------

```python
1 | """Test package for Ultimate MCP Server."""
2 | # This file is intentionally left mostly empty
3 | # It marks the tests directory as a Python package
4 | 
5 | import os
6 | import sys
7 | 
8 | # Add parent directory to path to allow imports
9 | sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))
```

--------------------------------------------------------------------------------
/examples/sample/sample_data.json:
--------------------------------------------------------------------------------

```json
1 | 
2 | [
3 |   {"user": "Alice", "dept": "Sales", "region": "North", "value": 100, "tags": ["active", "pipeline"]},
4 |   {"user": "Bob", "dept": "IT", "region": "South", "value": 150, "tags": ["active", "support"]},
5 |   {"user": "Charlie", "dept": "Sales", "region": "North", "value": 120, "tags": ["inactive", "pipeline"]},
6 |   {"user": "David", "dept": "IT", "region": "West", "value": 200, "tags": ["active", "admin"]}
7 | ]
8 | 
```

--------------------------------------------------------------------------------
/ultimate_mcp_server/services/vector/__init__.py:
--------------------------------------------------------------------------------

```python
 1 | """Vector database and embedding operations for Ultimate MCP Server."""
 2 | from ultimate_mcp_server.services.vector.embeddings import (
 3 |     EmbeddingService,
 4 |     get_embedding_service,
 5 | )
 6 | from ultimate_mcp_server.services.vector.vector_service import (
 7 |     VectorCollection,
 8 |     VectorDatabaseService,
 9 |     get_vector_db_service,
10 | )
11 | 
12 | # Create alias for compatibility
13 | get_vector_database_service = get_vector_db_service
14 | 
15 | __all__ = [
16 |     "EmbeddingService",
17 |     "get_embedding_service",
18 |     "VectorCollection",
19 |     "VectorDatabaseService",
20 |     "get_vector_db_service",
21 |     "get_vector_database_service",
22 | ]
```

--------------------------------------------------------------------------------
/ultimate_mcp_server/utils/__init__.py:
--------------------------------------------------------------------------------

```python
 1 | """Utility functions for Ultimate MCP Server."""
 2 | from ultimate_mcp_server.utils.logging.console import console
 3 | from ultimate_mcp_server.utils.logging.logger import (
 4 |     critical,
 5 |     debug,
 6 |     error,
 7 |     get_logger,
 8 |     info,
 9 |     logger,
10 |     section,
11 |     success,
12 |     warning,
13 | )
14 | from ultimate_mcp_server.utils.parsing import parse_result, process_mcp_result
15 | 
16 | __all__ = [
17 |     # Logging utilities
18 |     "logger",
19 |     "console",
20 |     "debug",
21 |     "info",
22 |     "success",
23 |     "warning",
24 |     "error",
25 |     "critical",
26 |     "section",
27 |     "get_logger",
28 |     
29 |     # Parsing utilities
30 |     "parse_result",
31 |     "process_mcp_result",
32 | ]
33 | 
```

--------------------------------------------------------------------------------
/ultimate_mcp_server/services/cache/__init__.py:
--------------------------------------------------------------------------------

```python
 1 | """Caching service for Ultimate MCP Server."""
 2 | from ultimate_mcp_server.services.cache.cache_service import (
 3 |     CacheService,
 4 |     CacheStats,
 5 |     get_cache_service,
 6 |     with_cache,
 7 | )
 8 | from ultimate_mcp_server.services.cache.persistence import CachePersistence
 9 | from ultimate_mcp_server.services.cache.strategies import (
10 |     CacheStrategy,
11 |     ExactMatchStrategy,
12 |     SemanticMatchStrategy,
13 |     TaskBasedStrategy,
14 |     get_strategy,
15 | )
16 | from ultimate_mcp_server.services.cache.utils import run_completion_with_cache
17 | 
18 | __all__ = [
19 |     "CacheService",
20 |     "CacheStats",
21 |     "get_cache_service",
22 |     "with_cache",
23 |     "CachePersistence",
24 |     "CacheStrategy",
25 |     "ExactMatchStrategy",
26 |     "SemanticMatchStrategy",
27 |     "TaskBasedStrategy",
28 |     "get_strategy",
29 |     "run_completion_with_cache",
30 | ]
```

--------------------------------------------------------------------------------
/examples/data/sample_event.txt:
--------------------------------------------------------------------------------

```
 1 | 
 2 |         Tech Conference 2024
 3 |         Location: San Francisco Convention Center, 123 Tech Blvd, San Francisco, CA 94103
 4 |         Date: June 15-17, 2024
 5 |         Time: 9:00 AM - 6:00 PM daily
 6 |         
 7 |         Registration Fee: $599 (Early Bird: $499 until March 31)
 8 |         
 9 |         Keynote Speakers:
10 |         - Dr. Sarah Johnson, AI Research Director at TechCorp
11 |         - Mark Williams, CTO of FutureTech Industries
12 |         - Prof. Emily Chen, MIT Computer Science Department
13 |         
14 |         Special Events:
15 |         - Networking Reception: June 15, 7:00 PM - 10:00 PM
16 |         - Hackathon: June 16, 9:00 PM - 9:00 AM (overnight)
17 |         - Career Fair: June 17, 1:00 PM - 5:00 PM
18 |         
19 |         For more information, contact [email protected] or call (555) 123-4567.
20 |         
```

--------------------------------------------------------------------------------
/ultimate_mcp_server/services/analytics/__init__.py:
--------------------------------------------------------------------------------

```python
 1 | """Analytics service for Ultimate MCP Server."""
 2 | # Analytics implementation is handled separately
 3 | 
 4 | from typing import Optional
 5 | 
 6 | 
 7 | class AnalyticsService:
 8 |     """Service for tracking analytics."""
 9 |     
10 |     def __init__(self):
11 |         """Initialize analytics service."""
12 |         pass
13 |         
14 |     async def track_event(self, event_name: str, properties: Optional[dict] = None):
15 |         """Track an event.
16 |         
17 |         Args:
18 |             event_name: Name of the event
19 |             properties: Event properties
20 |         """
21 |         # Analytics tracking implementation would go here
22 |         pass
23 | 
24 | def get_analytics_service() -> AnalyticsService:
25 |     """Get analytics service.
26 |     
27 |     Returns:
28 |         Analytics service instance
29 |     """
30 |     return AnalyticsService()
31 | 
32 | __all__ = ["get_analytics_service"]
```

--------------------------------------------------------------------------------
/test_connection.py:
--------------------------------------------------------------------------------

```python
 1 | #!/usr/bin/env python3
 2 | """
 3 | Test script to verify Ultimate MCP Server is working correctly
 4 | """
 5 | 
 6 | import asyncio
 7 | import json
 8 | from fastmcp import Client
 9 | 
10 | async def test_streamable_http():
11 |     """Test connection to streamable-http server"""
12 |     server_url = "http://127.0.0.1:8013/mcp"
13 |     
14 |     print("🧪 Testing Streamable-HTTP Connection")
15 |     print("=" * 40)
16 |     
17 |     try:
18 |         async with Client(server_url) as client:
19 |             print("✅ Connected successfully!")
20 |             
21 |             # Test basic functionality
22 |             tools = await client.list_tools()
23 |             print(f"📋 Found {len(tools)} tools")
24 |             
25 |             # Test echo
26 |             echo_result = await client.call_tool("echo", {"message": "Connection test successful!"})
27 |             print(f"📢 Echo: {json.loads(echo_result[0].text)['message']}")
28 |             
29 |             print("🎉 All tests passed!")
30 |             
31 |     except Exception as e:
32 |         print(f"❌ Connection failed: {e}")
33 | 
34 | if __name__ == "__main__":
35 |     asyncio.run(test_streamable_http())
```

--------------------------------------------------------------------------------
/ultimate_mcp_server/core/providers/__init__.py:
--------------------------------------------------------------------------------

```python
 1 | """Provider module for Ultimate MCP Server.
 2 | 
 3 | This module provides access to LLM providers and provider-specific functionality.
 4 | """
 5 | 
 6 | from typing import Dict, Type
 7 | 
 8 | from ultimate_mcp_server.constants import Provider
 9 | from ultimate_mcp_server.core.providers.anthropic import AnthropicProvider
10 | from ultimate_mcp_server.core.providers.base import BaseProvider
11 | from ultimate_mcp_server.core.providers.deepseek import DeepSeekProvider
12 | from ultimate_mcp_server.core.providers.gemini import GeminiProvider
13 | from ultimate_mcp_server.core.providers.grok import GrokProvider
14 | from ultimate_mcp_server.core.providers.ollama import OllamaProvider
15 | from ultimate_mcp_server.core.providers.openai import OpenAIProvider
16 | from ultimate_mcp_server.core.providers.openrouter import OpenRouterProvider
17 | 
18 | # Provider registry
19 | PROVIDER_REGISTRY: Dict[str, Type[BaseProvider]] = {
20 |     Provider.OPENAI.value: OpenAIProvider,
21 |     Provider.ANTHROPIC.value: AnthropicProvider,
22 |     Provider.DEEPSEEK.value: DeepSeekProvider,
23 |     Provider.GEMINI.value: GeminiProvider,
24 |     Provider.OPENROUTER.value: OpenRouterProvider,
25 |     Provider.GROK.value: GrokProvider,
26 |     Provider.OLLAMA.value: OllamaProvider,
27 | }
28 | 
29 | __all__ = ["PROVIDER_REGISTRY"]
```