This is page 1 of 2. Use http://codebase.md/pab1it0/prometheus-mcp-server?lines=true&page={x} to view the full context. # Directory Structure ``` ├── .dockerignore ├── .env.template ├── .github │ ├── ISSUE_TEMPLATE │ │ ├── bug_report.yml │ │ ├── config.yml │ │ ├── feature_request.yml │ │ └── question.yml │ ├── TRIAGE_AUTOMATION.md │ ├── VALIDATION_SUMMARY.md │ └── workflows │ ├── bug-triage.yml │ ├── ci.yml │ ├── claude-code-review.yml │ ├── claude.yml │ ├── issue-management.yml │ ├── label-management.yml │ ├── security.yml │ └── triage-metrics.yml ├── .gitignore ├── Dockerfile ├── docs │ ├── api_reference.md │ ├── configuration.md │ ├── contributing.md │ ├── deploying_with_toolhive.md │ ├── docker_deployment.md │ ├── installation.md │ └── usage.md ├── LICENSE ├── pyproject.toml ├── README.md ├── server.json ├── src │ └── prometheus_mcp_server │ ├── __init__.py │ ├── logging_config.py │ ├── main.py │ └── server.py ├── tests │ ├── test_docker_integration.py │ ├── test_logging_config.py │ ├── test_main.py │ ├── test_mcp_protocol_compliance.py │ ├── test_server.py │ └── test_tools.py └── uv.lock ``` # Files -------------------------------------------------------------------------------- /.dockerignore: -------------------------------------------------------------------------------- ``` 1 | # Git 2 | .git 3 | .gitignore 4 | .github 5 | 6 | # CI 7 | .codeclimate.yml 8 | .travis.yml 9 | .taskcluster.yml 10 | 11 | # Docker 12 | docker-compose.yml 13 | .docker 14 | 15 | # Byte-compiled / optimized / DLL files 16 | **/__pycache__/ 17 | **/*.py[cod] 18 | **/*$py.class 19 | **/*.so 20 | **/.pytest_cache 21 | **/.coverage 22 | **/htmlcov 23 | 24 | # Virtual environment 25 | .env 26 | .venv/ 27 | venv/ 28 | ENV/ 29 | 30 | # IDE 31 | .idea 32 | .vscode 33 | 34 | # macOS 35 | .DS_Store 36 | 37 | # Windows 38 | Thumbs.db 39 | 40 | # Config 41 | .env 42 | 43 | # Distribution / packaging 44 | *.egg-info/ 45 | ``` -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- ``` 1 | # Python 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | *.so 6 | .Python 7 | build/ 8 | develop-eggs/ 9 | dist/ 10 | downloads/ 11 | eggs/ 12 | .eggs/ 13 | lib/ 14 | lib64/ 15 | parts/ 16 | sdist/ 17 | var/ 18 | wheels/ 19 | *.egg-info/ 20 | .installed.cfg 21 | *.egg 22 | PYTHONPATH 23 | 24 | # Environment 25 | .env 26 | .venv 27 | venv/ 28 | ENV/ 29 | env/ 30 | 31 | # IDE 32 | .idea/ 33 | .vscode/ 34 | *.swp 35 | *.swo 36 | 37 | # Logging 38 | *.log 39 | 40 | # OS specific 41 | .DS_Store 42 | Thumbs.db 43 | 44 | # pytest 45 | .pytest_cache/ 46 | .coverage 47 | htmlcov/ 48 | 49 | # Claude Code 50 | CLAUDE.md 51 | 52 | # Claude Flow temporary files 53 | .claude-flow/ 54 | .swarm/ 55 | 56 | # Security scan results 57 | trivy*.json 58 | trivy-*.json 59 | ``` -------------------------------------------------------------------------------- /.env.template: -------------------------------------------------------------------------------- ``` 1 | # Prometheus configuration 2 | PROMETHEUS_URL=http://your-prometheus-server:9090 3 | 4 | # Authentication (if needed) 5 | # Choose one of the following authentication methods (if required): 6 | 7 | # For basic auth 8 | PROMETHEUS_USERNAME=your_username 9 | PROMETHEUS_PASSWORD=your_password 10 | 11 | # For bearer token auth 12 | PROMETHEUS_TOKEN=your_token 13 | 14 | # Optional: Custom MCP configuration 15 | # PROMETHEUS_MCP_SERVER_TRANSPORT=stdio # Choose between http, stdio, sse. If undefined, stdio is set as the default transport. 16 | 17 | # Optional: Only relevant for non-stdio transports 18 | # PROMETHEUS_MCP_BIND_HOST=localhost # if undefined, 127.0.0.1 is set by default. 19 | # PROMETHEUS_MCP_BIND_PORT=8080 # if undefined, 8080 is set by default. ``` -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- ```markdown 1 | # Prometheus MCP Server 2 | [](https://github.com/users/pab1it0/packages/container/package/prometheus-mcp-server) 3 | [](https://github.com/pab1it0/prometheus-mcp-server/releases) 4 | [](https://codecov.io/gh/pab1it0/prometheus-mcp-server) 5 |  6 | [](https://github.com/pab1it0/prometheus-mcp-server/blob/main/LICENSE) 7 | 8 | A [Model Context Protocol][mcp] (MCP) server for Prometheus. 9 | 10 | This provides access to your Prometheus metrics and queries through standardized MCP interfaces, allowing AI assistants to execute PromQL queries and analyze your metrics data. 11 | 12 | [mcp]: https://modelcontextprotocol.io 13 | 14 | ## Features 15 | 16 | - [x] Execute PromQL queries against Prometheus 17 | - [x] Discover and explore metrics 18 | - [x] List available metrics 19 | - [x] Get metadata for specific metrics 20 | - [x] View instant query results 21 | - [x] View range query results with different step intervals 22 | - [x] Authentication support 23 | - [x] Basic auth from environment variables 24 | - [x] Bearer token auth from environment variables 25 | - [x] Docker containerization support 26 | 27 | - [x] Provide interactive tools for AI assistants 28 | 29 | The list of tools is configurable, so you can choose which tools you want to make available to the MCP client. 30 | This is useful if you don't use certain functionality or if you don't want to take up too much of the context window. 31 | 32 | ## Getting Started 33 | 34 | ### Prerequisites 35 | 36 | - Prometheus server accessible from your environment 37 | - Docker Desktop (recommended) or Docker CLI 38 | - MCP-compatible client (Claude Desktop, VS Code, Cursor, Windsurf, etc.) 39 | 40 | ### Installation Methods 41 | 42 | <details> 43 | <summary><b>Claude Desktop</b></summary> 44 | 45 | Add to your Claude Desktop configuration: 46 | 47 | ```json 48 | { 49 | "mcpServers": { 50 | "prometheus": { 51 | "command": "docker", 52 | "args": [ 53 | "run", 54 | "-i", 55 | "--rm", 56 | "-e", 57 | "PROMETHEUS_URL", 58 | "ghcr.io/pab1it0/prometheus-mcp-server:latest" 59 | ], 60 | "env": { 61 | "PROMETHEUS_URL": "<your-prometheus-url>" 62 | } 63 | } 64 | } 65 | } 66 | ``` 67 | </details> 68 | 69 | <details> 70 | <summary><b>Claude Code</b></summary> 71 | 72 | Install via the Claude Code CLI: 73 | 74 | ```bash 75 | claude mcp add prometheus --env PROMETHEUS_URL=http://your-prometheus:9090 -- docker run -i --rm -e PROMETHEUS_URL ghcr.io/pab1it0/prometheus-mcp-server:latest 76 | ``` 77 | </details> 78 | 79 | <details> 80 | <summary><b>VS Code / Cursor / Windsurf</b></summary> 81 | 82 | Add to your MCP settings in the respective IDE: 83 | 84 | ```json 85 | { 86 | "prometheus": { 87 | "command": "docker", 88 | "args": [ 89 | "run", 90 | "-i", 91 | "--rm", 92 | "-e", 93 | "PROMETHEUS_URL", 94 | "ghcr.io/pab1it0/prometheus-mcp-server:latest" 95 | ], 96 | "env": { 97 | "PROMETHEUS_URL": "<your-prometheus-url>" 98 | } 99 | } 100 | } 101 | ``` 102 | </details> 103 | 104 | <details> 105 | <summary><b>Docker Desktop</b></summary> 106 | 107 | The easiest way to run the Prometheus MCP server is through Docker Desktop: 108 | 109 | <a href="https://hub.docker.com/open-desktop?url=https://open.docker.com/dashboard/mcp/servers/id/prometheus/config?enable=true"> 110 | <img src="https://img.shields.io/badge/+%20Add%20to-Docker%20Desktop-2496ED?style=for-the-badge&logo=docker&logoColor=white" alt="Add to Docker Desktop" /> 111 | </a> 112 | 113 | 1. **Via MCP Catalog**: Visit the [Prometheus MCP Server on Docker Hub](https://hub.docker.com/mcp/server/prometheus/overview) and click the button above 114 | 115 | 2. **Via MCP Toolkit**: Use Docker Desktop's MCP Toolkit extension to discover and install the server 116 | 117 | 3. Configure your connection using environment variables (see Configuration Options below) 118 | 119 | </details> 120 | 121 | <details> 122 | <summary><b>Manual Docker Setup</b></summary> 123 | 124 | Run directly with Docker: 125 | 126 | ```bash 127 | # With environment variables 128 | docker run -i --rm \ 129 | -e PROMETHEUS_URL="http://your-prometheus:9090" \ 130 | ghcr.io/pab1it0/prometheus-mcp-server:latest 131 | 132 | # With authentication 133 | docker run -i --rm \ 134 | -e PROMETHEUS_URL="http://your-prometheus:9090" \ 135 | -e PROMETHEUS_USERNAME="admin" \ 136 | -e PROMETHEUS_PASSWORD="password" \ 137 | ghcr.io/pab1it0/prometheus-mcp-server:latest 138 | ``` 139 | </details> 140 | 141 | ### Configuration Options 142 | 143 | | Variable | Description | Required | 144 | |----------|-------------|----------| 145 | | `PROMETHEUS_URL` | URL of your Prometheus server | Yes | 146 | | `PROMETHEUS_USERNAME` | Username for basic authentication | No | 147 | | `PROMETHEUS_PASSWORD` | Password for basic authentication | No | 148 | | `PROMETHEUS_TOKEN` | Bearer token for authentication | No | 149 | | `ORG_ID` | Organization ID for multi-tenant setups | No | 150 | | `PROMETHEUS_MCP_SERVER_TRANSPORT` | Transport mode (stdio, http, sse) | No (default: stdio) | 151 | | `PROMETHEUS_MCP_BIND_HOST` | Host for HTTP transport | No (default: 127.0.0.1) | 152 | | `PROMETHEUS_MCP_BIND_PORT` | Port for HTTP transport | No (default: 8080) | 153 | 154 | 155 | ## Development 156 | 157 | Contributions are welcome! Please open an issue or submit a pull request if you have any suggestions or improvements. 158 | 159 | This project uses [`uv`](https://github.com/astral-sh/uv) to manage dependencies. Install `uv` following the instructions for your platform: 160 | 161 | ```bash 162 | curl -LsSf https://astral.sh/uv/install.sh | sh 163 | ``` 164 | 165 | You can then create a virtual environment and install the dependencies with: 166 | 167 | ```bash 168 | uv venv 169 | source .venv/bin/activate # On Unix/macOS 170 | .venv\Scripts\activate # On Windows 171 | uv pip install -e . 172 | ``` 173 | 174 | ### Testing 175 | 176 | The project includes a comprehensive test suite that ensures functionality and helps prevent regressions. 177 | 178 | Run the tests with pytest: 179 | 180 | ```bash 181 | # Install development dependencies 182 | uv pip install -e ".[dev]" 183 | 184 | # Run the tests 185 | pytest 186 | 187 | # Run with coverage report 188 | pytest --cov=src --cov-report=term-missing 189 | ``` 190 | 191 | When adding new features, please also add corresponding tests. 192 | 193 | ### Tools 194 | 195 | | Tool | Category | Description | 196 | | --- | --- | --- | 197 | | `execute_query` | Query | Execute a PromQL instant query against Prometheus | 198 | | `execute_range_query` | Query | Execute a PromQL range query with start time, end time, and step interval | 199 | | `list_metrics` | Discovery | List all available metrics in Prometheus | 200 | | `get_metric_metadata` | Discovery | Get metadata for a specific metric | 201 | | `get_targets` | Discovery | Get information about all scrape targets | 202 | 203 | ## License 204 | 205 | MIT 206 | 207 | --- 208 | 209 | [mcp]: https://modelcontextprotocol.io ``` -------------------------------------------------------------------------------- /docs/contributing.md: -------------------------------------------------------------------------------- ```markdown 1 | # Contributing Guide 2 | 3 | Thank you for your interest in contributing to the Prometheus MCP Server project! This guide will help you get started with contributing to the project. 4 | 5 | ## Prerequisites 6 | 7 | - Python 3.10 or higher 8 | - [uv](https://github.com/astral-sh/uv) package manager (recommended) 9 | - Git 10 | - A Prometheus server for testing (you can use a local Docker instance for development) 11 | 12 | ## Development Environment Setup 13 | 14 | 1. Fork the repository on GitHub. 15 | 16 | 2. Clone your fork to your local machine: 17 | 18 | ```bash 19 | git clone https://github.com/YOUR_USERNAME/prometheus-mcp-server.git 20 | cd prometheus-mcp-server 21 | ``` 22 | 23 | 3. Create and activate a virtual environment: 24 | 25 | ```bash 26 | # Using uv (recommended) 27 | uv venv 28 | source .venv/bin/activate # On Unix/macOS 29 | .venv\Scripts\activate # On Windows 30 | 31 | # Using venv (alternative) 32 | python -m venv venv 33 | source venv/bin/activate # On Unix/macOS 34 | venv\Scripts\activate # On Windows 35 | ``` 36 | 37 | 4. Install the package in development mode with testing dependencies: 38 | 39 | ```bash 40 | # Using uv (recommended) 41 | uv pip install -e ".[dev]" 42 | 43 | # Using pip (alternative) 44 | pip install -e ".[dev]" 45 | ``` 46 | 47 | 5. Create a local `.env` file for development and testing: 48 | 49 | ```bash 50 | cp .env.template .env 51 | # Edit the .env file with your Prometheus server details 52 | ``` 53 | 54 | ## Running Tests 55 | 56 | The project uses pytest for testing. Run the test suite with: 57 | 58 | ```bash 59 | pytest 60 | ``` 61 | 62 | For more detailed test output with coverage information: 63 | 64 | ```bash 65 | pytest --cov=src --cov-report=term-missing 66 | ``` 67 | 68 | ## Code Style 69 | 70 | This project follows PEP 8 Python coding standards. Some key points: 71 | 72 | - Use 4 spaces for indentation (no tabs) 73 | - Maximum line length of 100 characters 74 | - Use descriptive variable names 75 | - Write docstrings for all functions, classes, and modules 76 | 77 | ### Pre-commit Hooks 78 | 79 | The project uses pre-commit hooks to ensure code quality. Install them with: 80 | 81 | ```bash 82 | pip install pre-commit 83 | pre-commit install 84 | ``` 85 | 86 | ## Pull Request Process 87 | 88 | 1. Create a new branch for your feature or bugfix: 89 | 90 | ```bash 91 | git checkout -b feature/your-feature-name 92 | # or 93 | git checkout -b fix/issue-description 94 | ``` 95 | 96 | 2. Make your changes and commit them with clear, descriptive commit messages. 97 | 98 | 3. Write or update tests to cover your changes. 99 | 100 | 4. Ensure all tests pass before submitting your pull request. 101 | 102 | 5. Update documentation to reflect any changes. 103 | 104 | 6. Push your branch to your fork: 105 | 106 | ```bash 107 | git push origin feature/your-feature-name 108 | ``` 109 | 110 | 7. Open a pull request against the main repository. 111 | 112 | ## Adding New Features 113 | 114 | When adding new features to the Prometheus MCP Server, follow these guidelines: 115 | 116 | 1. **Start with tests**: Write tests that describe the expected behavior of the feature. 117 | 118 | 2. **Document thoroughly**: Add docstrings and update relevant documentation files. 119 | 120 | 3. **Maintain compatibility**: Ensure new features don't break existing functionality. 121 | 122 | 4. **Error handling**: Implement robust error handling with clear error messages. 123 | 124 | ### Adding a New Tool 125 | 126 | To add a new tool to the MCP server: 127 | 128 | 1. Add the tool function in `server.py` with the `@mcp.tool` decorator: 129 | 130 | ```python 131 | @mcp.tool(description="Description of your new tool") 132 | async def your_new_tool(param1: str, param2: int = 0) -> Dict[str, Any]: 133 | """Detailed docstring for your tool. 134 | 135 | Args: 136 | param1: Description of param1 137 | param2: Description of param2, with default 138 | 139 | Returns: 140 | Description of the return value 141 | """ 142 | # Implementation 143 | # ... 144 | return result 145 | ``` 146 | 147 | 2. Add tests for your new tool in `tests/test_tools.py`. 148 | 149 | 3. Update the documentation to include your new tool. 150 | 151 | ## Reporting Issues 152 | 153 | When reporting issues, please include: 154 | 155 | - A clear, descriptive title 156 | - A detailed description of the issue 157 | - Steps to reproduce the bug, if applicable 158 | - Expected and actual behavior 159 | - Python version and operating system 160 | - Any relevant logs or error messages 161 | 162 | ## Feature Requests 163 | 164 | Feature requests are welcome! When proposing new features: 165 | 166 | - Clearly describe the feature and the problem it solves 167 | - Explain how it aligns with the project's goals 168 | - Consider implementation details and potential challenges 169 | - Indicate if you're willing to work on implementing it 170 | 171 | ## Questions and Discussions 172 | 173 | For questions or discussions about the project, feel free to open a discussion on GitHub. 174 | 175 | Thank you for contributing to the Prometheus MCP Server project! ``` -------------------------------------------------------------------------------- /src/prometheus_mcp_server/__init__.py: -------------------------------------------------------------------------------- ```python 1 | """Prometheus MCP Server. 2 | 3 | A Model Context Protocol (MCP) server that enables AI assistants to query 4 | and analyze Prometheus metrics through standardized interfaces. 5 | """ 6 | 7 | __version__ = "1.0.0" 8 | ``` -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/config.yml: -------------------------------------------------------------------------------- ```yaml 1 | blank_issues_enabled: false 2 | contact_links: 3 | - name: 📚 Documentation 4 | url: https://github.com/pab1it0/prometheus-mcp-server/blob/main/README.md 5 | about: Read the project documentation and setup guides 6 | - name: 💬 Discussions 7 | url: https://github.com/pab1it0/prometheus-mcp-server/discussions 8 | about: Ask questions, share ideas, and discuss with the community 9 | - name: 🔒 Security Issues 10 | url: mailto:[email protected] 11 | about: Report security vulnerabilities privately via email ``` -------------------------------------------------------------------------------- /src/prometheus_mcp_server/logging_config.py: -------------------------------------------------------------------------------- ```python 1 | #!/usr/bin/env python 2 | 3 | import logging 4 | import sys 5 | from typing import Any, Dict 6 | 7 | import structlog 8 | 9 | 10 | def setup_logging() -> structlog.BoundLogger: 11 | """Configure structured JSON logging for the MCP server. 12 | 13 | Returns: 14 | Configured structlog logger instance 15 | """ 16 | # Configure structlog to use standard library logging 17 | structlog.configure( 18 | processors=[ 19 | # Add timestamp to every log record 20 | structlog.stdlib.add_log_level, 21 | structlog.processors.TimeStamper(fmt="iso"), 22 | # Add structured context 23 | structlog.processors.StackInfoRenderer(), 24 | structlog.processors.format_exc_info, 25 | # Convert to JSON 26 | structlog.processors.JSONRenderer() 27 | ], 28 | wrapper_class=structlog.stdlib.BoundLogger, 29 | logger_factory=structlog.stdlib.LoggerFactory(), 30 | context_class=dict, 31 | cache_logger_on_first_use=True, 32 | ) 33 | 34 | # Configure standard library logging to output to stderr 35 | logging.basicConfig( 36 | format="%(message)s", 37 | stream=sys.stderr, 38 | level=logging.INFO, 39 | ) 40 | 41 | # Create and return the logger 42 | logger = structlog.get_logger("prometheus_mcp_server") 43 | return logger 44 | 45 | 46 | def get_logger() -> structlog.BoundLogger: 47 | """Get the configured logger instance. 48 | 49 | Returns: 50 | Configured structlog logger instance 51 | """ 52 | return structlog.get_logger("prometheus_mcp_server") ``` -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- ```toml 1 | [project] 2 | name = "prometheus_mcp_server" 3 | version = "1.3.1" 4 | description = "MCP server for Prometheus integration" 5 | readme = "README.md" 6 | requires-python = ">=3.10" 7 | dependencies = [ 8 | "mcp[cli]", 9 | "prometheus-api-client", 10 | "python-dotenv", 11 | "pyproject-toml>=0.1.0", 12 | "requests", 13 | "structlog>=23.0.0", 14 | "fastmcp>=2.11.3", 15 | ] 16 | 17 | [project.optional-dependencies] 18 | dev = [ 19 | "pytest>=7.0.0", 20 | "pytest-cov>=4.0.0", 21 | "pytest-asyncio>=0.21.0", 22 | "pytest-mock>=3.10.0", 23 | "docker>=7.0.0", 24 | "requests>=2.31.0", 25 | ] 26 | 27 | [project.scripts] 28 | prometheus-mcp-server = "prometheus_mcp_server.main:run_server" 29 | 30 | [tool.setuptools] 31 | packages = ["prometheus_mcp_server"] 32 | package-dir = {"" = "src"} 33 | 34 | [build-system] 35 | requires = ["setuptools>=61.0"] 36 | build-backend = "setuptools.build_meta" 37 | 38 | [tool.pytest.ini_options] 39 | testpaths = ["tests"] 40 | python_files = "test_*.py" 41 | python_functions = "test_*" 42 | python_classes = "Test*" 43 | addopts = "--cov=src --cov-report=term-missing" 44 | 45 | [tool.coverage.run] 46 | source = ["src/prometheus_mcp_server"] 47 | omit = ["*/__pycache__/*", "*/tests/*", "*/.venv/*", "*/venv/*"] 48 | branch = true 49 | 50 | [tool.coverage.report] 51 | exclude_lines = [ 52 | "pragma: no cover", 53 | "def __repr__", 54 | "if self.debug:", 55 | "raise NotImplementedError", 56 | "if __name__ == .__main__.:", 57 | "pass", 58 | "raise ImportError" 59 | ] 60 | precision = 1 61 | show_missing = true 62 | skip_covered = false 63 | fail_under = 89 64 | 65 | [tool.coverage.json] 66 | show_contexts = true 67 | 68 | [tool.coverage.xml] 69 | output = "coverage.xml" 70 | ``` -------------------------------------------------------------------------------- /server.json: -------------------------------------------------------------------------------- ```json 1 | { 2 | "$schema": "https://static.modelcontextprotocol.io/schemas/2025-09-29/server.schema.json", 3 | "name": "io.github.pab1it0/prometheus-mcp-server", 4 | "description": "MCP server providing Prometheus metrics access and PromQL query execution for AI assistants", 5 | "version": "1.3.1", 6 | "repository": { 7 | "url": "https://github.com/pab1it0/prometheus-mcp-server", 8 | "source": "github" 9 | }, 10 | "websiteUrl": "https://pab1it0.github.io/prometheus-mcp-server", 11 | "packages": [ 12 | { 13 | "registryType": "oci", 14 | "registryBaseUrl": "https://ghcr.io", 15 | "identifier": "pab1it0/prometheus-mcp-server", 16 | "version": "1.3.1", 17 | "transport": { 18 | "type": "stdio" 19 | }, 20 | "environmentVariables": [ 21 | { 22 | "name": "PROMETHEUS_URL", 23 | "description": "Prometheus server URL (e.g., http://localhost:9090)", 24 | "isRequired": true, 25 | "format": "string", 26 | "isSecret": false 27 | }, 28 | { 29 | "name": "PROMETHEUS_USERNAME", 30 | "description": "Username for Prometheus basic authentication", 31 | "isRequired": false, 32 | "format": "string", 33 | "isSecret": false 34 | }, 35 | { 36 | "name": "PROMETHEUS_PASSWORD", 37 | "description": "Password for Prometheus basic authentication", 38 | "isRequired": false, 39 | "format": "string", 40 | "isSecret": true 41 | }, 42 | { 43 | "name": "PROMETHEUS_TOKEN", 44 | "description": "Bearer token for Prometheus authentication", 45 | "isRequired": false, 46 | "format": "string", 47 | "isSecret": true 48 | }, 49 | { 50 | "name": "ORG_ID", 51 | "description": "Organization ID for multi-tenant Prometheus setups", 52 | "isRequired": false, 53 | "format": "string", 54 | "isSecret": false 55 | } 56 | ] 57 | } 58 | ] 59 | } 60 | ``` -------------------------------------------------------------------------------- /tests/test_logging_config.py: -------------------------------------------------------------------------------- ```python 1 | """Tests for the logging configuration module.""" 2 | 3 | import json 4 | import logging 5 | import sys 6 | from io import StringIO 7 | from unittest.mock import patch 8 | 9 | import pytest 10 | import structlog 11 | 12 | from prometheus_mcp_server.logging_config import setup_logging, get_logger 13 | 14 | 15 | def test_setup_logging_returns_logger(): 16 | """Test that setup_logging returns a structlog logger.""" 17 | logger = setup_logging() 18 | # Check that it has the methods we expect from a structlog logger 19 | assert hasattr(logger, 'info') 20 | assert hasattr(logger, 'error') 21 | assert hasattr(logger, 'warning') 22 | assert hasattr(logger, 'debug') 23 | 24 | 25 | def test_get_logger_returns_logger(): 26 | """Test that get_logger returns a structlog logger.""" 27 | logger = get_logger() 28 | # Check that it has the methods we expect from a structlog logger 29 | assert hasattr(logger, 'info') 30 | assert hasattr(logger, 'error') 31 | assert hasattr(logger, 'warning') 32 | assert hasattr(logger, 'debug') 33 | 34 | 35 | def test_structured_logging_outputs_json(): 36 | """Test that the logger can be configured and used.""" 37 | # Just test that the logger can be created and called without errors 38 | logger = setup_logging() 39 | 40 | # These should not raise exceptions 41 | logger.info("Test message", test_field="test_value", number=42) 42 | logger.warning("Warning message") 43 | logger.error("Error message") 44 | 45 | # Test that we can create multiple loggers 46 | logger2 = get_logger() 47 | logger2.info("Another test message") 48 | 49 | 50 | def test_logging_levels(): 51 | """Test that different logging levels work correctly.""" 52 | logger = setup_logging() 53 | 54 | # Test that all logging levels can be called without errors 55 | logger.debug("Debug message") 56 | logger.info("Info message") 57 | logger.warning("Warning message") 58 | logger.error("Error message") 59 | 60 | # Test with structured data 61 | logger.info("Structured message", user_id=123, action="test") 62 | logger.error("Error with context", error_code=500, module="test") ``` -------------------------------------------------------------------------------- /.github/workflows/claude.yml: -------------------------------------------------------------------------------- ```yaml 1 | name: Claude Code 2 | 3 | on: 4 | issue_comment: 5 | types: [created] 6 | pull_request_review_comment: 7 | types: [created] 8 | issues: 9 | types: [opened, assigned] 10 | pull_request_review: 11 | types: [submitted] 12 | 13 | jobs: 14 | claude: 15 | if: | 16 | (github.event_name == 'issue_comment' && contains(github.event.comment.body, '@claude')) || 17 | (github.event_name == 'pull_request_review_comment' && contains(github.event.comment.body, '@claude')) || 18 | (github.event_name == 'pull_request_review' && contains(github.event.review.body, '@claude')) || 19 | (github.event_name == 'issues' && (contains(github.event.issue.body, '@claude') || contains(github.event.issue.title, '@claude'))) 20 | runs-on: ubuntu-latest 21 | permissions: 22 | contents: read 23 | pull-requests: read 24 | issues: read 25 | id-token: write 26 | actions: read # Required for Claude to read CI results on PRs 27 | steps: 28 | - name: Checkout repository 29 | uses: actions/checkout@v4 30 | with: 31 | fetch-depth: 1 32 | 33 | - name: Run Claude Code 34 | id: claude 35 | uses: anthropics/claude-code-action@beta 36 | with: 37 | claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }} 38 | 39 | # This is an optional setting that allows Claude to read CI results on PRs 40 | additional_permissions: | 41 | actions: read 42 | 43 | # Optional: Specify model (defaults to Claude Sonnet 4, uncomment for Claude Opus 4.1) 44 | # model: "claude-opus-4-1-20250805" 45 | 46 | # Optional: Customize the trigger phrase (default: @claude) 47 | # trigger_phrase: "/claude" 48 | 49 | # Optional: Trigger when specific user is assigned to an issue 50 | # assignee_trigger: "claude-bot" 51 | 52 | # Optional: Allow Claude to run specific commands 53 | # allowed_tools: "Bash(npm install),Bash(npm run build),Bash(npm run test:*),Bash(npm run lint:*)" 54 | 55 | # Optional: Add custom instructions for Claude to customize its behavior for your project 56 | # custom_instructions: | 57 | # Follow our coding standards 58 | # Ensure all new code has tests 59 | # Use TypeScript for new files 60 | 61 | # Optional: Custom environment variables for Claude 62 | # claude_env: | 63 | # NODE_ENV: test 64 | 65 | ``` -------------------------------------------------------------------------------- /.github/workflows/security.yml: -------------------------------------------------------------------------------- ```yaml 1 | name: trivy 2 | 3 | on: 4 | push: 5 | branches: [ "main" ] 6 | pull_request: 7 | # The branches below must be a subset of the branches above 8 | branches: [ "main" ] 9 | schedule: 10 | - cron: '36 8 * * 3' 11 | 12 | permissions: 13 | contents: read 14 | 15 | jobs: 16 | # Security scan with failure on CRITICAL vulnerabilities 17 | security-scan: 18 | permissions: 19 | contents: read 20 | security-events: write 21 | actions: read 22 | name: Security Scan 23 | runs-on: ubuntu-latest 24 | steps: 25 | - name: Checkout code 26 | uses: actions/checkout@v4 27 | 28 | - name: Build Docker image for scanning 29 | run: | 30 | docker build -t ghcr.io/pab1it0/prometheus-mcp-server:${{ github.sha }} . 31 | 32 | - name: Run Trivy vulnerability scanner (fail on CRITICAL Python packages only) 33 | uses: aquasecurity/trivy-action@7b7aa264d83dc58691451798b4d117d53d21edfe 34 | with: 35 | image-ref: 'ghcr.io/pab1it0/prometheus-mcp-server:${{ github.sha }}' 36 | format: 'table' 37 | severity: 'CRITICAL' 38 | exit-code: '1' 39 | scanners: 'vuln' 40 | vuln-type: 'library' 41 | 42 | - name: Run Trivy vulnerability scanner (SARIF output) 43 | uses: aquasecurity/trivy-action@7b7aa264d83dc58691451798b4d117d53d21edfe 44 | if: always() 45 | with: 46 | image-ref: 'ghcr.io/pab1it0/prometheus-mcp-server:${{ github.sha }}' 47 | format: 'template' 48 | template: '@/contrib/sarif.tpl' 49 | output: 'trivy-results.sarif' 50 | severity: 'CRITICAL,HIGH,MEDIUM' 51 | 52 | - name: Upload Trivy scan results to GitHub Security tab 53 | uses: github/codeql-action/upload-sarif@v3 54 | if: always() 55 | with: 56 | sarif_file: 'trivy-results.sarif' 57 | 58 | # Additional filesystem scan for source code vulnerabilities 59 | filesystem-scan: 60 | permissions: 61 | contents: read 62 | security-events: write 63 | name: Filesystem Security Scan 64 | runs-on: ubuntu-latest 65 | steps: 66 | - name: Checkout code 67 | uses: actions/checkout@v4 68 | 69 | - name: Run Trivy filesystem scanner 70 | uses: aquasecurity/trivy-action@7b7aa264d83dc58691451798b4d117d53d21edfe 71 | with: 72 | scan-type: 'fs' 73 | scan-ref: '.' 74 | format: 'template' 75 | template: '@/contrib/sarif.tpl' 76 | output: 'trivy-fs-results.sarif' 77 | severity: 'CRITICAL,HIGH' 78 | 79 | - name: Upload filesystem scan results to GitHub Security tab 80 | uses: github/codeql-action/upload-sarif@v3 81 | if: always() 82 | with: 83 | sarif_file: 'trivy-fs-results.sarif' ``` -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- ```dockerfile 1 | FROM python:3.12-slim-bookworm AS builder 2 | 3 | COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv 4 | 5 | WORKDIR /app 6 | 7 | ENV UV_COMPILE_BYTECODE=1 \ 8 | UV_LINK_MODE=copy 9 | 10 | COPY pyproject.toml ./ 11 | COPY uv.lock ./ 12 | 13 | COPY src ./src/ 14 | 15 | RUN uv venv && \ 16 | uv sync --frozen --no-dev && \ 17 | uv pip install -e . --no-deps && \ 18 | uv pip install --upgrade pip setuptools 19 | 20 | FROM python:3.12-slim-bookworm 21 | 22 | WORKDIR /app 23 | 24 | RUN apt-get update && \ 25 | apt-get upgrade -y && \ 26 | apt-get install -y --no-install-recommends \ 27 | curl \ 28 | procps \ 29 | ca-certificates && \ 30 | rm -rf /var/lib/apt/lists/* && \ 31 | apt-get clean && \ 32 | apt-get autoremove -y 33 | 34 | RUN groupadd -r -g 1000 app && \ 35 | useradd -r -g app -u 1000 -d /app -s /bin/false app && \ 36 | chown -R app:app /app && \ 37 | chmod 755 /app && \ 38 | chmod -R go-w /app 39 | 40 | COPY --from=builder --chown=app:app /app/.venv /app/.venv 41 | COPY --from=builder --chown=app:app /app/src /app/src 42 | COPY --chown=app:app pyproject.toml /app/ 43 | 44 | ENV PATH="/app/.venv/bin:$PATH" \ 45 | PYTHONUNBUFFERED=1 \ 46 | PYTHONDONTWRITEBYTECODE=1 \ 47 | PYTHONPATH="/app" \ 48 | PYTHONFAULTHANDLER=1 \ 49 | PROMETHEUS_MCP_BIND_HOST=0.0.0.0 \ 50 | PROMETHEUS_MCP_BIND_PORT=8080 51 | 52 | USER app 53 | 54 | EXPOSE 8080 55 | 56 | HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \ 57 | CMD if [ "$PROMETHEUS_MCP_SERVER_TRANSPORT" = "http" ] || [ "$PROMETHEUS_MCP_SERVER_TRANSPORT" = "sse" ]; then \ 58 | curl -f http://localhost:${PROMETHEUS_MCP_BIND_PORT}/ >/dev/null 2>&1 || exit 1; \ 59 | else \ 60 | pgrep -f prometheus-mcp-server >/dev/null 2>&1 || exit 1; \ 61 | fi 62 | 63 | CMD ["/app/.venv/bin/prometheus-mcp-server"] 64 | 65 | LABEL org.opencontainers.image.title="Prometheus MCP Server" \ 66 | org.opencontainers.image.description="Model Context Protocol server for Prometheus integration, enabling AI assistants to query metrics and monitor system health" \ 67 | org.opencontainers.image.version="1.3.1" \ 68 | org.opencontainers.image.authors="Pavel Shklovsky <[email protected]>" \ 69 | org.opencontainers.image.source="https://github.com/pab1it0/prometheus-mcp-server" \ 70 | org.opencontainers.image.licenses="MIT" \ 71 | org.opencontainers.image.url="https://github.com/pab1it0/prometheus-mcp-server" \ 72 | org.opencontainers.image.documentation="https://github.com/pab1it0/prometheus-mcp-server/blob/main/docs/" \ 73 | org.opencontainers.image.vendor="Pavel Shklovsky" \ 74 | org.opencontainers.image.base.name="python:3.12-slim-bookworm" \ 75 | org.opencontainers.image.created="" \ 76 | org.opencontainers.image.revision="" \ 77 | io.modelcontextprotocol.server.name="io.github.pab1it0/prometheus-mcp-server" \ 78 | mcp.server.name="prometheus-mcp-server" \ 79 | mcp.server.category="monitoring" \ 80 | mcp.server.tags="prometheus,monitoring,metrics,observability" \ 81 | mcp.server.transport.stdio="true" \ 82 | mcp.server.transport.http="true" \ 83 | mcp.server.transport.sse="true" ``` -------------------------------------------------------------------------------- /src/prometheus_mcp_server/main.py: -------------------------------------------------------------------------------- ```python 1 | #!/usr/bin/env python 2 | import sys 3 | import dotenv 4 | from prometheus_mcp_server.server import mcp, config, TransportType 5 | from prometheus_mcp_server.logging_config import setup_logging 6 | 7 | # Initialize structured logging 8 | logger = setup_logging() 9 | 10 | def setup_environment(): 11 | if dotenv.load_dotenv(): 12 | logger.info("Environment configuration loaded", source=".env file") 13 | else: 14 | logger.info("Environment configuration loaded", source="environment variables", note="No .env file found") 15 | 16 | if not config.url: 17 | logger.error( 18 | "Missing required configuration", 19 | error="PROMETHEUS_URL environment variable is not set", 20 | suggestion="Please set it to your Prometheus server URL", 21 | example="http://your-prometheus-server:9090" 22 | ) 23 | return False 24 | 25 | # MCP Server configuration validation 26 | mcp_config = config.mcp_server_config 27 | if mcp_config: 28 | if str(mcp_config.mcp_server_transport).lower() not in TransportType.values(): 29 | logger.error( 30 | "Invalid mcp transport", 31 | error="PROMETHEUS_MCP_SERVER_TRANSPORT environment variable is invalid", 32 | suggestion="Please define one of these acceptable transports (http/sse/stdio)", 33 | example="http" 34 | ) 35 | return False 36 | 37 | try: 38 | if mcp_config.mcp_bind_port: 39 | int(mcp_config.mcp_bind_port) 40 | except (TypeError, ValueError): 41 | logger.error( 42 | "Invalid mcp port", 43 | error="PROMETHEUS_MCP_BIND_PORT environment variable is invalid", 44 | suggestion="Please define an integer", 45 | example="8080" 46 | ) 47 | return False 48 | 49 | # Determine authentication method 50 | auth_method = "none" 51 | if config.username and config.password: 52 | auth_method = "basic_auth" 53 | elif config.token: 54 | auth_method = "bearer_token" 55 | 56 | logger.info( 57 | "Prometheus configuration validated", 58 | server_url=config.url, 59 | authentication=auth_method, 60 | org_id=config.org_id if config.org_id else None 61 | ) 62 | 63 | return True 64 | 65 | def run_server(): 66 | """Main entry point for the Prometheus MCP Server""" 67 | # Setup environment 68 | if not setup_environment(): 69 | logger.error("Environment setup failed, exiting") 70 | sys.exit(1) 71 | 72 | mcp_config = config.mcp_server_config 73 | transport = mcp_config.mcp_server_transport 74 | 75 | http_transports = [TransportType.HTTP.value, TransportType.SSE.value] 76 | if transport in http_transports: 77 | mcp.run(transport=transport, host=mcp_config.mcp_bind_host, port=mcp_config.mcp_bind_port) 78 | logger.info("Starting Prometheus MCP Server", 79 | transport=transport, 80 | host=mcp_config.mcp_bind_host, 81 | port=mcp_config.mcp_bind_port) 82 | else: 83 | mcp.run(transport=transport) 84 | logger.info("Starting Prometheus MCP Server", transport=transport) 85 | 86 | if __name__ == "__main__": 87 | run_server() 88 | ``` -------------------------------------------------------------------------------- /docs/installation.md: -------------------------------------------------------------------------------- ```markdown 1 | # Installation Guide 2 | 3 | This guide will help you install and set up the Prometheus MCP Server. 4 | 5 | ## Prerequisites 6 | 7 | - Python 3.10 or higher 8 | - Access to a Prometheus server 9 | - [uv](https://github.com/astral-sh/uv) package manager (recommended) 10 | 11 | ## Installation Options 12 | 13 | ### Option 1: Direct Installation 14 | 15 | 1. Clone the repository: 16 | 17 | ```bash 18 | git clone https://github.com/pab1it0/prometheus-mcp-server.git 19 | cd prometheus-mcp-server 20 | ``` 21 | 22 | 2. Create and activate a virtual environment: 23 | 24 | ```bash 25 | # Using uv (recommended) 26 | uv venv 27 | source .venv/bin/activate # On Unix/macOS 28 | .venv\Scripts\activate # On Windows 29 | 30 | # Using venv (alternative) 31 | python -m venv venv 32 | source venv/bin/activate # On Unix/macOS 33 | venv\Scripts\activate # On Windows 34 | ``` 35 | 36 | 3. Install the package: 37 | 38 | ```bash 39 | # Using uv (recommended) 40 | uv pip install -e . 41 | 42 | # Using pip (alternative) 43 | pip install -e . 44 | ``` 45 | 46 | ### Option 2: Using Docker 47 | 48 | 1. Clone the repository: 49 | 50 | ```bash 51 | git clone https://github.com/pab1it0/prometheus-mcp-server.git 52 | cd prometheus-mcp-server 53 | ``` 54 | 55 | 2. Build the Docker image: 56 | 57 | ```bash 58 | docker build -t prometheus-mcp-server . 59 | ``` 60 | 61 | ## Configuration 62 | 63 | 1. Create a `.env` file in the root directory (you can copy from `.env.template`): 64 | 65 | ```bash 66 | cp .env.template .env 67 | ``` 68 | 69 | 2. Edit the `.env` file with your Prometheus server details: 70 | 71 | ```env 72 | # Required: Prometheus configuration 73 | PROMETHEUS_URL=http://your-prometheus-server:9090 74 | 75 | # Optional: Authentication credentials (if needed) 76 | # Choose one of the following authentication methods if required: 77 | 78 | # For basic auth 79 | PROMETHEUS_USERNAME=your_username 80 | PROMETHEUS_PASSWORD=your_password 81 | 82 | # For bearer token auth 83 | PROMETHEUS_TOKEN=your_token 84 | 85 | # Optional: Custom MCP configuration 86 | # PROMETHEUS_MCP_SERVER_TRANSPORT=stdio # Choose between http, stdio, sse. If undefined, stdio is set as the default transport. 87 | # Optional: Only relevant for non-stdio transports 88 | # PROMETHEUS_MCP_BIND_HOST=localhost # if undefined, 127.0.0.1 is set by default. 89 | # PROMETHEUS_MCP_BIND_PORT=8080 # if undefined, 8080 is set by default. 90 | ``` 91 | 92 | ## Running the Server 93 | 94 | ### Option 1: Directly from Python 95 | 96 | After installation and configuration, you can run the server with: 97 | 98 | ```bash 99 | # If installed with -e flag 100 | python -m prometheus_mcp_server.main 101 | 102 | # If installed as a package 103 | prometheus-mcp-server 104 | ``` 105 | 106 | ### Option 2: Using Docker 107 | 108 | ```bash 109 | # Using environment variables directly 110 | docker run -it --rm \ 111 | -e PROMETHEUS_URL=http://your-prometheus-server:9090 \ 112 | -e PROMETHEUS_USERNAME=your_username \ 113 | -e PROMETHEUS_PASSWORD=your_password \ 114 | prometheus-mcp-server 115 | 116 | # Using .env file 117 | docker run -it --rm \ 118 | --env-file .env \ 119 | prometheus-mcp-server 120 | 121 | # Using docker-compose 122 | docker-compose up 123 | ``` 124 | 125 | ## Verifying Installation 126 | 127 | When the server starts successfully, you should see output similar to: 128 | 129 | ``` 130 | Loaded environment variables from .env file 131 | Prometheus configuration: 132 | Server URL: http://your-prometheus-server:9090 133 | Authentication: Using basic auth 134 | 135 | Starting Prometheus MCP Server... 136 | Running server in standard mode... 137 | ``` 138 | 139 | The server is now ready to receive MCP requests from clients like Claude Desktop. ``` -------------------------------------------------------------------------------- /.github/workflows/claude-code-review.yml: -------------------------------------------------------------------------------- ```yaml 1 | name: Claude Code Review 2 | 3 | on: 4 | pull_request: 5 | types: [opened, synchronize] 6 | # Optional: Only run on specific file changes 7 | # paths: 8 | # - "src/**/*.ts" 9 | # - "src/**/*.tsx" 10 | # - "src/**/*.js" 11 | # - "src/**/*.jsx" 12 | 13 | jobs: 14 | claude-review: 15 | # Optional: Filter by PR author 16 | # if: | 17 | # github.event.pull_request.user.login == 'external-contributor' || 18 | # github.event.pull_request.user.login == 'new-developer' || 19 | # github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR' 20 | 21 | runs-on: ubuntu-latest 22 | permissions: 23 | contents: read 24 | pull-requests: read 25 | issues: read 26 | id-token: write 27 | actions: read # Required for Claude to read CI results on PRs 28 | 29 | steps: 30 | - name: Checkout repository 31 | uses: actions/checkout@v4 32 | with: 33 | fetch-depth: 1 34 | 35 | - name: Run Claude Code Review 36 | id: claude-review 37 | uses: anthropics/claude-code-action@beta 38 | with: 39 | claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }} 40 | 41 | # This is an optional setting that allows Claude to read CI results on PRs 42 | additional_permissions: | 43 | actions: read 44 | 45 | # Optional: Specify model (defaults to Claude Sonnet 4, uncomment for Claude Opus 4.1) 46 | # model: "claude-opus-4-1-20250805" 47 | 48 | # Direct prompt for automated review (no @claude mention needed) 49 | direct_prompt: | 50 | Please review this pull request and provide feedback on: 51 | - Code quality and best practices 52 | - Potential bugs or issues 53 | - Performance considerations 54 | - Security concerns 55 | - Test coverage 56 | 57 | Be constructive and helpful in your feedback. 58 | 59 | # Optional: Use sticky comments to make Claude reuse the same comment on subsequent pushes to the same PR 60 | # use_sticky_comment: true 61 | 62 | # Optional: Customize review based on file types 63 | # direct_prompt: | 64 | # Review this PR focusing on: 65 | # - For TypeScript files: Type safety and proper interface usage 66 | # - For API endpoints: Security, input validation, and error handling 67 | # - For React components: Performance, accessibility, and best practices 68 | # - For tests: Coverage, edge cases, and test quality 69 | 70 | # Optional: Different prompts for different authors 71 | # direct_prompt: | 72 | # ${{ github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR' && 73 | # 'Welcome! Please review this PR from a first-time contributor. Be encouraging and provide detailed explanations for any suggestions.' || 74 | # 'Please provide a thorough code review focusing on our coding standards and best practices.' }} 75 | 76 | # Optional: Add specific tools for running tests or linting 77 | # allowed_tools: "Bash(npm run test),Bash(npm run lint),Bash(npm run typecheck)" 78 | 79 | # Optional: Skip review for certain conditions 80 | # if: | 81 | # !contains(github.event.pull_request.title, '[skip-review]') && 82 | # !contains(github.event.pull_request.title, '[WIP]') 83 | 84 | ``` -------------------------------------------------------------------------------- /docs/configuration.md: -------------------------------------------------------------------------------- ```markdown 1 | # Configuration Guide 2 | 3 | This guide details all available configuration options for the Prometheus MCP Server. 4 | 5 | ## Environment Variables 6 | 7 | The server is configured primarily through environment variables. These can be set directly in your environment or through a `.env` file in the project root directory. 8 | 9 | ### Required Variables 10 | 11 | | Variable | Description | Example | 12 | |----------|-------------|--------| 13 | | `PROMETHEUS_URL` | URL of your Prometheus server | `http://prometheus:9090` | 14 | 15 | ### Authentication Variables 16 | 17 | Prometheus MCP Server supports multiple authentication methods. Choose the appropriate one for your Prometheus setup: 18 | 19 | #### Basic Authentication 20 | 21 | | Variable | Description | Example | 22 | |----------|-------------|--------| 23 | | `PROMETHEUS_USERNAME` | Username for basic authentication | `admin` | 24 | | `PROMETHEUS_PASSWORD` | Password for basic authentication | `secure_password` | 25 | 26 | #### Token Authentication 27 | 28 | | Variable | Description | Example | 29 | |----------|-------------|--------| 30 | | `PROMETHEUS_TOKEN` | Bearer token for authentication | `eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...` | 31 | 32 | ## Authentication Priority 33 | 34 | If multiple authentication methods are configured, the server will prioritize them in the following order: 35 | 36 | 1. Bearer token authentication (if `PROMETHEUS_TOKEN` is set) 37 | 2. Basic authentication (if both `PROMETHEUS_USERNAME` and `PROMETHEUS_PASSWORD` are set) 38 | 3. No authentication (if no credentials are provided) 39 | 40 | #### MCP Server Configuration 41 | 42 | | Variable | Description | Example | 43 | |----------|-------------|--------| 44 | | `PROMETHEUS_MCP_SERVER_TRANSPORT` | Choose between these transports: `http`, `stdio`, `sse`. If undefined, `stdio` is set as the default transport. | `http` | 45 | | `PROMETHEUS_MCP_BIND_HOST` | Define the host for your MCP server, if undefined, `127.0.0.1` is set by default. | `localhost` | 46 | | `PROMETHEUS_MCP_BIND_PORT` | Define the port where your MCP server is exposed, if undefined, `8080` is set by default. | `8080` | 47 | 48 | ## MCP Client Configuration 49 | 50 | ### Claude Desktop Configuration 51 | 52 | To use the Prometheus MCP Server with Claude Desktop, you'll need to add configuration to the Claude Desktop settings: 53 | 54 | ```json 55 | { 56 | "mcpServers": { 57 | "prometheus": { 58 | "command": "uv", 59 | "args": [ 60 | "--directory", 61 | "<full path to prometheus-mcp-server directory>", 62 | "run", 63 | "src/prometheus_mcp_server/main.py" 64 | ], 65 | "env": { 66 | "PROMETHEUS_URL": "http://your-prometheus-server:9090", 67 | "PROMETHEUS_USERNAME": "your_username", 68 | "PROMETHEUS_PASSWORD": "your_password" 69 | } 70 | } 71 | } 72 | } 73 | ``` 74 | 75 | ### Docker Configuration with Claude Desktop 76 | 77 | If you're using the Docker container with Claude Desktop: 78 | 79 | ```json 80 | { 81 | "mcpServers": { 82 | "prometheus": { 83 | "command": "docker", 84 | "args": [ 85 | "run", 86 | "--rm", 87 | "-i", 88 | "-e", "PROMETHEUS_URL", 89 | "-e", "PROMETHEUS_USERNAME", 90 | "-e", "PROMETHEUS_PASSWORD", 91 | "prometheus-mcp-server" 92 | ], 93 | "env": { 94 | "PROMETHEUS_URL": "http://your-prometheus-server:9090", 95 | "PROMETHEUS_USERNAME": "your_username", 96 | "PROMETHEUS_PASSWORD": "your_password" 97 | } 98 | } 99 | } 100 | } 101 | ``` 102 | 103 | ## Network Connectivity 104 | 105 | Ensure that the environment where the Prometheus MCP Server runs has network access to your Prometheus server. If running in Docker, you might need to adjust network settings or use host networking depending on your setup. 106 | 107 | ## Troubleshooting 108 | 109 | ### Connection Issues 110 | 111 | If you encounter connection issues: 112 | 113 | 1. Verify that the `PROMETHEUS_URL` is correct and accessible from the environment where the MCP server runs 114 | 2. Check that authentication credentials are correct 115 | 3. Ensure no network firewalls are blocking access 116 | 4. Verify that your Prometheus server is running and healthy 117 | 118 | ### Authentication Issues 119 | 120 | If you experience authentication problems: 121 | 122 | 1. Double-check your username and password or token 123 | 2. Ensure the authentication method matches what your Prometheus server expects 124 | 3. Check Prometheus server logs for authentication failures ``` -------------------------------------------------------------------------------- /docs/usage.md: -------------------------------------------------------------------------------- ```markdown 1 | # Usage Guide 2 | 3 | This guide explains how to use the Prometheus MCP Server with AI assistants like Claude. 4 | 5 | ## Available Tools 6 | 7 | The Prometheus MCP Server provides several tools that AI assistants can use to interact with your Prometheus data: 8 | 9 | ### Query Tools 10 | 11 | #### `execute_query` 12 | 13 | Executes an instant PromQL query and returns the current value(s). 14 | 15 | **Parameters:** 16 | - `query`: PromQL query string (required) 17 | - `time`: Optional RFC3339 or Unix timestamp (defaults to current time) 18 | 19 | **Example Claude prompt:** 20 | ``` 21 | Use the execute_query tool to check the current value of the 'up' metric. 22 | ``` 23 | 24 | #### `execute_range_query` 25 | 26 | Executes a PromQL range query to return values over a time period. 27 | 28 | **Parameters:** 29 | - `query`: PromQL query string (required) 30 | - `start`: Start time as RFC3339 or Unix timestamp (required) 31 | - `end`: End time as RFC3339 or Unix timestamp (required) 32 | - `step`: Query resolution step width (e.g., '15s', '1m', '1h') (required) 33 | 34 | **Example Claude prompt:** 35 | ``` 36 | Use the execute_range_query tool to show me the CPU usage over the last hour with 5-minute intervals. Use the query 'rate(node_cpu_seconds_total{mode="user"}[5m])'. 37 | ``` 38 | 39 | ### Discovery Tools 40 | 41 | #### `list_metrics` 42 | 43 | Retrieves a list of all available metric names. 44 | 45 | **Example Claude prompt:** 46 | ``` 47 | Use the list_metrics tool to show me all available metrics in my Prometheus server. 48 | ``` 49 | 50 | #### `get_metric_metadata` 51 | 52 | Retrieves metadata about a specific metric. 53 | 54 | **Parameters:** 55 | - `metric`: The name of the metric (required) 56 | 57 | **Example Claude prompt:** 58 | ``` 59 | Use the get_metric_metadata tool to get information about the 'http_requests_total' metric. 60 | ``` 61 | 62 | #### `get_targets` 63 | 64 | Retrieves information about all Prometheus scrape targets. 65 | 66 | **Example Claude prompt:** 67 | ``` 68 | Use the get_targets tool to check the health of all monitoring targets. 69 | ``` 70 | 71 | ## Example Workflows 72 | 73 | ### Basic Monitoring Check 74 | 75 | ``` 76 | Can you check if all my monitored services are up? Also, show me the top 5 CPU-consuming pods if we're monitoring Kubernetes. 77 | ``` 78 | 79 | Claude might use: 80 | 1. `execute_query` with `up` to check service health 81 | 2. `execute_query` with a more complex PromQL query to find CPU usage 82 | 83 | ### Performance Analysis 84 | 85 | ``` 86 | Analyze the memory usage pattern of my application over the last 24 hours. Are there any concerning spikes? 87 | ``` 88 | 89 | Claude might use: 90 | 1. `execute_range_query` with appropriate time parameters 91 | 2. Analyze the data for patterns and anomalies 92 | 93 | ### Metric Exploration 94 | 95 | ``` 96 | I'm not sure what metrics are available. Can you help me discover metrics related to HTTP requests and then show me their current values? 97 | ``` 98 | 99 | Claude might use: 100 | 1. `list_metrics` to get all metrics 101 | 2. Filter for HTTP-related metrics 102 | 3. `get_metric_metadata` to understand what each metric represents 103 | 4. `execute_query` to fetch current values 104 | 105 | ## Tips for Effective Use 106 | 107 | 1. **Be specific about time ranges** when asking for historical data 108 | 2. **Specify step intervals** appropriate to your time range (e.g., use smaller steps for shorter periods) 109 | 3. **Use metric discovery tools** if you're unsure what metrics are available 110 | 4. **Start with simple queries** and gradually build more complex ones 111 | 5. **Ask for explanations** if you don't understand the returned data 112 | 113 | ## PromQL Query Examples 114 | 115 | Here are some useful PromQL queries you can use with the tools: 116 | 117 | ### Basic Queries 118 | 119 | - Check if targets are up: `up` 120 | - HTTP request rate: `rate(http_requests_total[5m])` 121 | - CPU usage: `sum(rate(node_cpu_seconds_total{mode!="idle"}[5m])) by (instance)` 122 | - Memory usage: `node_memory_MemTotal_bytes - node_memory_MemFree_bytes - node_memory_Buffers_bytes - node_memory_Cached_bytes` 123 | 124 | ### Kubernetes-specific Queries 125 | 126 | - Pod CPU usage: `sum(rate(container_cpu_usage_seconds_total{container!="POD",container!=""}[5m])) by (pod)` 127 | - Pod memory usage: `sum(container_memory_working_set_bytes{container!="POD",container!=""}) by (pod)` 128 | - Pod restart count: `kube_pod_container_status_restarts_total` 129 | 130 | ## Limitations 131 | 132 | - The MCP server queries your live Prometheus instance, so it only has access to metrics retained in your Prometheus server's storage 133 | - Complex PromQL queries might take longer to execute, especially over large time ranges 134 | - Authentication is passed through from your environment variables, so ensure you're using credentials with appropriate access rights ``` -------------------------------------------------------------------------------- /.github/workflows/ci.yml: -------------------------------------------------------------------------------- ```yaml 1 | name: CI/CD 2 | 3 | on: 4 | push: 5 | branches: [ "main" ] 6 | tags: 7 | - 'v*' 8 | pull_request: 9 | branches: [ "main" ] 10 | 11 | env: 12 | REGISTRY: ghcr.io 13 | IMAGE_NAME: ${{ github.repository }} 14 | 15 | jobs: 16 | ci: 17 | name: CI 18 | runs-on: ubuntu-latest 19 | timeout-minutes: 10 20 | permissions: 21 | contents: read 22 | packages: write 23 | 24 | steps: 25 | - name: Checkout repository 26 | uses: actions/checkout@v4 27 | 28 | - name: Set up Python 3.12 29 | uses: actions/setup-python@v5 30 | with: 31 | python-version: "3.12" 32 | 33 | - name: Install uv 34 | uses: astral-sh/setup-uv@v4 35 | with: 36 | enable-cache: true 37 | 38 | - name: Create virtual environment 39 | run: uv venv 40 | 41 | - name: Install dependencies 42 | run: | 43 | source .venv/bin/activate 44 | uv pip install -e ".[dev]" 45 | 46 | - name: Run tests with coverage 47 | run: | 48 | source .venv/bin/activate 49 | pytest --cov --junitxml=junit.xml -o junit_family=legacy --cov-report=xml --cov-fail-under=89 50 | 51 | - name: Upload coverage to Codecov 52 | uses: codecov/codecov-action@v4 53 | with: 54 | file: ./coverage.xml 55 | fail_ci_if_error: false 56 | 57 | - name: Upload test results to Codecov 58 | if: ${{ !cancelled() }} 59 | uses: codecov/test-results-action@v1 60 | with: 61 | file: ./junit.xml 62 | token: ${{ secrets.CODECOV_TOKEN }} 63 | 64 | - name: Build Python distribution 65 | run: | 66 | python3 -m pip install build --user 67 | python3 -m build 68 | 69 | - name: Store the distribution packages 70 | uses: actions/upload-artifact@v4 71 | with: 72 | name: python-package-distributions 73 | path: dist/ 74 | 75 | - name: Set up QEMU 76 | uses: docker/setup-qemu-action@v3 77 | 78 | - name: Set up Docker Buildx 79 | uses: docker/setup-buildx-action@v3 80 | 81 | - name: Log in to the Container registry 82 | if: github.event_name != 'pull_request' 83 | uses: docker/login-action@v3 84 | with: 85 | registry: ${{ env.REGISTRY }} 86 | username: ${{ github.actor }} 87 | password: ${{ secrets.GITHUB_TOKEN }} 88 | 89 | - name: Extract metadata (tags, labels) for Docker 90 | id: meta 91 | uses: docker/metadata-action@v5 92 | with: 93 | images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }} 94 | tags: | 95 | type=ref,event=branch 96 | type=ref,event=pr 97 | type=semver,pattern={{version}} 98 | type=semver,pattern={{major}}.{{minor}} 99 | type=semver,pattern={{major}} 100 | type=sha,format=long 101 | 102 | - name: Build and push Docker image 103 | uses: docker/build-push-action@v5 104 | with: 105 | context: . 106 | push: ${{ github.event_name != 'pull_request' }} 107 | tags: ${{ steps.meta.outputs.tags }} 108 | labels: ${{ steps.meta.outputs.labels }} 109 | platforms: linux/amd64,linux/arm64 110 | cache-from: type=gha 111 | cache-to: type=gha,mode=max 112 | 113 | deploy: 114 | name: Deploy 115 | if: startsWith(github.ref, 'refs/tags/v') && github.event_name == 'push' 116 | needs: ci 117 | runs-on: ubuntu-latest 118 | timeout-minutes: 15 119 | environment: 120 | name: pypi 121 | url: https://pypi.org/p/prometheus_mcp_server 122 | permissions: 123 | contents: write # Required for creating GitHub releases 124 | id-token: write # Required for PyPI publishing and MCP registry OIDC authentication 125 | 126 | steps: 127 | - name: Checkout repository 128 | uses: actions/checkout@v4 129 | 130 | - name: Download all the dists 131 | uses: actions/download-artifact@v4 132 | with: 133 | name: python-package-distributions 134 | path: dist/ 135 | 136 | - name: Publish distribution to PyPI 137 | uses: pypa/gh-action-pypi-publish@release/v1 138 | 139 | - name: Sign the dists with Sigstore 140 | uses: sigstore/[email protected] 141 | with: 142 | inputs: >- 143 | ./dist/*.tar.gz 144 | ./dist/*.whl 145 | 146 | - name: Create GitHub Release 147 | env: 148 | GITHUB_TOKEN: ${{ github.token }} 149 | run: >- 150 | gh release create 151 | "$GITHUB_REF_NAME" 152 | --repo "$GITHUB_REPOSITORY" 153 | --generate-notes 154 | 155 | - name: Upload artifact signatures to GitHub Release 156 | env: 157 | GITHUB_TOKEN: ${{ github.token }} 158 | run: >- 159 | gh release upload 160 | "$GITHUB_REF_NAME" dist/** 161 | --repo "$GITHUB_REPOSITORY" 162 | 163 | - name: Install MCP Publisher 164 | run: | 165 | curl -L "https://github.com/modelcontextprotocol/registry/releases/download/v1.2.3/mcp-publisher_1.2.3_$(uname -s | tr '[:upper:]' '[:lower:]')_$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/').tar.gz" | tar xz mcp-publisher 166 | 167 | - name: Login to MCP Registry 168 | run: ./mcp-publisher login github-oidc 169 | 170 | - name: Publish to MCP Registry 171 | run: ./mcp-publisher publish ``` -------------------------------------------------------------------------------- /tests/test_tools.py: -------------------------------------------------------------------------------- ```python 1 | """Tests for the MCP tools functionality.""" 2 | 3 | import pytest 4 | import json 5 | from unittest.mock import patch, MagicMock 6 | from fastmcp import Client 7 | from prometheus_mcp_server.server import mcp, execute_query, execute_range_query, list_metrics, get_metric_metadata, get_targets 8 | 9 | @pytest.fixture 10 | def mock_make_request(): 11 | """Mock the make_prometheus_request function.""" 12 | with patch("prometheus_mcp_server.server.make_prometheus_request") as mock: 13 | yield mock 14 | 15 | @pytest.mark.asyncio 16 | async def test_execute_query(mock_make_request): 17 | """Test the execute_query tool.""" 18 | # Setup 19 | mock_make_request.return_value = { 20 | "resultType": "vector", 21 | "result": [{"metric": {"__name__": "up"}, "value": [1617898448.214, "1"]}] 22 | } 23 | 24 | async with Client(mcp) as client: 25 | # Execute 26 | result = await client.call_tool("execute_query", {"query":"up"}) 27 | 28 | # Verify 29 | mock_make_request.assert_called_once_with("query", params={"query": "up"}) 30 | assert result.data["resultType"] == "vector" 31 | assert len(result.data["result"]) == 1 32 | 33 | @pytest.mark.asyncio 34 | async def test_execute_query_with_time(mock_make_request): 35 | """Test the execute_query tool with a specified time.""" 36 | # Setup 37 | mock_make_request.return_value = { 38 | "resultType": "vector", 39 | "result": [{"metric": {"__name__": "up"}, "value": [1617898448.214, "1"]}] 40 | } 41 | 42 | async with Client(mcp) as client: 43 | # Execute 44 | result = await client.call_tool("execute_query", {"query":"up", "time":"2023-01-01T00:00:00Z"}) 45 | 46 | # Verify 47 | mock_make_request.assert_called_once_with("query", params={"query": "up", "time": "2023-01-01T00:00:00Z"}) 48 | assert result.data["resultType"] == "vector" 49 | 50 | @pytest.mark.asyncio 51 | async def test_execute_range_query(mock_make_request): 52 | """Test the execute_range_query tool.""" 53 | # Setup 54 | mock_make_request.return_value = { 55 | "resultType": "matrix", 56 | "result": [{ 57 | "metric": {"__name__": "up"}, 58 | "values": [ 59 | [1617898400, "1"], 60 | [1617898415, "1"] 61 | ] 62 | }] 63 | } 64 | 65 | async with Client(mcp) as client: 66 | # Execute 67 | result = await client.call_tool( 68 | "execute_range_query",{ 69 | "query": "up", 70 | "start": "2023-01-01T00:00:00Z", 71 | "end": "2023-01-01T01:00:00Z", 72 | "step": "15s" 73 | }) 74 | 75 | # Verify 76 | mock_make_request.assert_called_once_with("query_range", params={ 77 | "query": "up", 78 | "start": "2023-01-01T00:00:00Z", 79 | "end": "2023-01-01T01:00:00Z", 80 | "step": "15s" 81 | }) 82 | assert result.data["resultType"] == "matrix" 83 | assert len(result.data["result"]) == 1 84 | assert len(result.data["result"][0]["values"]) == 2 85 | 86 | @pytest.mark.asyncio 87 | async def test_list_metrics(mock_make_request): 88 | """Test the list_metrics tool.""" 89 | # Setup 90 | mock_make_request.return_value = ["up", "go_goroutines", "http_requests_total"] 91 | 92 | async with Client(mcp) as client: 93 | # Execute 94 | result = await client.call_tool("list_metrics", {}) 95 | 96 | # Verify 97 | mock_make_request.assert_called_once_with("label/__name__/values") 98 | assert result.data == ["up", "go_goroutines", "http_requests_total"] 99 | 100 | @pytest.mark.asyncio 101 | async def test_get_metric_metadata(mock_make_request): 102 | """Test the get_metric_metadata tool.""" 103 | # Setup 104 | mock_make_request.return_value = [ 105 | {"metric": "up", "type": "gauge", "help": "Up indicates if the scrape was successful", "unit": ""} 106 | ] 107 | 108 | async with Client(mcp) as client: 109 | # Execute 110 | result = await client.call_tool("get_metric_metadata", {"metric":"up"}) 111 | 112 | payload = result.content[0].text 113 | json_data = json.loads(payload) 114 | 115 | # Verify 116 | mock_make_request.assert_called_once_with("metadata", params={"metric": "up"}) 117 | assert len(json_data) == 1 118 | assert json_data[0]["metric"] == "up" 119 | assert json_data[0]["type"] == "gauge" 120 | 121 | @pytest.mark.asyncio 122 | async def test_get_targets(mock_make_request): 123 | """Test the get_targets tool.""" 124 | # Setup 125 | mock_make_request.return_value = { 126 | "activeTargets": [ 127 | {"discoveredLabels": {"__address__": "localhost:9090"}, "labels": {"job": "prometheus"}, "health": "up"} 128 | ], 129 | "droppedTargets": [] 130 | } 131 | 132 | async with Client(mcp) as client: 133 | # Execute 134 | result = await client.call_tool("get_targets",{}) 135 | 136 | payload = result.content[0].text 137 | json_data = json.loads(payload) 138 | 139 | # Verify 140 | mock_make_request.assert_called_once_with("targets") 141 | assert len(json_data["activeTargets"]) == 1 142 | assert json_data["activeTargets"][0]["health"] == "up" 143 | assert len(json_data["droppedTargets"]) == 0 144 | ``` -------------------------------------------------------------------------------- /docs/api_reference.md: -------------------------------------------------------------------------------- ```markdown 1 | # API Reference 2 | 3 | This document provides detailed information about the API endpoints and functions provided by the Prometheus MCP Server. 4 | 5 | ## MCP Tools 6 | 7 | ### Query Tools 8 | 9 | #### `execute_query` 10 | 11 | Executes a PromQL instant query against Prometheus. 12 | 13 | **Description**: Retrieves current values for a given PromQL expression. 14 | 15 | **Parameters**: 16 | 17 | | Parameter | Type | Required | Description | 18 | |-----------|------|----------|-------------| 19 | | `query` | string | Yes | The PromQL query expression | 20 | | `time` | string | No | Evaluation timestamp (RFC3339 or Unix timestamp) | 21 | 22 | **Returns**: Object with `resultType` and `result` fields. 23 | 24 | ```json 25 | { 26 | "resultType": "vector", 27 | "result": [ 28 | { 29 | "metric": { "__name__": "up", "job": "prometheus", "instance": "localhost:9090" }, 30 | "value": [1617898448.214, "1"] 31 | } 32 | ] 33 | } 34 | ``` 35 | 36 | #### `execute_range_query` 37 | 38 | Executes a PromQL range query with start time, end time, and step interval. 39 | 40 | **Description**: Retrieves values for a given PromQL expression over a time range. 41 | 42 | **Parameters**: 43 | 44 | | Parameter | Type | Required | Description | 45 | |-----------|------|----------|-------------| 46 | | `query` | string | Yes | The PromQL query expression | 47 | | `start` | string | Yes | Start time (RFC3339 or Unix timestamp) | 48 | | `end` | string | Yes | End time (RFC3339 or Unix timestamp) | 49 | | `step` | string | Yes | Query resolution step (e.g., "15s", "1m", "1h") | 50 | 51 | **Returns**: Object with `resultType` and `result` fields. 52 | 53 | ```json 54 | { 55 | "resultType": "matrix", 56 | "result": [ 57 | { 58 | "metric": { "__name__": "up", "job": "prometheus", "instance": "localhost:9090" }, 59 | "values": [ 60 | [1617898400, "1"], 61 | [1617898415, "1"], 62 | [1617898430, "1"] 63 | ] 64 | } 65 | ] 66 | } 67 | ``` 68 | 69 | ### Discovery Tools 70 | 71 | #### `list_metrics` 72 | 73 | List all available metrics in Prometheus. 74 | 75 | **Description**: Retrieves a list of all metric names available in the Prometheus server. 76 | 77 | **Parameters**: None 78 | 79 | **Returns**: Array of metric names. 80 | 81 | ```json 82 | ["up", "go_goroutines", "http_requests_total", ...] 83 | ``` 84 | 85 | #### `get_metric_metadata` 86 | 87 | Get metadata for a specific metric. 88 | 89 | **Description**: Retrieves metadata information about a specific metric. 90 | 91 | **Parameters**: 92 | 93 | | Parameter | Type | Required | Description | 94 | |-----------|------|----------|-------------| 95 | | `metric` | string | Yes | The name of the metric | 96 | 97 | **Returns**: Array of metadata objects. 98 | 99 | ```json 100 | [ 101 | { 102 | "metric": "up", 103 | "type": "gauge", 104 | "help": "Up indicates if the scrape was successful", 105 | "unit": "" 106 | } 107 | ] 108 | ``` 109 | 110 | #### `get_targets` 111 | 112 | Get information about all scrape targets. 113 | 114 | **Description**: Retrieves the current state of all Prometheus scrape targets. 115 | 116 | **Parameters**: None 117 | 118 | **Returns**: Object with `activeTargets` and `droppedTargets` arrays. 119 | 120 | ```json 121 | { 122 | "activeTargets": [ 123 | { 124 | "discoveredLabels": { 125 | "__address__": "localhost:9090", 126 | "__metrics_path__": "/metrics", 127 | "__scheme__": "http", 128 | "job": "prometheus" 129 | }, 130 | "labels": { 131 | "instance": "localhost:9090", 132 | "job": "prometheus" 133 | }, 134 | "scrapePool": "prometheus", 135 | "scrapeUrl": "http://localhost:9090/metrics", 136 | "lastError": "", 137 | "lastScrape": "2023-04-08T12:00:45.123Z", 138 | "lastScrapeDuration": 0.015, 139 | "health": "up" 140 | } 141 | ], 142 | "droppedTargets": [] 143 | } 144 | ``` 145 | 146 | ## Prometheus API Endpoints 147 | 148 | The MCP server interacts with the following Prometheus API endpoints: 149 | 150 | ### `/api/v1/query` 151 | 152 | Used by `execute_query` to perform instant queries. 153 | 154 | ### `/api/v1/query_range` 155 | 156 | Used by `execute_range_query` to perform range queries. 157 | 158 | ### `/api/v1/label/__name__/values` 159 | 160 | Used by `list_metrics` to retrieve all metric names. 161 | 162 | ### `/api/v1/metadata` 163 | 164 | Used by `get_metric_metadata` to retrieve metadata about metrics. 165 | 166 | ### `/api/v1/targets` 167 | 168 | Used by `get_targets` to retrieve information about scrape targets. 169 | 170 | ## Error Handling 171 | 172 | All tools return standardized error responses when problems occur: 173 | 174 | 1. **Connection errors**: When the server cannot connect to Prometheus 175 | 2. **Authentication errors**: When credentials are invalid or insufficient 176 | 3. **Query errors**: When a PromQL query is invalid or fails to execute 177 | 4. **Not found errors**: When requested metrics or data don't exist 178 | 179 | Error messages are descriptive and include the specific issue that occurred. 180 | 181 | ## Result Types 182 | 183 | Prometheus returns different result types depending on the query: 184 | 185 | ### Instant Query Result Types 186 | 187 | - **Vector**: A set of time series, each with a single sample (most common for instant queries) 188 | - **Scalar**: A single numeric value 189 | - **String**: A string value 190 | 191 | ### Range Query Result Types 192 | 193 | - **Matrix**: A set of time series, each with multiple samples over time (most common for range queries) 194 | 195 | ## Time Formats 196 | 197 | Time parameters accept either: 198 | 199 | 1. **RFC3339 timestamps**: `2023-04-08T12:00:00Z` 200 | 2. **Unix timestamps**: `1617869245.324` 201 | 202 | If not specified, the current time is used for instant queries. ``` -------------------------------------------------------------------------------- /docs/deploying_with_toolhive.md: -------------------------------------------------------------------------------- ```markdown 1 | # Deploying Prometheus MCP Server with Toolhive in Kubernetes 2 | 3 | This guide explains how to deploy the Prometheus MCP server in a Kubernetes cluster using the Toolhive operator. 4 | 5 | ## Overview 6 | 7 | The Toolhive operator provides a Kubernetes-native way to manage MCP servers. It automates the deployment, configuration, and lifecycle management of MCP servers in your Kubernetes cluster. This guide focuses specifically on deploying the Prometheus MCP server, which allows AI agents to query Prometheus metrics. 8 | 9 | ## Prerequisites 10 | 11 | Before you begin, make sure you have: 12 | 13 | - A Kubernetes cluster 14 | - Helm (v3.10 minimum, v3.14+ recommended) 15 | - kubectl 16 | - A Prometheus instance running in your cluster 17 | 18 | For detailed instructions on setting up a Kubernetes cluster and installing the Toolhive operator, refer to the [Toolhive Kubernetes Operator Tutorial](https://codegate-docs-git-website-refactor-stacklok.vercel.app/toolhive/tutorials/toolhive-operator). 19 | 20 | ## Deploying the Prometheus MCP Server 21 | 22 | ### Step 1: Install the Toolhive Operator 23 | 24 | Follow the instructions in the [Toolhive Kubernetes Operator Tutorial](https://codegate-docs-git-website-refactor-stacklok.vercel.app/toolhive/tutorials/toolhive-operator) to install the Toolhive operator in your Kubernetes cluster. 25 | 26 | ### Step 2: Create the Prometheus MCP Server Resource 27 | 28 | Create a YAML file named `mcpserver_prometheus.yaml` with the following content: 29 | 30 | ```yaml 31 | apiVersion: toolhive.stacklok.dev/v1alpha1 32 | kind: MCPServer 33 | metadata: 34 | name: prometheus 35 | namespace: toolhive-system 36 | spec: 37 | image: ghcr.io/pab1it0/prometheus-mcp-server:latest 38 | transport: stdio 39 | port: 8080 40 | permissionProfile: 41 | type: builtin 42 | name: network 43 | podTemplateSpec: 44 | spec: 45 | containers: 46 | - name: mcp 47 | securityContext: 48 | allowPrivilegeEscalation: false 49 | runAsNonRoot: false 50 | runAsUser: 0 51 | runAsGroup: 0 52 | capabilities: 53 | drop: 54 | - ALL 55 | resources: 56 | limits: 57 | cpu: "500m" 58 | memory: "512Mi" 59 | requests: 60 | cpu: "100m" 61 | memory: "128Mi" 62 | env: 63 | - name: PROMETHEUS_URL 64 | value: "http://prometheus-server.monitoring.svc.cluster.local:80" # Default value, can be overridden 65 | securityContext: 66 | runAsNonRoot: false 67 | runAsUser: 0 68 | runAsGroup: 0 69 | seccompProfile: 70 | type: RuntimeDefault 71 | resources: 72 | limits: 73 | cpu: "100m" 74 | memory: "128Mi" 75 | requests: 76 | cpu: "50m" 77 | memory: "64Mi" 78 | ``` 79 | 80 | > **Important**: Make sure to update the `PROMETHEUS_URL` environment variable to point to your Prometheus server's URL in your Kubernetes cluster. 81 | 82 | ### Step 3: Apply the MCP Server Resource 83 | 84 | Apply the YAML file to your cluster: 85 | 86 | ```bash 87 | kubectl apply -f mcpserver_prometheus.yaml 88 | ``` 89 | 90 | ### Step 4: Verify the Deployment 91 | 92 | Check that the MCP server is running: 93 | 94 | ```bash 95 | kubectl get mcpservers -n toolhive-system 96 | ``` 97 | 98 | You should see output similar to: 99 | 100 | ``` 101 | NAME STATUS URL AGE 102 | prometheus Running http://prometheus-mcp-proxy.toolhive-system.svc.cluster.local:8080 30s 103 | ``` 104 | 105 | ## Using the Prometheus MCP Server with Copilot 106 | 107 | Once the Prometheus MCP server is deployed, you can use it with GitHub Copilot or other AI agents that support the Model Context Protocol. 108 | 109 | ### Example: Querying Prometheus Metrics 110 | 111 | When asking Copilot about Prometheus metrics, you might see responses like: 112 | 113 | **Query**: "What is the rate of requests on the Prometheus server?" 114 | 115 | **Response**: 116 | 117 | ```json 118 | { 119 | "resultType": "vector", 120 | "result": [ 121 | { 122 | "metric": { 123 | "__name__": "up", 124 | "instance": "localhost:9090", 125 | "job": "prometheus" 126 | }, 127 | "value": [ 128 | 1749034117.048, 129 | "1" 130 | ] 131 | } 132 | ] 133 | } 134 | ``` 135 | 136 | This shows that the Prometheus server is up and running (value "1"). 137 | 138 | ## Troubleshooting 139 | 140 | If you encounter issues with the Prometheus MCP server: 141 | 142 | 1. Check the MCP server status: 143 | ```bash 144 | kubectl get mcpservers -n toolhive-system 145 | ``` 146 | 147 | 2. Check the MCP server logs: 148 | ```bash 149 | kubectl logs -n toolhive-system deployment/prometheus-mcp 150 | ``` 151 | 152 | 3. Verify the Prometheus URL is correct in the MCP server configuration. 153 | 154 | 4. Ensure your Prometheus server is accessible from the MCP server pod. 155 | 156 | ## Configuration Options 157 | 158 | The Prometheus MCP server can be configured with the following environment variables: 159 | 160 | - `PROMETHEUS_URL`: The URL of your Prometheus server (required) 161 | - `PORT`: The port on which the MCP server listens (default: 8080) 162 | 163 | ## Available Metrics and Queries 164 | 165 | The Prometheus MCP server provides access to all metrics available in your Prometheus instance. Some common queries include: 166 | 167 | - `up`: Check if targets are up 168 | - `rate(http_requests_total[5m])`: Request rate over the last 5 minutes 169 | - `sum by (job) (rate(http_requests_total[5m]))`: Request rate by job 170 | 171 | For more information on PromQL (Prometheus Query Language), refer to the [Prometheus documentation](https://prometheus.io/docs/prometheus/latest/querying/basics/). 172 | 173 | ## Conclusion 174 | 175 | By following this guide, you've deployed a Prometheus MCP server in your Kubernetes cluster using the Toolhive operator. This server allows AI agents like GitHub Copilot to query your Prometheus metrics, enabling powerful observability and monitoring capabilities through natural language. 176 | ``` -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/question.yml: -------------------------------------------------------------------------------- ```yaml 1 | name: ❓ Question or Support 2 | description: Ask a question or get help with configuration/usage 3 | title: "[Question]: " 4 | labels: ["type: question", "status: needs-triage"] 5 | assignees: [] 6 | body: 7 | - type: markdown 8 | attributes: 9 | value: | 10 | Thank you for your question! Please provide as much detail as possible so we can help you effectively. 11 | 12 | **Note**: For general discussions, feature brainstorming, or community chat, consider using [Discussions](https://github.com/pab1it0/prometheus-mcp-server/discussions) instead. 13 | 14 | - type: checkboxes 15 | id: checklist 16 | attributes: 17 | label: Pre-submission Checklist 18 | description: Please complete the following before asking your question 19 | options: 20 | - label: I have searched existing issues and discussions for similar questions 21 | required: true 22 | - label: I have checked the documentation and README 23 | required: true 24 | - label: I have tried basic troubleshooting steps 25 | required: false 26 | 27 | - type: dropdown 28 | id: question-type 29 | attributes: 30 | label: Question Type 31 | description: What type of help do you need? 32 | options: 33 | - Configuration Help (setup, environment variables, MCP client config) 34 | - Usage Help (how to use tools, execute queries) 35 | - Troubleshooting (something not working as expected) 36 | - Integration Help (connecting to Prometheus, MCP clients) 37 | - Authentication Help (setting up auth, credentials) 38 | - Performance Question (optimization, best practices) 39 | - Deployment Help (Docker, production setup) 40 | - General Question (understanding concepts, how things work) 41 | validations: 42 | required: true 43 | 44 | - type: textarea 45 | id: question 46 | attributes: 47 | label: Question 48 | description: What would you like to know or what help do you need? 49 | placeholder: Please describe your question or the help you need in detail 50 | validations: 51 | required: true 52 | 53 | - type: textarea 54 | id: context 55 | attributes: 56 | label: Context and Background 57 | description: Provide context about what you're trying to accomplish 58 | placeholder: | 59 | - What are you trying to achieve? 60 | - What is your use case? 61 | - What have you tried so far? 62 | - Where are you getting stuck? 63 | validations: 64 | required: true 65 | 66 | - type: dropdown 67 | id: experience-level 68 | attributes: 69 | label: Experience Level 70 | description: How familiar are you with the relevant technologies? 71 | options: 72 | - Beginner (new to Prometheus, MCP, or similar tools) 73 | - Intermediate (some experience with related technologies) 74 | - Advanced (experienced user looking for specific guidance) 75 | validations: 76 | required: true 77 | 78 | - type: textarea 79 | id: current-setup 80 | attributes: 81 | label: Current Setup 82 | description: Describe your current setup and configuration 83 | placeholder: | 84 | - Operating System: 85 | - Python Version: 86 | - Prometheus MCP Server Version: 87 | - Prometheus Version: 88 | - MCP Client (Claude Desktop, etc.): 89 | - Transport Mode (stdio/HTTP/SSE): 90 | render: markdown 91 | validations: 92 | required: false 93 | 94 | - type: textarea 95 | id: configuration 96 | attributes: 97 | label: Configuration 98 | description: Share your current configuration (remove sensitive information) 99 | placeholder: | 100 | Environment variables: 101 | PROMETHEUS_URL=... 102 | 103 | MCP Client configuration: 104 | { 105 | "mcpServers": { 106 | ... 107 | } 108 | } 109 | render: bash 110 | validations: 111 | required: false 112 | 113 | - type: textarea 114 | id: attempted-solutions 115 | attributes: 116 | label: What Have You Tried? 117 | description: What troubleshooting steps or solutions have you already attempted? 118 | placeholder: | 119 | - Checked documentation sections: ... 120 | - Tried different configurations: ... 121 | - Searched for similar issues: ... 122 | - Tested with different versions: ... 123 | validations: 124 | required: false 125 | 126 | - type: textarea 127 | id: error-messages 128 | attributes: 129 | label: Error Messages or Logs 130 | description: Include any error messages, logs, or unexpected behavior 131 | placeholder: Paste any relevant error messages or log output here 132 | render: text 133 | validations: 134 | required: false 135 | 136 | - type: textarea 137 | id: expected-outcome 138 | attributes: 139 | label: Expected Outcome 140 | description: What result or behavior are you hoping to achieve? 141 | placeholder: Describe what you expect to happen or what success looks like 142 | validations: 143 | required: false 144 | 145 | - type: dropdown 146 | id: urgency 147 | attributes: 148 | label: Urgency 149 | description: How urgent is this question for you? 150 | options: 151 | - Low - General curiosity or learning 152 | - Medium - Helpful for current project 153 | - High - Blocking current work 154 | - Critical - Production issue or deadline-critical 155 | default: 1 156 | validations: 157 | required: true 158 | 159 | - type: textarea 160 | id: additional-info 161 | attributes: 162 | label: Additional Information 163 | description: Any other details that might be helpful 164 | placeholder: | 165 | - Screenshots or diagrams 166 | - Links to relevant documentation you've already read 167 | - Specific Prometheus metrics or queries you're working with 168 | - Network or infrastructure details 169 | - Timeline or constraints 170 | validations: 171 | required: false ``` -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug_report.yml: -------------------------------------------------------------------------------- ```yaml 1 | name: 🐛 Bug Report 2 | description: Report a bug or unexpected behavior 3 | title: "[Bug]: " 4 | labels: ["type: bug", "status: needs-triage"] 5 | assignees: [] 6 | body: 7 | - type: markdown 8 | attributes: 9 | value: | 10 | Thank you for taking the time to report this bug! Please provide as much detail as possible to help us resolve the issue quickly. 11 | 12 | - type: checkboxes 13 | id: checklist 14 | attributes: 15 | label: Pre-submission Checklist 16 | description: Please complete the following checklist before submitting your bug report 17 | options: 18 | - label: I have searched existing issues to ensure this bug hasn't been reported before 19 | required: true 20 | - label: I have checked the documentation and this appears to be a bug, not a configuration issue 21 | required: true 22 | - label: I can reproduce this issue consistently 23 | required: false 24 | 25 | - type: dropdown 26 | id: priority 27 | attributes: 28 | label: Priority Level 29 | description: How critical is this bug to your use case? 30 | options: 31 | - Low - Minor issue, workaround available 32 | - Medium - Moderate impact on functionality 33 | - High - Significant impact, blocks important functionality 34 | - Critical - System unusable, data loss, or security issue 35 | default: 0 36 | validations: 37 | required: true 38 | 39 | - type: textarea 40 | id: bug-description 41 | attributes: 42 | label: Bug Description 43 | description: A clear and concise description of the bug 44 | placeholder: Describe what happened and what you expected to happen instead 45 | validations: 46 | required: true 47 | 48 | - type: textarea 49 | id: reproduction-steps 50 | attributes: 51 | label: Steps to Reproduce 52 | description: Detailed steps to reproduce the bug 53 | placeholder: | 54 | 1. Configure the MCP server with... 55 | 2. Execute the following command... 56 | 3. Observe the following behavior... 57 | value: | 58 | 1. 59 | 2. 60 | 3. 61 | validations: 62 | required: true 63 | 64 | - type: textarea 65 | id: expected-behavior 66 | attributes: 67 | label: Expected Behavior 68 | description: What should happen instead of the bug? 69 | placeholder: Describe the expected behavior 70 | validations: 71 | required: true 72 | 73 | - type: textarea 74 | id: actual-behavior 75 | attributes: 76 | label: Actual Behavior 77 | description: What actually happens when you follow the reproduction steps? 78 | placeholder: Describe what actually happens 79 | validations: 80 | required: true 81 | 82 | - type: dropdown 83 | id: component 84 | attributes: 85 | label: Affected Component 86 | description: Which component is affected by this bug? 87 | options: 88 | - Prometheus Integration (queries, metrics, API calls) 89 | - MCP Server (transport, protocols, tools) 90 | - Authentication (basic auth, token auth, credentials) 91 | - Configuration (environment variables, setup) 92 | - Docker/Deployment (containerization, deployment) 93 | - Logging (error messages, debug output) 94 | - Documentation (README, guides, API docs) 95 | - Other (please specify in description) 96 | validations: 97 | required: true 98 | 99 | - type: dropdown 100 | id: environment-os 101 | attributes: 102 | label: Operating System 103 | description: On which operating system does this bug occur? 104 | options: 105 | - Linux 106 | - macOS 107 | - Windows 108 | - Docker Container 109 | - Other (please specify) 110 | validations: 111 | required: true 112 | 113 | - type: input 114 | id: environment-python 115 | attributes: 116 | label: Python Version 117 | description: What version of Python are you using? 118 | placeholder: "e.g., 3.11.5, 3.12.0" 119 | validations: 120 | required: true 121 | 122 | - type: input 123 | id: environment-mcp-version 124 | attributes: 125 | label: Prometheus MCP Server Version 126 | description: What version of the Prometheus MCP Server are you using? 127 | placeholder: "e.g., 1.2.0, latest, commit hash" 128 | validations: 129 | required: true 130 | 131 | - type: input 132 | id: environment-prometheus 133 | attributes: 134 | label: Prometheus Version 135 | description: What version of Prometheus are you connecting to? 136 | placeholder: "e.g., 2.45.0, latest" 137 | validations: 138 | required: false 139 | 140 | - type: dropdown 141 | id: transport-mode 142 | attributes: 143 | label: Transport Mode 144 | description: Which transport mode are you using? 145 | options: 146 | - stdio (default) 147 | - HTTP 148 | - SSE 149 | - Unknown 150 | default: 0 151 | validations: 152 | required: true 153 | 154 | - type: textarea 155 | id: configuration 156 | attributes: 157 | label: Configuration 158 | description: Please share your configuration (remove sensitive information like passwords/tokens) 159 | placeholder: | 160 | Environment variables: 161 | PROMETHEUS_URL=http://localhost:9090 162 | PROMETHEUS_USERNAME=... 163 | 164 | MCP Client configuration: 165 | { 166 | "mcpServers": { 167 | ... 168 | } 169 | } 170 | render: bash 171 | validations: 172 | required: false 173 | 174 | - type: textarea 175 | id: logs 176 | attributes: 177 | label: Error Logs 178 | description: Please include any relevant error messages or logs 179 | placeholder: Paste error messages, stack traces, or relevant log output here 180 | render: text 181 | validations: 182 | required: false 183 | 184 | - type: textarea 185 | id: prometheus-query 186 | attributes: 187 | label: PromQL Query (if applicable) 188 | description: If this bug is related to a specific query, please include it 189 | placeholder: "e.g., up, rate(prometheus_http_requests_total[5m])" 190 | render: promql 191 | validations: 192 | required: false 193 | 194 | - type: textarea 195 | id: workaround 196 | attributes: 197 | label: Workaround 198 | description: Have you found any temporary workaround for this issue? 199 | placeholder: Describe any workaround you've discovered 200 | validations: 201 | required: false 202 | 203 | - type: textarea 204 | id: additional-context 205 | attributes: 206 | label: Additional Context 207 | description: Any other information that might be helpful 208 | placeholder: | 209 | - Screenshots 210 | - Related issues 211 | - Links to relevant documentation 212 | - Network configuration details 213 | - Prometheus server setup details 214 | validations: 215 | required: false 216 | 217 | - type: checkboxes 218 | id: contribution 219 | attributes: 220 | label: Contribution 221 | options: 222 | - label: I would be willing to submit a pull request to fix this issue 223 | required: false ``` -------------------------------------------------------------------------------- /tests/test_main.py: -------------------------------------------------------------------------------- ```python 1 | """Tests for the main module.""" 2 | 3 | import os 4 | import pytest 5 | from unittest.mock import patch, MagicMock 6 | from prometheus_mcp_server.server import MCPServerConfig 7 | from prometheus_mcp_server.main import setup_environment, run_server 8 | 9 | @patch("prometheus_mcp_server.main.config") 10 | def test_setup_environment_success(mock_config): 11 | """Test successful environment setup.""" 12 | # Setup 13 | mock_config.url = "http://test:9090" 14 | mock_config.username = None 15 | mock_config.password = None 16 | mock_config.token = None 17 | mock_config.org_id = None 18 | mock_config.mcp_server_config = None 19 | 20 | # Execute 21 | result = setup_environment() 22 | 23 | # Verify 24 | assert result is True 25 | 26 | @patch("prometheus_mcp_server.main.config") 27 | def test_setup_environment_missing_url(mock_config): 28 | """Test environment setup with missing URL.""" 29 | # Setup - mock config with no URL 30 | mock_config.url = "" 31 | mock_config.username = None 32 | mock_config.password = None 33 | mock_config.token = None 34 | mock_config.org_id = None 35 | mock_config.mcp_server_config = None 36 | 37 | # Execute 38 | result = setup_environment() 39 | 40 | # Verify 41 | assert result is False 42 | 43 | @patch("prometheus_mcp_server.main.config") 44 | def test_setup_environment_with_auth(mock_config): 45 | """Test environment setup with authentication.""" 46 | # Setup 47 | mock_config.url = "http://test:9090" 48 | mock_config.username = "user" 49 | mock_config.password = "pass" 50 | mock_config.token = None 51 | mock_config.org_id = None 52 | mock_config.mcp_server_config = None 53 | 54 | # Execute 55 | result = setup_environment() 56 | 57 | # Verify 58 | assert result is True 59 | 60 | @patch("prometheus_mcp_server.main.config") 61 | def test_setup_environment_with_custom_mcp_config(mock_config): 62 | """Test environment setup with custom mcp config.""" 63 | # Setup 64 | mock_config.url = "http://test:9090" 65 | mock_config.username = "user" 66 | mock_config.password = "pass" 67 | mock_config.token = None 68 | mock_config.mcp_server_config = MCPServerConfig( 69 | mcp_server_transport="http", 70 | mcp_bind_host="localhost", 71 | mcp_bind_port=5000 72 | ) 73 | 74 | # Execute 75 | result = setup_environment() 76 | 77 | # Verify 78 | assert result is True 79 | 80 | @patch("prometheus_mcp_server.main.config") 81 | def test_setup_environment_with_custom_mcp_config_caps(mock_config): 82 | """Test environment setup with custom mcp config.""" 83 | # Setup 84 | mock_config.url = "http://test:9090" 85 | mock_config.username = "user" 86 | mock_config.password = "pass" 87 | mock_config.token = None 88 | mock_config.mcp_server_config = MCPServerConfig( 89 | mcp_server_transport="HTTP", 90 | mcp_bind_host="localhost", 91 | mcp_bind_port=5000 92 | ) 93 | 94 | 95 | # Execute 96 | result = setup_environment() 97 | 98 | # Verify 99 | assert result is True 100 | 101 | @patch("prometheus_mcp_server.main.config") 102 | def test_setup_environment_with_undefined_mcp_server_transports(mock_config): 103 | """Test environment setup with undefined mcp_server_transport.""" 104 | with pytest.raises(ValueError, match="MCP SERVER TRANSPORT is required"): 105 | mock_config.mcp_server_config = MCPServerConfig( 106 | mcp_server_transport=None, 107 | mcp_bind_host="localhost", 108 | mcp_bind_port=5000 109 | ) 110 | 111 | @patch("prometheus_mcp_server.main.config") 112 | def test_setup_environment_with_undefined_mcp_bind_host(mock_config): 113 | """Test environment setup with undefined mcp_bind_host.""" 114 | with pytest.raises(ValueError, match="MCP BIND HOST is required"): 115 | mock_config.mcp_server_config = MCPServerConfig( 116 | mcp_server_transport="http", 117 | mcp_bind_host=None, 118 | mcp_bind_port=5000 119 | ) 120 | 121 | @patch("prometheus_mcp_server.main.config") 122 | def test_setup_environment_with_undefined_mcp_bind_port(mock_config): 123 | """Test environment setup with undefined mcp_bind_port.""" 124 | with pytest.raises(ValueError, match="MCP BIND PORT is required"): 125 | mock_config.mcp_server_config = MCPServerConfig( 126 | mcp_server_transport="http", 127 | mcp_bind_host="localhost", 128 | mcp_bind_port=None 129 | ) 130 | 131 | @patch("prometheus_mcp_server.main.config") 132 | def test_setup_environment_with_bad_mcp_config_transport(mock_config): 133 | """Test environment setup with bad transport in mcp config.""" 134 | # Setup 135 | mock_config.url = "http://test:9090" 136 | mock_config.username = "user" 137 | mock_config.password = "pass" 138 | mock_config.token = None 139 | mock_config.org_id = None 140 | mock_config.mcp_server_config = MCPServerConfig( 141 | mcp_server_transport="wrong_transport", 142 | mcp_bind_host="localhost", 143 | mcp_bind_port=5000 144 | ) 145 | 146 | # Execute 147 | result = setup_environment() 148 | 149 | # Verify 150 | assert result is False 151 | 152 | @patch("prometheus_mcp_server.main.config") 153 | def test_setup_environment_with_bad_mcp_config_port(mock_config): 154 | """Test environment setup with bad port in mcp config.""" 155 | # Setup 156 | mock_config.url = "http://test:9090" 157 | mock_config.username = "user" 158 | mock_config.password = "pass" 159 | mock_config.token = None 160 | mock_config.org_id = None 161 | mock_config.mcp_server_config = MCPServerConfig( 162 | mcp_server_transport="http", 163 | mcp_bind_host="localhost", 164 | mcp_bind_port="some_string" 165 | ) 166 | 167 | # Execute 168 | result = setup_environment() 169 | 170 | # Verify 171 | assert result is False 172 | 173 | @patch("prometheus_mcp_server.main.setup_environment") 174 | @patch("prometheus_mcp_server.main.mcp.run") 175 | @patch("prometheus_mcp_server.main.sys.exit") 176 | def test_run_server_success(mock_exit, mock_run, mock_setup): 177 | """Test successful server run.""" 178 | # Setup 179 | mock_setup.return_value = True 180 | 181 | # Execute 182 | run_server() 183 | 184 | # Verify 185 | mock_setup.assert_called_once() 186 | mock_exit.assert_not_called() 187 | 188 | @patch("prometheus_mcp_server.main.setup_environment") 189 | @patch("prometheus_mcp_server.main.mcp.run") 190 | @patch("prometheus_mcp_server.main.sys.exit") 191 | def test_run_server_setup_failure(mock_exit, mock_run, mock_setup): 192 | """Test server run with setup failure.""" 193 | # Setup 194 | mock_setup.return_value = False 195 | # Make sys.exit actually stop execution 196 | mock_exit.side_effect = SystemExit(1) 197 | 198 | # Execute - should raise SystemExit 199 | with pytest.raises(SystemExit): 200 | run_server() 201 | 202 | # Verify 203 | mock_setup.assert_called_once() 204 | mock_run.assert_not_called() 205 | 206 | @patch("prometheus_mcp_server.main.config") 207 | @patch("prometheus_mcp_server.main.dotenv.load_dotenv") 208 | def test_setup_environment_bearer_token_auth(mock_load_dotenv, mock_config): 209 | """Test environment setup with bearer token authentication.""" 210 | # Setup 211 | mock_load_dotenv.return_value = False 212 | mock_config.url = "http://test:9090" 213 | mock_config.username = "" 214 | mock_config.password = "" 215 | mock_config.token = "bearer_token_123" 216 | mock_config.org_id = None 217 | mock_config.mcp_server_config = None 218 | 219 | # Execute 220 | result = setup_environment() 221 | 222 | # Verify 223 | assert result is True 224 | 225 | @patch("prometheus_mcp_server.main.setup_environment") 226 | @patch("prometheus_mcp_server.main.mcp.run") 227 | @patch("prometheus_mcp_server.main.config") 228 | def test_run_server_http_transport(mock_config, mock_run, mock_setup): 229 | """Test server run with HTTP transport.""" 230 | # Setup 231 | mock_setup.return_value = True 232 | mock_config.mcp_server_config = MCPServerConfig( 233 | mcp_server_transport="http", 234 | mcp_bind_host="localhost", 235 | mcp_bind_port=8080 236 | ) 237 | 238 | # Execute 239 | run_server() 240 | 241 | # Verify 242 | mock_run.assert_called_once_with(transport="http", host="localhost", port=8080) 243 | 244 | @patch("prometheus_mcp_server.main.setup_environment") 245 | @patch("prometheus_mcp_server.main.mcp.run") 246 | @patch("prometheus_mcp_server.main.config") 247 | def test_run_server_sse_transport(mock_config, mock_run, mock_setup): 248 | """Test server run with SSE transport.""" 249 | # Setup 250 | mock_setup.return_value = True 251 | mock_config.mcp_server_config = MCPServerConfig( 252 | mcp_server_transport="sse", 253 | mcp_bind_host="0.0.0.0", 254 | mcp_bind_port=9090 255 | ) 256 | 257 | # Execute 258 | run_server() 259 | 260 | # Verify 261 | mock_run.assert_called_once_with(transport="sse", host="0.0.0.0", port=9090) 262 | ``` -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature_request.yml: -------------------------------------------------------------------------------- ```yaml 1 | name: ✨ Feature Request 2 | description: Suggest a new feature or enhancement 3 | title: "[Feature]: " 4 | labels: ["type: feature", "status: needs-triage"] 5 | assignees: [] 6 | body: 7 | - type: markdown 8 | attributes: 9 | value: | 10 | Thank you for suggesting a new feature! Please provide detailed information to help us understand and evaluate your request. 11 | 12 | - type: checkboxes 13 | id: checklist 14 | attributes: 15 | label: Pre-submission Checklist 16 | description: Please complete the following checklist before submitting your feature request 17 | options: 18 | - label: I have searched existing issues and discussions for similar feature requests 19 | required: true 20 | - label: I have checked the documentation to ensure this feature doesn't already exist 21 | required: true 22 | - label: This feature request is related to the Prometheus MCP Server project 23 | required: true 24 | 25 | - type: dropdown 26 | id: feature-type 27 | attributes: 28 | label: Feature Type 29 | description: What type of feature are you requesting? 30 | options: 31 | - New MCP Tool (new functionality for AI assistants) 32 | - Prometheus Integration Enhancement (better Prometheus support) 33 | - Authentication Enhancement (new auth methods, security) 34 | - Configuration Option (new settings, customization) 35 | - Performance Improvement (optimization, caching) 36 | - Developer Experience (tooling, debugging, logging) 37 | - Documentation Improvement (guides, examples, API docs) 38 | - Deployment Feature (Docker, cloud, packaging) 39 | - Other (please specify in description) 40 | validations: 41 | required: true 42 | 43 | - type: dropdown 44 | id: priority 45 | attributes: 46 | label: Priority Level 47 | description: How important is this feature to your use case? 48 | options: 49 | - Low - Nice to have, not critical 50 | - Medium - Would improve workflow significantly 51 | - High - Important for broader adoption 52 | - Critical - Blocking critical functionality 53 | default: 1 54 | validations: 55 | required: true 56 | 57 | - type: textarea 58 | id: feature-summary 59 | attributes: 60 | label: Feature Summary 61 | description: A clear and concise description of the feature you'd like to see 62 | placeholder: Briefly describe the feature in 1-2 sentences 63 | validations: 64 | required: true 65 | 66 | - type: textarea 67 | id: problem-statement 68 | attributes: 69 | label: Problem Statement 70 | description: What problem does this feature solve? What pain point are you experiencing? 71 | placeholder: | 72 | Describe the current limitation or problem: 73 | - What are you trying to accomplish? 74 | - What obstacles are preventing you from achieving your goal? 75 | - How does this impact your workflow? 76 | validations: 77 | required: true 78 | 79 | - type: textarea 80 | id: proposed-solution 81 | attributes: 82 | label: Proposed Solution 83 | description: Describe your ideal solution to the problem 84 | placeholder: | 85 | Describe your proposed solution: 86 | - How would this feature work? 87 | - What would the user interface/API look like? 88 | - How would users interact with this feature? 89 | validations: 90 | required: true 91 | 92 | - type: textarea 93 | id: use-cases 94 | attributes: 95 | label: Use Cases 96 | description: Provide specific use cases and scenarios where this feature would be beneficial 97 | placeholder: | 98 | 1. Use case: As a DevOps engineer, I want to... 99 | - Steps: ... 100 | - Expected outcome: ... 101 | 102 | 2. Use case: As an AI assistant user, I want to... 103 | - Steps: ... 104 | - Expected outcome: ... 105 | validations: 106 | required: true 107 | 108 | - type: dropdown 109 | id: component 110 | attributes: 111 | label: Affected Component 112 | description: Which component would this feature primarily affect? 113 | options: 114 | - Prometheus Integration (queries, metrics, API) 115 | - MCP Server (tools, transport, protocol) 116 | - Authentication (auth methods, security) 117 | - Configuration (settings, environment vars) 118 | - Docker/Deployment (containers, packaging) 119 | - Logging/Monitoring (observability, debugging) 120 | - Documentation (guides, examples) 121 | - Testing (test framework, CI/CD) 122 | - Multiple Components 123 | - New Component 124 | validations: 125 | required: true 126 | 127 | - type: textarea 128 | id: technical-details 129 | attributes: 130 | label: Technical Implementation Ideas 131 | description: If you have technical ideas about implementation, share them here 132 | placeholder: | 133 | - Suggested API changes 134 | - New configuration options 135 | - Integration points 136 | - Technical considerations 137 | - Dependencies that might be needed 138 | validations: 139 | required: false 140 | 141 | - type: textarea 142 | id: examples 143 | attributes: 144 | label: Examples and Mockups 145 | description: Provide examples, mockups, or pseudo-code of how this feature would work 146 | placeholder: | 147 | Example configuration: 148 | ```json 149 | { 150 | "new_feature": { 151 | "enabled": true, 152 | "settings": "..." 153 | } 154 | } 155 | ``` 156 | 157 | Example usage: 158 | ```bash 159 | prometheus-mcp-server --new-feature-option 160 | ``` 161 | render: markdown 162 | validations: 163 | required: false 164 | 165 | - type: textarea 166 | id: alternatives 167 | attributes: 168 | label: Alternatives Considered 169 | description: Have you considered any alternative solutions or workarounds? 170 | placeholder: | 171 | - Alternative approach 1: ... 172 | - Alternative approach 2: ... 173 | - Current workarounds: ... 174 | - Why these alternatives are not sufficient: ... 175 | validations: 176 | required: false 177 | 178 | - type: dropdown 179 | id: breaking-changes 180 | attributes: 181 | label: Breaking Changes 182 | description: Would implementing this feature require breaking changes? 183 | options: 184 | - No breaking changes expected 185 | - Minor breaking changes (with migration path) 186 | - Major breaking changes required 187 | - Unknown/Need to investigate 188 | default: 0 189 | validations: 190 | required: true 191 | 192 | - type: textarea 193 | id: compatibility 194 | attributes: 195 | label: Compatibility Considerations 196 | description: What compatibility concerns should be considered? 197 | placeholder: | 198 | - Prometheus version compatibility 199 | - Python version requirements 200 | - MCP client compatibility 201 | - Operating system considerations 202 | - Dependencies that might conflict 203 | validations: 204 | required: false 205 | 206 | - type: textarea 207 | id: success-criteria 208 | attributes: 209 | label: Success Criteria 210 | description: How would we know this feature is successfully implemented? 211 | placeholder: | 212 | - Specific metrics or behaviors that indicate success 213 | - User experience improvements 214 | - Performance benchmarks 215 | - Integration test scenarios 216 | validations: 217 | required: false 218 | 219 | - type: textarea 220 | id: related-work 221 | attributes: 222 | label: Related Work 223 | description: Are there related features in other tools or projects? 224 | placeholder: | 225 | - Similar features in other MCP servers 226 | - Prometheus ecosystem tools that do something similar 227 | - References to relevant documentation or standards 228 | validations: 229 | required: false 230 | 231 | - type: textarea 232 | id: additional-context 233 | attributes: 234 | label: Additional Context 235 | description: Any other information that might be helpful 236 | placeholder: | 237 | - Links to relevant documentation 238 | - Screenshots or diagrams 239 | - Community discussions 240 | - Business justification 241 | - Timeline constraints 242 | validations: 243 | required: false 244 | 245 | - type: checkboxes 246 | id: contribution 247 | attributes: 248 | label: Contribution 249 | options: 250 | - label: I would be willing to contribute to the implementation of this feature 251 | required: false 252 | - label: I would be willing to help with testing this feature 253 | required: false 254 | - label: I would be willing to help with documentation for this feature 255 | required: false ``` -------------------------------------------------------------------------------- /.github/VALIDATION_SUMMARY.md: -------------------------------------------------------------------------------- ```markdown 1 | # GitHub Workflow Automation - Validation Summary 2 | 3 | ## ✅ Successfully Created Files 4 | 5 | ### GitHub Actions Workflows 6 | - ✅ `bug-triage.yml` - Core triage automation (23KB) 7 | - ✅ `issue-management.yml` - Advanced issue management (16KB) 8 | - ✅ `label-management.yml` - Label schema management (8KB) 9 | - ✅ `triage-metrics.yml` - Metrics and reporting (15KB) 10 | 11 | ### Issue Templates 12 | - ✅ `bug_report.yml` - Comprehensive bug report template (6.4KB) 13 | - ✅ `feature_request.yml` - Feature request template (8.2KB) 14 | - ✅ `question.yml` - Support/question template (5.5KB) 15 | - ✅ `config.yml` - Issue template configuration (506B) 16 | 17 | ### Documentation 18 | - ✅ `TRIAGE_AUTOMATION.md` - Complete system documentation (15KB) 19 | 20 | ## 🔍 Validation Results 21 | 22 | ### Workflow Structure ✅ 23 | - All workflows have proper YAML structure 24 | - Correct event triggers configured 25 | - Proper job definitions and steps 26 | - GitHub Actions syntax validated 27 | 28 | ### Permissions ✅ 29 | - Appropriate permissions set for each workflow 30 | - Read access to contents and pull requests 31 | - Write access to issues for automation 32 | 33 | ### Integration Points ✅ 34 | - Workflows coordinate properly with each other 35 | - No conflicting automation rules 36 | - Proper event handling to avoid infinite loops 37 | 38 | ## 🎯 Key Features Implemented 39 | 40 | ### 1. Intelligent Auto-Triage 41 | - **Pattern-based labeling**: Analyzes issue content for automatic categorization 42 | - **Priority detection**: Identifies critical, high, medium, and low priority issues 43 | - **Component classification**: Routes issues to appropriate maintainers 44 | - **Environment detection**: Identifies OS and platform-specific issues 45 | 46 | ### 2. Smart Assignment System 47 | - **Component-based routing**: Auto-assigns based on affected components 48 | - **Priority escalation**: Critical issues get immediate attention and notification 49 | - **Load balancing**: Future-ready for multiple maintainers 50 | 51 | ### 3. Comprehensive Issue Templates 52 | - **Structured data collection**: Consistent information gathering 53 | - **Validation requirements**: Ensures quality submissions 54 | - **Multiple issue types**: Bug reports, feature requests, questions 55 | - **Pre-submission checklists**: Reduces duplicate and low-quality issues 56 | 57 | ### 4. Advanced Label Management 58 | - **Hierarchical schema**: Priority, status, component, type, environment labels 59 | - **Automatic synchronization**: Keeps labels consistent across repository 60 | - **Migration support**: Handles deprecated label transitions 61 | - **Audit capabilities**: Reports on label usage and health 62 | 63 | ### 5. Stale Issue Management 64 | - **Automated cleanup**: Marks stale after 30 days, closes after 37 days 65 | - **Smart detection**: Avoids marking active discussions as stale 66 | - **Reactivation support**: Activity removes stale status automatically 67 | 68 | ### 6. PR Integration 69 | - **Issue linking**: Automatically links PRs to referenced issues 70 | - **Status updates**: Updates issue status during PR lifecycle 71 | - **Resolution tracking**: Marks issues resolved when PRs merge 72 | 73 | ### 7. Metrics and Reporting 74 | - **Daily metrics**: Tracks triage performance and health 75 | - **Weekly reports**: Comprehensive analysis and recommendations 76 | - **Health monitoring**: Identifies issues needing attention 77 | - **Performance tracking**: Response times, resolution rates, quality metrics 78 | 79 | ### 8. Duplicate Detection 80 | - **Smart matching**: Identifies potential duplicates based on title similarity 81 | - **Automatic notification**: Alerts users to check existing issues 82 | - **Manual override**: Maintainers can confirm or dismiss duplicate flags 83 | 84 | ## 🚦 Workflow Triggers 85 | 86 | ### Real-time Triggers 87 | - Issue opened/edited/labeled/assigned 88 | - Comments created/edited 89 | - Pull requests opened/closed/merged 90 | 91 | ### Scheduled Triggers 92 | - **Every 6 hours**: Core triage maintenance 93 | - **Daily at 9 AM UTC**: Issue health checks 94 | - **Daily at 8 AM UTC**: Metrics collection 95 | - **Weekly on Mondays**: Detailed reporting 96 | - **Weekly on Sundays**: Label synchronization 97 | 98 | ### Manual Triggers 99 | - All workflows support manual dispatch 100 | - Customizable parameters for different operations 101 | - Emergency triage and cleanup operations 102 | 103 | ## 📊 Expected Performance Metrics 104 | 105 | ### Triage Efficiency 106 | - **Target**: <24 hours for initial triage 107 | - **Measurement**: Time from issue creation to first label assignment 108 | - **Automation**: 80%+ of issues auto-labeled correctly 109 | 110 | ### Response Times 111 | - **Target**: <48 hours for first maintainer response 112 | - **Measurement**: Time from issue creation to first maintainer comment 113 | - **Tracking**: Automated measurement and reporting 114 | 115 | ### Quality Improvements 116 | - **Template adoption**: Expect >90% of issues using templates 117 | - **Complete information**: Reduced requests for additional details 118 | - **Reduced duplicates**: Better duplicate detection and prevention 119 | 120 | ### Issue Health 121 | - **Stale rate**: Target <10% of open issues marked stale 122 | - **Resolution rate**: Track monthly resolved vs. new issues 123 | - **Backlog management**: Automated cleanup of inactive issues 124 | 125 | ## ⚙️ Configuration Management 126 | 127 | ### Environment Variables 128 | - No additional environment variables required 129 | - Uses GitHub's built-in GITHUB_TOKEN for authentication 130 | - Repository settings control permissions 131 | 132 | ### Customization Points 133 | - Assignee mappings in workflow scripts (currently set to @pab1it0) 134 | - Stale issue timeouts (30 days stale, 7 days to close) 135 | - Pattern matching keywords for auto-labeling 136 | - Metric collection intervals and retention 137 | 138 | ## 🔧 Manual Override Capabilities 139 | 140 | ### Workflow Control 141 | - All automated actions can be manually overridden 142 | - Manual workflow dispatch with custom parameters 143 | - Emergency stop capabilities for problematic automations 144 | 145 | ### Issue Management 146 | - Manual label addition/removal takes precedence 147 | - Manual assignment overrides automation 148 | - Stale status can be cleared by commenting 149 | - Critical issues can be manually escalated 150 | 151 | ## 🚀 Production Readiness 152 | 153 | ### Security 154 | - ✅ Minimal required permissions 155 | - ✅ No sensitive data exposure 156 | - ✅ Rate limiting considerations 157 | - ✅ Error handling for API failures 158 | 159 | ### Reliability 160 | - ✅ Graceful degradation on failures 161 | - ✅ Idempotent operations 162 | - ✅ No infinite loop potential 163 | - ✅ Proper error logging 164 | 165 | ### Scalability 166 | - ✅ Efficient API usage patterns 167 | - ✅ Pagination for large datasets 168 | - ✅ Configurable batch sizes 169 | - ✅ Async operation support 170 | 171 | ### Maintainability 172 | - ✅ Well-documented workflows 173 | - ✅ Modular job structure 174 | - ✅ Clear separation of concerns 175 | - ✅ Comprehensive logging 176 | 177 | ## 🏃♂️ Next Steps 178 | 179 | ### Immediate Actions 180 | 1. **Test workflows**: Create test issues to validate automation 181 | 2. **Monitor metrics**: Review initial triage performance 182 | 3. **Adjust patterns**: Fine-tune auto-labeling based on actual issues 183 | 4. **Train team**: Ensure maintainers understand the system 184 | 185 | ### Weekly Tasks 186 | 1. Review weekly triage reports 187 | 2. Check workflow execution logs 188 | 3. Adjust assignment rules if needed 189 | 4. Update documentation based on learnings 190 | 191 | ### Monthly Tasks 192 | 1. Audit label usage and clean deprecated labels 193 | 2. Review automation effectiveness metrics 194 | 3. Update workflow patterns based on issue trends 195 | 4. Plan system improvements and optimizations 196 | 197 | ## 🔍 Testing Recommendations 198 | 199 | ### Manual Testing 200 | 1. **Create test issues** with different types and priorities 201 | 2. **Test label synchronization** via manual workflow dispatch 202 | 3. **Verify assignment rules** by creating component-specific issues 203 | 4. **Test stale issue handling** with old test issues 204 | 5. **Validate metrics collection** after several days of operation 205 | 206 | ### Integration Testing 207 | 1. **PR workflow integration** - test issue linking and status updates 208 | 2. **Cross-workflow coordination** - ensure workflows don't conflict 209 | 3. **Performance under load** - test with multiple simultaneous issues 210 | 4. **Error handling** - test with malformed inputs and API failures 211 | 212 | ## ⚠️ Known Limitations 213 | 214 | 1. **Single maintainer setup**: Currently configured for one maintainer (@pab1it0) 215 | 2. **English-only pattern matching**: Auto-labeling works best with English content 216 | 3. **GitHub API rate limits**: May need adjustment for high-volume repositories 217 | 4. **Manual review required**: Some edge cases will still need human judgment 218 | 219 | ## 📈 Success Metrics 220 | 221 | Track these metrics to measure automation success: 222 | 223 | - **Triage time reduction**: Compare before/after automation 224 | - **Response time consistency**: More predictable maintainer responses 225 | - **Issue quality improvement**: Better structured, complete issue reports 226 | - **Maintainer satisfaction**: Less manual triage work, focus on solutions 227 | - **Contributor experience**: Faster feedback, clearer communication 228 | 229 | --- 230 | 231 | **Status**: ✅ **READY FOR PRODUCTION** 232 | 233 | All workflows are production-ready and can be safely deployed. The system will begin operating automatically once the files are committed to the main branch. ``` -------------------------------------------------------------------------------- /tests/test_server.py: -------------------------------------------------------------------------------- ```python 1 | """Tests for the Prometheus MCP server functionality.""" 2 | 3 | import pytest 4 | import requests 5 | from unittest.mock import patch, MagicMock 6 | import asyncio 7 | from prometheus_mcp_server.server import make_prometheus_request, get_prometheus_auth, config 8 | 9 | @pytest.fixture 10 | def mock_response(): 11 | """Create a mock response object for requests.""" 12 | mock = MagicMock() 13 | mock.raise_for_status = MagicMock() 14 | mock.json.return_value = { 15 | "status": "success", 16 | "data": { 17 | "resultType": "vector", 18 | "result": [] 19 | } 20 | } 21 | return mock 22 | 23 | @patch("prometheus_mcp_server.server.requests.get") 24 | def test_make_prometheus_request_no_auth(mock_get, mock_response): 25 | """Test making a request to Prometheus with no authentication.""" 26 | # Setup 27 | mock_get.return_value = mock_response 28 | config.url = "http://test:9090" 29 | config.username = "" 30 | config.password = "" 31 | config.token = "" 32 | 33 | # Execute 34 | result = make_prometheus_request("query", {"query": "up"}) 35 | 36 | # Verify 37 | mock_get.assert_called_once() 38 | assert result == {"resultType": "vector", "result": []} 39 | 40 | @patch("prometheus_mcp_server.server.requests.get") 41 | def test_make_prometheus_request_with_basic_auth(mock_get, mock_response): 42 | """Test making a request to Prometheus with basic authentication.""" 43 | # Setup 44 | mock_get.return_value = mock_response 45 | config.url = "http://test:9090" 46 | config.username = "user" 47 | config.password = "pass" 48 | config.token = "" 49 | 50 | # Execute 51 | result = make_prometheus_request("query", {"query": "up"}) 52 | 53 | # Verify 54 | mock_get.assert_called_once() 55 | assert result == {"resultType": "vector", "result": []} 56 | 57 | @patch("prometheus_mcp_server.server.requests.get") 58 | def test_make_prometheus_request_with_token_auth(mock_get, mock_response): 59 | """Test making a request to Prometheus with token authentication.""" 60 | # Setup 61 | mock_get.return_value = mock_response 62 | config.url = "http://test:9090" 63 | config.username = "" 64 | config.password = "" 65 | config.token = "token123" 66 | 67 | # Execute 68 | result = make_prometheus_request("query", {"query": "up"}) 69 | 70 | # Verify 71 | mock_get.assert_called_once() 72 | assert result == {"resultType": "vector", "result": []} 73 | 74 | @patch("prometheus_mcp_server.server.requests.get") 75 | def test_make_prometheus_request_error(mock_get): 76 | """Test handling of an error response from Prometheus.""" 77 | # Setup 78 | mock_response = MagicMock() 79 | mock_response.raise_for_status = MagicMock() 80 | mock_response.json.return_value = {"status": "error", "error": "Test error"} 81 | mock_get.return_value = mock_response 82 | config.url = "http://test:9090" 83 | 84 | # Execute and verify 85 | with pytest.raises(ValueError, match="Prometheus API error: Test error"): 86 | make_prometheus_request("query", {"query": "up"}) 87 | 88 | @patch("prometheus_mcp_server.server.requests.get") 89 | def test_make_prometheus_request_connection_error(mock_get): 90 | """Test handling of connection errors.""" 91 | # Setup 92 | mock_get.side_effect = requests.ConnectionError("Connection failed") 93 | config.url = "http://test:9090" 94 | 95 | # Execute and verify 96 | with pytest.raises(requests.ConnectionError): 97 | make_prometheus_request("query", {"query": "up"}) 98 | 99 | @patch("prometheus_mcp_server.server.requests.get") 100 | def test_make_prometheus_request_timeout(mock_get): 101 | """Test handling of timeout errors.""" 102 | # Setup 103 | mock_get.side_effect = requests.Timeout("Request timeout") 104 | config.url = "http://test:9090" 105 | 106 | # Execute and verify 107 | with pytest.raises(requests.Timeout): 108 | make_prometheus_request("query", {"query": "up"}) 109 | 110 | @patch("prometheus_mcp_server.server.requests.get") 111 | def test_make_prometheus_request_http_error(mock_get): 112 | """Test handling of HTTP errors.""" 113 | # Setup 114 | mock_response = MagicMock() 115 | mock_response.raise_for_status.side_effect = requests.HTTPError("HTTP 500 Error") 116 | mock_get.return_value = mock_response 117 | config.url = "http://test:9090" 118 | 119 | # Execute and verify 120 | with pytest.raises(requests.HTTPError): 121 | make_prometheus_request("query", {"query": "up"}) 122 | 123 | @patch("prometheus_mcp_server.server.requests.get") 124 | def test_make_prometheus_request_json_error(mock_get): 125 | """Test handling of JSON decode errors.""" 126 | # Setup 127 | mock_response = MagicMock() 128 | mock_response.raise_for_status = MagicMock() 129 | mock_response.json.side_effect = requests.exceptions.JSONDecodeError("Invalid JSON", "", 0) 130 | mock_get.return_value = mock_response 131 | config.url = "http://test:9090" 132 | 133 | # Execute and verify 134 | with pytest.raises(requests.exceptions.JSONDecodeError): 135 | make_prometheus_request("query", {"query": "up"}) 136 | 137 | @patch("prometheus_mcp_server.server.requests.get") 138 | def test_make_prometheus_request_pure_json_decode_error(mock_get): 139 | """Test handling of pure json.JSONDecodeError.""" 140 | import json 141 | # Setup 142 | mock_response = MagicMock() 143 | mock_response.raise_for_status = MagicMock() 144 | mock_response.json.side_effect = json.JSONDecodeError("Invalid JSON", "", 0) 145 | mock_get.return_value = mock_response 146 | config.url = "http://test:9090" 147 | 148 | # Execute and verify - should be converted to ValueError 149 | with pytest.raises(ValueError, match="Invalid JSON response from Prometheus"): 150 | make_prometheus_request("query", {"query": "up"}) 151 | 152 | @patch("prometheus_mcp_server.server.requests.get") 153 | def test_make_prometheus_request_missing_url(mock_get): 154 | """Test make_prometheus_request with missing URL configuration.""" 155 | # Setup 156 | original_url = config.url 157 | config.url = "" # Simulate missing URL 158 | 159 | # Execute and verify 160 | with pytest.raises(ValueError, match="Prometheus configuration is missing"): 161 | make_prometheus_request("query", {"query": "up"}) 162 | 163 | # Cleanup 164 | config.url = original_url 165 | 166 | @patch("prometheus_mcp_server.server.requests.get") 167 | def test_make_prometheus_request_with_org_id(mock_get, mock_response): 168 | """Test making a request with org_id header.""" 169 | # Setup 170 | mock_get.return_value = mock_response 171 | config.url = "http://test:9090" 172 | original_org_id = config.org_id 173 | config.org_id = "test-org" 174 | 175 | # Execute 176 | result = make_prometheus_request("query", {"query": "up"}) 177 | 178 | # Verify 179 | mock_get.assert_called_once() 180 | assert result == {"resultType": "vector", "result": []} 181 | 182 | # Check that org_id header was included 183 | call_args = mock_get.call_args 184 | headers = call_args[1]['headers'] 185 | assert 'X-Scope-OrgID' in headers 186 | assert headers['X-Scope-OrgID'] == 'test-org' 187 | 188 | # Cleanup 189 | config.org_id = original_org_id 190 | 191 | @patch("prometheus_mcp_server.server.requests.get") 192 | def test_make_prometheus_request_request_exception(mock_get): 193 | """Test handling of generic request exceptions.""" 194 | # Setup 195 | mock_get.side_effect = requests.exceptions.RequestException("Generic request error") 196 | config.url = "http://test:9090" 197 | 198 | # Execute and verify 199 | with pytest.raises(requests.exceptions.RequestException): 200 | make_prometheus_request("query", {"query": "up"}) 201 | 202 | @patch("prometheus_mcp_server.server.requests.get") 203 | def test_make_prometheus_request_response_error(mock_get): 204 | """Test handling of response errors from Prometheus.""" 205 | # Setup - mock HTTP error response 206 | mock_response = MagicMock() 207 | mock_response.raise_for_status.side_effect = requests.HTTPError("HTTP 500 Server Error") 208 | mock_response.status_code = 500 209 | mock_get.return_value = mock_response 210 | config.url = "http://test:9090" 211 | 212 | # Execute and verify 213 | with pytest.raises(requests.HTTPError): 214 | make_prometheus_request("query", {"query": "up"}) 215 | 216 | @patch("prometheus_mcp_server.server.requests.get") 217 | def test_make_prometheus_request_generic_exception(mock_get): 218 | """Test handling of unexpected exceptions.""" 219 | # Setup 220 | mock_get.side_effect = Exception("Unexpected error") 221 | config.url = "http://test:9090" 222 | 223 | # Execute and verify 224 | with pytest.raises(Exception, match="Unexpected error"): 225 | make_prometheus_request("query", {"query": "up"}) 226 | 227 | @patch("prometheus_mcp_server.server.requests.get") 228 | def test_make_prometheus_request_list_data_format(mock_get): 229 | """Test make_prometheus_request with list data format.""" 230 | # Setup - mock response with list data format 231 | mock_response = MagicMock() 232 | mock_response.raise_for_status = MagicMock() 233 | mock_response.json.return_value = { 234 | "status": "success", 235 | "data": [{"metric": {}, "value": [1609459200, "1"]}] # List format instead of dict 236 | } 237 | mock_get.return_value = mock_response 238 | config.url = "http://test:9090" 239 | 240 | # Execute 241 | result = make_prometheus_request("query", {"query": "up"}) 242 | 243 | # Verify 244 | assert result == [{"metric": {}, "value": [1609459200, "1"]}] 245 | ``` -------------------------------------------------------------------------------- /.github/TRIAGE_AUTOMATION.md: -------------------------------------------------------------------------------- ```markdown 1 | # Bug Triage Automation Documentation 2 | 3 | This document describes the automated bug triage system implemented for the Prometheus MCP Server repository using GitHub Actions. 4 | 5 | ## Overview 6 | 7 | The automated triage system helps maintain issue quality, improve response times, and ensure consistent handling of bug reports and feature requests through intelligent automation. 8 | 9 | ## System Components 10 | 11 | ### 1. Automated Workflows 12 | 13 | #### `bug-triage.yml` - Core Triage Automation 14 | - **Triggers**: Issue events (opened, edited, labeled, unlabeled, assigned, unassigned), issue comments, scheduled runs (every 6 hours), manual dispatch 15 | - **Functions**: 16 | - Auto-labels new issues based on content analysis 17 | - Assigns issues to maintainers based on component labels 18 | - Updates triage status when issues are assigned 19 | - Welcomes new contributors 20 | - Manages stale issues (marks stale after 30 days, closes after 7 additional days) 21 | - Links PRs to issues and updates status on PR merge 22 | 23 | #### `issue-management.yml` - Advanced Issue Management 24 | - **Triggers**: Issue events, comments, daily scheduled runs, manual dispatch 25 | - **Functions**: 26 | - Enhanced auto-triage with pattern matching 27 | - Smart assignment based on content and labels 28 | - Issue health monitoring and escalation 29 | - Comment-based automated responses 30 | - Duplicate detection for new issues 31 | 32 | #### `label-management.yml` - Label Consistency 33 | - **Triggers**: Manual dispatch, weekly scheduled runs 34 | - **Functions**: 35 | - Synchronizes label schema across the repository 36 | - Creates missing labels with proper colors and descriptions 37 | - Audits and reports on unused labels 38 | - Migrates deprecated labels to new schema 39 | 40 | #### `triage-metrics.yml` - Reporting and Analytics 41 | - **Triggers**: Daily and weekly scheduled runs, manual dispatch 42 | - **Functions**: 43 | - Collects comprehensive triage metrics 44 | - Generates detailed markdown reports 45 | - Tracks response times and resolution rates 46 | - Monitors triage efficiency and quality 47 | - Creates weekly summary issues 48 | 49 | ### 2. Issue Templates 50 | 51 | #### Bug Report Template (`bug_report.yml`) 52 | Comprehensive template for bug reports including: 53 | - Pre-submission checklist 54 | - Priority level classification 55 | - Detailed reproduction steps 56 | - Environment information (OS, Python version, Prometheus version) 57 | - Configuration and log collection 58 | - Component classification 59 | 60 | #### Feature Request Template (`feature_request.yml`) 61 | Structured template for feature requests including: 62 | - Feature type classification 63 | - Problem statement and proposed solution 64 | - Use cases and technical implementation ideas 65 | - Breaking change assessment 66 | - Success criteria and compatibility considerations 67 | 68 | #### Question/Support Template (`question.yml`) 69 | Template for questions and support requests including: 70 | - Question type classification 71 | - Experience level indication 72 | - Current setup and attempted solutions 73 | - Urgency level assessment 74 | 75 | ### 3. Label Schema 76 | 77 | The system uses a hierarchical label structure: 78 | 79 | #### Priority Labels 80 | - `priority: critical` - Immediate attention required 81 | - `priority: high` - Should be addressed soon 82 | - `priority: medium` - Normal timeline 83 | - `priority: low` - Can be addressed when convenient 84 | 85 | #### Status Labels 86 | - `status: needs-triage` - Issue needs initial triage 87 | - `status: in-progress` - Actively being worked on 88 | - `status: waiting-for-response` - Waiting for issue author 89 | - `status: stale` - Marked as stale due to inactivity 90 | - `status: in-review` - Has associated PR under review 91 | - `status: blocked` - Blocked by external dependencies 92 | 93 | #### Component Labels 94 | - `component: prometheus` - Prometheus integration issues 95 | - `component: mcp-server` - MCP server functionality 96 | - `component: deployment` - Deployment and containerization 97 | - `component: authentication` - Authentication mechanisms 98 | - `component: configuration` - Configuration and setup 99 | - `component: logging` - Logging and monitoring 100 | 101 | #### Type Labels 102 | - `type: bug` - Something isn't working as expected 103 | - `type: feature` - New feature or enhancement 104 | - `type: documentation` - Documentation improvements 105 | - `type: performance` - Performance-related issues 106 | - `type: testing` - Testing and QA related 107 | - `type: maintenance` - Maintenance and technical debt 108 | 109 | #### Environment Labels 110 | - `env: windows` - Windows-specific issues 111 | - `env: macos` - macOS-specific issues 112 | - `env: linux` - Linux-specific issues 113 | - `env: docker` - Docker deployment issues 114 | 115 | #### Difficulty Labels 116 | - `difficulty: beginner` - Good for newcomers 117 | - `difficulty: intermediate` - Requires moderate experience 118 | - `difficulty: advanced` - Requires deep codebase knowledge 119 | 120 | ## Automation Rules 121 | 122 | ### Auto-Labeling Rules 123 | 124 | 1. **Priority Detection**: 125 | - `critical`: Keywords like "critical", "crash", "data loss", "security" 126 | - `high`: Keywords like "urgent", "blocking" 127 | - `low`: Keywords like "minor", "cosmetic" 128 | - `medium`: Default for other issues 129 | 130 | 2. **Component Detection**: 131 | - `prometheus`: Keywords related to Prometheus, metrics, PromQL 132 | - `mcp-server`: Keywords related to MCP, server, transport 133 | - `deployment`: Keywords related to Docker, containers, deployment 134 | - `authentication`: Keywords related to auth, tokens, credentials 135 | 136 | 3. **Type Detection**: 137 | - `feature`: Keywords like "feature", "enhancement", "improvement" 138 | - `documentation`: Keywords related to docs, documentation 139 | - `performance`: Keywords like "performance", "slow" 140 | - `bug`: Default for issues not matching other types 141 | 142 | ### Assignment Rules 143 | 144 | Issues are automatically assigned based on: 145 | - Component labels (all components currently assign to @pab1it0) 146 | - Priority levels (critical issues get immediate assignment with notification) 147 | - Special handling for performance and authentication issues 148 | 149 | ### Stale Issue Management 150 | 151 | 1. Issues with no activity for 30 days are marked as `stale` 152 | 2. A comment is added explaining the stale status 153 | 3. Issues remain stale for 7 days before being automatically closed 154 | 4. Stale issues that receive activity have the stale label removed 155 | 156 | ### PR Integration 157 | 158 | 1. PRs that reference issues with "closes #X" syntax automatically: 159 | - Add a comment to the linked issue 160 | - Apply `status: in-review` label to the issue 161 | 2. When PRs are merged: 162 | - Add resolution comment to linked issues 163 | - Remove `status: in-review` label 164 | 165 | ## Metrics and Reporting 166 | 167 | ### Daily Metrics Collection 168 | - Total open/closed issues 169 | - Triage status distribution 170 | - Response time averages 171 | - Label distribution analysis 172 | 173 | ### Weekly Reporting 174 | Comprehensive reports include: 175 | - Overview statistics 176 | - Triage efficiency metrics 177 | - Response time analysis 178 | - Label distribution 179 | - Contributor activity 180 | - Quality metrics 181 | - Actionable recommendations 182 | 183 | ### Health Monitoring 184 | The system monitors: 185 | - Issues needing attention (>3 days without triage) 186 | - Stale issues (>30 days without activity) 187 | - Missing essential labels 188 | - High-priority unassigned issues 189 | - Potential duplicate issues 190 | 191 | ## Manual Controls 192 | 193 | ### Workflow Dispatch Options 194 | 195 | #### Bug Triage Workflow 196 | - `triage_all`: Re-triage all open issues 197 | 198 | #### Label Management Workflow 199 | - `sync`: Create/update all labels 200 | - `create-missing`: Only create missing labels 201 | - `audit`: Report on unused/deprecated labels 202 | - `cleanup`: Migrate deprecated labels on issues 203 | 204 | #### Issue Management Workflow 205 | - `health-check`: Run issue health analysis 206 | - `close-stale`: Process stale issue closure 207 | - `update-metrics`: Refresh metric calculations 208 | - `sync-labels`: Synchronize label schema 209 | 210 | #### Metrics Workflow 211 | - `daily`/`weekly`/`monthly`: Generate period reports 212 | - `custom`: Custom date range analysis 213 | 214 | ## Best Practices 215 | 216 | ### For Maintainers 217 | 218 | 1. **Regular Monitoring**: 219 | - Check weekly triage reports 220 | - Review health check notifications 221 | - Act on escalated high-priority issues 222 | 223 | 2. **Label Hygiene**: 224 | - Use consistent labeling patterns 225 | - Run label sync weekly 226 | - Audit unused labels monthly 227 | 228 | 3. **Response Times**: 229 | - Aim to respond to new issues within 48 hours 230 | - Prioritize critical and high-priority issues 231 | - Use template responses for common questions 232 | 233 | ### For Contributors 234 | 235 | 1. **Issue Creation**: 236 | - Use appropriate issue templates 237 | - Provide complete information requested in templates 238 | - Check for existing similar issues before creating new ones 239 | 240 | 2. **Issue Updates**: 241 | - Respond promptly to requests for additional information 242 | - Update issues when circumstances change 243 | - Close issues when resolved independently 244 | 245 | ## Troubleshooting 246 | 247 | ### Common Issues 248 | 249 | 1. **Labels Not Applied**: Check if issue content matches pattern keywords 250 | 2. **Assignment Not Working**: Verify component labels are correctly applied 251 | 3. **Stale Issues**: Issues marked stale can be reactivated by adding comments 252 | 4. **Duplicate Detection**: May flag similar but distinct issues - review carefully 253 | 254 | ### Manual Overrides 255 | 256 | All automated actions can be manually overridden: 257 | - Add/remove labels manually 258 | - Change assignments 259 | - Remove stale status by commenting 260 | - Close/reopen issues as needed 261 | 262 | ## Configuration 263 | 264 | ### Environment Variables 265 | No additional environment variables required - system uses GitHub tokens automatically. 266 | 267 | ### Permissions 268 | Workflows require: 269 | - `issues: write` - For label and assignment management 270 | - `contents: read` - For repository access 271 | - `pull-requests: read` - For PR integration 272 | 273 | ## Monitoring and Maintenance 274 | 275 | ### Regular Tasks 276 | 1. **Weekly**: Review triage reports and health metrics 277 | 2. **Monthly**: Audit label usage and clean up deprecated labels 278 | 3. **Quarterly**: Review automation rules and adjust based on repository needs 279 | 280 | ### Performance Metrics 281 | - Triage time: Target <24 hours for initial triage 282 | - Response time: Target <48 hours for first maintainer response 283 | - Resolution time: Varies by issue complexity and priority 284 | - Stale rate: Target <10% of open issues marked as stale 285 | 286 | ## Future Enhancements 287 | 288 | Potential improvements to consider: 289 | 1. **AI-Powered Classification**: Use GitHub Copilot or similar for smarter issue categorization 290 | 2. **Integration with External Tools**: Connect to project management tools or monitoring systems 291 | 3. **Advanced Duplicate Detection**: Implement semantic similarity matching 292 | 4. **Automated Testing**: Trigger relevant tests based on issue components 293 | 5. **Community Health Metrics**: Track contributor engagement and satisfaction 294 | 295 | --- 296 | 297 | For questions about the triage automation system, please create an issue with the `type: documentation` label. ``` -------------------------------------------------------------------------------- /src/prometheus_mcp_server/server.py: -------------------------------------------------------------------------------- ```python 1 | #!/usr/bin/env python 2 | 3 | import os 4 | import json 5 | from typing import Any, Dict, List, Optional, Union 6 | from dataclasses import dataclass 7 | import time 8 | from datetime import datetime, timedelta 9 | from enum import Enum 10 | 11 | import dotenv 12 | import requests 13 | from fastmcp import FastMCP 14 | from prometheus_mcp_server.logging_config import get_logger 15 | 16 | dotenv.load_dotenv() 17 | mcp = FastMCP("Prometheus MCP") 18 | 19 | # Get logger instance 20 | logger = get_logger() 21 | 22 | # Health check tool for Docker containers and monitoring 23 | @mcp.tool(description="Health check endpoint for container monitoring and status verification") 24 | async def health_check() -> Dict[str, Any]: 25 | """Return health status of the MCP server and Prometheus connection. 26 | 27 | Returns: 28 | Health status including service information, configuration, and connectivity 29 | """ 30 | try: 31 | health_status = { 32 | "status": "healthy", 33 | "service": "prometheus-mcp-server", 34 | "version": "1.2.3", 35 | "timestamp": datetime.utcnow().isoformat(), 36 | "transport": config.mcp_server_config.mcp_server_transport if config.mcp_server_config else "stdio", 37 | "configuration": { 38 | "prometheus_url_configured": bool(config.url), 39 | "authentication_configured": bool(config.username or config.token), 40 | "org_id_configured": bool(config.org_id) 41 | } 42 | } 43 | 44 | # Test Prometheus connectivity if configured 45 | if config.url: 46 | try: 47 | # Quick connectivity test 48 | make_prometheus_request("query", params={"query": "up", "time": str(int(time.time()))}) 49 | health_status["prometheus_connectivity"] = "healthy" 50 | health_status["prometheus_url"] = config.url 51 | except Exception as e: 52 | health_status["prometheus_connectivity"] = "unhealthy" 53 | health_status["prometheus_error"] = str(e) 54 | health_status["status"] = "degraded" 55 | else: 56 | health_status["status"] = "unhealthy" 57 | health_status["error"] = "PROMETHEUS_URL not configured" 58 | 59 | logger.info("Health check completed", status=health_status["status"]) 60 | return health_status 61 | 62 | except Exception as e: 63 | logger.error("Health check failed", error=str(e)) 64 | return { 65 | "status": "unhealthy", 66 | "service": "prometheus-mcp-server", 67 | "error": str(e), 68 | "timestamp": datetime.utcnow().isoformat() 69 | } 70 | 71 | 72 | class TransportType(str, Enum): 73 | """Supported MCP server transport types.""" 74 | 75 | STDIO = "stdio" 76 | HTTP = "http" 77 | SSE = "sse" 78 | 79 | @classmethod 80 | def values(cls) -> list[str]: 81 | """Get all valid transport values.""" 82 | return [transport.value for transport in cls] 83 | 84 | @dataclass 85 | class MCPServerConfig: 86 | """Global Configuration for MCP.""" 87 | mcp_server_transport: TransportType = None 88 | mcp_bind_host: str = None 89 | mcp_bind_port: int = None 90 | 91 | def __post_init__(self): 92 | """Validate mcp configuration.""" 93 | if not self.mcp_server_transport: 94 | raise ValueError("MCP SERVER TRANSPORT is required") 95 | if not self.mcp_bind_host: 96 | raise ValueError(f"MCP BIND HOST is required") 97 | if not self.mcp_bind_port: 98 | raise ValueError(f"MCP BIND PORT is required") 99 | 100 | @dataclass 101 | class PrometheusConfig: 102 | url: str 103 | # Optional credentials 104 | username: Optional[str] = None 105 | password: Optional[str] = None 106 | token: Optional[str] = None 107 | # Optional Org ID for multi-tenant setups 108 | org_id: Optional[str] = None 109 | # Optional Custom MCP Server Configuration 110 | mcp_server_config: Optional[MCPServerConfig] = None 111 | 112 | config = PrometheusConfig( 113 | url=os.environ.get("PROMETHEUS_URL", ""), 114 | username=os.environ.get("PROMETHEUS_USERNAME", ""), 115 | password=os.environ.get("PROMETHEUS_PASSWORD", ""), 116 | token=os.environ.get("PROMETHEUS_TOKEN", ""), 117 | org_id=os.environ.get("ORG_ID", ""), 118 | mcp_server_config=MCPServerConfig( 119 | mcp_server_transport=os.environ.get("PROMETHEUS_MCP_SERVER_TRANSPORT", "stdio").lower(), 120 | mcp_bind_host=os.environ.get("PROMETHEUS_MCP_BIND_HOST", "127.0.0.1"), 121 | mcp_bind_port=int(os.environ.get("PROMETHEUS_MCP_BIND_PORT", "8080")) 122 | ) 123 | ) 124 | 125 | def get_prometheus_auth(): 126 | """Get authentication for Prometheus based on provided credentials.""" 127 | if config.token: 128 | return {"Authorization": f"Bearer {config.token}"} 129 | elif config.username and config.password: 130 | return requests.auth.HTTPBasicAuth(config.username, config.password) 131 | return None 132 | 133 | def make_prometheus_request(endpoint, params=None): 134 | """Make a request to the Prometheus API with proper authentication and headers.""" 135 | if not config.url: 136 | logger.error("Prometheus configuration missing", error="PROMETHEUS_URL not set") 137 | raise ValueError("Prometheus configuration is missing. Please set PROMETHEUS_URL environment variable.") 138 | 139 | url = f"{config.url.rstrip('/')}/api/v1/{endpoint}" 140 | auth = get_prometheus_auth() 141 | headers = {} 142 | 143 | if isinstance(auth, dict): # Token auth is passed via headers 144 | headers.update(auth) 145 | auth = None # Clear auth for requests.get if it's already in headers 146 | 147 | # Add OrgID header if specified 148 | if config.org_id: 149 | headers["X-Scope-OrgID"] = config.org_id 150 | 151 | try: 152 | logger.debug("Making Prometheus API request", endpoint=endpoint, url=url, params=params) 153 | 154 | # Make the request with appropriate headers and auth 155 | response = requests.get(url, params=params, auth=auth, headers=headers) 156 | 157 | response.raise_for_status() 158 | result = response.json() 159 | 160 | if result["status"] != "success": 161 | error_msg = result.get('error', 'Unknown error') 162 | logger.error("Prometheus API returned error", endpoint=endpoint, error=error_msg, status=result["status"]) 163 | raise ValueError(f"Prometheus API error: {error_msg}") 164 | 165 | data_field = result.get("data", {}) 166 | if isinstance(data_field, dict): 167 | result_type = data_field.get("resultType") 168 | else: 169 | result_type = "list" 170 | logger.debug("Prometheus API request successful", endpoint=endpoint, result_type=result_type) 171 | return result["data"] 172 | 173 | except requests.exceptions.RequestException as e: 174 | logger.error("HTTP request to Prometheus failed", endpoint=endpoint, url=url, error=str(e), error_type=type(e).__name__) 175 | raise 176 | except json.JSONDecodeError as e: 177 | logger.error("Failed to parse Prometheus response as JSON", endpoint=endpoint, url=url, error=str(e)) 178 | raise ValueError(f"Invalid JSON response from Prometheus: {str(e)}") 179 | except Exception as e: 180 | logger.error("Unexpected error during Prometheus request", endpoint=endpoint, url=url, error=str(e), error_type=type(e).__name__) 181 | raise 182 | 183 | @mcp.tool(description="Execute a PromQL instant query against Prometheus") 184 | async def execute_query(query: str, time: Optional[str] = None) -> Dict[str, Any]: 185 | """Execute an instant query against Prometheus. 186 | 187 | Args: 188 | query: PromQL query string 189 | time: Optional RFC3339 or Unix timestamp (default: current time) 190 | 191 | Returns: 192 | Query result with type (vector, matrix, scalar, string) and values 193 | """ 194 | params = {"query": query} 195 | if time: 196 | params["time"] = time 197 | 198 | logger.info("Executing instant query", query=query, time=time) 199 | data = make_prometheus_request("query", params=params) 200 | 201 | result = { 202 | "resultType": data["resultType"], 203 | "result": data["result"] 204 | } 205 | 206 | logger.info("Instant query completed", 207 | query=query, 208 | result_type=data["resultType"], 209 | result_count=len(data["result"]) if isinstance(data["result"], list) else 1) 210 | 211 | return result 212 | 213 | @mcp.tool(description="Execute a PromQL range query with start time, end time, and step interval") 214 | async def execute_range_query(query: str, start: str, end: str, step: str) -> Dict[str, Any]: 215 | """Execute a range query against Prometheus. 216 | 217 | Args: 218 | query: PromQL query string 219 | start: Start time as RFC3339 or Unix timestamp 220 | end: End time as RFC3339 or Unix timestamp 221 | step: Query resolution step width (e.g., '15s', '1m', '1h') 222 | 223 | Returns: 224 | Range query result with type (usually matrix) and values over time 225 | """ 226 | params = { 227 | "query": query, 228 | "start": start, 229 | "end": end, 230 | "step": step 231 | } 232 | 233 | logger.info("Executing range query", query=query, start=start, end=end, step=step) 234 | data = make_prometheus_request("query_range", params=params) 235 | 236 | result = { 237 | "resultType": data["resultType"], 238 | "result": data["result"] 239 | } 240 | 241 | logger.info("Range query completed", 242 | query=query, 243 | result_type=data["resultType"], 244 | result_count=len(data["result"]) if isinstance(data["result"], list) else 1) 245 | 246 | return result 247 | 248 | @mcp.tool(description="List all available metrics in Prometheus") 249 | async def list_metrics() -> List[str]: 250 | """Retrieve a list of all metric names available in Prometheus. 251 | 252 | Returns: 253 | List of metric names as strings 254 | """ 255 | logger.info("Listing available metrics") 256 | data = make_prometheus_request("label/__name__/values") 257 | logger.info("Metrics list retrieved", metric_count=len(data)) 258 | return data 259 | 260 | @mcp.tool(description="Get metadata for a specific metric") 261 | async def get_metric_metadata(metric: str) -> List[Dict[str, Any]]: 262 | """Get metadata about a specific metric. 263 | 264 | Args: 265 | metric: The name of the metric to retrieve metadata for 266 | 267 | Returns: 268 | List of metadata entries for the metric 269 | """ 270 | logger.info("Retrieving metric metadata", metric=metric) 271 | params = {"metric": metric} 272 | data = make_prometheus_request("metadata", params=params) 273 | logger.info("Metric metadata retrieved", metric=metric, metadata_count=len(data)) 274 | return data 275 | 276 | @mcp.tool(description="Get information about all scrape targets") 277 | async def get_targets() -> Dict[str, List[Dict[str, Any]]]: 278 | """Get information about all Prometheus scrape targets. 279 | 280 | Returns: 281 | Dictionary with active and dropped targets information 282 | """ 283 | logger.info("Retrieving scrape targets information") 284 | data = make_prometheus_request("targets") 285 | 286 | result = { 287 | "activeTargets": data["activeTargets"], 288 | "droppedTargets": data["droppedTargets"] 289 | } 290 | 291 | logger.info("Scrape targets retrieved", 292 | active_targets=len(data["activeTargets"]), 293 | dropped_targets=len(data["droppedTargets"])) 294 | 295 | return result 296 | 297 | if __name__ == "__main__": 298 | logger.info("Starting Prometheus MCP Server", mode="direct") 299 | mcp.run() 300 | ``` -------------------------------------------------------------------------------- /.github/workflows/label-management.yml: -------------------------------------------------------------------------------- ```yaml 1 | name: Label Management 2 | 3 | on: 4 | workflow_dispatch: 5 | inputs: 6 | action: 7 | description: 'Action to perform' 8 | required: true 9 | default: 'sync' 10 | type: choice 11 | options: 12 | - sync 13 | - create-missing 14 | - audit 15 | schedule: 16 | # Sync labels weekly 17 | - cron: '0 2 * * 0' 18 | 19 | jobs: 20 | label-sync: 21 | runs-on: ubuntu-latest 22 | permissions: 23 | issues: write 24 | contents: read 25 | 26 | steps: 27 | - name: Checkout repository 28 | uses: actions/checkout@v4 29 | 30 | - name: Create/Update Labels 31 | uses: actions/github-script@v7 32 | with: 33 | script: | 34 | // Define the complete label schema for bug triage 35 | const labels = [ 36 | // Priority Labels 37 | { name: 'priority: critical', color: 'B60205', description: 'Critical priority - immediate attention required' }, 38 | { name: 'priority: high', color: 'D93F0B', description: 'High priority - should be addressed soon' }, 39 | { name: 'priority: medium', color: 'FBCA04', description: 'Medium priority - normal timeline' }, 40 | { name: 'priority: low', color: '0E8A16', description: 'Low priority - can be addressed when convenient' }, 41 | 42 | // Status Labels 43 | { name: 'status: needs-triage', color: 'E99695', description: 'Issue needs initial triage and labeling' }, 44 | { name: 'status: in-progress', color: '0052CC', description: 'Issue is actively being worked on' }, 45 | { name: 'status: waiting-for-response', color: 'F9D0C4', description: 'Waiting for response from issue author' }, 46 | { name: 'status: stale', color: '795548', description: 'Issue marked as stale due to inactivity' }, 47 | { name: 'status: in-review', color: '6F42C1', description: 'Issue has an associated PR under review' }, 48 | { name: 'status: blocked', color: 'D73A4A', description: 'Issue is blocked by external dependencies' }, 49 | 50 | // Component Labels 51 | { name: 'component: prometheus', color: 'E6522C', description: 'Issues related to Prometheus integration' }, 52 | { name: 'component: mcp-server', color: '1F77B4', description: 'Issues related to MCP server functionality' }, 53 | { name: 'component: deployment', color: '2CA02C', description: 'Issues related to deployment and containerization' }, 54 | { name: 'component: authentication', color: 'FF7F0E', description: 'Issues related to authentication mechanisms' }, 55 | { name: 'component: configuration', color: '9467BD', description: 'Issues related to configuration and setup' }, 56 | { name: 'component: logging', color: '8C564B', description: 'Issues related to logging and monitoring' }, 57 | 58 | // Type Labels 59 | { name: 'type: bug', color: 'D73A4A', description: 'Something isn\'t working as expected' }, 60 | { name: 'type: feature', color: 'A2EEEF', description: 'New feature or enhancement request' }, 61 | { name: 'type: documentation', color: '0075CA', description: 'Documentation improvements or additions' }, 62 | { name: 'type: performance', color: 'FF6B6B', description: 'Performance related issues or optimizations' }, 63 | { name: 'type: testing', color: 'BFD4F2', description: 'Issues related to testing and QA' }, 64 | { name: 'type: maintenance', color: 'CFCFCF', description: 'Maintenance and technical debt issues' }, 65 | 66 | // Environment Labels 67 | { name: 'env: windows', color: '0078D4', description: 'Issues specific to Windows environment' }, 68 | { name: 'env: macos', color: '000000', description: 'Issues specific to macOS environment' }, 69 | { name: 'env: linux', color: 'FCC624', description: 'Issues specific to Linux environment' }, 70 | { name: 'env: docker', color: '2496ED', description: 'Issues related to Docker deployment' }, 71 | 72 | // Difficulty Labels 73 | { name: 'difficulty: beginner', color: '7057FF', description: 'Good for newcomers to the project' }, 74 | { name: 'difficulty: intermediate', color: 'F39C12', description: 'Requires moderate experience with the codebase' }, 75 | { name: 'difficulty: advanced', color: 'E67E22', description: 'Requires deep understanding of the codebase' }, 76 | 77 | // Special Labels 78 | { name: 'help wanted', color: '008672', description: 'Community help is welcome on this issue' }, 79 | { name: 'security', color: 'B60205', description: 'Security related issues - handle with priority' }, 80 | { name: 'breaking-change', color: 'B60205', description: 'Changes that would break existing functionality' }, 81 | { name: 'needs-investigation', color: '795548', description: 'Issue requires investigation to understand root cause' }, 82 | { name: 'wontfix', color: 'FFFFFF', description: 'This will not be worked on' }, 83 | { name: 'duplicate', color: 'CFD3D7', description: 'This issue or PR already exists' } 84 | ]; 85 | 86 | // Get existing labels 87 | const existingLabels = await github.rest.issues.listLabelsForRepo({ 88 | owner: context.repo.owner, 89 | repo: context.repo.repo, 90 | per_page: 100 91 | }); 92 | 93 | const existingLabelMap = new Map( 94 | existingLabels.data.map(label => [label.name, label]) 95 | ); 96 | 97 | const action = '${{ github.event.inputs.action }}' || 'sync'; 98 | console.log(`Performing action: ${action}`); 99 | 100 | for (const label of labels) { 101 | const existing = existingLabelMap.get(label.name); 102 | 103 | if (existing) { 104 | // Update existing label if color or description changed 105 | if (existing.color !== label.color || existing.description !== label.description) { 106 | console.log(`Updating label: ${label.name}`); 107 | if (action === 'sync' || action === 'create-missing') { 108 | try { 109 | await github.rest.issues.updateLabel({ 110 | owner: context.repo.owner, 111 | repo: context.repo.repo, 112 | name: label.name, 113 | color: label.color, 114 | description: label.description 115 | }); 116 | } catch (error) { 117 | console.log(`Failed to update label ${label.name}: ${error.message}`); 118 | } 119 | } 120 | } else { 121 | console.log(`Label ${label.name} is up to date`); 122 | } 123 | } else { 124 | // Create new label 125 | console.log(`Creating label: ${label.name}`); 126 | if (action === 'sync' || action === 'create-missing') { 127 | try { 128 | await github.rest.issues.createLabel({ 129 | owner: context.repo.owner, 130 | repo: context.repo.repo, 131 | name: label.name, 132 | color: label.color, 133 | description: label.description 134 | }); 135 | } catch (error) { 136 | console.log(`Failed to create label ${label.name}: ${error.message}`); 137 | } 138 | } 139 | } 140 | } 141 | 142 | // Audit mode: report on unused or outdated labels 143 | if (action === 'audit') { 144 | const definedLabelNames = new Set(labels.map(l => l.name)); 145 | const unusedLabels = existingLabels.data.filter( 146 | label => !definedLabelNames.has(label.name) && !label.default 147 | ); 148 | 149 | if (unusedLabels.length > 0) { 150 | console.log('\n=== AUDIT: Unused Labels ==='); 151 | unusedLabels.forEach(label => { 152 | console.log(`- ${label.name} (${label.color}): ${label.description || 'No description'}`); 153 | }); 154 | } 155 | 156 | // Check for issues with deprecated labels 157 | const { data: issues } = await github.rest.issues.listForRepo({ 158 | owner: context.repo.owner, 159 | repo: context.repo.repo, 160 | state: 'open', 161 | per_page: 100 162 | }); 163 | 164 | const deprecatedLabelUsage = new Map(); 165 | for (const issue of issues) { 166 | if (issue.pull_request) continue; 167 | 168 | for (const label of issue.labels) { 169 | if (!definedLabelNames.has(label.name) && !label.default) { 170 | if (!deprecatedLabelUsage.has(label.name)) { 171 | deprecatedLabelUsage.set(label.name, []); 172 | } 173 | deprecatedLabelUsage.get(label.name).push(issue.number); 174 | } 175 | } 176 | } 177 | 178 | if (deprecatedLabelUsage.size > 0) { 179 | console.log('\n=== AUDIT: Issues with Deprecated Labels ==='); 180 | for (const [labelName, issueNumbers] of deprecatedLabelUsage) { 181 | console.log(`${labelName}: Issues ${issueNumbers.join(', ')}`); 182 | } 183 | } 184 | } 185 | 186 | console.log('\nLabel management completed successfully!'); 187 | 188 | label-cleanup: 189 | runs-on: ubuntu-latest 190 | if: github.event.inputs.action == 'cleanup' 191 | permissions: 192 | issues: write 193 | contents: read 194 | 195 | steps: 196 | - name: Cleanup deprecated labels from issues 197 | uses: actions/github-script@v7 198 | with: 199 | script: | 200 | // Define mappings for deprecated labels to new ones 201 | const labelMigrations = { 202 | 'bug': 'type: bug', 203 | 'enhancement': 'type: feature', 204 | 'documentation': 'type: documentation', 205 | 'good first issue': 'difficulty: beginner', 206 | 'question': 'status: needs-triage' 207 | }; 208 | 209 | const { data: issues } = await github.rest.issues.listForRepo({ 210 | owner: context.repo.owner, 211 | repo: context.repo.repo, 212 | state: 'all', 213 | per_page: 100 214 | }); 215 | 216 | for (const issue of issues) { 217 | if (issue.pull_request) continue; 218 | 219 | let needsUpdate = false; 220 | const labelsToRemove = []; 221 | const labelsToAdd = []; 222 | 223 | for (const label of issue.labels) { 224 | if (labelMigrations[label.name]) { 225 | labelsToRemove.push(label.name); 226 | labelsToAdd.push(labelMigrations[label.name]); 227 | needsUpdate = true; 228 | } 229 | } 230 | 231 | if (needsUpdate) { 232 | console.log(`Updating labels for issue #${issue.number}`); 233 | 234 | // Remove old labels 235 | for (const labelToRemove of labelsToRemove) { 236 | try { 237 | await github.rest.issues.removeLabel({ 238 | owner: context.repo.owner, 239 | repo: context.repo.repo, 240 | issue_number: issue.number, 241 | name: labelToRemove 242 | }); 243 | } catch (error) { 244 | console.log(`Could not remove label ${labelToRemove}: ${error.message}`); 245 | } 246 | } 247 | 248 | // Add new labels 249 | if (labelsToAdd.length > 0) { 250 | try { 251 | await github.rest.issues.addLabels({ 252 | owner: context.repo.owner, 253 | repo: context.repo.repo, 254 | issue_number: issue.number, 255 | labels: labelsToAdd 256 | }); 257 | } catch (error) { 258 | console.log(`Could not add labels to #${issue.number}: ${error.message}`); 259 | } 260 | } 261 | } 262 | } 263 | 264 | console.log('Label cleanup completed!'); ```