This is page 1 of 8. Use http://codebase.md/tosin2013/mcp-codebase-insight?lines=true&page={x} to view the full context. # Directory Structure ``` ├── .bumpversion.cfg ├── .codecov.yml ├── .compile-venv-py3.11 │ ├── bin │ │ ├── activate │ │ ├── activate.csh │ │ ├── activate.fish │ │ ├── Activate.ps1 │ │ ├── coverage │ │ ├── coverage-3.11 │ │ ├── coverage3 │ │ ├── pip │ │ ├── pip-compile │ │ ├── pip-sync │ │ ├── pip3 │ │ ├── pip3.11 │ │ ├── py.test │ │ ├── pyproject-build │ │ ├── pytest │ │ ├── python │ │ ├── python3 │ │ ├── python3.11 │ │ └── wheel │ └── pyvenv.cfg ├── .env.example ├── .github │ └── workflows │ ├── build-verification.yml │ ├── publish.yml │ └── tdd-verification.yml ├── .gitignore ├── async_fixture_wrapper.py ├── CHANGELOG.md ├── CLAUDE.md ├── codebase_structure.txt ├── component_test_runner.py ├── CONTRIBUTING.md ├── core_workflows.txt ├── debug_tests.md ├── Dockerfile ├── docs │ ├── adrs │ │ └── 001_use_docker_for_qdrant.md │ ├── api.md │ ├── components │ │ └── README.md │ ├── cookbook.md │ ├── development │ │ ├── CODE_OF_CONDUCT.md │ │ ├── CONTRIBUTING.md │ │ └── README.md │ ├── documentation_map.md │ ├── documentation_summary.md │ ├── features │ │ ├── adr-management.md │ │ ├── code-analysis.md │ │ └── documentation.md │ ├── getting-started │ │ ├── configuration.md │ │ ├── docker-setup.md │ │ ├── installation.md │ │ ├── qdrant_setup.md │ │ └── quickstart.md │ ├── qdrant_setup.md │ ├── README.md │ ├── SSE_INTEGRATION.md │ ├── system_architecture │ │ └── README.md │ ├── templates │ │ └── adr.md │ ├── testing_guide.md │ ├── troubleshooting │ │ ├── common-issues.md │ │ └── faq.md │ ├── vector_store_best_practices.md │ └── workflows │ └── README.md ├── error_logs.txt ├── examples │ └── use_with_claude.py ├── github-actions-documentation.md ├── Makefile ├── module_summaries │ ├── backend_summary.txt │ ├── database_summary.txt │ └── frontend_summary.txt ├── output.txt ├── package-lock.json ├── package.json ├── PLAN.md ├── prepare_codebase.sh ├── PULL_REQUEST.md ├── pyproject.toml ├── pytest.ini ├── README.md ├── requirements-3.11.txt ├── requirements-3.11.txt.backup ├── requirements-dev.txt ├── requirements.in ├── requirements.txt ├── run_build_verification.sh ├── run_fixed_tests.sh ├── run_test_with_path_fix.sh ├── run_tests.py ├── scripts │ ├── check_qdrant_health.sh │ ├── compile_requirements.sh │ ├── load_example_patterns.py │ ├── macos_install.sh │ ├── README.md │ ├── setup_qdrant.sh │ ├── start_mcp_server.sh │ ├── store_code_relationships.py │ ├── store_report_in_mcp.py │ ├── validate_knowledge_base.py │ ├── validate_poc.py │ ├── validate_vector_store.py │ └── verify_build.py ├── server.py ├── setup_qdrant_collection.py ├── setup.py ├── src │ └── mcp_codebase_insight │ ├── __init__.py │ ├── __main__.py │ ├── asgi.py │ ├── core │ │ ├── __init__.py │ │ ├── adr.py │ │ ├── cache.py │ │ ├── component_status.py │ │ ├── config.py │ │ ├── debug.py │ │ ├── di.py │ │ ├── documentation.py │ │ ├── embeddings.py │ │ ├── errors.py │ │ ├── health.py │ │ ├── knowledge.py │ │ ├── metrics.py │ │ ├── prompts.py │ │ ├── sse.py │ │ ├── state.py │ │ ├── task_tracker.py │ │ ├── tasks.py │ │ └── vector_store.py │ ├── models.py │ ├── server_test_isolation.py │ ├── server.py │ ├── utils │ │ ├── __init__.py │ │ └── logger.py │ └── version.py ├── start-mcpserver.sh ├── summary_document.txt ├── system-architecture.md ├── system-card.yml ├── test_fix_helper.py ├── test_fixes.md ├── test_function.txt ├── test_imports.py ├── tests │ ├── components │ │ ├── conftest.py │ │ ├── test_core_components.py │ │ ├── test_embeddings.py │ │ ├── test_knowledge_base.py │ │ ├── test_sse_components.py │ │ ├── test_stdio_components.py │ │ ├── test_task_manager.py │ │ └── test_vector_store.py │ ├── config │ │ └── test_config_and_env.py │ ├── conftest.py │ ├── integration │ │ ├── fixed_test2.py │ │ ├── test_api_endpoints.py │ │ ├── test_api_endpoints.py-e │ │ ├── test_communication_integration.py │ │ └── test_server.py │ ├── README.md │ ├── README.test.md │ ├── test_build_verifier.py │ └── test_file_relationships.py └── trajectories └── tosinakinosho ├── anthropic_filemap__claude-3-sonnet-20240229__t-0.00__p-1.00__c-3.00___db62b9 │ └── db62b9 │ └── config.yaml ├── default__claude-3-5-sonnet-20240620__t-0.00__p-1.00__c-3.00___03565e │ └── 03565e │ ├── 03565e.traj │ └── config.yaml └── default__openrouter └── anthropic └── claude-3.5-sonnet-20240620:beta__t-0.00__p-1.00__c-3.00___03565e └── 03565e ├── 03565e.pred ├── 03565e.traj └── config.yaml ``` # Files -------------------------------------------------------------------------------- /.codecov.yml: -------------------------------------------------------------------------------- ```yaml 1 | codecov: 2 | require_ci_to_pass: yes 3 | notify: 4 | wait_for_ci: yes 5 | 6 | coverage: 7 | precision: 2 8 | round: down 9 | range: "70...100" 10 | status: 11 | project: 12 | default: 13 | target: 80% 14 | threshold: 2% 15 | base: auto 16 | if_ci_failed: error 17 | informational: false 18 | only_pulls: false 19 | patch: 20 | default: 21 | target: 80% 22 | threshold: 2% 23 | base: auto 24 | if_ci_failed: error 25 | informational: false 26 | only_pulls: false 27 | 28 | parsers: 29 | gcov: 30 | branch_detection: 31 | conditional: yes 32 | loop: yes 33 | method: no 34 | macro: no 35 | 36 | comment: 37 | layout: "reach,diff,flags,files,footer" 38 | behavior: default 39 | require_changes: false 40 | require_base: no 41 | require_head: yes 42 | branches: 43 | - main 44 | 45 | ignore: 46 | - "tests/**/*" 47 | - "setup.py" 48 | - "docs/**/*" 49 | - "examples/**/*" 50 | - "scripts/**/*" 51 | - "**/version.py" 52 | - "**/__init__.py" 53 | ``` -------------------------------------------------------------------------------- /.bumpversion.cfg: -------------------------------------------------------------------------------- ``` 1 | [bumpversion] 2 | current_version = 0.1.0 3 | commit = True 4 | tag = True 5 | parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)((?P<release>[a-z]+)(?P<build>\d+))? 6 | serialize = 7 | {major}.{minor}.{patch}{release}{build} 8 | {major}.{minor}.{patch} 9 | 10 | [bumpversion:part:release] 11 | optional_value = prod 12 | first_value = dev 13 | values = 14 | dev 15 | prod 16 | 17 | [bumpversion:part:build] 18 | first_value = 1 19 | 20 | [bumpversion:file:pyproject.toml] 21 | search = version = "{current_version}" 22 | replace = version = "{new_version}" 23 | 24 | [bumpversion:file:src/mcp_codebase_insight/version.py] 25 | search = __version__ = "{current_version}" 26 | replace = __version__ = "{new_version}" 27 | 28 | [bumpversion:file:src/mcp_codebase_insight/version.py] 29 | search = VERSION_MAJOR = {current_version.split(".")[0]} 30 | replace = VERSION_MAJOR = {new_version.split(".")[0]} 31 | 32 | [bumpversion:file:src/mcp_codebase_insight/version.py] 33 | search = VERSION_MINOR = {current_version.split(".")[1]} 34 | replace = VERSION_MINOR = {new_version.split(".")[1]} 35 | 36 | [bumpversion:file:src/mcp_codebase_insight/version.py] 37 | search = VERSION_PATCH = {current_version.split(".")[2]} 38 | replace = VERSION_PATCH = {new_version.split(".")[2]} 39 | ``` -------------------------------------------------------------------------------- /.env.example: -------------------------------------------------------------------------------- ``` 1 | # Server configuration 2 | MCP_HOST=127.0.0.1 3 | MCP_PORT=3000 4 | MCP_LOG_LEVEL=INFO 5 | MCP_DEBUG=false 6 | 7 | # Qdrant configuration 8 | QDRANT_URL=http://localhost:6333 9 | QDRANT_API_KEY=your-qdrant-api-key-here 10 | 11 | # Directory configuration 12 | MCP_DOCS_CACHE_DIR=docs 13 | MCP_ADR_DIR=docs/adrs 14 | MCP_KB_STORAGE_DIR=knowledge 15 | MCP_DISK_CACHE_DIR=cache 16 | 17 | # Model configuration 18 | MCP_EMBEDDING_MODEL=all-MiniLM-L6-v2 19 | MCP_COLLECTION_NAME=codebase_patterns 20 | 21 | # Feature flags 22 | MCP_METRICS_ENABLED=true 23 | MCP_CACHE_ENABLED=true 24 | MCP_MEMORY_CACHE_SIZE=1000 25 | 26 | # Optional: Authentication (if needed) 27 | # MCP_AUTH_ENABLED=false 28 | # MCP_AUTH_SECRET_KEY=your-secret-key 29 | # MCP_AUTH_TOKEN_EXPIRY=3600 30 | 31 | # Optional: Rate limiting (if needed) 32 | # MCP_RATE_LIMIT_ENABLED=false 33 | # MCP_RATE_LIMIT_REQUESTS=100 34 | # MCP_RATE_LIMIT_WINDOW=60 35 | 36 | # Optional: SSL/TLS configuration (if needed) 37 | # MCP_SSL_ENABLED=false 38 | # MCP_SSL_CERT_FILE=path/to/cert.pem 39 | # MCP_SSL_KEY_FILE=path/to/key.pem 40 | 41 | # Optional: Proxy configuration (if needed) 42 | # MCP_PROXY_URL=http://proxy:8080 43 | # MCP_NO_PROXY=localhost,127.0.0.1 44 | 45 | # Optional: External services (if needed) 46 | # MCP_GITHUB_TOKEN=your-github-token 47 | # MCP_JIRA_URL=https://your-jira-instance 48 | # MCP_JIRA_TOKEN=your-jira-token 49 | 50 | # Optional: Monitoring (if needed) 51 | # MCP_SENTRY_DSN=your-sentry-dsn 52 | # MCP_DATADOG_API_KEY=your-datadog-api-key 53 | # MCP_PROMETHEUS_ENABLED=false 54 | 55 | # Test Configuration 56 | # These variables are used when running tests 57 | MCP_TEST_MODE=1 58 | MCP_TEST_QDRANT_URL=http://localhost:6333 59 | MCP_TEST_COLLECTION_NAME=test_collection 60 | MCP_TEST_EMBEDDING_MODEL=all-MiniLM-L6-v2 61 | 62 | # Event Loop Debug Mode 63 | # Uncomment to enable asyncio debug mode for testing 64 | # PYTHONASYNCIODEBUG=1 65 | ``` -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- ``` 1 | # Python 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | *.so 6 | .Python 7 | build/ 8 | develop-eggs/ 9 | dist/ 10 | downloads/ 11 | eggs/ 12 | .eggs/ 13 | lib/ 14 | lib64/ 15 | parts/ 16 | sdist/ 17 | var/ 18 | wheels/ 19 | *.egg-info/ 20 | .installed.cfg 21 | *.egg 22 | MANIFEST 23 | 24 | # Virtual Environment 25 | .env 26 | .venv 27 | env/ 28 | venv/ 29 | ENV/ 30 | env.bak/ 31 | venv.bak/ 32 | 33 | # IDE 34 | .idea/ 35 | .vscode/ 36 | *.swp 37 | *.swo 38 | *~ 39 | .project 40 | .pydevproject 41 | .settings/ 42 | 43 | # Testing 44 | .tox/ 45 | .coverage 46 | .coverage.* 47 | .cache 48 | nosetests.xml 49 | coverage.xml 50 | *.cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | htmlcov/ 54 | 55 | # Documentation 56 | docs/_build/ 57 | docs/api/ 58 | 59 | # Project specific 60 | docs/adrs/* 61 | !docs/adrs/001_use_docker_for_qdrant.md 62 | !docs/adrs/README.md 63 | knowledge/* 64 | !knowledge/README.md 65 | cache/* 66 | !cache/README.md 67 | logs/* 68 | !logs/README.md 69 | .test_cache/ 70 | test_knowledge/ 71 | build_output.txt 72 | testreport.txt 73 | test_env/ 74 | codebase_stats.txt 75 | dependency_map.txt 76 | vector_relationship_graph.* 77 | verification-config.json 78 | *.dot 79 | *.json.tmp 80 | 81 | # Jupyter Notebook 82 | .ipynb_checkpoints 83 | 84 | # Distribution / packaging 85 | .Python 86 | env/ 87 | build/ 88 | develop-eggs/ 89 | dist/ 90 | downloads/ 91 | eggs/ 92 | .eggs/ 93 | lib/ 94 | lib64/ 95 | parts/ 96 | sdist/ 97 | var/ 98 | wheels/ 99 | *.egg-info/ 100 | .installed.cfg 101 | *.egg 102 | 103 | # Installer logs 104 | pip-log.txt 105 | pip-delete-this-directory.txt 106 | 107 | # Unit test / coverage reports 108 | htmlcov/ 109 | .tox/ 110 | .coverage 111 | .coverage.* 112 | .cache 113 | nosetests.xml 114 | coverage.xml 115 | *.cover 116 | .hypothesis/ 117 | .pytest_cache/ 118 | 119 | # Translations 120 | *.mo 121 | *.pot 122 | 123 | # Django stuff: 124 | *.log 125 | local_settings.py 126 | db.sqlite3 127 | db.sqlite3-journal 128 | 129 | # Flask stuff: 130 | instance/ 131 | .webassets-cache 132 | 133 | # Scrapy stuff: 134 | .scrapy 135 | 136 | # Sphinx documentation 137 | docs/_build/ 138 | 139 | # PyBuilder 140 | target/ 141 | 142 | # Jupyter Notebook 143 | .ipynb_checkpoints 144 | 145 | # pyenv 146 | .python-version 147 | 148 | # celery beat schedule file 149 | celerybeat-schedule 150 | 151 | # SageMath parsed files 152 | *.sage.py 153 | 154 | # Environments 155 | .env 156 | .venv 157 | env/ 158 | venv/ 159 | ENV/ 160 | env.bak/ 161 | venv.bak/ 162 | 163 | # Spyder project settings 164 | .spyderproject 165 | .spyproject 166 | 167 | # Rope project settings 168 | .ropeproject 169 | 170 | # mkdocs documentation 171 | /site 172 | 173 | # mypy 174 | .mypy_cache/ 175 | .dmypy.json 176 | dmypy.json 177 | 178 | # Pyre type checker 179 | .pyre/ 180 | 181 | # pytype static type analyzer 182 | .pytype/ 183 | 184 | # Cython debug symbols 185 | cython_debug/ 186 | 187 | # macOS 188 | .DS_Store 189 | .AppleDouble 190 | .LSOverride 191 | Icon 192 | ._* 193 | .DocumentRevisions-V100 194 | .fseventsd 195 | .Spotlight-V100 196 | .TemporaryItems 197 | .Trashes 198 | .VolumeIcon.icns 199 | .com.apple.timemachine.donotpresent 200 | 201 | # Windows 202 | Thumbs.db 203 | ehthumbs.db 204 | Desktop.ini 205 | $RECYCLE.BIN/ 206 | *.cab 207 | *.msi 208 | *.msm 209 | *.msp 210 | *.lnk 211 | 212 | # Linux 213 | *~ 214 | .fuse_hidden* 215 | .directory 216 | .Trash-* 217 | .nfs* 218 | 219 | # Project specific 220 | .env 221 | .env.* 222 | !.env.example 223 | *.log 224 | logs/ 225 | cache/ 226 | knowledge/ 227 | docs/adrs/* 228 | !docs/adrs/001_use_docker_for_qdrant.md 229 | 230 | # Documentation and ADRs (temporary private) 231 | docs/adrs/ 232 | docs/private/ 233 | docs/internal/ 234 | 235 | # Cache and Temporary Files 236 | cache/ 237 | .cache/ 238 | tmp/ 239 | temp/ 240 | *.tmp 241 | *.bak 242 | *.log 243 | 244 | # Sensitive Configuration 245 | .env* 246 | !.env.example 247 | *.key 248 | *.pem 249 | *.crt 250 | secrets/ 251 | private/ 252 | 253 | # Vector Database 254 | qdrant_storage/ 255 | 256 | # Knowledge Base (private for now) 257 | knowledge/patterns/ 258 | knowledge/tasks/ 259 | knowledge/private/ 260 | 261 | # Build and Distribution 262 | dist/ 263 | build/ 264 | *.pyc 265 | *.pyo 266 | *.pyd 267 | .Python 268 | *.so 269 | 270 | # Misc 271 | .DS_Store 272 | Thumbs.db 273 | *.swp 274 | *.swo 275 | *~ 276 | 277 | # Project Specific 278 | mcp.json 279 | .cursor/rules/ 280 | module_summaries/ 281 | logs/ 282 | references/private/ 283 | prompts/ 284 | 285 | # Ignore Qdrant data storage directory 286 | qdrant_data/ 287 | .aider* 288 | ``` -------------------------------------------------------------------------------- /tests/README.test.md: -------------------------------------------------------------------------------- ```markdown 1 | import pytest 2 | from pathlib import Path 3 | 4 | @pytest.fixture 5 | def readme_content(): 6 | readme_path = Path(__file__).parent / "README.md" 7 | with open(readme_path, "r") as f: 8 | return f.read() 9 | 10 | ``` -------------------------------------------------------------------------------- /docs/components/README.md: -------------------------------------------------------------------------------- ```markdown 1 | # Core Components 2 | 3 | > 🚧 **Documentation In Progress** 4 | > 5 | > This documentation is being actively developed. More details will be added soon. 6 | 7 | ## Overview 8 | 9 | This document details the core components of the MCP Codebase Insight system. For workflow information, please see the [Workflows Documentation](../workflows/README.md). 10 | 11 | ## Components 12 | 13 | ### Server Framework 14 | - API endpoint management 15 | - Request validation 16 | - Response formatting 17 | - Server lifecycle management 18 | 19 | ### Testing Framework 20 | - Test environment management 21 | - Component-level testing 22 | - Integration test support 23 | - Performance testing tools 24 | 25 | ### Documentation Tools 26 | - Documentation generation 27 | - Relationship analysis 28 | - Validation tools 29 | - Integration with code analysis 30 | 31 | ## Implementation Details 32 | 33 | See the [System Architecture](../system_architecture/README.md) for more details on how these components interact ``` -------------------------------------------------------------------------------- /scripts/README.md: -------------------------------------------------------------------------------- ```markdown 1 | # Utility Scripts 2 | 3 | This directory contains utility scripts for the MCP Codebase Insight project. 4 | 5 | ## Available Scripts 6 | 7 | ### check_qdrant_health.sh 8 | 9 | **Purpose**: Checks if the Qdrant vector database service is available and healthy. 10 | 11 | **Usage**: 12 | ```bash 13 | ./check_qdrant_health.sh [qdrant_url] [max_retries] [sleep_seconds] 14 | ``` 15 | 16 | **Parameters**: 17 | - `qdrant_url` - URL of the Qdrant service (default: "http://localhost:6333") 18 | - `max_retries` - Maximum number of retry attempts (default: 20) 19 | - `sleep_seconds` - Seconds to wait between retries (default: 5) 20 | 21 | **Example**: 22 | ```bash 23 | ./check_qdrant_health.sh "http://localhost:6333" 30 2 24 | ``` 25 | 26 | > Note: This script uses `apt-get` and may require `sudo` privileges on Linux systems. Ensure `curl` and `jq` are pre-installed or run with proper permissions. 27 | 28 | **Exit Codes**: 29 | - 0: Qdrant service is accessible and healthy 30 | - 1: Qdrant service is not accessible or not healthy 31 | 32 | ### compile_requirements.sh 33 | 34 | **Purpose**: Compiles and generates version-specific requirements files for different Python versions. 35 | 36 | **Usage**: 37 | ```bash 38 | ./compile_requirements.sh <python-version> 39 | ``` 40 | 41 | **Example**: 42 | ```bash 43 | ./compile_requirements.sh 3.11 44 | ``` 45 | 46 | ### load_example_patterns.py 47 | 48 | **Purpose**: Loads example patterns and ADRs into the knowledge base for demonstration or testing. 49 | 50 | **Usage**: 51 | ```bash 52 | python load_example_patterns.py [--help] 53 | ``` 54 | 55 | ### verify_build.py 56 | 57 | **Purpose**: Verifies the build status and generates a build verification report. 58 | 59 | **Usage**: 60 | ```bash 61 | python verify_build.py [--config <file>] [--output <report-file>] 62 | ``` 63 | 64 | ## Usage in GitHub Actions 65 | 66 | These scripts are used in our GitHub Actions workflows to automate and standardize common tasks. For example, `check_qdrant_health.sh` is used in both the build verification and TDD verification workflows to ensure the Qdrant service is available before running tests. 67 | 68 | ## Adding New Scripts 69 | 70 | When adding new scripts to this directory: 71 | 72 | 1. Make them executable: `chmod +x scripts/your_script.sh` 73 | 2. Include a header comment explaining the purpose and usage 74 | 3. Add error handling and sensible defaults 75 | 4. Update this README with information about the script 76 | 5. Use parameter validation and help text when appropriate ``` -------------------------------------------------------------------------------- /docs/development/README.md: -------------------------------------------------------------------------------- ```markdown 1 | # Development Guide 2 | 3 | > 🚧 **Documentation In Progress** 4 | > 5 | > This documentation is being actively developed. More details will be added soon. 6 | 7 | ## Overview 8 | 9 | This guide covers development setup, contribution guidelines, and best practices for the MCP Codebase Insight project. 10 | 11 | ## Development Setup 12 | 13 | 1. **Clone Repository** 14 | ```bash 15 | git clone https://github.com/modelcontextprotocol/mcp-codebase-insight 16 | cd mcp-codebase-insight 17 | ``` 18 | 19 | 2. **Create Virtual Environment** 20 | ```bash 21 | python -m venv venv 22 | source venv/bin/activate # On Windows: venv\Scripts\activate 23 | ``` 24 | 25 | 3. **Install Development Dependencies** 26 | ```bash 27 | pip install -e ".[dev]" 28 | ``` 29 | 30 | 4. **Setup Pre-commit Hooks** 31 | ```bash 32 | pre-commit install 33 | ``` 34 | 35 | ## Project Structure 36 | 37 | ``` 38 | mcp-codebase-insight/ 39 | ├── src/ 40 | │ └── mcp_codebase_insight/ 41 | │ ├── analysis/ # Code analysis modules 42 | │ ├── documentation/ # Documentation management 43 | │ ├── kb/ # Knowledge base operations 44 | │ └── server/ # FastAPI server 45 | ├── tests/ 46 | │ ├── integration/ # Integration tests 47 | │ └── unit/ # Unit tests 48 | ├── docs/ # Documentation 49 | └── examples/ # Example usage 50 | ``` 51 | 52 | ## Testing 53 | 54 | ```bash 55 | # Run unit tests 56 | pytest tests/unit 57 | 58 | # Run integration tests 59 | pytest tests/integration 60 | 61 | # Run with coverage 62 | pytest --cov=src tests/ 63 | ``` 64 | 65 | ## Code Style 66 | 67 | - Follow PEP 8 68 | - Use type hints 69 | - Document functions and classes 70 | - Keep functions focused and small 71 | - Write tests for new features 72 | 73 | ## Git Workflow 74 | 75 | 1. Create feature branch 76 | 2. Make changes 77 | 3. Run tests 78 | 4. Submit pull request 79 | 80 | ## Documentation 81 | 82 | - Update docs for new features 83 | - Include docstrings 84 | - Add examples when relevant 85 | 86 | ## Debugging 87 | 88 | ### Server Debugging 89 | ```python 90 | import debugpy 91 | 92 | debugpy.listen(("0.0.0.0", 5678)) 93 | debugpy.wait_for_client() 94 | ``` 95 | 96 | ### VSCode Launch Configuration 97 | ```json 98 | { 99 | "version": "0.2.0", 100 | "configurations": [ 101 | { 102 | "name": "Python: Remote Attach", 103 | "type": "python", 104 | "request": "attach", 105 | "port": 5678, 106 | "host": "localhost" 107 | } 108 | ] 109 | } 110 | ``` 111 | 112 | ## Performance Profiling 113 | 114 | ```bash 115 | python -m cProfile -o profile.stats your_script.py 116 | python -m snakeviz profile.stats 117 | ``` 118 | 119 | ## Next Steps 120 | 121 | - [Contributing Guidelines](CONTRIBUTING.md) 122 | - [Code of Conduct](CODE_OF_CONDUCT.md) 123 | - [API Reference](../api/rest-api.md) ``` -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- ```markdown 1 | # MCP Codebase Insight - WIP 2 | 3 | > 🚧 **Development in Progress** 4 | > 5 | > This project is actively under development. Features and documentation are being continuously updated. 6 | 7 | ## Overview 8 | 9 | MCP Codebase Insight is a system for analyzing and understanding codebases through semantic analysis, pattern detection, and documentation management. 10 | 11 | ## Current Development Status 12 | 13 | ### Completed Features 14 | - ✅ Core Vector Store System 15 | - ✅ Basic Knowledge Base 16 | - ✅ SSE Integration 17 | - ✅ Testing Framework 18 | - ✅ TDD and Debugging Framework (rules_template integration) 19 | 20 | ### In Progress 21 | - 🔄 Documentation Management System 22 | - 🔄 Advanced Pattern Detection 23 | - 🔄 Performance Optimization 24 | - 🔄 Integration Testing 25 | - 🔄 Debugging Utilities Enhancement 26 | 27 | ### Planned 28 | - 📋 Extended API Documentation 29 | - 📋 Custom Pattern Plugins 30 | - 📋 Advanced Caching Strategies 31 | - 📋 Deployment Guides 32 | - 📋 Comprehensive Error Tracking System 33 | 34 | ## Quick Start 35 | 36 | 1. **Installation** 37 | ```bash 38 | pip install mcp-codebase-insight 39 | ``` 40 | 41 | 2. **Basic Usage** 42 | ```python 43 | from mcp_codebase_insight import CodebaseAnalyzer 44 | 45 | analyzer = CodebaseAnalyzer() 46 | results = analyzer.analyze_code("path/to/code") 47 | ``` 48 | 49 | 3. **Running Tests** 50 | ```bash 51 | # Run all tests 52 | pytest tests/ 53 | 54 | # Run unit tests 55 | pytest tests/unit/ 56 | 57 | # Run component tests 58 | pytest tests/components/ 59 | 60 | # Run tests with coverage 61 | pytest tests/ --cov=src --cov-report=term-missing 62 | ``` 63 | 64 | 4. **Debugging Utilities** 65 | ```python 66 | from mcp_codebase_insight.utils.debug_utils import debug_trace, DebugContext, get_error_tracker 67 | 68 | # Use debug trace decorator 69 | @debug_trace 70 | def my_function(): 71 | # Implementation 72 | 73 | # Use debug context 74 | with DebugContext("operation_name"): 75 | # Code to debug 76 | 77 | # Track errors 78 | try: 79 | # Risky operation 80 | except Exception as e: 81 | error_id = get_error_tracker().record_error(e, context={"operation": "description"}) 82 | print(f"Error recorded with ID: {error_id}") 83 | ``` 84 | 85 | ## Testing and Debugging 86 | 87 | ### Test-Driven Development 88 | 89 | This project follows Test-Driven Development (TDD) principles: 90 | 91 | 1. Write a failing test first (Red) 92 | 2. Write minimal code to make the test pass (Green) 93 | 3. Refactor for clean code while keeping tests passing (Refactor) 94 | 95 | Our TDD documentation can be found in [docs/tdd/workflow.md](docs/tdd/workflow.md). 96 | 97 | ### Debugging Framework 98 | 99 | We use Agans' 9 Rules of Debugging: 100 | 101 | 1. Understand the System 102 | 2. Make It Fail 103 | 3. Quit Thinking and Look 104 | 4. Divide and Conquer 105 | 5. Change One Thing at a Time 106 | 6. Keep an Audit Trail 107 | 7. Check the Plug 108 | 8. Get a Fresh View 109 | 9. If You Didn't Fix It, It Isn't Fixed 110 | 111 | Learn more about our debugging approach in [docs/debuggers/agans_9_rules.md](docs/debuggers/agans_9_rules.md). 112 | 113 | ## Documentation 114 | 115 | - [System Architecture](docs/system_architecture/README.md) 116 | - [Core Components](docs/components/README.md) 117 | - [API Reference](docs/api/README.md) 118 | - [Development Guide](docs/development/README.md) 119 | - [Workflows](docs/workflows/README.md) 120 | - [TDD Workflow](docs/tdd/workflow.md) 121 | - [Debugging Practices](docs/debuggers/best_practices.md) 122 | 123 | ## Contributing 124 | 125 | We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details. 126 | 127 | ## License 128 | 129 | This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. 130 | 131 | ## Support 132 | 133 | - [Issue Tracker](https://github.com/modelcontextprotocol/mcp-codebase-insight/issues) 134 | - [Discussions](https://github.com/modelcontextprotocol/mcp-codebase-insight/discussions) 135 | ``` -------------------------------------------------------------------------------- /tests/README.md: -------------------------------------------------------------------------------- ```markdown 1 | # Test Structure 2 | 3 | This directory contains the test suite for the MCP Codebase Insight project. The tests are organized into the following structure: 4 | 5 | ## Directory Structure 6 | 7 | ``` 8 | tests/ 9 | ├── components/ # Component-level tests 10 | │ ├── test_vector_store.py 11 | │ ├── test_knowledge_base.py 12 | │ ├── test_task_manager.py 13 | │ └── ... 14 | ├── integration/ # Integration and API tests 15 | │ ├── test_api_endpoints.py 16 | │ └── test_server.py 17 | ├── config/ # Configuration tests 18 | │ └── test_config_and_env.py 19 | ├── conftest.py # Shared test fixtures 20 | └── README.md # This file 21 | ``` 22 | 23 | ## Test Categories 24 | 25 | 1. **Component Tests** (`components/`) 26 | - Unit tests for individual components 27 | - Tests component initialization, methods, and cleanup 28 | - Isolated from other components where possible 29 | 30 | 2. **Integration Tests** (`integration/`) 31 | - Tests for API endpoints 32 | - Server lifecycle tests 33 | - Component interaction tests 34 | 35 | 3. **Configuration Tests** (`config/`) 36 | - Environment variable handling 37 | - Configuration file parsing 38 | - Directory setup and permissions 39 | 40 | ## API Test Coverage 41 | 42 | The following API endpoints are tested in the integration tests: 43 | 44 | | Endpoint | Test Status | Test File | 45 | |----------|-------------|-----------| 46 | | `/health` | ✅ Tested | `test_api_endpoints.py` | 47 | | `/api/vector-store/search` | ✅ Tested | `test_api_endpoints.py` | 48 | | `/api/docs/adrs` | ✅ Tested | `test_api_endpoints.py` | 49 | | `/api/docs/adrs/{adr_id}` | ✅ Tested | `test_api_endpoints.py` | 50 | | `/api/docs/patterns` | ✅ Tested | `test_api_endpoints.py` | 51 | | `/api/docs/patterns/{pattern_id}` | ✅ Tested | `test_api_endpoints.py` | 52 | | `/api/analyze` | ✅ Tested | `test_api_endpoints.py` | 53 | | `/api/tasks/create` | ✅ Tested | `test_api_endpoints.py` | 54 | | `/api/tasks` | ✅ Tested | `test_api_endpoints.py` | 55 | | `/api/tasks/{task_id}` | ✅ Tested | `test_api_endpoints.py` | 56 | | `/api/debug/issues` | ✅ Tested | `test_api_endpoints.py` | 57 | | `/api/debug/issues/{issue_id}` | ✅ Tested | `test_api_endpoints.py` | 58 | | `/api/debug/issues/{issue_id}/analyze` | ✅ Tested | `test_api_endpoints.py` | 59 | | `/tools/*` | ✅ Tested | `test_api_endpoints.py` | 60 | 61 | Each test verifies: 62 | - Successful responses with valid input 63 | - Error handling with invalid input 64 | - Response structure and content validation 65 | - Edge cases where applicable 66 | 67 | ## Running Tests 68 | 69 | To run all tests: 70 | ```bash 71 | python -m pytest tests/ 72 | ``` 73 | 74 | To run specific test categories: 75 | ```bash 76 | # Run component tests 77 | python -m pytest tests/components/ 78 | 79 | # Run integration tests 80 | python -m pytest tests/integration/ 81 | 82 | # Run config tests 83 | python -m pytest tests/config/ 84 | 85 | # Run API endpoint tests only 86 | python -m pytest tests/integration/test_api_endpoints.py 87 | 88 | # Run tests for a specific API endpoint 89 | python -m pytest tests/integration/test_api_endpoints.py::test_health_check 90 | ``` 91 | 92 | ## Test Fixtures 93 | 94 | Shared test fixtures are defined in `conftest.py` and include: 95 | 96 | - `temp_dir`: Temporary directory for test files 97 | - `test_config`: Server configuration for testing 98 | - `embedder`: Sentence transformer embedder 99 | - `vector_store`: Vector store instance 100 | - `test_server`: Server instance for testing 101 | - `test_client`: FastAPI test client 102 | - `test_code`: Sample code for testing 103 | - `test_adr`: Sample ADR data 104 | - `env_vars`: Environment variables for testing 105 | 106 | ## Writing New Tests 107 | 108 | 1. Place new tests in the appropriate directory based on what they're testing 109 | 2. Use the shared fixtures from `conftest.py` 110 | 3. Follow the existing patterns for async tests and cleanup 111 | 4. Add proper docstrings and comments 112 | 5. Ensure proper cleanup in fixtures that create resources 113 | 114 | ## Test Dependencies 115 | 116 | The test suite has the following dependencies: 117 | - pytest 118 | - pytest-asyncio 119 | - httpx 120 | - fastapi 121 | - sentence-transformers 122 | 123 | Make sure these are installed before running tests. ``` -------------------------------------------------------------------------------- /docs/README.md: -------------------------------------------------------------------------------- ```markdown 1 | # MCP Codebase Insight Documentation 2 | 3 | Welcome to the MCP Codebase Insight documentation. This directory contains detailed information about installation, configuration, usage, and development of the MCP Codebase Insight tool. 4 | 5 | ## Documentation Structure 6 | 7 | ### Getting Started 8 | - [Installation Guide](getting-started/installation.md) - Complete installation instructions 9 | - [Configuration Guide](getting-started/configuration.md) - Configuration options and environment setup 10 | - [Quick Start Tutorial](getting-started/quickstart.md) - Get up and running quickly 11 | - [Qdrant Setup](getting-started/qdrant_setup.md) - Vector database setup and configuration 12 | 13 | ### Core Features 14 | - [Code Analysis](features/code-analysis.md) - Understanding code patterns and insights 15 | - [ADR Management](features/adr-management.md) - Managing architectural decisions 16 | - [Documentation Management](features/documentation.md) - Auto-generation and maintenance 17 | - [Knowledge Base](features/knowledge-base.md) - Pattern storage and retrieval 18 | - [Debug System](features/debug-system.md) - Intelligent debugging assistance 19 | - [Build Verification](features/build-verification.md) - Automated build checks 20 | 21 | ### API Reference 22 | - [REST API](api/rest-api.md) - Complete API endpoint documentation 23 | - [SSE Integration](SSE_INTEGRATION.md) - Server-Sent Events integration guide 24 | - [Vector Store API](api/vector-store-api.md) - Vector database interaction 25 | - [Client Libraries](api/client-libraries.md) - Available client SDKs 26 | 27 | ### Development 28 | - [Contributing Guide](development/contributing.md) - How to contribute to the project 29 | - [Architecture Overview](development/architecture.md) - System architecture and design 30 | - [Testing Guide](testing_guide.md) - Writing and running tests 31 | - [Best Practices](development/best-practices.md) - Coding standards and guidelines 32 | 33 | ### Deployment 34 | - [Production Deployment](deployment/production.md) - Production setup guide 35 | - [Docker Deployment](deployment/docker.md) - Container-based deployment 36 | - [Scaling Guide](deployment/scaling.md) - Handling increased load 37 | - [Monitoring](deployment/monitoring.md) - System monitoring and alerts 38 | 39 | ### Troubleshooting 40 | - [Common Issues](troubleshooting/common-issues.md) - Frequently encountered problems 41 | - [FAQ](troubleshooting/faq.md) - Frequently asked questions 42 | - [Debug Guide](troubleshooting/debug-guide.md) - Advanced debugging techniques 43 | - [Support](troubleshooting/support.md) - Getting help and support 44 | 45 | ## Quick Links 46 | 47 | - [GitHub Repository](https://github.com/modelcontextprotocol/mcp-codebase-insight) 48 | - [Issue Tracker](https://github.com/modelcontextprotocol/mcp-codebase-insight/issues) 49 | - [Discussions](https://github.com/modelcontextprotocol/mcp-codebase-insight/discussions) 50 | - [Release Notes](CHANGELOG.md) 51 | - [License](../LICENSE) 52 | 53 | ## Contributing to Documentation 54 | 55 | We welcome contributions to improve this documentation. Please see our [Contributing Guide](development/contributing.md) for details on: 56 | 57 | - Documentation style guide 58 | - How to submit documentation changes 59 | - Documentation testing 60 | - Building documentation locally 61 | 62 | ## Documentation Versions 63 | 64 | This documentation corresponds to the latest stable release of MCP Codebase Insight. For other versions: 65 | 66 | - [Latest Development](https://github.com/modelcontextprotocol/mcp-codebase-insight/tree/main/docs) 67 | - [Version History](https://github.com/modelcontextprotocol/mcp-codebase-insight/releases) 68 | 69 | ## Support 70 | 71 | If you need help or have questions: 72 | 73 | 1. Check the [FAQ](troubleshooting/faq.md) and [Common Issues](troubleshooting/common-issues.md) 74 | 2. Search existing [GitHub Issues](https://github.com/modelcontextprotocol/mcp-codebase-insight/issues) 75 | 3. Join our [Discussion Forum](https://github.com/modelcontextprotocol/mcp-codebase-insight/discussions) 76 | 4. Open a new issue if needed 77 | ``` -------------------------------------------------------------------------------- /docs/system_architecture/README.md: -------------------------------------------------------------------------------- ```markdown 1 | # System Architecture 2 | 3 | > 🚧 **Documentation In Progress** 4 | > 5 | > This documentation is being actively developed. More details will be added soon. 6 | 7 | ## Overview 8 | 9 | This document provides a comprehensive overview of the MCP Codebase Insight system architecture. For detailed workflow information, please see the [Workflows Documentation](../workflows/README.md). 10 | 11 | ## Architecture Components 12 | 13 | ### Core Systems 14 | - Vector Store System 15 | - Knowledge Base 16 | - Task Management 17 | - Health Monitoring 18 | - Error Handling 19 | - Metrics Collection 20 | - Cache Management 21 | 22 | ### Documentation 23 | - ADR Management 24 | - Documentation Tools 25 | - API Documentation 26 | 27 | ### Testing 28 | - Test Framework 29 | - SSE Testing 30 | - Integration Testing 31 | 32 | ## Detailed Documentation 33 | 34 | - [Core Components](../components/README.md) 35 | - [API Reference](../api/README.md) 36 | - [Development Guide](../development/README.md) 37 | 38 | ## System Overview 39 | 40 | This document provides a comprehensive overview of the MCP Codebase Insight system architecture, focusing on system interactions, dependencies, and design considerations. 41 | 42 | ## Core Systems 43 | 44 | ### 1. Vector Store System (`src/mcp_codebase_insight/core/vector_store.py`) 45 | - **Purpose**: Manages code embeddings and semantic search capabilities 46 | - **Key Components**: 47 | - Qdrant integration for vector storage 48 | - Embedding generation and management 49 | - Search optimization and caching 50 | - **Integration Points**: 51 | - Knowledge Base for semantic understanding 52 | - Cache Management for performance optimization 53 | - Health Monitoring for system status 54 | 55 | ### 2. Knowledge Base (`src/mcp_codebase_insight/core/knowledge.py`) 56 | - **Purpose**: Central repository for code insights and relationships 57 | - **Key Components**: 58 | - Pattern detection and storage 59 | - Relationship mapping 60 | - Semantic analysis 61 | - **Feedback Loops**: 62 | - Updates vector store with new patterns 63 | - Receives feedback from code analysis 64 | - Improves pattern detection over time 65 | 66 | ### 3. Task Management (`src/mcp_codebase_insight/core/tasks.py`) 67 | - **Purpose**: Handles async operations and job scheduling 68 | - **Key Components**: 69 | - Task scheduling and prioritization 70 | - Progress tracking 71 | - Resource management 72 | - **Bottleneck Mitigation**: 73 | - Task queuing strategies 74 | - Resource allocation 75 | - Error recovery 76 | 77 | ### 4. Health Monitoring (`src/mcp_codebase_insight/core/health.py`) 78 | - **Purpose**: System health and performance monitoring 79 | - **Key Components**: 80 | - Component status tracking 81 | - Performance metrics 82 | - Alert system 83 | - **Feedback Mechanisms**: 84 | - Real-time status updates 85 | - Performance optimization triggers 86 | - System recovery procedures 87 | 88 | ### 5. Error Handling (`src/mcp_codebase_insight/core/errors.py`) 89 | - **Purpose**: Centralized error management 90 | - **Key Components**: 91 | - Error classification 92 | - Recovery strategies 93 | - Logging and reporting 94 | - **Resilience Features**: 95 | - Graceful degradation 96 | - Circuit breakers 97 | - Error propagation control 98 | 99 | ## System Interactions 100 | 101 | ### Critical Paths 102 | 1. **Code Analysis Flow**: 103 | ```mermaid 104 | sequenceDiagram 105 | participant CA as Code Analysis 106 | participant KB as Knowledge Base 107 | participant VS as Vector Store 108 | participant CM as Cache 109 | 110 | CA->>VS: Request embeddings 111 | VS->>CM: Check cache 112 | CM-->>VS: Return cached/null 113 | VS->>KB: Get patterns 114 | KB-->>VS: Return patterns 115 | VS-->>CA: Return analysis 116 | ``` 117 | 118 | 2. **Health Monitoring Flow**: 119 | ```mermaid 120 | sequenceDiagram 121 | participant HM as Health Monitor 122 | participant CS as Component State 123 | participant TM as Task Manager 124 | participant EH as Error Handler 125 | 126 | HM->>CS: Check states 127 | CS->>TM: Verify tasks 128 | TM-->>CS: Task status 129 | CS-->>HM: System status 130 | HM->>EH: Report issues 131 | ``` 132 | 133 | ## Performance Considerations 134 | 135 | ### Caching Strategy 136 | - Multi-level caching (memory and disk) 137 | - Cache invalidation triggers 138 | - Cache size management 139 | 140 | ### Scalability Points 141 | 1. Vector Store: 142 | - Horizontal scaling capabilities 143 | - Batch processing optimization 144 | - Search performance tuning 145 | 146 | 2. Task Management: 147 | - Worker pool management 148 | - Task prioritization 149 | - Resource allocation 150 | 151 | ## Error Recovery 152 | 153 | ### Failure Scenarios 154 | 1. Vector Store Unavailable: 155 | - Fallback to cached results 156 | - Graceful degradation of search 157 | - Automatic reconnection 158 | 159 | 2. Task Overload: 160 | - Dynamic task throttling 161 | - Priority-based scheduling 162 | - Resource reallocation 163 | 164 | ## System Evolution 165 | 166 | ### Extension Points 167 | 1. Knowledge Base: 168 | - Plugin system for new patterns 169 | - Custom analyzers 170 | - External integrations 171 | 172 | 2. Monitoring: 173 | - Custom metrics 174 | - Alert integrations 175 | - Performance profiling 176 | 177 | ## Next Steps 178 | 179 | 1. **Documentation Needs**: 180 | - Detailed component interaction guides 181 | - Performance tuning documentation 182 | - Deployment architecture guides 183 | 184 | 2. **System Improvements**: 185 | - Enhanced caching strategies 186 | - More robust error recovery 187 | - Better performance monitoring ``` -------------------------------------------------------------------------------- /docs/workflows/README.md: -------------------------------------------------------------------------------- ```markdown 1 | # MCP Codebase Insight Workflows 2 | 3 | ## Overview 4 | 5 | This document details the various workflows supported by MCP Codebase Insight, including both user-facing and system-level processes. These workflows are designed to help developers effectively use and interact with the system's features. 6 | 7 | ## Quick Navigation 8 | 9 | - [User Workflows](#user-workflows) 10 | - [Code Analysis](#1-code-analysis-workflow) 11 | - [Documentation Management](#2-documentation-management-workflow) 12 | - [Testing](#3-testing-workflow) 13 | - [System Workflows](#system-workflows) 14 | - [Vector Store Operations](#1-vector-store-operations) 15 | - [Health Monitoring](#2-health-monitoring) 16 | - [Integration Points](#integration-points) 17 | - [Best Practices](#best-practices) 18 | - [Troubleshooting](#troubleshooting) 19 | - [Next Steps](#next-steps) 20 | 21 | ## User Workflows 22 | 23 | ### 1. Code Analysis Workflow 24 | 25 | #### Process Flow 26 | ```mermaid 27 | graph TD 28 | A[Developer] -->|Submit Code| B[Analysis Request] 29 | B --> C{Analysis Type} 30 | C -->|Pattern Detection| D[Pattern Analysis] 31 | C -->|Semantic Search| E[Vector Search] 32 | C -->|Documentation| F[Doc Analysis] 33 | D --> G[Results] 34 | E --> G 35 | F --> G 36 | G -->|Display| A 37 | ``` 38 | 39 | #### Steps 40 | 1. **Submit Code** 41 | - Upload code files or provide repository URL 42 | - Specify analysis parameters 43 | - Set analysis scope 44 | 45 | 2. **Analysis Processing** 46 | - Pattern detection runs against known patterns 47 | - Semantic search finds similar code 48 | - Documentation analysis checks coverage 49 | 50 | 3. **Results Review** 51 | - View detected patterns 52 | - Review suggestions 53 | - Access related documentation 54 | 55 | ### 2. Documentation Management Workflow 56 | 57 | #### Process Flow 58 | ```mermaid 59 | graph TD 60 | A[Developer] -->|Create/Update| B[Documentation] 61 | B --> C{Doc Type} 62 | C -->|ADR| D[ADR Processing] 63 | C -->|API| E[API Docs] 64 | C -->|Guide| F[User Guide] 65 | D --> G[Link Analysis] 66 | E --> G 67 | F --> G 68 | G -->|Update| H[Doc Map] 69 | H -->|Validate| A 70 | ``` 71 | 72 | #### Steps 73 | 1. **Create/Update Documentation** 74 | - Choose document type 75 | - Write content 76 | - Add metadata 77 | 78 | 2. **Processing** 79 | - Analyze document relationships 80 | - Update documentation map 81 | - Validate links 82 | 83 | 3. **Validation** 84 | - Check for broken links 85 | - Verify consistency 86 | - Update references 87 | 88 | ### 3. Testing Workflow 89 | 90 | #### Process Flow 91 | ```mermaid 92 | graph TD 93 | A[Developer] -->|Run Tests| B[Test Suite] 94 | B --> C{Test Type} 95 | C -->|Unit| D[Unit Tests] 96 | C -->|Integration| E[Integration Tests] 97 | C -->|SSE| F[SSE Tests] 98 | D --> G[Results] 99 | E --> G 100 | F --> G 101 | G -->|Report| A 102 | ``` 103 | 104 | #### Steps 105 | 1. **Test Initialization** 106 | - Set up test environment 107 | - Configure test parameters 108 | - Prepare test data 109 | 110 | 2. **Test Execution** 111 | - Run selected test types 112 | - Monitor progress 113 | - Collect results 114 | 115 | 3. **Results Analysis** 116 | - Review test reports 117 | - Analyze failures 118 | - Generate coverage reports 119 | 120 | ## System Workflows 121 | 122 | ### 1. Vector Store Operations 123 | 124 | #### Process Flow 125 | ```mermaid 126 | sequenceDiagram 127 | participant User 128 | participant Server 129 | participant Cache 130 | participant VectorStore 131 | participant Knowledge 132 | 133 | User->>Server: Request Analysis 134 | Server->>Cache: Check Cache 135 | Cache-->>Server: Cache Hit/Miss 136 | 137 | alt Cache Miss 138 | Server->>VectorStore: Generate Embeddings 139 | VectorStore->>Knowledge: Get Patterns 140 | Knowledge-->>VectorStore: Return Patterns 141 | VectorStore-->>Server: Return Results 142 | Server->>Cache: Update Cache 143 | end 144 | 145 | Server-->>User: Return Analysis 146 | ``` 147 | 148 | #### Components 149 | 1. **Cache Layer** 150 | - In-memory cache for frequent requests 151 | - Disk cache for larger datasets 152 | - Cache invalidation strategy 153 | 154 | 2. **Vector Store** 155 | - Embedding generation 156 | - Vector search 157 | - Pattern matching 158 | 159 | 3. **Knowledge Base** 160 | - Pattern storage 161 | - Relationship tracking 162 | - Context management 163 | 164 | ### 2. Health Monitoring 165 | 166 | #### Process Flow 167 | ```mermaid 168 | sequenceDiagram 169 | participant Monitor 170 | participant Components 171 | participant Tasks 172 | participant Alerts 173 | 174 | loop Every 30s 175 | Monitor->>Components: Check Status 176 | Components->>Tasks: Verify Tasks 177 | Tasks-->>Components: Task Status 178 | 179 | alt Issues Detected 180 | Components->>Alerts: Raise Alert 181 | Alerts->>Monitor: Alert Status 182 | end 183 | 184 | Components-->>Monitor: System Status 185 | end 186 | ``` 187 | 188 | #### Components 189 | 1. **Monitor** 190 | - Regular health checks 191 | - Performance monitoring 192 | - Resource tracking 193 | 194 | 2. **Components** 195 | - Service status 196 | - Resource usage 197 | - Error rates 198 | 199 | 3. **Tasks** 200 | - Task queue status 201 | - Processing rates 202 | - Error handling 203 | 204 | 4. **Alerts** 205 | - Alert generation 206 | - Notification routing 207 | - Alert history 208 | 209 | ## Integration Points 210 | 211 | ### 1. External Systems 212 | - Version Control Systems 213 | - CI/CD Pipelines 214 | - Issue Tracking Systems 215 | - Documentation Platforms 216 | 217 | ### 2. APIs 218 | - REST API for main operations 219 | - SSE for real-time updates 220 | - WebSocket for bi-directional communication 221 | 222 | ### 3. Storage 223 | - Vector Database (Qdrant) 224 | - Cache Storage 225 | - Document Storage 226 | 227 | ## Best Practices 228 | 229 | ### 1. Code Analysis 230 | - Regular analysis scheduling 231 | - Incremental analysis for large codebases 232 | - Pattern customization 233 | 234 | ### 2. Documentation 235 | - Consistent formatting 236 | - Regular updates 237 | - Link validation 238 | 239 | ### 3. Testing 240 | - Comprehensive test coverage 241 | - Regular test runs 242 | - Performance benchmarking 243 | 244 | ## Troubleshooting 245 | 246 | ### Common Issues 247 | 1. **Analysis Failures** 248 | - Check input validation 249 | - Verify system resources 250 | - Review error logs 251 | 252 | 2. **Performance Issues** 253 | - Monitor cache hit rates 254 | - Check vector store performance 255 | - Review resource usage 256 | 257 | 3. **Integration Issues** 258 | - Verify API endpoints 259 | - Check authentication 260 | - Review connection settings 261 | 262 | ## Next Steps 263 | 264 | 1. **Workflow Optimization** 265 | - Performance improvements 266 | - Enhanced error handling 267 | - Better user feedback 268 | 269 | 2. **New Features** 270 | - Custom workflow creation 271 | - Advanced analysis options 272 | - Extended integration options 273 | 274 | 3. **Documentation** 275 | - Workflow examples 276 | - Integration guides 277 | - Troubleshooting guides ``` -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- ```markdown 1 | # Contributing to MCP Codebase Insight 2 | 3 | > 🚧 **Documentation In Progress** 4 | > 5 | > This documentation is being actively developed. More details will be added soon. 6 | 7 | ## Getting Started 8 | 9 | 1. Fork the repository 10 | 2. Clone your fork 11 | 3. Create a new branch 12 | 4. Make your changes 13 | 5. Submit a pull request 14 | 15 | ## Development Setup 16 | 17 | See the [Development Guide](docs/development/README.md) for detailed setup instructions. 18 | 19 | ## Code Style 20 | 21 | - Follow PEP 8 guidelines 22 | - Use type hints 23 | - Write docstrings for all public functions and classes 24 | - Keep functions focused and small 25 | - Write clear commit messages 26 | 27 | ## Testing 28 | 29 | - Write tests for new features 30 | - Ensure all tests pass before submitting PR 31 | - Include both unit and integration tests 32 | - Document test cases 33 | 34 | ## Documentation 35 | 36 | - Update documentation for new features 37 | - Follow the documentation style guide 38 | - Include examples where appropriate 39 | - Keep documentation up to date with code 40 | 41 | ## Pull Request Process 42 | 43 | 1. Update documentation 44 | 2. Add tests 45 | 3. Update CHANGELOG.md 46 | 4. Submit PR with clear description 47 | 5. Address review comments 48 | 49 | ## Code of Conduct 50 | 51 | Please note that this project is released with a [Code of Conduct](CODE_OF_CONDUCT.md). By participating in this project you agree to abide by its terms. 52 | ``` -------------------------------------------------------------------------------- /CLAUDE.md: -------------------------------------------------------------------------------- ```markdown 1 | # TechPath Project Guidelines 2 | 3 | ## Build & Test Commands 4 | - **Python**: `make install-dev` (setup), `make start` (run server), `make check` (all checks) 5 | - **Python Tests**: `make test` or `pytest tests/test_file.py::test_function_name` (single test) 6 | - **Frontend**: `cd project && npm run dev` (development), `npm run build` (production) 7 | - **Frontend Tests**: `cd project && npm test` or `npm test -- -t "test name pattern"` (single test) 8 | - **Linting**: `make lint` (Python), `cd project && npm run lint` (TypeScript/React) 9 | - **Formatting**: `make format` (Python), `prettier --write src/` (Frontend) 10 | 11 | ## Code Style Guidelines 12 | - **Python**: Black (88 chars), isort for imports, type hints required 13 | - **TypeScript**: 2-space indent, semicolons, strong typing with interfaces 14 | - **Imports**: Group by external then internal, alphabetize 15 | - **React**: Functional components with hooks, avoid class components 16 | - **Types**: Define interfaces in separate files when reused 17 | - **Naming**: camelCase for JS/TS variables, PascalCase for components/types, snake_case for Python 18 | - **Error Handling**: Try/catch in async functions, propagate errors with descriptive messages 19 | - **Comments**: Document complex logic, interfaces, and function parameters/returns 20 | - **Testing**: Unit test coverage required, mock external dependencies ``` -------------------------------------------------------------------------------- /docs/development/CONTRIBUTING.md: -------------------------------------------------------------------------------- ```markdown 1 | # Contributing Guidelines 2 | 3 | > 🚧 **Documentation In Progress** 4 | > 5 | > This documentation is being actively developed. More details will be added soon. 6 | 7 | ## Welcome! 8 | 9 | Thank you for considering contributing to MCP Codebase Insight! This document provides guidelines and workflows for contributing. 10 | 11 | ## Code of Conduct 12 | 13 | Please read and follow our [Code of Conduct](CODE_OF_CONDUCT.md). 14 | 15 | ## How Can I Contribute? 16 | 17 | ### Reporting Bugs 18 | 19 | 1. Check if the bug is already reported in [Issues](https://github.com/modelcontextprotocol/mcp-codebase-insight/issues) 20 | 2. If not, create a new issue with: 21 | - Clear title 22 | - Detailed description 23 | - Steps to reproduce 24 | - Expected vs actual behavior 25 | - Environment details 26 | 27 | ### Suggesting Enhancements 28 | 29 | 1. Check existing issues and discussions 30 | 2. Create a new issue with: 31 | - Clear title 32 | - Detailed description 33 | - Use cases 34 | - Implementation ideas (optional) 35 | 36 | ### Pull Requests 37 | 38 | 1. Fork the repository 39 | 2. Create a feature branch 40 | 3. Make your changes 41 | 4. Run tests and linting 42 | 5. Submit PR with: 43 | - Clear title 44 | - Description of changes 45 | - Reference to related issues 46 | - Updated documentation 47 | 48 | ## Development Process 49 | 50 | ### 1. Setup Development Environment 51 | 52 | Follow the [Development Guide](README.md) for setup instructions. 53 | 54 | ### 2. Make Changes 55 | 56 | 1. Create a branch: 57 | ```bash 58 | git checkout -b feature/your-feature 59 | ``` 60 | 61 | 2. Make changes following our style guide 62 | 3. Add tests for new functionality 63 | 4. Update documentation 64 | 65 | ### 3. Test Your Changes 66 | 67 | ```bash 68 | # Run all tests 69 | pytest 70 | 71 | # Run specific test file 72 | pytest tests/path/to/test_file.py 73 | 74 | # Run with coverage 75 | pytest --cov=src tests/ 76 | ``` 77 | 78 | ### 4. Submit Changes 79 | 80 | 1. Push to your fork 81 | 2. Create pull request 82 | 3. Wait for review 83 | 4. Address feedback 84 | 85 | ## Style Guide 86 | 87 | ### Python Code Style 88 | 89 | - Follow PEP 8 90 | - Use type hints 91 | - Maximum line length: 88 characters 92 | - Use docstrings (Google style) 93 | 94 | ### Commit Messages 95 | 96 | ``` 97 | type(scope): description 98 | 99 | [optional body] 100 | 101 | [optional footer] 102 | ``` 103 | 104 | Types: 105 | - feat: New feature 106 | - fix: Bug fix 107 | - docs: Documentation 108 | - style: Formatting 109 | - refactor: Code restructuring 110 | - test: Adding tests 111 | - chore: Maintenance 112 | 113 | ### Documentation 114 | 115 | - Keep README.md updated 116 | - Add docstrings to all public APIs 117 | - Update relevant documentation files 118 | - Include examples for new features 119 | 120 | ## Review Process 121 | 122 | 1. Automated checks must pass 123 | 2. At least one maintainer review 124 | 3. All feedback addressed 125 | 4. Documentation updated 126 | 5. Tests added/updated 127 | 128 | ## Getting Help 129 | 130 | - Join our [Discord](https://discord.gg/mcp-codebase-insight) 131 | - Ask in GitHub Discussions 132 | - Contact maintainers 133 | 134 | ## Recognition 135 | 136 | Contributors will be: 137 | - Listed in CONTRIBUTORS.md 138 | - Mentioned in release notes 139 | - Credited in documentation 140 | 141 | Thank you for contributing! ``` -------------------------------------------------------------------------------- /docs/development/CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- ```markdown 1 | # Code of Conduct 2 | 3 | > 🚧 **Documentation In Progress** 4 | > 5 | > This documentation is being actively developed. More details will be added soon. 6 | 7 | ## Our Pledge 8 | 9 | We as members, contributors, and leaders pledge to make participation in our 10 | community a harassment-free experience for everyone, regardless of age, body 11 | size, visible or invisible disability, ethnicity, sex characteristics, gender 12 | identity and expression, level of experience, education, socio-economic status, 13 | nationality, personal appearance, race, religion, or sexual identity 14 | and orientation. 15 | 16 | We pledge to act and interact in ways that contribute to an open, welcoming, 17 | diverse, inclusive, and healthy community. 18 | 19 | ## Our Standards 20 | 21 | Examples of behavior that contributes to a positive environment for our 22 | community include: 23 | 24 | * Demonstrating empathy and kindness toward other people 25 | * Being respectful of differing opinions, viewpoints, and experiences 26 | * Giving and gracefully accepting constructive feedback 27 | * Accepting responsibility and apologizing to those affected by our mistakes, 28 | and learning from the experience 29 | * Focusing on what is best not just for us as individuals, but for the 30 | overall community 31 | 32 | Examples of unacceptable behavior include: 33 | 34 | * The use of sexualized language or imagery, and sexual attention or 35 | advances of any kind 36 | * Trolling, insulting or derogatory comments, and personal or political attacks 37 | * Public or private harassment 38 | * Publishing others' private information, such as a physical or email 39 | address, without their explicit permission 40 | * Other conduct which could reasonably be considered inappropriate in a 41 | professional setting 42 | 43 | ## Enforcement Responsibilities 44 | 45 | Project maintainers are responsible for clarifying and enforcing our standards of 46 | acceptable behavior and will take appropriate and fair corrective action in 47 | response to any behavior that they deem inappropriate, threatening, offensive, 48 | or harmful. 49 | 50 | ## Scope 51 | 52 | This Code of Conduct applies within all community spaces, and also applies when 53 | an individual is officially representing the community in public spaces. 54 | 55 | ## Enforcement 56 | 57 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 58 | reported to the project maintainers responsible for enforcement at 59 | [INSERT CONTACT METHOD]. 60 | 61 | All complaints will be reviewed and investigated promptly and fairly. 62 | 63 | ## Enforcement Guidelines 64 | 65 | Project maintainers will follow these Community Impact Guidelines in determining 66 | the consequences for any action they deem in violation of this Code of Conduct: 67 | 68 | ### 1. Correction 69 | 70 | **Community Impact**: Use of inappropriate language or other behavior deemed 71 | unprofessional or unwelcome in the community. 72 | 73 | **Consequence**: A private, written warning from project maintainers, providing 74 | clarity around the nature of the violation and an explanation of why the 75 | behavior was inappropriate. 76 | 77 | ### 2. Warning 78 | 79 | **Community Impact**: A violation through a single incident or series 80 | of actions. 81 | 82 | **Consequence**: A warning with consequences for continued behavior. No 83 | interaction with the people involved, including unsolicited interaction with 84 | those enforcing the Code of Conduct, for a specified period of time. 85 | 86 | ### 3. Temporary Ban 87 | 88 | **Community Impact**: A serious violation of community standards, including 89 | sustained inappropriate behavior. 90 | 91 | **Consequence**: A temporary ban from any sort of interaction or public 92 | communication with the community for a specified period of time. 93 | 94 | ### 4. Permanent Ban 95 | 96 | **Community Impact**: Demonstrating a pattern of violation of community 97 | standards, including sustained inappropriate behavior, harassment of an 98 | individual, or aggression toward or disparagement of classes of individuals. 99 | 100 | **Consequence**: A permanent ban from any sort of public interaction within 101 | the community. 102 | 103 | ## Attribution 104 | 105 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], 106 | version 2.0, available at 107 | https://www.contributor-covenant.org/version/2/0/code_of_conduct.html. 108 | 109 | [homepage]: https://www.contributor-covenant.org ``` -------------------------------------------------------------------------------- /src/mcp_codebase_insight/core/__init__.py: -------------------------------------------------------------------------------- ```python 1 | """Core package initialization.""" 2 | 3 | from .config import ServerConfig 4 | 5 | __all__ = ["ServerConfig"] 6 | ``` -------------------------------------------------------------------------------- /requirements-dev.txt: -------------------------------------------------------------------------------- ``` 1 | pytest>=8.0 2 | pytest-asyncio>=0.26.0 3 | anyio>=3.0.0 4 | httpx>=0.24.0 5 | fastapi[all]>=0.100.0 6 | qdrant-client>=1.2.0 7 | ``` -------------------------------------------------------------------------------- /src/mcp_codebase_insight/__init__.py: -------------------------------------------------------------------------------- ```python 1 | """MCP Codebase Insight package.""" 2 | 3 | from .core.config import ServerConfig 4 | 5 | __version__ = "0.2.2" 6 | __all__ = ["ServerConfig"] 7 | ``` -------------------------------------------------------------------------------- /src/mcp_codebase_insight/utils/__init__.py: -------------------------------------------------------------------------------- ```python 1 | """Utils package initialization.""" 2 | 3 | from .logger import Logger, get_logger, logger 4 | 5 | __all__ = ["Logger", "get_logger", "logger"] 6 | ``` -------------------------------------------------------------------------------- /src/mcp_codebase_insight/asgi.py: -------------------------------------------------------------------------------- ```python 1 | """ASGI application entry point.""" 2 | 3 | from .core.config import ServerConfig 4 | from .server import CodebaseAnalysisServer 5 | 6 | # Create server instance with default config 7 | config = ServerConfig() 8 | server = CodebaseAnalysisServer(config) 9 | 10 | # Export the FastAPI app instance 11 | app = server.app ``` -------------------------------------------------------------------------------- /src/mcp_codebase_insight/core/component_status.py: -------------------------------------------------------------------------------- ```python 1 | """Component status enumeration.""" 2 | 3 | from enum import Enum 4 | 5 | class ComponentStatus(str, Enum): 6 | """Component status enumeration.""" 7 | 8 | UNINITIALIZED = "uninitialized" 9 | INITIALIZING = "initializing" 10 | INITIALIZED = "initialized" 11 | FAILED = "failed" 12 | CLEANING = "cleaning" 13 | CLEANED = "cleaned" ``` -------------------------------------------------------------------------------- /module_summaries/database_summary.txt: -------------------------------------------------------------------------------- ``` 1 | # Database Module Summary 2 | - **Purpose**: Describe the database's role in the application. 3 | - **Key Components**: List database types, schema designs, and any ORM tools used. 4 | - **Dependencies**: Mention the relationships with the backend and data sources. 5 | - **Largest Files**: Identify the largest database-related files and their purposes. 6 | ``` -------------------------------------------------------------------------------- /module_summaries/backend_summary.txt: -------------------------------------------------------------------------------- ``` 1 | # Backend Module Summary 2 | - **Purpose**: Describe the backend's role in the application. 3 | - **Key Components**: List key components such as main frameworks, APIs, and data handling. 4 | - **Dependencies**: Mention any database connections and external services it relies on. 5 | - **Largest Files**: Identify the largest backend files and their purposes. 6 | ``` -------------------------------------------------------------------------------- /module_summaries/frontend_summary.txt: -------------------------------------------------------------------------------- ``` 1 | # Frontend Module Summary 2 | - **Purpose**: Describe the frontend's role in the application. 3 | - **Key Components**: List key components such as main frameworks, libraries, and UI components. 4 | - **Dependencies**: Mention any dependencies on backend services or external APIs. 5 | - **Largest Files**: Identify the largest frontend files and their purposes. 6 | ``` -------------------------------------------------------------------------------- /pytest.ini: -------------------------------------------------------------------------------- ``` 1 | [pytest] 2 | asyncio_mode = strict 3 | asyncio_default_fixture_loop_scope = session 4 | testpaths = tests 5 | python_files = test_*.py 6 | python_classes = Test* 7 | python_functions = test_* 8 | addopts = -v --cov=src/mcp_codebase_insight --cov-report=term-missing 9 | filterwarnings = 10 | ignore::DeprecationWarning:pkg_resources.* 11 | ignore::DeprecationWarning:importlib.* 12 | ignore::DeprecationWarning:pytest_asyncio.* 13 | ignore::DeprecationWarning:pydantic.* 14 | ignore::pydantic.PydanticDeprecatedSince20 ``` -------------------------------------------------------------------------------- /src/mcp_codebase_insight/version.py: -------------------------------------------------------------------------------- ```python 1 | """Version information.""" 2 | 3 | __version__ = "0.1.0" 4 | __author__ = "MCP Team" 5 | __author_email__ = "[email protected]" 6 | __description__ = "MCP Codebase Insight Server" 7 | __url__ = "https://github.com/modelcontextprotocol/mcp-codebase-insight" 8 | __license__ = "MIT" 9 | 10 | # Version components 11 | VERSION_MAJOR = 0 12 | VERSION_MINOR = 1 13 | VERSION_PATCH = 0 14 | VERSION_SUFFIX = "" 15 | 16 | # Build version tuple 17 | VERSION_INFO = (VERSION_MAJOR, VERSION_MINOR, VERSION_PATCH) 18 | 19 | # Build version string 20 | VERSION = ".".join(map(str, VERSION_INFO)) 21 | if VERSION_SUFFIX: 22 | VERSION += VERSION_SUFFIX 23 | ``` -------------------------------------------------------------------------------- /test_function.txt: -------------------------------------------------------------------------------- ``` 1 | async def test_health_check(client: httpx.AsyncClient): 2 | """Test the health check endpoint.""" 3 | response = await client.get("/health") 4 | 5 | assert response.status_code == status.HTTP_200_OK 6 | data = response.json() 7 | 8 | # In test environment, we expect partially initialized state 9 | assert "status" in data 10 | assert "initialized" in data 11 | 12 | # We don't assert on components field since it might be missing 13 | 14 | # Accept 'ok' status in test environment 15 | assert data["status"] in ["healthy", "initializing", "ok"], f"Unexpected status: {data["status"]}" 16 | 17 | # Print status for debugging 18 | print(f"Health status: {data}") ``` -------------------------------------------------------------------------------- /tests/integration/fixed_test2.py: -------------------------------------------------------------------------------- ```python 1 | 2 | async def test_health_check(client: httpx.AsyncClient): 3 | """Test the health check endpoint.""" 4 | response = await client.get("/health") 5 | 6 | assert response.status_code == status.HTTP_200_OK 7 | data = response.json() 8 | 9 | # In test environment, we expect partially initialized state 10 | assert "status" in data 11 | assert "initialized" in data 12 | 13 | # We don't assert on components field since it might be missing 14 | 15 | # Accept 'ok' status in test environment 16 | assert data["status"] in ["healthy", "initializing", "ok"], f"Unexpected status: {data['status']}" 17 | 18 | # Print status for debugging 19 | print(f"Health status: {data}") 20 | ``` -------------------------------------------------------------------------------- /run_fixed_tests.sh: -------------------------------------------------------------------------------- ```bash 1 | #!/bin/bash 2 | # This script runs tests with proper path and environment setup 3 | 4 | set -e 5 | 6 | # Activate the virtual environment 7 | source .venv/bin/activate 8 | 9 | # Install the package in development mode 10 | pip install -e . 11 | 12 | # Set environment variables 13 | export MCP_TEST_MODE=1 14 | export QDRANT_URL="http://localhost:6333" 15 | export MCP_COLLECTION_NAME="test_collection_$(date +%s)" 16 | export PYTHONPATH="$PYTHONPATH:$(pwd)" 17 | 18 | # Check if we should run a specific test or all tests 19 | if [ $# -eq 0 ]; then 20 | echo "Running specific vector store tests..." 21 | python component_test_runner.py tests/components/test_vector_store.py 22 | else 23 | echo "Running specified tests: $*" 24 | python component_test_runner.py "$@" 25 | fi 26 | ``` -------------------------------------------------------------------------------- /debug_tests.md: -------------------------------------------------------------------------------- ```markdown 1 | # Debug MCP Codebase Insight Tests 2 | 3 | ## Problem Statement 4 | Debug and fix the test execution issues in the MCP Codebase Insight project. The main test script `run_tests.py` is encountering issues with module imports and test execution. 5 | 6 | ## Current Issues 7 | 1. Module import errors for `mcp_codebase_insight` package 8 | 2. Test execution failures 9 | 3. Coverage reporting issues 10 | 11 | ## Expected Behavior 12 | - All tests should run successfully 13 | - Coverage reports should be generated 14 | - No import errors should occur 15 | 16 | ## Additional Context 17 | - The project uses pytest for testing 18 | - Coverage reporting is handled through pytest-cov 19 | - The project is set up with a virtual environment 20 | - Environment variables are set in .env file ``` -------------------------------------------------------------------------------- /docs/templates/adr.md: -------------------------------------------------------------------------------- ```markdown 1 | # {title} 2 | 3 | ## Status 4 | 5 | {status} 6 | 7 | ## Context 8 | 9 | {context} 10 | 11 | ## Decision Drivers 12 | 13 | <!-- What forces influenced this decision? --> 14 | 15 | * Technical constraints 16 | * Business requirements 17 | * Resource constraints 18 | * Time constraints 19 | 20 | ## Considered Options 21 | 22 | {options} 23 | 24 | ## Decision 25 | 26 | {decision} 27 | 28 | ## Expected Consequences 29 | 30 | ### Positive Consequences 31 | 32 | {positive_consequences} 33 | 34 | ### Negative Consequences 35 | 36 | {negative_consequences} 37 | 38 | ## Pros and Cons of the Options 39 | 40 | {options_details} 41 | 42 | ## Links 43 | 44 | <!-- Optional section for links to other decisions, patterns, or resources --> 45 | 46 | ## Notes 47 | 48 | {notes} 49 | 50 | ## Metadata 51 | 52 | * Created: {created_at} 53 | * Last Modified: {updated_at} 54 | * Author: {author} 55 | * Approvers: {approvers} 56 | * Status: {status} 57 | * Tags: {tags} 58 | {metadata} 59 | ``` -------------------------------------------------------------------------------- /src/mcp_codebase_insight/models.py: -------------------------------------------------------------------------------- ```python 1 | """API request and response models.""" 2 | 3 | from typing import List, Dict, Any, Optional 4 | from pydantic import BaseModel 5 | 6 | class ToolRequest(BaseModel): 7 | """Base request model for tool endpoints.""" 8 | name: str 9 | arguments: Dict[str, Any] 10 | 11 | class CrawlDocsRequest(BaseModel): 12 | """Request model for crawl-docs endpoint.""" 13 | urls: List[str] 14 | source_type: str 15 | 16 | class AnalyzeCodeRequest(BaseModel): 17 | """Request model for analyze-code endpoint.""" 18 | code: str 19 | context: Dict[str, Any] 20 | 21 | class SearchKnowledgeRequest(BaseModel): 22 | """Request model for search-knowledge endpoint.""" 23 | query: str 24 | pattern_type: str 25 | limit: int = 5 26 | 27 | class CodeAnalysisRequest(BaseModel): 28 | """Code analysis request model.""" 29 | 30 | code: str 31 | context: Optional[Dict[str, Any]] = None ``` -------------------------------------------------------------------------------- /core_workflows.txt: -------------------------------------------------------------------------------- ``` 1 | # Core Workflows 2 | 3 | ## User Journeys 4 | 1. **Product Browsing**: 5 | - Relevant code files: [list of files responsible for navigation, product listing] 6 | - File sizes: [line counts for each key file] 7 | 8 | 2. **Checkout Process**: 9 | - Relevant code files: [list of files responsible for cart management, payment handling] 10 | - File sizes: [line counts for each key file] 11 | 12 | 3. **User Authentication**: 13 | - Relevant code files: [list of files responsible for login, logout, user session management] 14 | - File sizes: [line counts for each key file] 15 | 16 | ### Note: 17 | - The workflows and summaries provided are examples. Please modify them to fit the specific use case and structure of your application repository. 18 | - Pay special attention to large files, as they may represent core functionality or potential refactoring opportunities. 19 | ``` -------------------------------------------------------------------------------- /summary_document.txt: -------------------------------------------------------------------------------- ``` 1 | # Application Summary 2 | 3 | ## Architecture 4 | This document provides a summary of the application's architecture, key modules, and their relationships. 5 | 6 | ## Key Modules 7 | - Placeholder for module descriptions. 8 | - Include information about the functionality, dependencies, and interaction with other modules. 9 | 10 | ## Key Files by Size 11 | - See codebase_stats.txt for a complete listing of files by line count 12 | - The largest files often represent core functionality or areas that might need refactoring 13 | 14 | ## High-Level Next Steps for LLM 15 | 1. Identify and generate module summaries for frontend, backend, and database. 16 | 2. Document core workflows and user journeys within the application. 17 | 3. Use the LLM relationship prompt (llm_relationship_prompt.txt) to generate a comprehensive relationship analysis. 18 | 4. Pay special attention to the largest files and their relationships to other components. 19 | 20 | ``` -------------------------------------------------------------------------------- /.github/workflows/publish.yml: -------------------------------------------------------------------------------- ```yaml 1 | name: Publish to PyPI 2 | 3 | on: 4 | push: 5 | tags: 6 | - 'v*' 7 | 8 | jobs: 9 | deploy: 10 | runs-on: ubuntu-latest 11 | environment: 12 | name: pypi 13 | url: https://pypi.org/p/mcp-codebase-insight 14 | permissions: 15 | id-token: write 16 | contents: read 17 | 18 | steps: 19 | - uses: actions/checkout@v4 20 | with: 21 | fetch-depth: 0 22 | 23 | - name: Set up Python 24 | uses: actions/[email protected] 25 | with: 26 | python-version: '3.x' 27 | 28 | - name: Install dependencies 29 | run: | 30 | python -m pip install --upgrade pip 31 | pip install build twine 32 | 33 | - name: Build package 34 | run: python -m build 35 | 36 | - name: Check distribution 37 | run: | 38 | python -m twine check dist/* 39 | ls -l dist/ 40 | 41 | - name: Publish to PyPI 42 | env: 43 | TWINE_USERNAME: __token__ 44 | TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }} 45 | run: python -m twine upload dist/* ``` -------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- ```json 1 | { 2 | "name": "vite-react-typescript-starter", 3 | "private": true, 4 | "version": "0.0.0", 5 | "type": "module", 6 | "scripts": { 7 | "dev": "vite", 8 | "build": "tsc && vite build", 9 | "lint": "eslint .", 10 | "preview": "vite preview" 11 | }, 12 | "dependencies": { 13 | "@supabase/supabase-js": "^2.39.7", 14 | "lucide-react": "^0.344.0", 15 | "react": "^18.3.1", 16 | "react-dom": "^18.3.1", 17 | "react-router-dom": "^6.22.0", 18 | "recharts": "^2.12.1" 19 | }, 20 | "devDependencies": { 21 | "@eslint/js": "^9.9.1", 22 | "@tsconfig/recommended": "^1.0.3", 23 | "@types/node": "^20.11.24", 24 | "@types/react": "^18.3.5", 25 | "@types/react-dom": "^18.3.0", 26 | "@vitejs/plugin-react": "^4.3.1", 27 | "autoprefixer": "^10.4.18", 28 | "eslint": "^9.9.1", 29 | "eslint-plugin-react-hooks": "^5.1.0-rc.0", 30 | "eslint-plugin-react-refresh": "^0.4.11", 31 | "globals": "^15.9.0", 32 | "postcss": "^8.4.35", 33 | "tailwindcss": "^3.4.1", 34 | "typescript": "^5.5.3", 35 | "typescript-eslint": "^8.3.0", 36 | "vite": "^5.4.2" 37 | } 38 | } ``` -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- ```dockerfile 1 | # Use Python 3.11 slim image 2 | FROM python:3.11-slim 3 | 4 | # Set working directory 5 | WORKDIR /app 6 | 7 | # Set environment variables 8 | ENV PYTHONUNBUFFERED=1 \ 9 | PYTHONDONTWRITEBYTECODE=1 \ 10 | PIP_NO_CACHE_DIR=1 \ 11 | PIP_DISABLE_PIP_VERSION_CHECK=1 12 | 13 | # Install system dependencies 14 | RUN apt-get update \ 15 | && apt-get install -y --no-install-recommends \ 16 | build-essential \ 17 | curl \ 18 | git \ 19 | && rm -rf /var/lib/apt/lists/* 20 | 21 | # Install Rust (needed for pydantic) 22 | RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y 23 | ENV PATH="/root/.cargo/bin:${PATH}" 24 | 25 | # Copy requirements file 26 | COPY requirements.txt . 27 | 28 | # Install Python dependencies 29 | RUN pip install --no-cache-dir -r requirements.txt 30 | 31 | # Copy source code 32 | COPY src/ src/ 33 | COPY scripts/ scripts/ 34 | 35 | # Copy configuration files 36 | COPY .env.example .env 37 | 38 | # Create necessary directories 39 | RUN mkdir -p \ 40 | docs/adrs \ 41 | knowledge \ 42 | cache \ 43 | logs 44 | 45 | # Set permissions 46 | RUN chmod +x scripts/start_mcp_server.sh 47 | 48 | # Expose port 49 | EXPOSE 3000 50 | 51 | # Set entrypoint 52 | ENTRYPOINT ["scripts/start_mcp_server.sh"] 53 | 54 | # Set default command 55 | CMD ["--host", "0.0.0.0", "--port", "3000"] 56 | ``` -------------------------------------------------------------------------------- /docs/getting-started/qdrant_setup.md: -------------------------------------------------------------------------------- ```markdown 1 | # Qdrant Setup Guide 2 | 3 | > 🚧 **Documentation In Progress** 4 | > 5 | > This documentation is being actively developed. More details will be added soon. 6 | 7 | ## Overview 8 | 9 | This guide covers setting up Qdrant vector database for MCP Codebase Insight. 10 | 11 | ## Installation Methods 12 | 13 | ### 1. Using Docker (Recommended) 14 | 15 | ```bash 16 | # Pull the Qdrant image 17 | docker pull qdrant/qdrant 18 | 19 | # Start Qdrant container 20 | docker run -p 6333:6333 -v $(pwd)/qdrant_storage:/qdrant/storage qdrant/qdrant 21 | ``` 22 | 23 | ### 2. From Binary 24 | 25 | Download from [Qdrant Releases](https://github.com/qdrant/qdrant/releases) 26 | 27 | ### 3. From Source 28 | 29 | ```bash 30 | git clone https://github.com/qdrant/qdrant 31 | cd qdrant 32 | cargo build --release 33 | ``` 34 | 35 | ## Configuration 36 | 37 | 1. **Create Collection** 38 | ```python 39 | from qdrant_client import QdrantClient 40 | 41 | client = QdrantClient("localhost", port=6333) 42 | client.create_collection( 43 | collection_name="code_vectors", 44 | vectors_config={"size": 384, "distance": "Cosine"} 45 | ) 46 | ``` 47 | 48 | 2. **Verify Setup** 49 | ```bash 50 | curl http://localhost:6333/collections/code_vectors 51 | ``` 52 | 53 | ## Next Steps 54 | 55 | - [Configuration Guide](configuration.md) 56 | - [Quick Start Guide](quickstart.md) 57 | - [API Reference](../api/rest-api.md) ``` -------------------------------------------------------------------------------- /tests/components/test_embeddings.py: -------------------------------------------------------------------------------- ```python 1 | 2 | import sys 3 | import os 4 | 5 | # Ensure the src directory is in the Python path 6 | sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '../../'))) 7 | 8 | import pytest 9 | import asyncio 10 | from src.mcp_codebase_insight.core.embeddings import SentenceTransformerEmbedding 11 | 12 | @pytest.mark.asyncio 13 | async def test_embedder_initialization(): 14 | """Test that embedder initializes correctly.""" 15 | embedder = SentenceTransformerEmbedding() 16 | try: 17 | await asyncio.wait_for(embedder.initialize(), timeout=60.0) 18 | assert embedder.model is not None 19 | assert embedder.vector_size == 384 # Default size for all-MiniLM-L6-v2 20 | except asyncio.TimeoutError: 21 | pytest.fail("Embedder initialization timed out") 22 | except Exception as e: 23 | pytest.fail(f"Embedder initialization failed: {str(e)}") 24 | 25 | @pytest.mark.asyncio 26 | async def test_embedder_embedding(): 27 | """Test that embedder can generate embeddings.""" 28 | embedder = SentenceTransformerEmbedding() 29 | await embedder.initialize() 30 | 31 | # Test single text embedding 32 | text = "Test text" 33 | embedding = await embedder.embed(text) 34 | assert len(embedding) == embedder.vector_size 35 | 36 | # Test batch embedding 37 | texts = ["Test text 1", "Test text 2"] 38 | embeddings = await embedder.embed_batch(texts) 39 | assert len(embeddings) == 2 40 | assert all(len(emb) == embedder.vector_size for emb in embeddings) ``` -------------------------------------------------------------------------------- /async_fixture_wrapper.py: -------------------------------------------------------------------------------- ```python 1 | """ 2 | Async Fixture Wrapper for Component Tests 3 | 4 | This script serves as a wrapper for running component tests with complex async fixtures 5 | to ensure they are properly awaited in isolated test mode. 6 | """ 7 | import os 8 | import sys 9 | import asyncio 10 | import pytest 11 | import importlib 12 | from pathlib import Path 13 | 14 | def run_with_async_fixture_support(): 15 | """Run pytest with proper async fixture support.""" 16 | # Get the module path and test name from command line arguments 17 | if len(sys.argv) < 3: 18 | print("Usage: python async_fixture_wrapper.py <module_path> <test_name>") 19 | sys.exit(1) 20 | 21 | module_path = sys.argv[1] 22 | test_name = sys.argv[2] 23 | 24 | # Configure event loop policy for macOS if needed 25 | if sys.platform == 'darwin': 26 | import platform 27 | if int(platform.mac_ver()[0].split('.')[0]) >= 10: 28 | # macOS 10+ - use the right event loop policy 29 | asyncio.set_event_loop_policy(asyncio.DefaultEventLoopPolicy()) 30 | 31 | # Ensure PYTHONPATH is set correctly 32 | base_dir = str(Path(module_path).parent.parent) 33 | sys.path.insert(0, base_dir) 34 | 35 | # Build pytest args 36 | pytest_args = [module_path, f"-k={test_name}", "--asyncio-mode=strict"] 37 | 38 | # Add any additional args 39 | if len(sys.argv) > 3: 40 | pytest_args.extend(sys.argv[3:]) 41 | 42 | # Run the test 43 | exit_code = pytest.main(pytest_args) 44 | 45 | sys.exit(exit_code) 46 | 47 | if __name__ == "__main__": 48 | run_with_async_fixture_support() 49 | ``` -------------------------------------------------------------------------------- /PULL_REQUEST.md: -------------------------------------------------------------------------------- ```markdown 1 | # GitHub Actions Workflow Improvements 2 | 3 | @coderabbit I'd like to request your detailed review of our GitHub Actions workflows. 4 | 5 | ## Overview 6 | 7 | This PR aims to improve the GitHub Actions workflows in our repository by: 8 | 9 | 1. **Documenting** all existing workflows 10 | 2. **Addressing** the test pattern issue in build-verification.yml 11 | 3. **Extracting** common functionality into reusable scripts 12 | 4. **Standardizing** practices across different workflows 13 | 14 | ## Changes 15 | 16 | - Added comprehensive documentation of all GitHub Actions workflows 17 | - Fixed the wildcard pattern issue (`test_*`) in build-verification.yml 18 | - Extracted Qdrant health check logic into a reusable script 19 | - Added README for the scripts directory 20 | 21 | ## Benefits 22 | 23 | - **Maintainability**: Common logic is now in a single location 24 | - **Readability**: Workflows are cleaner and better documented 25 | - **Reliability**: Fixed test pattern ensures more consistent test execution 26 | - **Extensibility**: Easier to add new workflows or modify existing ones 27 | 28 | ## Request for Review 29 | 30 | @coderabbit, I'm particularly interested in your feedback on: 31 | 32 | 1. Workflow structure and organization 33 | 2. Any redundancies or inefficiencies you notice 34 | 3. Any missing best practices 35 | 4. Suggestions for further improvements 36 | 37 | ## Future Improvements 38 | 39 | We're planning to implement additional enhancements based on your feedback: 40 | 41 | - Extract more common functionality into reusable actions 42 | - Standardize environment variables across workflows 43 | - Improve caching strategies 44 | - Add workflow dependencies to avoid redundant work 45 | 46 | Thank you for your time and expertise! ``` -------------------------------------------------------------------------------- /run_test_with_path_fix.sh: -------------------------------------------------------------------------------- ```bash 1 | #!/bin/bash 2 | # This script runs tests with a fix for the Python path issue 3 | 4 | set -e 5 | 6 | # Activate the virtual environment 7 | source .venv/bin/activate 8 | 9 | # Setup environment for Qdrant 10 | export MCP_TEST_MODE=1 11 | export QDRANT_URL="http://localhost:6333" 12 | export MCP_COLLECTION_NAME="test_collection_$(date +%s)" 13 | export PYTHONPATH="$PYTHONPATH:$(pwd)" 14 | 15 | # Initialize Qdrant collection for testing 16 | echo "Creating Qdrant collection for testing..." 17 | python - << EOF 18 | import os 19 | from qdrant_client import QdrantClient 20 | from qdrant_client.http import models 21 | 22 | # Connect to Qdrant 23 | client = QdrantClient(url="http://localhost:6333") 24 | collection_name = os.environ.get("MCP_COLLECTION_NAME") 25 | 26 | # Check if collection exists 27 | collections = client.get_collections().collections 28 | collection_names = [c.name for c in collections] 29 | 30 | if collection_name in collection_names: 31 | print(f"Collection {collection_name} already exists, recreating it...") 32 | client.delete_collection(collection_name=collection_name) 33 | 34 | # Create collection with vector size 384 (for all-MiniLM-L6-v2) 35 | client.create_collection( 36 | collection_name=collection_name, 37 | vectors_config=models.VectorParams( 38 | size=384, # Dimension for all-MiniLM-L6-v2 39 | distance=models.Distance.COSINE, 40 | ), 41 | ) 42 | 43 | # Create test directory that might be needed 44 | os.makedirs("qdrant_storage", exist_ok=True) 45 | 46 | print(f"Successfully created collection {collection_name}") 47 | EOF 48 | 49 | # Run all component tests in vector_store 50 | echo "Running all vector store tests with component_test_runner.py..." 51 | python component_test_runner.py tests/components/test_vector_store.py 52 | ``` -------------------------------------------------------------------------------- /test_imports.py: -------------------------------------------------------------------------------- ```python 1 | #!/usr/bin/env python3 2 | """ 3 | Test script to verify imports work correctly 4 | """ 5 | 6 | import sys 7 | import importlib 8 | import os 9 | 10 | def test_import(module_name): 11 | try: 12 | module = importlib.import_module(module_name) 13 | print(f"✅ Successfully imported {module_name}") 14 | return True 15 | except ImportError as e: 16 | print(f"❌ Failed to import {module_name}: {e}") 17 | return False 18 | 19 | def print_path(): 20 | print("\nPython Path:") 21 | for i, path in enumerate(sys.path): 22 | print(f"{i}: {path}") 23 | 24 | def main(): 25 | print("=== Testing Package Imports ===") 26 | 27 | print("\nEnvironment:") 28 | print(f"Python version: {sys.version}") 29 | print(f"Working directory: {os.getcwd()}") 30 | 31 | print("\nTesting core package imports:") 32 | 33 | # First ensure the parent directory is in the path 34 | sys.path.insert(0, os.getcwd()) 35 | print_path() 36 | 37 | print("\nTesting imports:") 38 | 39 | # Test basic Python imports 40 | test_import("os") 41 | test_import("sys") 42 | 43 | # Test ML/NLP packages 44 | test_import("torch") 45 | test_import("numpy") 46 | test_import("transformers") 47 | test_import("sentence_transformers") 48 | 49 | # Test FastAPI and web packages 50 | test_import("fastapi") 51 | test_import("starlette") 52 | test_import("pydantic") 53 | 54 | # Test database packages 55 | test_import("qdrant_client") 56 | 57 | # Test project specific modules 58 | test_import("src.mcp_codebase_insight.core.config") 59 | test_import("src.mcp_codebase_insight.core.embeddings") 60 | test_import("src.mcp_codebase_insight.core.vector_store") 61 | 62 | print("\n=== Testing Complete ===") 63 | 64 | if __name__ == "__main__": 65 | main() 66 | ``` -------------------------------------------------------------------------------- /scripts/setup_qdrant.sh: -------------------------------------------------------------------------------- ```bash 1 | #!/bin/bash 2 | 3 | # Script to set up Qdrant for MCP Codebase Insight 4 | set -e 5 | 6 | # Colors for output 7 | GREEN='\033[0;32m' 8 | RED='\033[0;31m' 9 | NC='\033[0m' # No Color 10 | 11 | echo "Setting up Qdrant for MCP Codebase Insight..." 12 | 13 | # Check if Docker is running 14 | if ! docker info > /dev/null 2>&1; then 15 | echo -e "${RED}Error: Docker is not running${NC}" 16 | exit 1 17 | fi 18 | 19 | # Check if port 6333 is available 20 | if lsof -Pi :6333 -sTCP:LISTEN -t >/dev/null ; then 21 | echo -e "${RED}Warning: Port 6333 is already in use${NC}" 22 | echo "Checking if it's a Qdrant instance..." 23 | if curl -s http://localhost:6333/health > /dev/null; then 24 | echo -e "${GREEN}Existing Qdrant instance detected and healthy${NC}" 25 | exit 0 26 | else 27 | echo -e "${RED}Port 6333 is in use by another service${NC}" 28 | exit 1 29 | fi 30 | fi 31 | 32 | # Create data directory if it doesn't exist 33 | mkdir -p ./qdrant_data 34 | 35 | # Stop and remove existing container if it exists 36 | if docker ps -a | grep -q mcp-qdrant; then 37 | echo "Removing existing mcp-qdrant container..." 38 | docker stop mcp-qdrant || true 39 | docker rm mcp-qdrant || true 40 | fi 41 | 42 | # Pull latest Qdrant image 43 | echo "Pulling latest Qdrant image..." 44 | docker pull qdrant/qdrant:latest 45 | 46 | # Start Qdrant container 47 | echo "Starting Qdrant container..." 48 | docker run -d \ 49 | --name mcp-qdrant \ 50 | -p 6333:6333 \ 51 | -v "$(pwd)/qdrant_data:/qdrant/storage" \ 52 | qdrant/qdrant 53 | 54 | # Wait for Qdrant to be ready 55 | echo "Waiting for Qdrant to be ready..." 56 | for i in {1..30}; do 57 | if curl -s http://localhost:6333/health > /dev/null; then 58 | echo -e "${GREEN}Qdrant is ready!${NC}" 59 | exit 0 60 | fi 61 | echo "Waiting... ($i/30)" 62 | sleep 1 63 | done 64 | 65 | echo -e "${RED}Error: Qdrant failed to start within 30 seconds${NC}" 66 | exit 1 67 | ``` -------------------------------------------------------------------------------- /scripts/check_qdrant_health.sh: -------------------------------------------------------------------------------- ```bash 1 | #!/bin/bash 2 | set -euo pipefail 3 | # Script to check if Qdrant service is available and healthy 4 | # Usage: ./check_qdrant_health.sh [qdrant_url] [max_retries] [sleep_seconds] 5 | 6 | # Default values 7 | QDRANT_URL=${1:-"http://localhost:6333"} 8 | MAX_RETRIES=${2:-20} 9 | SLEEP_SECONDS=${3:-5} 10 | 11 | echo "Checking Qdrant health at $QDRANT_URL (max $MAX_RETRIES attempts with $SLEEP_SECONDS seconds delay)" 12 | 13 | # Install dependencies if not present 14 | if ! command -v curl &> /dev/null || ! command -v jq &> /dev/null; then 15 | echo "Installing required dependencies..." 16 | apt-get update &> /dev/null && apt-get install -y curl jq &> /dev/null || true 17 | fi 18 | 19 | # Check if dependencies are available 20 | if ! command -v curl &> /dev/null; then 21 | echo "Error: curl command not found and could not be installed" 22 | exit 1 23 | fi 24 | 25 | if ! command -v jq &> /dev/null; then 26 | echo "Warning: jq command not found and could not be installed. JSON validation will be skipped." 27 | JQ_AVAILABLE=false 28 | else 29 | JQ_AVAILABLE=true 30 | fi 31 | 32 | # Wait for Qdrant to be available 33 | retry_count=0 34 | until [ "$(curl -s -o /dev/null -w "%{http_code}" "$QDRANT_URL/collections")" -eq 200 ] || [ "$retry_count" -eq "$MAX_RETRIES" ] 35 | do 36 | echo "Waiting for Qdrant... (attempt $retry_count of $MAX_RETRIES)" 37 | sleep "$SLEEP_SECONDS" 38 | retry_count=$((retry_count+1)) 39 | done 40 | 41 | if [ "$retry_count" -eq "$MAX_RETRIES" ]; then 42 | echo "Qdrant service failed to become available after $((MAX_RETRIES * SLEEP_SECONDS)) seconds" 43 | exit 1 44 | fi 45 | 46 | # Check for valid JSON response if jq is available 47 | if [ "$JQ_AVAILABLE" = true ]; then 48 | if ! curl -s "$QDRANT_URL/collections" | jq . > /dev/null; then 49 | echo "Qdrant did not return valid JSON." 50 | exit 1 51 | fi 52 | fi 53 | 54 | echo "Qdrant service is accessible and healthy." 55 | exit 0 ``` -------------------------------------------------------------------------------- /docs/qdrant_setup.md: -------------------------------------------------------------------------------- ```markdown 1 | # Qdrant Setup Guide 2 | 3 | ## Overview 4 | This document outlines the setup and maintenance procedures for the Qdrant vector database instance required for running tests and development. 5 | 6 | ## Prerequisites 7 | - Docker installed and running 8 | - Port 6333 available on localhost 9 | - Python 3.8+ with pip 10 | 11 | ## Setup Options 12 | 13 | ### Option 1: Docker Container (Recommended for Development) 14 | ```bash 15 | # Pull the latest Qdrant image 16 | docker pull qdrant/qdrant:latest 17 | 18 | # Run Qdrant container 19 | docker run -d \ 20 | --name mcp-qdrant \ 21 | -p 6333:6333 \ 22 | -v $(pwd)/qdrant_data:/qdrant/storage \ 23 | qdrant/qdrant 24 | 25 | # Verify the instance is running 26 | curl http://localhost:6333/health 27 | ``` 28 | 29 | ### Option 2: Pre-existing Instance 30 | If using a pre-existing Qdrant instance: 31 | 1. Ensure it's accessible at `localhost:6333` 32 | 2. Verify health status 33 | 3. Configure environment variables if needed: 34 | ```bash 35 | export QDRANT_HOST=localhost 36 | export QDRANT_PORT=6333 37 | ``` 38 | 39 | ## Health Check 40 | ```python 41 | from qdrant_client import QdrantClient 42 | 43 | client = QdrantClient(host="localhost", port=6333) 44 | health = client.health() 45 | print(f"Qdrant health status: {health}") 46 | ``` 47 | 48 | ## Maintenance 49 | - Regular health checks are automated in CI/CD pipeline 50 | - Database backups are stored in `./qdrant_data` 51 | - Version updates should be coordinated with the team 52 | 53 | ## Troubleshooting 54 | 1. If container fails to start: 55 | ```bash 56 | # Check logs 57 | docker logs mcp-qdrant 58 | 59 | # Verify port availability 60 | lsof -i :6333 61 | ``` 62 | 63 | 2. If connection fails: 64 | ```bash 65 | # Restart container 66 | docker restart mcp-qdrant 67 | 68 | # Check container status 69 | docker ps -a | grep mcp-qdrant 70 | ``` 71 | 72 | ## Responsible Parties 73 | - Primary maintainer: DevOps Team 74 | - Documentation updates: Development Team Lead 75 | - Testing coordination: QA Team Lead 76 | 77 | ## Version Control 78 | - Document version: 1.0 79 | - Last updated: 2025-03-24 80 | - Next review: 2025-06-24 81 | ``` -------------------------------------------------------------------------------- /setup_qdrant_collection.py: -------------------------------------------------------------------------------- ```python 1 | from qdrant_client import QdrantClient 2 | from qdrant_client.http import models 3 | from qdrant_client.http.models import Distance, VectorParams 4 | 5 | def setup_collection(): 6 | # Connect to Qdrant 7 | client = QdrantClient( 8 | url='https://e67ee53a-6e03-4526-9e41-3fde622323a9.us-east4-0.gcp.cloud.qdrant.io:6333', 9 | api_key='eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhY2Nlc3MiOiJtIiwiZXhwIjoxNzQ1MTAyNzQ3fQ.3gvK8M7dJxZkSpyzpJtTGVUhjyjgbYEhEvl2aG7JodM' 10 | ) 11 | 12 | collection_name = "mcp-codebase-insight" 13 | 14 | try: 15 | # Check if collection exists 16 | collections = client.get_collections().collections 17 | exists = any(c.name == collection_name for c in collections) 18 | 19 | # If collection exists, recreate it 20 | if exists: 21 | print(f"\nRemoving existing collection '{collection_name}'") 22 | client.delete_collection(collection_name=collection_name) 23 | 24 | # Create a new collection with named vector configurations 25 | print(f"\nCreating collection '{collection_name}' with named vectors") 26 | 27 | # Create named vectors configuration 28 | vectors_config = { 29 | # For the default MCP server embedding model (all-MiniLM-L6-v2) 30 | "fast-all-minilm-l6-v2": VectorParams( 31 | size=384, # all-MiniLM-L6-v2 produces 384-dimensional vectors 32 | distance=Distance.COSINE 33 | ) 34 | } 35 | 36 | client.create_collection( 37 | collection_name=collection_name, 38 | vectors_config=vectors_config 39 | ) 40 | 41 | # Verify the collection was created properly 42 | collection_info = client.get_collection(collection_name=collection_name) 43 | print(f"\nCollection '{collection_name}' created successfully") 44 | print(f"Vector configuration: {collection_info.config.params.vectors}") 45 | 46 | print("\nCollection is ready for the MCP server") 47 | 48 | except Exception as e: 49 | print(f"\nError setting up collection: {e}") 50 | 51 | if __name__ == '__main__': 52 | setup_collection() ``` -------------------------------------------------------------------------------- /docs/vector_store_best_practices.md: -------------------------------------------------------------------------------- ```markdown 1 | # VectorStore Best Practices 2 | 3 | This document outlines best practices for working with the VectorStore component in the MCP Codebase Insight project. 4 | 5 | ## Metadata Structure 6 | 7 | To ensure consistency and prevent `KeyError` exceptions, always follow these metadata structure guidelines: 8 | 9 | ### Required Fields 10 | 11 | Always include these fields in your metadata when adding vectors: 12 | 13 | - `type`: The type of content (e.g., "code", "documentation", "pattern") 14 | - `language`: Programming language if applicable (e.g., "python", "javascript") 15 | - `title`: Short descriptive title 16 | - `description`: Longer description of the content 17 | 18 | ### Accessing Metadata 19 | 20 | Always use the `.get()` method with a default value when accessing metadata fields: 21 | 22 | ```python 23 | # Good - safe access pattern 24 | result.metadata.get("type", "code") 25 | 26 | # Bad - can cause KeyError 27 | result.metadata["type"] 28 | ``` 29 | 30 | ## Initialization and Cleanup 31 | 32 | Follow these best practices for proper initialization and cleanup: 33 | 34 | 1. Always `await vector_store.initialize()` before using a VectorStore 35 | 2. Always `await vector_store.cleanup()` in test teardown/finally blocks 36 | 3. Use unique collection names in tests to prevent conflicts 37 | 4. Check `vector_store.initialized` status before operations 38 | 39 | Example: 40 | 41 | ```python 42 | try: 43 | store = VectorStore(url, embedder, collection_name=unique_name) 44 | await store.initialize() 45 | # Use the store... 46 | finally: 47 | await store.cleanup() 48 | await store.close() 49 | ``` 50 | 51 | ## Vector Names and Dimensions 52 | 53 | - Use consistent vector dimensions (384 for all-MiniLM-L6-v2) 54 | - Be careful when overriding the vector_name parameter 55 | - Ensure embedder and vector store are compatible 56 | 57 | ## Error Handling 58 | 59 | - Check for component availability before use 60 | - Handle initialization errors gracefully 61 | - Log failures with meaningful messages 62 | 63 | ## Testing Guidelines 64 | 65 | 1. Use isolated test collections with unique names 66 | 2. Clean up all test data after tests 67 | 3. Verify metadata structure in tests 68 | 4. Use standardized test data fixtures 69 | 5. Test both positive and negative paths 70 | 71 | By following these guidelines, you can avoid common issues like the "KeyError: 'type'" problem that was occurring in the codebase. ``` -------------------------------------------------------------------------------- /scripts/macos_install.sh: -------------------------------------------------------------------------------- ```bash 1 | #!/bin/bash 2 | 3 | # Exit on error 4 | set -e 5 | 6 | echo "Installing MCP Codebase Insight development environment..." 7 | 8 | # Check for Homebrew 9 | if ! command -v brew &> /dev/null; then 10 | echo "Installing Homebrew..." 11 | /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" 12 | else 13 | echo "Homebrew already installed, updating..." 14 | brew update 15 | fi 16 | 17 | # Check for Python 18 | if ! command -v python3 &> /dev/null; then 19 | echo "Installing Python..." 20 | brew install [email protected] 21 | else 22 | echo "Python already installed" 23 | fi 24 | 25 | # Check for Docker 26 | if ! command -v docker &> /dev/null; then 27 | echo "Installing Docker..." 28 | brew install --cask docker 29 | 30 | echo "Starting Docker..." 31 | open -a Docker 32 | 33 | # Wait for Docker to start 34 | echo "Waiting for Docker to start..." 35 | while ! docker info &> /dev/null; do 36 | sleep 1 37 | done 38 | else 39 | echo "Docker already installed" 40 | fi 41 | 42 | # Create virtual environment 43 | echo "Creating virtual environment..." 44 | python3.11 -m venv .venv 45 | 46 | # Activate virtual environment 47 | echo "Activating virtual environment..." 48 | source .venv/bin/activate 49 | 50 | # Install dependencies 51 | echo "Installing Python dependencies..." 52 | pip install --upgrade pip 53 | pip install -r requirements.txt 54 | 55 | # Start Qdrant 56 | echo "Starting Qdrant container..." 57 | if ! docker ps | grep -q qdrant; then 58 | docker run -d -p 6333:6333 -p 6334:6334 \ 59 | -v $(pwd)/qdrant_storage:/qdrant/storage \ 60 | qdrant/qdrant 61 | echo "Qdrant container started" 62 | else 63 | echo "Qdrant container already running" 64 | fi 65 | 66 | # Create required directories 67 | echo "Creating project directories..." 68 | mkdir -p docs/adrs 69 | mkdir -p docs/templates 70 | mkdir -p knowledge/patterns 71 | mkdir -p references 72 | mkdir -p logs/debug 73 | 74 | # Copy environment file if it doesn't exist 75 | if [ ! -f .env ]; then 76 | echo "Creating .env file..." 77 | cp .env.example .env 78 | echo "Please update .env with your settings" 79 | fi 80 | 81 | # Load example patterns 82 | echo "Loading example patterns..." 83 | python scripts/load_example_patterns.py 84 | 85 | echo " 86 | Installation complete! 🎉 87 | 88 | To start development: 89 | 1. Update .env with your settings 90 | 2. Activate the virtual environment: 91 | source .venv/bin/activate 92 | 3. Start the server: 93 | make run 94 | 95 | For more information, see the README.md file. 96 | " 97 | ``` -------------------------------------------------------------------------------- /start-mcpserver.sh: -------------------------------------------------------------------------------- ```bash 1 | #!/bin/bash 2 | # This script starts the MCP Qdrant server with SSE transport 3 | set -x 4 | source .venv/bin/activate 5 | # Set the PATH to include the local bin directory 6 | export PATH="$HOME/.local/bin:$PATH" 7 | 8 | # Define environment variables 9 | export COLLECTION_NAME="mcp-codebase-insight" 10 | export EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2" 11 | export QDRANT_URL="${QDRANT_URL:-http://localhost:6333}" 12 | export QDRANT_API_KEY="eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhY2Nlc3MiOiJtIiwiZXhwIjoxNzQ1MTAyNzQ3fQ.3gvK8M7dJxZkSpyzpJtTGVUhjyjgbYEhEvl2aG7JodM" 13 | 14 | # Define tool descriptions 15 | TOOL_STORE_DESCRIPTION="Store reusable code snippets and test results. 'information' contains a description. 'metadata' is a dictionary with a 'type' key: 'code' for code snippets, 'test_result' for test results. For 'code', 'metadata' includes a 'code' key with the code. For 'test_result', 'metadata' includes 'test_name', 'status' (pass/fail), and 'error_message'." 16 | 17 | TOOL_FIND_DESCRIPTION="Search for code snippets and test results. The 'query' parameter describes what you're looking for. Returned results will have a 'metadata' field with a 'type' key indicating 'code' or 'test_result'. Use this to find code or analyze test failures." 18 | 19 | # Default port for the SSE transport (can be overridden with PORT env var) 20 | PORT="${PORT:-8000}" 21 | 22 | # Determine transport type (default to sse if not specified) 23 | TRANSPORT="${TRANSPORT:-sse}" 24 | 25 | # Check if uvx and mcp-server-qdrant are installed 26 | if ! command -v uvx &> /dev/null; then 27 | echo "Error: uvx is not installed. Please install it with: pip install uvx" 28 | exit 1 29 | fi 30 | 31 | if ! python -c "import importlib.util; print(importlib.util.find_spec('mcp_server_qdrant') is not None)" | grep -q "True"; then 32 | echo "Error: mcp-server-qdrant is not installed. Please install it with: pip install mcp-server-qdrant" 33 | exit 1 34 | fi 35 | 36 | echo "Starting MCP Qdrant server with $TRANSPORT transport on port $PORT..." 37 | 38 | # Run the MCP Qdrant server with the specified transport 39 | if [ "$TRANSPORT" = "sse" ]; then 40 | # For SSE transport, we need to specify the port 41 | uvx mcp-server-qdrant --transport sse --port $PORT 42 | else 43 | # For other transports (e.g., stdio which is the default) 44 | uvx mcp-server-qdrant 45 | fi 46 | ``` -------------------------------------------------------------------------------- /docs/testing_guide.md: -------------------------------------------------------------------------------- ```markdown 1 | # Testing Guide for MCP Codebase Insight 2 | 3 | ## Asynchronous Testing 4 | 5 | The MCP Codebase Insight project uses asynchronous APIs and should be tested using proper async test clients. Here are guidelines for testing: 6 | 7 | ### Async vs Sync Testing Clients 8 | 9 | The project provides two test client fixtures: 10 | 11 | 1. **`test_client`** - Use for asynchronous tests 12 | - Returns an `AsyncClient` from httpx 13 | - Must be used with `await` for requests 14 | - Must be used with `@pytest.mark.asyncio` decorator 15 | 16 | 2. **`sync_test_client`** - Use for synchronous tests 17 | - Returns a `TestClient` from FastAPI 18 | - Used for simpler tests where async is not needed 19 | - No need for await or asyncio decorators 20 | 21 | ### Example: Async Test 22 | 23 | ```python 24 | import pytest 25 | 26 | @pytest.mark.asyncio 27 | async def test_my_endpoint(test_client): 28 | """Test an endpoint asynchronously.""" 29 | response = await test_client.get("/my-endpoint") 30 | assert response.status_code == 200 31 | data = response.json() 32 | assert "result" in data 33 | ``` 34 | 35 | ### Example: Sync Test 36 | 37 | ```python 38 | def test_simple_endpoint(sync_test_client): 39 | """Test an endpoint synchronously.""" 40 | response = sync_test_client.get("/simple-endpoint") 41 | assert response.status_code == 200 42 | ``` 43 | 44 | ### Common Issues 45 | 46 | 1. **Using TestClient with async:** The error `'TestClient' object does not support the asynchronous context manager protocol` occurs when trying to use TestClient in an async context. Always use the `test_client` fixture for async tests. 47 | 48 | 2. **Mixing async/sync:** Don't mix async and sync patterns in the same test. 49 | 50 | 3. **Missing asyncio mark:** Always add `@pytest.mark.asyncio` to async test functions. 51 | 52 | ## Test Isolation 53 | 54 | Tests should be isolated to prevent state interference between tests: 55 | 56 | 1. Each test gets its own server instance with isolated state 57 | 2. Vector store tests use unique collection names 58 | 3. Cleanup is performed automatically after tests 59 | 60 | ## Running Tests 61 | 62 | Run tests using pytest: 63 | 64 | ```bash 65 | # Run all tests 66 | pytest 67 | 68 | # Run specific test file 69 | pytest tests/test_file_relationships.py 70 | 71 | # Run specific test function 72 | pytest tests/test_file_relationships.py::test_create_file_relationship 73 | ``` 74 | 75 | For more advanced test running options, use the `run_tests.py` script in the project root. ``` -------------------------------------------------------------------------------- /.compile-venv-py3.11/bin/activate.fish: -------------------------------------------------------------------------------- ``` 1 | # This file must be used with "source <venv>/bin/activate.fish" *from fish* 2 | # (https://fishshell.com/); you cannot run it directly. 3 | 4 | function deactivate -d "Exit virtual environment and return to normal shell environment" 5 | # reset old environment variables 6 | if test -n "$_OLD_VIRTUAL_PATH" 7 | set -gx PATH $_OLD_VIRTUAL_PATH 8 | set -e _OLD_VIRTUAL_PATH 9 | end 10 | if test -n "$_OLD_VIRTUAL_PYTHONHOME" 11 | set -gx PYTHONHOME $_OLD_VIRTUAL_PYTHONHOME 12 | set -e _OLD_VIRTUAL_PYTHONHOME 13 | end 14 | 15 | if test -n "$_OLD_FISH_PROMPT_OVERRIDE" 16 | set -e _OLD_FISH_PROMPT_OVERRIDE 17 | # prevents error when using nested fish instances (Issue #93858) 18 | if functions -q _old_fish_prompt 19 | functions -e fish_prompt 20 | functions -c _old_fish_prompt fish_prompt 21 | functions -e _old_fish_prompt 22 | end 23 | end 24 | 25 | set -e VIRTUAL_ENV 26 | set -e VIRTUAL_ENV_PROMPT 27 | if test "$argv[1]" != "nondestructive" 28 | # Self-destruct! 29 | functions -e deactivate 30 | end 31 | end 32 | 33 | # Unset irrelevant variables. 34 | deactivate nondestructive 35 | 36 | set -gx VIRTUAL_ENV /Users/tosinakinosho/workspaces/mcp-codebase-insight/.compile-venv-py3.11 37 | 38 | set -gx _OLD_VIRTUAL_PATH $PATH 39 | set -gx PATH "$VIRTUAL_ENV/"bin $PATH 40 | 41 | # Unset PYTHONHOME if set. 42 | if set -q PYTHONHOME 43 | set -gx _OLD_VIRTUAL_PYTHONHOME $PYTHONHOME 44 | set -e PYTHONHOME 45 | end 46 | 47 | if test -z "$VIRTUAL_ENV_DISABLE_PROMPT" 48 | # fish uses a function instead of an env var to generate the prompt. 49 | 50 | # Save the current fish_prompt function as the function _old_fish_prompt. 51 | functions -c fish_prompt _old_fish_prompt 52 | 53 | # With the original prompt function renamed, we can override with our own. 54 | function fish_prompt 55 | # Save the return status of the last command. 56 | set -l old_status $status 57 | 58 | # Output the venv prompt; color taken from the blue of the Python logo. 59 | printf "%s%s%s" (set_color 4B8BBE) '(.compile-venv-py3.11) ' (set_color normal) 60 | 61 | # Restore the return status of the previous command. 62 | echo "exit $old_status" | . 63 | # Output the original/"old" prompt. 64 | _old_fish_prompt 65 | end 66 | 67 | set -gx _OLD_FISH_PROMPT_OVERRIDE "$VIRTUAL_ENV" 68 | set -gx VIRTUAL_ENV_PROMPT '(.compile-venv-py3.11) ' 69 | end 70 | ``` -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- ```python 1 | from setuptools import setup, find_packages 2 | import re 3 | import os 4 | 5 | # Read version from __init__.py 6 | with open(os.path.join("src", "mcp_codebase_insight", "__init__.py"), "r") as f: 7 | version_match = re.search(r"^__version__ = ['\"]([^'\"]*)['\"]", f.read(), re.M) 8 | if version_match: 9 | version = version_match.group(1) 10 | else: 11 | raise RuntimeError("Unable to find version string") 12 | 13 | setup( 14 | name="mcp-codebase-insight", 15 | version=version, 16 | description="Model Context Protocol (MCP) server for codebase analysis and insights", 17 | long_description=open("README.md").read(), 18 | long_description_content_type="text/markdown", 19 | author="Model Context Protocol", 20 | author_email="[email protected]", 21 | url="https://github.com/modelcontextprotocol/mcp-codebase-insight", 22 | packages=find_packages(where="src"), 23 | package_dir={"": "src"}, 24 | install_requires=[ 25 | "fastapi>=0.103.2,<0.104.0", 26 | "uvicorn>=0.23.2,<0.24.0", 27 | "pydantic>=2.4.2,<3.0.0", 28 | "starlette>=0.27.0,<0.28.0", 29 | "asyncio>=3.4.3", 30 | "aiohttp>=3.9.0,<4.0.0", 31 | "qdrant-client>=1.13.3", 32 | "sentence-transformers>=2.2.2", 33 | "torch>=2.0.0", 34 | "transformers>=4.34.0,<5.0.0", 35 | "python-frontmatter>=1.0.0", 36 | "markdown>=3.4.4", 37 | "PyYAML>=6.0.1", 38 | "structlog>=23.1.0", 39 | "psutil>=5.9.5", 40 | "python-dotenv>=1.0.0", 41 | "requests>=2.31.0", 42 | "beautifulsoup4>=4.12.0", 43 | "scipy>=1.11.0", 44 | "numpy>=1.24.0", 45 | "python-slugify>=8.0.0", 46 | "slugify>=0.0.1", 47 | # Temporarily commented out for development installation 48 | # "uvx>=0.4.0", 49 | "mcp-server-qdrant>=0.2.0", 50 | "mcp==1.5.0", 51 | ], 52 | python_requires=">=3.9", 53 | classifiers=[ 54 | "Development Status :: 3 - Alpha", 55 | "Intended Audience :: Developers", 56 | "License :: OSI Approved :: MIT License", 57 | "Programming Language :: Python :: 3.9", 58 | "Programming Language :: Python :: 3.10", 59 | "Programming Language :: Python :: 3.11", 60 | "Topic :: Software Development :: Libraries :: Python Modules", 61 | ], 62 | entry_points={ 63 | "console_scripts": [ 64 | "mcp-codebase-insight=mcp_codebase_insight.server:run", 65 | ], 66 | }, 67 | ) ``` -------------------------------------------------------------------------------- /scripts/start_mcp_server.sh: -------------------------------------------------------------------------------- ```bash 1 | #!/bin/bash 2 | set -e 3 | 4 | # Function to log messages 5 | log() { 6 | echo "[$(date +'%Y-%m-%d %H:%M:%S')] $1" 7 | } 8 | 9 | # Function to check if Qdrant is available 10 | check_qdrant() { 11 | local url="${QDRANT_URL:-http://localhost:6333}" 12 | local max_attempts=30 13 | local attempt=1 14 | 15 | log "Checking Qdrant connection at $url" 16 | 17 | while [ $attempt -le $max_attempts ]; do 18 | if curl -s -f "$url/health" > /dev/null 2>&1; then 19 | log "Qdrant is available" 20 | return 0 21 | fi 22 | 23 | log "Waiting for Qdrant (attempt $attempt/$max_attempts)..." 24 | sleep 2 25 | attempt=$((attempt + 1)) 26 | done 27 | 28 | log "Error: Could not connect to Qdrant" 29 | return 1 30 | } 31 | 32 | # Function to check Python environment 33 | check_python() { 34 | if ! command -v python3 &> /dev/null; then 35 | log "Error: Python 3 is not installed" 36 | exit 1 37 | fi 38 | 39 | if ! python3 -c "import pkg_resources; pkg_resources.require('fastapi>=0.103.2')" &> /dev/null; then 40 | log "Error: Required Python packages are not installed" 41 | exit 1 42 | fi 43 | } 44 | 45 | # Function to setup environment 46 | setup_env() { 47 | # Create required directories if they don't exist 48 | mkdir -p docs/adrs knowledge cache logs 49 | 50 | # Copy example env file if .env doesn't exist 51 | if [ ! -f .env ] && [ -f .env.example ]; then 52 | cp .env.example .env 53 | log "Created .env from example" 54 | fi 55 | 56 | # Set default environment variables if not set 57 | export MCP_HOST=${MCP_HOST:-0.0.0.0} 58 | export MCP_PORT=${MCP_PORT:-3000} 59 | export MCP_LOG_LEVEL=${MCP_LOG_LEVEL:-INFO} 60 | 61 | log "Environment setup complete" 62 | } 63 | 64 | # Main startup sequence 65 | main() { 66 | log "Starting MCP Codebase Insight Server" 67 | 68 | # Perform checks 69 | check_python 70 | setup_env 71 | check_qdrant 72 | 73 | # Parse command line arguments 74 | local host="0.0.0.0" 75 | local port="3000" 76 | 77 | while [[ $# -gt 0 ]]; do 78 | case $1 in 79 | --host) 80 | host="$2" 81 | shift 2 82 | ;; 83 | --port) 84 | port="$2" 85 | shift 2 86 | ;; 87 | *) 88 | log "Unknown option: $1" 89 | exit 1 90 | ;; 91 | esac 92 | done 93 | 94 | # Start server 95 | log "Starting server on $host:$port" 96 | exec python3 -m mcp_codebase_insight 97 | } 98 | 99 | # Run main function with all arguments 100 | main "$@" 101 | ``` -------------------------------------------------------------------------------- /src/mcp_codebase_insight/__main__.py: -------------------------------------------------------------------------------- ```python 1 | """Main entry point for MCP server.""" 2 | 3 | import os 4 | from pathlib import Path 5 | import sys 6 | import logging 7 | 8 | import uvicorn 9 | from dotenv import load_dotenv 10 | 11 | from .core.config import ServerConfig 12 | from .server import create_app 13 | from .utils.logger import get_logger 14 | 15 | # Configure logging 16 | logger = get_logger(__name__) 17 | 18 | def get_config() -> ServerConfig: 19 | """Get server configuration.""" 20 | try: 21 | # Load environment variables 22 | load_dotenv() 23 | 24 | config = ServerConfig( 25 | host=os.getenv("MCP_HOST", "127.0.0.1"), 26 | port=int(os.getenv("MCP_PORT", "3000")), 27 | log_level=os.getenv("MCP_LOG_LEVEL", "INFO"), 28 | qdrant_url=os.getenv("QDRANT_URL", "http://localhost:6333"), 29 | docs_cache_dir=Path(os.getenv("MCP_DOCS_CACHE_DIR", "docs")), 30 | adr_dir=Path(os.getenv("MCP_ADR_DIR", "docs/adrs")), 31 | kb_storage_dir=Path(os.getenv("MCP_KB_STORAGE_DIR", "knowledge")), 32 | embedding_model=os.getenv("MCP_EMBEDDING_MODEL", "all-MiniLM-L6-v2"), 33 | collection_name=os.getenv("MCP_COLLECTION_NAME", "codebase_patterns"), 34 | debug_mode=os.getenv("MCP_DEBUG", "false").lower() == "true", 35 | metrics_enabled=os.getenv("MCP_METRICS_ENABLED", "true").lower() == "true", 36 | cache_enabled=os.getenv("MCP_CACHE_ENABLED", "true").lower() == "true", 37 | memory_cache_size=int(os.getenv("MCP_MEMORY_CACHE_SIZE", "1000")), 38 | disk_cache_dir=Path(os.getenv("MCP_DISK_CACHE_DIR", "cache")) if os.getenv("MCP_DISK_CACHE_DIR") else None 39 | ) 40 | 41 | logger.info("Configuration loaded successfully") 42 | return config 43 | 44 | except Exception as e: 45 | logger.error(f"Failed to load configuration: {e}", exc_info=True) 46 | raise 47 | 48 | def main(): 49 | """Run the server.""" 50 | try: 51 | # Get configuration 52 | config = get_config() 53 | 54 | # Create FastAPI app 55 | app = create_app(config) 56 | 57 | # Log startup message 58 | logger.info( 59 | f"Starting MCP Codebase Insight Server on {config.host}:{config.port} " 60 | f"(log level: {config.log_level}, debug mode: {config.debug_mode})" 61 | ) 62 | 63 | # Run using Uvicorn directly 64 | uvicorn.run( 65 | app=app, 66 | host=config.host, 67 | port=config.port, 68 | log_level=config.log_level.lower(), 69 | loop="auto", 70 | lifespan="on", 71 | workers=1 72 | ) 73 | 74 | except Exception as e: 75 | logger.error(f"Server error: {e}", exc_info=True) 76 | sys.exit(1) 77 | 78 | if __name__ == "__main__": 79 | # Run main directly without asyncio.run() 80 | main() 81 | ``` -------------------------------------------------------------------------------- /scripts/validate_knowledge_base.py: -------------------------------------------------------------------------------- ```python 1 | #!/usr/bin/env python3 2 | """ 3 | Knowledge Base Validation Script 4 | Tests knowledge base operations using Firecrawl MCP. 5 | """ 6 | 7 | import asyncio 8 | import logging 9 | from mcp_firecrawl import ( 10 | test_knowledge_operations, 11 | validate_entity_relations, 12 | verify_query_results 13 | ) 14 | 15 | logging.basicConfig(level=logging.INFO) 16 | logger = logging.getLogger(__name__) 17 | 18 | async def validate_knowledge_base(config: dict) -> bool: 19 | """Validate knowledge base operations.""" 20 | logger.info("Testing knowledge base operations...") 21 | 22 | # Test basic knowledge operations 23 | ops_result = await test_knowledge_operations({ 24 | "url": "http://localhost:8001", 25 | "auth_token": config["API_KEY"], 26 | "test_entities": [ 27 | {"name": "TestClass", "type": "class"}, 28 | {"name": "test_method", "type": "method"}, 29 | {"name": "test_variable", "type": "variable"} 30 | ], 31 | "verify_persistence": True 32 | }) 33 | 34 | # Validate entity relations 35 | relations_result = await validate_entity_relations({ 36 | "url": "http://localhost:8001", 37 | "auth_token": config["API_KEY"], 38 | "test_relations": [ 39 | {"from": "TestClass", "to": "test_method", "type": "contains"}, 40 | {"from": "test_method", "to": "test_variable", "type": "uses"} 41 | ], 42 | "verify_bidirectional": True 43 | }) 44 | 45 | # Verify query functionality 46 | query_result = await verify_query_results({ 47 | "url": "http://localhost:8001", 48 | "auth_token": config["API_KEY"], 49 | "test_queries": [ 50 | "find classes that use test_variable", 51 | "find methods in TestClass", 52 | "find variables used by test_method" 53 | ], 54 | "expected_matches": { 55 | "classes": ["TestClass"], 56 | "methods": ["test_method"], 57 | "variables": ["test_variable"] 58 | } 59 | }) 60 | 61 | all_passed = all([ 62 | ops_result.success, 63 | relations_result.success, 64 | query_result.success 65 | ]) 66 | 67 | if all_passed: 68 | logger.info("Knowledge base validation successful") 69 | else: 70 | logger.error("Knowledge base validation failed") 71 | if not ops_result.success: 72 | logger.error("Knowledge operations failed") 73 | if not relations_result.success: 74 | logger.error("Entity relations validation failed") 75 | if not query_result.success: 76 | logger.error("Query validation failed") 77 | 78 | return all_passed 79 | 80 | if __name__ == "__main__": 81 | import sys 82 | from pathlib import Path 83 | sys.path.append(str(Path(__file__).parent.parent)) 84 | 85 | from scripts.config import load_config 86 | config = load_config() 87 | 88 | success = asyncio.run(validate_knowledge_base(config)) 89 | sys.exit(0 if success else 1) ``` -------------------------------------------------------------------------------- /test_fixes.md: -------------------------------------------------------------------------------- ```markdown 1 | # MCP Codebase Insight Test Fixes 2 | 3 | ## Identified Issues 4 | 5 | 1. **Package Import Problems** 6 | - The tests were trying to import from `mcp_codebase_insight` directly, but the package needed to be imported from `src.mcp_codebase_insight` 7 | - The Python path wasn't correctly set up to include the project root directory 8 | 9 | 2. **Missing Dependencies** 10 | - The `sentence-transformers` package was installed in the wrong Python environment (Python 3.13 instead of 3.11) 11 | - Had to explicitly install it in the correct environment 12 | 13 | 3. **Test Isolation Problems** 14 | - Tests were failing due to not being properly isolated 15 | - The `component_test_runner.py` script needed fixes to properly load test modules 16 | 17 | 4. **Qdrant Server Issue** 18 | - The `test_vector_store_cleanup` test failed due to permission issues in the Qdrant server 19 | - The server couldn't create a collection directory for the test 20 | 21 | ## Applied Fixes 22 | 23 | 1. **Fixed Import Paths** 24 | - Modified test files to use `from src.mcp_codebase_insight...` instead of `from mcp_codebase_insight...` 25 | - Added code to explicitly set `sys.path` to include the project root directory 26 | 27 | 2. **Fixed Dependency Issues** 28 | - Ran `python3.11 -m pip install sentence-transformers` to install the package in the correct environment 29 | - Verified all dependencies were properly installed 30 | 31 | 3. **Created a Test Runner Script** 32 | - Created `run_test_with_path_fix.sh` to set up the proper environment variables and paths 33 | - Modified `component_test_runner.py` to better handle module loading 34 | 35 | 4. **Fixed Test Module Loading** 36 | - Added a `load_test_module` function to properly handle import paths 37 | - Ensured the correct Python path is set before importing test modules 38 | 39 | ## Results 40 | 41 | - Successfully ran 2 out of 3 vector store tests: 42 | - ✅ `test_vector_store_initialization` 43 | - ✅ `test_vector_store_add_and_search` 44 | - ❌ `test_vector_store_cleanup` (still failing due to Qdrant server issue) 45 | 46 | ## Recommendations for Remaining Issue 47 | 48 | The `test_vector_store_cleanup` test is failing due to the Qdrant server not being able to create a directory for the collection. This could be fixed by: 49 | 50 | 1. Checking the Qdrant server configuration to ensure it has proper permissions to create directories 51 | 2. Creating the necessary directories beforehand 52 | 3. Modifying the test to use a collection name that already exists or mock the collection creation 53 | 54 | The error message suggests a file system permission issue: 55 | ``` 56 | "Can't create directory for collection cleanup_test_db679546. Error: No such file or directory (os error 2)" 57 | ``` 58 | 59 | A simpler fix for testing purposes might be to modify the Qdrant Docker run command to include a volume mount with proper permissions: 60 | 61 | ```bash 62 | docker run -d -p 6333:6333 -p 6334:6334 -v $(pwd)/qdrant_data:/qdrant/storage qdrant/qdrant 63 | ``` 64 | 65 | This would ensure the storage directory exists and has the right permissions. 66 | ``` -------------------------------------------------------------------------------- /src/mcp_codebase_insight/utils/logger.py: -------------------------------------------------------------------------------- ```python 1 | """Structured logging module.""" 2 | 3 | import logging 4 | import sys 5 | from typing import Any, Dict, Optional 6 | 7 | import structlog 8 | 9 | # Configure structlog 10 | structlog.configure( 11 | processors=[ 12 | structlog.stdlib.filter_by_level, 13 | structlog.stdlib.add_logger_name, 14 | structlog.stdlib.add_log_level, 15 | structlog.stdlib.PositionalArgumentsFormatter(), 16 | structlog.processors.TimeStamper(fmt="iso"), 17 | structlog.processors.StackInfoRenderer(), 18 | structlog.processors.format_exc_info, 19 | structlog.processors.UnicodeDecoder(), 20 | structlog.processors.JSONRenderer() 21 | ], 22 | context_class=dict, 23 | logger_factory=structlog.stdlib.LoggerFactory(), 24 | wrapper_class=structlog.stdlib.BoundLogger, 25 | cache_logger_on_first_use=True, 26 | ) 27 | 28 | class Logger: 29 | """Structured logger.""" 30 | 31 | def __init__( 32 | self, 33 | name: str, 34 | level: str = "INFO", 35 | extra: Optional[Dict[str, Any]] = None 36 | ): 37 | """Initialize logger.""" 38 | # Set log level 39 | log_level = getattr(logging, level.upper()) 40 | logging.basicConfig( 41 | format="%(message)s", 42 | stream=sys.stdout, 43 | level=log_level, 44 | ) 45 | 46 | # Create logger 47 | self.logger = structlog.get_logger(name) 48 | self.extra = extra or {} 49 | 50 | def bind(self, **kwargs) -> "Logger": 51 | """Create new logger with additional context.""" 52 | extra = {**self.extra, **kwargs} 53 | return Logger( 54 | name=self.logger.name, 55 | level=logging.getLevelName(self.logger.level), 56 | extra=extra 57 | ) 58 | 59 | def debug(self, event: str, **kwargs): 60 | """Log debug message.""" 61 | self.logger.debug( 62 | event, 63 | **{**self.extra, **kwargs} 64 | ) 65 | 66 | def info(self, event: str, **kwargs): 67 | """Log info message.""" 68 | self.logger.info( 69 | event, 70 | **{**self.extra, **kwargs} 71 | ) 72 | 73 | def warning(self, event: str, **kwargs): 74 | """Log warning message.""" 75 | self.logger.warning( 76 | event, 77 | **{**self.extra, **kwargs} 78 | ) 79 | 80 | def error(self, event: str, **kwargs): 81 | """Log error message.""" 82 | self.logger.error( 83 | event, 84 | **{**self.extra, **kwargs} 85 | ) 86 | 87 | def exception(self, event: str, exc_info: bool = True, **kwargs): 88 | """Log exception message.""" 89 | self.logger.exception( 90 | event, 91 | exc_info=exc_info, 92 | **{**self.extra, **kwargs} 93 | ) 94 | 95 | def critical(self, event: str, **kwargs): 96 | """Log critical message.""" 97 | self.logger.critical( 98 | event, 99 | **{**self.extra, **kwargs} 100 | ) 101 | 102 | def get_logger( 103 | name: str, 104 | level: str = "INFO", 105 | extra: Optional[Dict[str, Any]] = None 106 | ) -> Logger: 107 | """Get logger instance.""" 108 | return Logger(name, level, extra) 109 | 110 | # Default logger 111 | logger = get_logger("mcp_codebase_insight") 112 | ``` -------------------------------------------------------------------------------- /scripts/validate_vector_store.py: -------------------------------------------------------------------------------- ```python 1 | #!/usr/bin/env python3 2 | """ 3 | Vector Store Validation Script 4 | Tests vector store operations using local codebase. 5 | """ 6 | 7 | import asyncio 8 | import logging 9 | from pathlib import Path 10 | import sys 11 | 12 | # Add the src directory to the Python path 13 | sys.path.append(str(Path(__file__).parent.parent / "src")) 14 | 15 | from mcp_codebase_insight.core.vector_store import VectorStore 16 | from mcp_codebase_insight.core.embeddings import SentenceTransformerEmbedding 17 | 18 | logging.basicConfig(level=logging.INFO) 19 | logger = logging.getLogger(__name__) 20 | 21 | async def validate_vector_store(config: dict) -> bool: 22 | """Validate vector store operations.""" 23 | logger.info("Testing vector store operations...") 24 | 25 | try: 26 | # Initialize embedder 27 | embedder = SentenceTransformerEmbedding( 28 | model_name="sentence-transformers/all-MiniLM-L6-v2" 29 | ) 30 | await embedder.initialize() 31 | logger.info("Embedder initialized successfully") 32 | 33 | # Initialize vector store 34 | vector_store = VectorStore( 35 | url=config.get("QDRANT_URL", "http://localhost:6333"), 36 | embedder=embedder, 37 | collection_name=config.get("COLLECTION_NAME", "mcp-codebase-insight"), 38 | api_key=config.get("QDRANT_API_KEY", ""), 39 | vector_name="default" 40 | ) 41 | await vector_store.initialize() 42 | logger.info("Vector store initialized successfully") 43 | 44 | # Test vector operations 45 | test_text = "def test_function():\n pass" 46 | embedding = await embedder.embed(test_text) 47 | 48 | # Store vector 49 | await vector_store.add_vector( 50 | text=test_text, 51 | metadata={"type": "code", "content": test_text} 52 | ) 53 | logger.info("Vector storage test passed") 54 | 55 | # Search for similar vectors 56 | logger.info("Searching for similar vectors") 57 | results = await vector_store.search_similar( 58 | query=test_text, 59 | limit=1 60 | ) 61 | 62 | if not results or len(results) == 0: 63 | logger.error("Vector search test failed: No results found") 64 | return False 65 | 66 | logger.info("Vector search test passed") 67 | 68 | # Verify result metadata 69 | result = results[0] 70 | if not result.metadata or result.metadata.get("type") != "code": 71 | logger.error("Vector metadata test failed: Invalid metadata") 72 | return False 73 | 74 | logger.info("Vector metadata test passed") 75 | return True 76 | 77 | except Exception as e: 78 | logger.error(f"Vector store validation failed: {e}") 79 | return False 80 | 81 | if __name__ == "__main__": 82 | # Load config from environment or .env file 83 | from dotenv import load_dotenv 84 | load_dotenv() 85 | 86 | import os 87 | config = { 88 | "QDRANT_URL": os.getenv("QDRANT_URL", "http://localhost:6333"), 89 | "COLLECTION_NAME": os.getenv("COLLECTION_NAME", "mcp-codebase-insight"), 90 | "QDRANT_API_KEY": os.getenv("QDRANT_API_KEY", "") 91 | } 92 | 93 | success = asyncio.run(validate_vector_store(config)) 94 | sys.exit(0 if success else 1) ``` -------------------------------------------------------------------------------- /tests/components/conftest.py: -------------------------------------------------------------------------------- ```python 1 | """ 2 | Component Test Fixture Configuration. 3 | 4 | This file defines fixtures specifically for component tests that might have different 5 | scope requirements than the main test fixtures. 6 | """ 7 | import pytest 8 | import pytest_asyncio 9 | import sys 10 | import os 11 | from pathlib import Path 12 | import uuid 13 | from typing import Dict 14 | 15 | # Import required components 16 | from src.mcp_codebase_insight.core.config import ServerConfig 17 | from src.mcp_codebase_insight.core.vector_store import VectorStore 18 | from src.mcp_codebase_insight.core.embeddings import SentenceTransformerEmbedding 19 | from src.mcp_codebase_insight.core.knowledge import KnowledgeBase 20 | from src.mcp_codebase_insight.core.tasks import TaskManager 21 | # Ensure the src directory is in the Python path 22 | sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '../'))) 23 | 24 | @pytest.fixture 25 | def test_config(): 26 | """Create a server configuration for tests. 27 | 28 | This is an alias for test_server_config that allows component tests to use 29 | their expected fixture name. 30 | """ 31 | config = ServerConfig( 32 | host="localhost", 33 | port=8000, 34 | log_level="DEBUG", 35 | qdrant_url="http://localhost:6333", 36 | docs_cache_dir=Path(".test_cache") / "docs", 37 | adr_dir=Path(".test_cache") / "docs/adrs", 38 | kb_storage_dir=Path(".test_cache") / "knowledge", 39 | embedding_model="all-MiniLM-L6-v2", 40 | collection_name=f"test_collection_{uuid.uuid4().hex[:8]}", 41 | debug_mode=True, 42 | metrics_enabled=False, 43 | cache_enabled=True, 44 | memory_cache_size=1000, 45 | disk_cache_dir=Path(".test_cache") / "cache" 46 | ) 47 | return config 48 | 49 | @pytest.fixture 50 | def test_metadata() -> Dict: 51 | """Standard test metadata for consistency across tests.""" 52 | return { 53 | "type": "code", 54 | "language": "python", 55 | "title": "Test Code", 56 | "description": "Test code snippet for vector store testing", 57 | "tags": ["test", "vector"] 58 | } 59 | 60 | @pytest_asyncio.fixture 61 | async def embedder(): 62 | """Create an embedder for tests.""" 63 | return SentenceTransformerEmbedding() 64 | 65 | @pytest_asyncio.fixture 66 | async def vector_store(test_config, embedder): 67 | """Create a vector store for tests.""" 68 | store = VectorStore(test_config.qdrant_url, embedder) 69 | await store.initialize() 70 | yield store 71 | await store.cleanup() 72 | 73 | @pytest_asyncio.fixture 74 | async def task_manager(test_config): 75 | """Create a task manager for tests.""" 76 | manager = TaskManager(test_config) 77 | await manager.initialize() 78 | yield manager 79 | await manager.cleanup() 80 | 81 | @pytest.fixture 82 | def test_code(): 83 | """Provide sample code for testing task-related functionality.""" 84 | return """ 85 | def example_function(): 86 | \"\"\"This is a test function for task manager tests.\"\"\" 87 | return "Hello, world!" 88 | 89 | class TestClass: 90 | def __init__(self): 91 | self.value = 42 92 | 93 | def method(self): 94 | return self.value 95 | """ 96 | 97 | @pytest_asyncio.fixture 98 | async def knowledge_base(test_config, vector_store): 99 | """Create a knowledge base for tests.""" 100 | kb = KnowledgeBase(test_config, vector_store) 101 | await kb.initialize() 102 | yield kb 103 | await kb.cleanup() 104 | ``` -------------------------------------------------------------------------------- /tests/test_file_relationships.py: -------------------------------------------------------------------------------- ```python 1 | import pytest 2 | 3 | @pytest.mark.asyncio 4 | async def test_create_file_relationship(client): 5 | """Test creating a file relationship.""" 6 | relationship_data = { 7 | "source_file": "src/main.py", 8 | "target_file": "src/utils.py", 9 | "relationship_type": "imports", 10 | "description": "Main imports utility functions", 11 | "metadata": {"importance": "high"} 12 | } 13 | 14 | response = await client.post("/relationships", json=relationship_data) 15 | assert response.status_code == 200 16 | data = response.json() 17 | assert data["source_file"] == relationship_data["source_file"] 18 | assert data["target_file"] == relationship_data["target_file"] 19 | assert data["relationship_type"] == relationship_data["relationship_type"] 20 | 21 | @pytest.mark.asyncio 22 | async def test_get_file_relationships(client): 23 | """Test getting file relationships.""" 24 | # Create a test relationship first 25 | relationship_data = { 26 | "source_file": "src/test.py", 27 | "target_file": "src/helper.py", 28 | "relationship_type": "depends_on" 29 | } 30 | await client.post("/relationships", json=relationship_data) 31 | 32 | # Test getting all relationships 33 | response = await client.get("/relationships") 34 | assert response.status_code == 200 35 | data = response.json() 36 | assert len(data) > 0 37 | assert isinstance(data, list) 38 | 39 | # Test filtering by source file 40 | response = await client.get("/relationships", params={"source_file": "src/test.py"}) 41 | assert response.status_code == 200 42 | data = response.json() 43 | assert all(r["source_file"] == "src/test.py" for r in data) 44 | 45 | @pytest.mark.asyncio 46 | async def test_create_web_source(client): 47 | """Test creating a web source.""" 48 | source_data = { 49 | "url": "https://example.com/docs", 50 | "title": "API Documentation", 51 | "content_type": "documentation", 52 | "description": "External API documentation", 53 | "tags": ["api", "docs"], 54 | "metadata": {"version": "1.0"} 55 | } 56 | 57 | response = await client.post("/web-sources", json=source_data) 58 | assert response.status_code == 200 59 | data = response.json() 60 | assert data["url"] == source_data["url"] 61 | assert data["title"] == source_data["title"] 62 | assert data["content_type"] == source_data["content_type"] 63 | 64 | @pytest.mark.asyncio 65 | async def test_get_web_sources(client): 66 | """Test getting web sources.""" 67 | # Create a test web source first 68 | source_data = { 69 | "url": "https://example.com/tutorial", 70 | "title": "Tutorial", 71 | "content_type": "tutorial", 72 | "tags": ["guide", "tutorial"] 73 | } 74 | await client.post("/web-sources", json=source_data) 75 | 76 | # Test getting all web sources 77 | response = await client.get("/web-sources") 78 | assert response.status_code == 200 79 | data = response.json() 80 | assert len(data) > 0 81 | assert isinstance(data, list) 82 | 83 | # Test filtering by content type 84 | response = await client.get("/web-sources", params={"content_type": "tutorial"}) 85 | assert response.status_code == 200 86 | data = response.json() 87 | assert all(s["content_type"] == "tutorial" for s in data) 88 | 89 | # Test filtering by tags 90 | response = await client.get("/web-sources", params={"tags": ["guide"]}) 91 | assert response.status_code == 200 92 | data = response.json() 93 | assert any("guide" in s["tags"] for s in data) ``` -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- ```toml 1 | [build-system] 2 | requires = ["setuptools>=61.0", "wheel"] 3 | build-backend = "setuptools.build_meta" 4 | 5 | [project] 6 | name = "mcp-codebase-insight" 7 | dynamic = ["version"] 8 | description = "MCP Codebase Insight Server" 9 | readme = "README.md" 10 | requires-python = ">=3.10" 11 | license = {text = "MIT"} 12 | authors = [ 13 | {name = "Tosin Akinosho"} 14 | ] 15 | classifiers = [ 16 | "Development Status :: 3 - Alpha", 17 | "Intended Audience :: Developers", 18 | "License :: OSI Approved :: MIT License", 19 | "Programming Language :: Python :: 3", 20 | "Programming Language :: Python :: 3.10", 21 | "Programming Language :: Python :: 3.11", 22 | "Programming Language :: Python :: 3.12", 23 | "Programming Language :: Python :: 3.13", 24 | "Topic :: Software Development :: Libraries :: Python Modules", 25 | ] 26 | dependencies = [ 27 | "fastapi>=0.109.0", 28 | "uvicorn>=0.23.2", 29 | "pydantic>=2.4.2", 30 | "starlette>=0.35.0", 31 | "asyncio>=3.4.3", 32 | "aiohttp>=3.9.0", 33 | "qdrant-client>=1.13.3", 34 | "sentence-transformers>=2.2.2", 35 | "torch>=2.0.0", 36 | "transformers>=4.34.0", 37 | "python-frontmatter>=1.0.0", 38 | "markdown>=3.4.4", 39 | "PyYAML>=6.0.1", 40 | "structlog>=23.1.0", 41 | "psutil>=5.9.5", 42 | "python-dotenv>=1.0.0", 43 | "requests>=2.31.0", 44 | "beautifulsoup4>=4.12.0", 45 | "scipy>=1.11.0", 46 | "python-slugify>=8.0.0", 47 | "slugify>=0.0.1", 48 | "numpy>=1.24.0", 49 | # "uvx>=0.4.0", # Temporarily commented out for development installation 50 | "mcp-server-qdrant>=0.2.0", 51 | "mcp>=1.5.0,<1.6.0", # Pin to MCP 1.5.0 for API compatibility 52 | ] 53 | 54 | [project.optional-dependencies] 55 | test = [ 56 | "pytest>=7.4.2", 57 | "pytest-asyncio>=0.21.1", 58 | "pytest-cov>=4.1.0", 59 | "httpx>=0.25.0", 60 | ] 61 | dev = [ 62 | "black>=23.9.1", 63 | "isort>=5.12.0", 64 | "mypy>=1.5.1", 65 | "flake8>=6.1.0", 66 | "bump2version>=1.0.1", 67 | "pre-commit>=3.5.0", 68 | "pdoc>=14.1.0", 69 | ] 70 | 71 | [project.urls] 72 | Homepage = "https://github.com/tosin2013/mcp-codebase-insight" 73 | Documentation = "https://github.com/tosin2013/mcp-codebase-insight/docs" 74 | Repository = "https://github.com/tosin2013/mcp-codebase-insight.git" 75 | Issues = "https://github.com/tosin2013/mcp-codebase-insight/issues" 76 | 77 | [project.scripts] 78 | mcp-codebase-insight = "mcp_codebase_insight.server:run" 79 | 80 | [tool.setuptools] 81 | package-dir = {"" = "src"} 82 | 83 | [tool.setuptools.packages.find] 84 | where = ["src"] 85 | include = ["mcp_codebase_insight*"] 86 | 87 | [tool.black] 88 | line-length = 88 89 | target-version = ['py311'] 90 | include = '\.pyi?$' 91 | 92 | [tool.isort] 93 | profile = "black" 94 | multi_line_output = 3 95 | include_trailing_comma = true 96 | force_grid_wrap = 0 97 | use_parentheses = true 98 | ensure_newline_before_comments = true 99 | line_length = 88 100 | 101 | [tool.mypy] 102 | python_version = "3.11" 103 | warn_return_any = true 104 | warn_unused_configs = true 105 | disallow_untyped_defs = true 106 | check_untyped_defs = true 107 | disallow_untyped_decorators = true 108 | no_implicit_optional = true 109 | warn_redundant_casts = true 110 | warn_unused_ignores = true 111 | warn_no_return = true 112 | warn_unreachable = true 113 | 114 | [tool.pytest.ini_options] 115 | minversion = "6.0" 116 | addopts = "-ra -q --cov=src --cov-report=term-missing" 117 | testpaths = ["tests"] 118 | asyncio_mode = "auto" 119 | 120 | [tool.coverage.run] 121 | source = ["src"] 122 | branch = true 123 | 124 | [tool.coverage.report] 125 | exclude_lines = [ 126 | "pragma: no cover", 127 | "def __repr__", 128 | "if self.debug:", 129 | "raise NotImplementedError", 130 | "if __name__ == .__main__.:", 131 | "pass", 132 | "raise ImportError", 133 | ] 134 | ignore_errors = true 135 | omit = ["tests/*", "setup.py"] 136 | ``` -------------------------------------------------------------------------------- /tests/components/test_task_manager.py: -------------------------------------------------------------------------------- ```python 1 | import sys 2 | import os 3 | import pytest 4 | import pytest_asyncio 5 | from pathlib import Path 6 | from typing import AsyncGenerator 7 | from src.mcp_codebase_insight.core.tasks import TaskManager, TaskType, TaskStatus 8 | from src.mcp_codebase_insight.core.config import ServerConfig 9 | 10 | @pytest_asyncio.fixture 11 | async def task_manager(test_config: ServerConfig): 12 | manager = TaskManager(test_config) 13 | await manager.initialize() 14 | yield manager 15 | await manager.cleanup() 16 | 17 | @pytest.mark.asyncio 18 | async def test_task_manager_initialization(task_manager: TaskManager): 19 | """Test that task manager initializes correctly.""" 20 | assert task_manager is not None 21 | assert task_manager.config is not None 22 | 23 | @pytest.mark.asyncio 24 | async def test_create_and_get_task(task_manager: TaskManager, test_code: str): 25 | """Test creating and retrieving tasks.""" 26 | # Create task 27 | task = await task_manager.create_task( 28 | type="code_analysis", 29 | title="Test task", 30 | description="Test task description", 31 | context={"code": test_code} 32 | ) 33 | assert task is not None 34 | 35 | # Get task 36 | retrieved_task = await task_manager.get_task(task.id) 37 | assert retrieved_task.context["code"] == test_code 38 | assert retrieved_task.type == TaskType.CODE_ANALYSIS 39 | assert retrieved_task.description == "Test task description" 40 | 41 | @pytest.mark.asyncio 42 | async def test_task_status_updates(task_manager: TaskManager, test_code: str): 43 | """Test task status updates.""" 44 | # Create task 45 | task = await task_manager.create_task( 46 | type="code_analysis", 47 | title="Status Test", 48 | description="Test task status updates", 49 | context={"code": test_code} 50 | ) 51 | 52 | # Update status 53 | await task_manager.update_task(task.id, status=TaskStatus.IN_PROGRESS) 54 | updated_task = await task_manager.get_task(task.id) 55 | assert updated_task.status == TaskStatus.IN_PROGRESS 56 | 57 | await task_manager.update_task(task.id, status=TaskStatus.COMPLETED) 58 | completed_task = await task_manager.get_task(task.id) 59 | assert completed_task.status == TaskStatus.COMPLETED 60 | 61 | @pytest.mark.asyncio 62 | async def test_task_result_updates(task_manager: TaskManager, test_code: str): 63 | """Test updating task results.""" 64 | # Create task 65 | task = await task_manager.create_task( 66 | type="code_analysis", 67 | title="Result Test", 68 | description="Test task result updates", 69 | context={"code": test_code} 70 | ) 71 | 72 | # Update result 73 | result = {"analysis": "Test analysis result"} 74 | await task_manager.update_task(task.id, result=result) 75 | 76 | # Verify result 77 | updated_task = await task_manager.get_task(task.id) 78 | assert updated_task.result == result 79 | 80 | @pytest.mark.asyncio 81 | async def test_list_tasks(task_manager: TaskManager, test_code: str): 82 | """Test listing tasks.""" 83 | # Create multiple tasks 84 | tasks = [] 85 | for i in range(3): 86 | task = await task_manager.create_task( 87 | type="code_analysis", 88 | title=f"List Test {i}", 89 | description=f"Test task {i}", 90 | context={"code": test_code} 91 | ) 92 | tasks.append(task) 93 | 94 | # List tasks 95 | task_list = await task_manager.list_tasks() 96 | assert len(task_list) >= 3 97 | 98 | # Verify task descriptions 99 | descriptions = [task.description for task in task_list] 100 | for i in range(3): 101 | assert f"Test task {i}" in descriptions ``` -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- ```markdown 1 | # Changelog 2 | 3 | All notable changes to this project will be documented in this file. 4 | 5 | The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), 6 | and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). 7 | 8 | ## [Unreleased] 9 | 10 | ### Added 11 | - Initial project setup 12 | - Core server implementation 13 | - ADR management system 14 | - Documentation management 15 | - Knowledge base with vector search 16 | - Debug system 17 | - Task management 18 | - Metrics and health monitoring 19 | - Caching system 20 | - Structured logging 21 | - Docker support 22 | - CI/CD pipeline 23 | - Test suite 24 | 25 | ### Changed 26 | - None 27 | 28 | ### Deprecated 29 | - None 30 | 31 | ### Removed 32 | - None 33 | 34 | ### Fixed 35 | - None 36 | 37 | ### Security 38 | - None 39 | 40 | ## [0.2.2] - 2025-03-25 41 | 42 | ### Added 43 | - Implemented single source of truth for versioning 44 | 45 | ### Changed 46 | - Moved version to the package's __init__.py file as the canonical source 47 | - Updated setup.py to dynamically read version from __init__.py 48 | - Updated pyproject.toml to use dynamic versioning 49 | - Synchronized dependencies between setup.py, pyproject.toml and requirements.in 50 | 51 | ### Fixed 52 | - Missing dependencies in setup.py and pyproject.toml 53 | 54 | ## [0.2.1] - 2025-03-25 55 | 56 | ### Added 57 | - Integrated Qdrant Docker container in CI/CD workflow for more realistic testing 58 | - Added collection initialization step for proper Qdrant setup in CI/CD 59 | - Created shared Qdrant client fixture for improved test reliability 60 | 61 | ### Changed 62 | - Updated Python version requirement from >=3.11 to >=3.9 for broader compatibility 63 | - Enhanced test fixture scoping to resolve event_loop fixture scope mismatches 64 | - Improved connection verification for Qdrant in GitHub Actions workflow 65 | 66 | ### Fixed 67 | - Resolved fixture scope mismatches in async tests 68 | - Fixed environment variable handling in test configuration 69 | 70 | ### Removed 71 | - None 72 | 73 | ### Security 74 | - None 75 | 76 | ## [0.2.0] - 2025-03-24 77 | 78 | ### Added 79 | - None 80 | 81 | ### Changed 82 | - Improved async test fixture handling in component tests 83 | - Enhanced test discovery to properly distinguish between test functions and fixtures 84 | - Updated component test runner for better isolation and resource management 85 | 86 | ### Fixed 87 | - Resolved fixture scope mismatches in async tests 88 | - Fixed async event loop handling in component tests 89 | - Corrected test_metadata fixture identification in test_vector_store.py 90 | 91 | ### Removed 92 | - None 93 | 94 | ### Security 95 | - None 96 | 97 | ## [0.1.0] - 2025-03-19 98 | 99 | ### Added 100 | - Initial release 101 | - Basic server functionality 102 | - Core components: 103 | - ADR management 104 | - Documentation handling 105 | - Knowledge base 106 | - Vector search 107 | - Task management 108 | - Health monitoring 109 | - Metrics collection 110 | - Caching 111 | - Logging 112 | - Docker support 113 | - CI/CD pipeline with GitHub Actions 114 | - Test coverage with pytest 115 | - Code quality tools: 116 | - Black 117 | - isort 118 | - flake8 119 | - mypy 120 | - Documentation: 121 | - README 122 | - API documentation 123 | - Contributing guidelines 124 | - ADR templates 125 | - Development tools: 126 | - Makefile 127 | - Docker compose 128 | - Environment configuration 129 | - Version management 130 | 131 | [Unreleased]: https://github.com/modelcontextprotocol/mcp-codebase-insight/compare/v0.2.2...HEAD 132 | [0.2.2]: https://github.com/modelcontextprotocol/mcp-codebase-insight/compare/v0.2.1...v0.2.2 133 | [0.2.1]: https://github.com/modelcontextprotocol/mcp-codebase-insight/releases/tag/v0.2.1 134 | [0.2.0]: https://github.com/modelcontextprotocol/mcp-codebase-insight/releases/tag/v0.2.0 135 | [0.1.0]: https://github.com/modelcontextprotocol/mcp-codebase-insight/releases/tag/v0.1.0 136 | ``` -------------------------------------------------------------------------------- /docs/documentation_map.md: -------------------------------------------------------------------------------- ```markdown 1 | # Documentation Relationship Map 2 | 3 | ```mermaid 4 | graph TD 5 | %% ADRs 6 | ADR1[ADR-0001: Testing Strategy] 7 | ADR2[ADR-0002: SSE Testing] 8 | ADR3[ADR-0003: Comprehensive Testing] 9 | ADR4[ADR-0004: Documentation Linking] 10 | 11 | %% Core Systems 12 | CS1[Vector Store System] 13 | CS2[Knowledge Base] 14 | CS3[Task Management] 15 | CS4[Health Monitoring] 16 | CS5[Error Handling] 17 | CS6[Metrics Collection] 18 | CS7[Cache Management] 19 | 20 | %% Features 21 | FA[Code Analysis] 22 | FB[ADR Management] 23 | FC[Documentation Management] 24 | 25 | %% Testing 26 | TA[Server Testing] 27 | TB[SSE Testing] 28 | 29 | %% Components 30 | C1[Server Framework] 31 | C2[Testing Framework] 32 | C3[Documentation Tools] 33 | 34 | %% Implementation Files 35 | I1[test_server_instance.py] 36 | I2[SSETestManager.py] 37 | I3[ServerTestFramework.py] 38 | I4[DocNode.py] 39 | I5[DocumentationMap.py] 40 | 41 | %% Core Classes 42 | CC1[ServerConfig] 43 | CC2[ErrorCode] 44 | CC3[ComponentState] 45 | CC4[TaskTracker] 46 | CC5[DocumentationType] 47 | 48 | %% Relationships - Core Systems 49 | CS1 --> CC1 50 | CS2 --> CS1 51 | CS2 --> CS7 52 | CS3 --> CC4 53 | CS4 --> CC3 54 | CS5 --> CC2 55 | 56 | %% Relationships - ADRs 57 | ADR1 --> I1 58 | ADR1 --> C1 59 | ADR2 --> I2 60 | ADR2 --> TB 61 | ADR3 --> I3 62 | ADR3 --> C2 63 | ADR4 --> I4 64 | ADR4 --> I5 65 | ADR4 --> C3 66 | 67 | %% Relationships - Features 68 | FA --> CS2 69 | FA --> CS1 70 | FB --> ADR1 71 | FB --> ADR2 72 | FB --> ADR3 73 | FB --> ADR4 74 | FC --> C3 75 | FC --> CC5 76 | 77 | %% Relationships - Testing 78 | TA --> I1 79 | TA --> I3 80 | TB --> I2 81 | TB --> ADR2 82 | 83 | %% Component Relationships 84 | C1 --> CC1 85 | C1 --> CS4 86 | C2 --> I2 87 | C2 --> I3 88 | C3 --> I4 89 | C3 --> I5 90 | 91 | %% Error Handling 92 | CS5 --> FA 93 | CS5 --> FB 94 | CS5 --> FC 95 | CS5 --> CS1 96 | CS5 --> CS2 97 | CS5 --> CS3 98 | 99 | %% Styling 100 | classDef adr fill:#f9f,stroke:#333,stroke-width:2px 101 | classDef feature fill:#bbf,stroke:#333,stroke-width:2px 102 | classDef testing fill:#bfb,stroke:#333,stroke-width:2px 103 | classDef component fill:#fbb,stroke:#333,stroke-width:2px 104 | classDef implementation fill:#ddd,stroke:#333,stroke-width:1px 105 | classDef core fill:#ffd,stroke:#333,stroke-width:2px 106 | classDef class fill:#dff,stroke:#333,stroke-width:1px 107 | 108 | class ADR1,ADR2,ADR3,ADR4 adr 109 | class FA,FB,FC feature 110 | class TA,TB testing 111 | class C1,C2,C3 component 112 | class I1,I2,I3,I4,I5 implementation 113 | class CS1,CS2,CS3,CS4,CS5,CS6,CS7 core 114 | class CC1,CC2,CC3,CC4,CC5 class 115 | ``` 116 | 117 | ## Documentation Map Legend 118 | 119 | ### Node Types 120 | - **Purple**: Architecture Decision Records (ADRs) 121 | - **Blue**: Feature Documentation 122 | - **Green**: Testing Documentation 123 | - **Red**: Key Components 124 | - **Gray**: Implementation Files 125 | - **Yellow**: Core Systems 126 | - **Light Blue**: Core Classes 127 | 128 | ### Relationship Types 129 | - Arrows indicate dependencies or references between documents 130 | - Direct connections show implementation relationships 131 | - Indirect connections show conceptual relationships 132 | 133 | ### Key Areas 134 | 1. **Core Systems** 135 | - Vector Store and Knowledge Base 136 | - Task Management and Health Monitoring 137 | - Error Handling and Metrics Collection 138 | - Cache Management 139 | 140 | 2. **Testing Infrastructure** 141 | - Centered around ADR-0001 and ADR-0002 142 | - Connected to Server and SSE testing implementations 143 | 144 | 3. **Documentation Management** 145 | - Focused on ADR-0004 146 | - Links to Documentation Tools and models 147 | 148 | 4. **Feature Implementation** 149 | - Shows how features connect to components 150 | - Demonstrates implementation dependencies 151 | 152 | 5. **Error Handling** 153 | - Centralized error management 154 | - Connected to all major systems 155 | - Standardized error codes and types ``` -------------------------------------------------------------------------------- /src/mcp_codebase_insight/server_test_isolation.py: -------------------------------------------------------------------------------- ```python 1 | """Test isolation for ServerState. 2 | 3 | This module provides utilities to create isolated ServerState instances for testing, 4 | preventing state conflicts between parallel test runs. 5 | """ 6 | 7 | from typing import Dict, Optional 8 | import asyncio 9 | import uuid 10 | import logging 11 | 12 | from .core.state import ServerState 13 | from .utils.logger import get_logger 14 | 15 | logger = get_logger(__name__) 16 | 17 | # Store of server states keyed by instance ID 18 | _server_states: Dict[str, ServerState] = {} 19 | 20 | def get_isolated_server_state(instance_id: Optional[str] = None) -> ServerState: 21 | """Get or create an isolated ServerState instance for tests. 22 | 23 | Args: 24 | instance_id: Optional unique ID for the server state 25 | 26 | Returns: 27 | An isolated ServerState instance 28 | """ 29 | global _server_states 30 | 31 | if instance_id is None: 32 | # Create a new ServerState without storing it 33 | instance_id = f"temp_{uuid.uuid4().hex}" 34 | 35 | if instance_id not in _server_states: 36 | logger.debug(f"Creating new isolated ServerState with ID: {instance_id}") 37 | _server_states[instance_id] = ServerState() 38 | 39 | return _server_states[instance_id] 40 | 41 | async def cleanup_all_server_states(): 42 | """Clean up all tracked server states.""" 43 | global _server_states 44 | logger.debug(f"Cleaning up {len(_server_states)} isolated server states") 45 | 46 | # Make a copy of the states to avoid modification during iteration 47 | states_to_clean = list(_server_states.items()) 48 | cleanup_tasks = [] 49 | 50 | for instance_id, state in states_to_clean: 51 | try: 52 | logger.debug(f"Cleaning up ServerState: {instance_id}") 53 | if state.initialized: 54 | # Get active tasks before cleanup 55 | active_tasks = state.get_active_tasks() 56 | if active_tasks: 57 | logger.debug( 58 | f"Found {len(active_tasks)} active tasks for {instance_id}" 59 | ) 60 | 61 | # Schedule state cleanup with increased timeout 62 | cleanup_task = asyncio.create_task( 63 | asyncio.wait_for(state.cleanup(), timeout=5.0) 64 | ) 65 | cleanup_tasks.append((instance_id, cleanup_task)) 66 | else: 67 | logger.debug(f"Skipping uninitialized ServerState: {instance_id}") 68 | except Exception as e: 69 | logger.error( 70 | f"Error preparing cleanup for ServerState {instance_id}: {e}", 71 | exc_info=True 72 | ) 73 | 74 | # Wait for all cleanup tasks to complete 75 | if cleanup_tasks: 76 | for instance_id, task in cleanup_tasks: 77 | try: 78 | await task 79 | logger.debug(f"State {instance_id} cleaned up successfully") 80 | 81 | # Verify no tasks remain 82 | state = _server_states.get(instance_id) 83 | if state and state.get_task_count() > 0: 84 | logger.warning( 85 | f"State {instance_id} still has {state.get_task_count()} " 86 | "active tasks after cleanup" 87 | ) 88 | except asyncio.TimeoutError: 89 | logger.warning(f"State cleanup timed out for {instance_id}") 90 | # Force cleanup 91 | state = _server_states.get(instance_id) 92 | if state: 93 | state.initialized = False 94 | except Exception as e: 95 | logger.error(f"Error during state cleanup for {instance_id}: {e}") 96 | 97 | # Clear all states from global store 98 | _server_states.clear() 99 | logger.debug("All server states cleaned up") ``` -------------------------------------------------------------------------------- /src/mcp_codebase_insight/core/task_tracker.py: -------------------------------------------------------------------------------- ```python 1 | """Task tracking and management for async operations.""" 2 | 3 | import asyncio 4 | import logging 5 | from typing import Set, Optional 6 | from datetime import datetime 7 | 8 | from ..utils.logger import get_logger 9 | 10 | logger = get_logger(__name__) 11 | 12 | class TaskTracker: 13 | """Tracks and manages async tasks with improved error handling and logging.""" 14 | 15 | def __init__(self): 16 | """Initialize the task tracker.""" 17 | self._tasks: Set[asyncio.Task] = set() 18 | self._loop = asyncio.get_event_loop() 19 | self._loop_id = id(self._loop) 20 | self._start_time = datetime.utcnow() 21 | logger.debug(f"TaskTracker initialized with loop ID: {self._loop_id}") 22 | 23 | def track_task(self, task: asyncio.Task) -> None: 24 | """Track a new task and set up completion handling. 25 | 26 | Args: 27 | task: The asyncio.Task to track 28 | """ 29 | if id(asyncio.get_event_loop()) != self._loop_id: 30 | logger.warning( 31 | f"Task created in different event loop context. " 32 | f"Expected: {self._loop_id}, Got: {id(asyncio.get_event_loop())}" 33 | ) 34 | 35 | self._tasks.add(task) 36 | task.add_done_callback(self._handle_task_completion) 37 | logger.debug(f"Tracking new task: {task.get_name()}") 38 | 39 | def _handle_task_completion(self, task: asyncio.Task) -> None: 40 | """Handle task completion and cleanup. 41 | 42 | Args: 43 | task: The completed task 44 | """ 45 | self._tasks.discard(task) 46 | if task.exception(): 47 | logger.error( 48 | f"Task {task.get_name()} failed with error: {task.exception()}", 49 | exc_info=True 50 | ) 51 | else: 52 | logger.debug(f"Task {task.get_name()} completed successfully") 53 | 54 | async def cancel_all_tasks(self, timeout: float = 5.0) -> None: 55 | """Cancel all tracked tasks and wait for completion. 56 | 57 | Args: 58 | timeout: Maximum time to wait for tasks to cancel 59 | """ 60 | if not self._tasks: 61 | logger.debug("No tasks to cancel") 62 | return 63 | 64 | logger.debug(f"Cancelling {len(self._tasks)} tasks") 65 | for task in self._tasks: 66 | if not task.done() and not task.cancelled(): 67 | task.cancel() 68 | 69 | try: 70 | await asyncio.wait_for( 71 | asyncio.gather(*self._tasks, return_exceptions=True), 72 | timeout=timeout 73 | ) 74 | logger.debug("All tasks cancelled successfully") 75 | except asyncio.TimeoutError: 76 | logger.warning(f"Task cancellation timed out after {timeout} seconds") 77 | except Exception as e: 78 | logger.error(f"Error during task cancellation: {e}", exc_info=True) 79 | 80 | def get_active_tasks(self) -> Set[asyncio.Task]: 81 | """Get all currently active tasks. 82 | 83 | Returns: 84 | Set of active asyncio.Task objects 85 | """ 86 | return self._tasks.copy() 87 | 88 | def get_task_count(self) -> int: 89 | """Get the number of currently tracked tasks. 90 | 91 | Returns: 92 | Number of active tasks 93 | """ 94 | return len(self._tasks) 95 | 96 | def get_uptime(self) -> float: 97 | """Get the uptime of the task tracker in seconds. 98 | 99 | Returns: 100 | Uptime in seconds 101 | """ 102 | return (datetime.utcnow() - self._start_time).total_seconds() 103 | 104 | def __del__(self): 105 | """Cleanup when the tracker is destroyed.""" 106 | if self._tasks: 107 | logger.warning( 108 | f"TaskTracker destroyed with {len(self._tasks)} " 109 | "unfinished tasks" 110 | ) ``` -------------------------------------------------------------------------------- /docs/getting-started/installation.md: -------------------------------------------------------------------------------- ```markdown 1 | # Installation Guide 2 | 3 | > 🚧 **Documentation In Progress** 4 | > 5 | > This documentation is being actively developed. More details will be added soon. 6 | 7 | ## Prerequisites 8 | 9 | Before installing MCP Codebase Insight, ensure you have the following: 10 | 11 | - Python 3.11 or higher 12 | - pip (Python package installer) 13 | - Git 14 | - Docker (optional, for containerized deployment) 15 | - 4GB RAM minimum (8GB recommended) 16 | - 2GB free disk space 17 | 18 | ## System Requirements 19 | 20 | ### Operating Systems 21 | - Linux (Ubuntu 20.04+, CentOS 8+) 22 | - macOS (10.15+) 23 | - Windows 10/11 with WSL2 24 | 25 | ### Python Dependencies 26 | - FastAPI 27 | - Pydantic 28 | - httpx 29 | - sentence-transformers 30 | - qdrant-client 31 | 32 | ## Installation Methods 33 | 34 | ### 1. Using pip (Recommended) 35 | 36 | ```bash 37 | # Create and activate a virtual environment 38 | python -m venv venv 39 | source venv/bin/activate # On Windows: venv\Scripts\activate 40 | 41 | # Install MCP Codebase Insight 42 | pip install mcp-codebase-insight 43 | 44 | # Verify installation 45 | mcp-codebase-insight --version 46 | ``` 47 | 48 | ### 2. Using Docker 49 | 50 | ```bash 51 | # Pull the Docker image 52 | docker pull modelcontextprotocol/mcp-codebase-insight 53 | 54 | # Create necessary directories 55 | mkdir -p docs knowledge cache 56 | 57 | # Run the container 58 | docker run -p 3000:3000 \ 59 | --env-file .env \ 60 | -v $(pwd)/docs:/app/docs \ 61 | -v $(pwd)/knowledge:/app/knowledge \ 62 | -v $(pwd)/cache:/app/cache \ 63 | modelcontextprotocol/mcp-codebase-insight 64 | ``` 65 | 66 | ### 3. From Source 67 | 68 | ```bash 69 | # Clone the repository 70 | git clone https://github.com/modelcontextprotocol/mcp-codebase-insight.git 71 | cd mcp-codebase-insight 72 | 73 | # Create and activate virtual environment 74 | python -m venv venv 75 | source venv/bin/activate # On Windows: venv\Scripts\activate 76 | 77 | # Install dependencies 78 | pip install -r requirements.txt 79 | 80 | # Install in development mode 81 | pip install -e . 82 | ``` 83 | 84 | ## Environment Setup 85 | 86 | 1. Create a `.env` file in your project root: 87 | 88 | ```bash 89 | MCP_HOST=127.0.0.1 90 | MCP_PORT=3000 91 | QDRANT_URL=http://localhost:6333 92 | MCP_DOCS_CACHE_DIR=./docs 93 | MCP_ADR_DIR=./docs/adrs 94 | MCP_KB_STORAGE_DIR=./knowledge 95 | MCP_DISK_CACHE_DIR=./cache 96 | LOG_LEVEL=INFO 97 | ``` 98 | 99 | 2. Create required directories: 100 | 101 | ```bash 102 | mkdir -p docs/adrs knowledge cache 103 | ``` 104 | 105 | ## Post-Installation Steps 106 | 107 | 1. **Vector Database Setup** 108 | - Follow the [Qdrant Setup Guide](qdrant_setup.md) to install and configure Qdrant 109 | 110 | 2. **Verify Installation** 111 | ```bash 112 | # Start the server 113 | mcp-codebase-insight --host 127.0.0.1 --port 3000 114 | 115 | # In another terminal, test the health endpoint 116 | curl http://localhost:3000/health 117 | ``` 118 | 119 | 3. **Initial Configuration** 120 | - Configure authentication (if needed) 121 | - Set up logging 122 | - Configure metrics collection 123 | 124 | ## Common Installation Issues 125 | 126 | ### 1. Dependencies Installation Fails 127 | ```bash 128 | # Try upgrading pip 129 | pip install --upgrade pip 130 | 131 | # Install wheel 132 | pip install wheel 133 | 134 | # Retry installation 135 | pip install mcp-codebase-insight 136 | ``` 137 | 138 | ### 2. Port Already in Use 139 | ```bash 140 | # Check what's using port 3000 141 | lsof -i :3000 # On Linux/macOS 142 | netstat -ano | findstr :3000 # On Windows 143 | 144 | # Use a different port 145 | mcp-codebase-insight --port 3001 146 | ``` 147 | 148 | ### 3. Permission Issues 149 | ```bash 150 | # Fix directory permissions 151 | chmod -R 755 docs knowledge cache 152 | ``` 153 | 154 | ## Next Steps 155 | 156 | - Read the [Configuration Guide](configuration.md) for detailed setup options 157 | - Follow the [Quick Start Tutorial](quickstart.md) to begin using the system 158 | - Check the [Best Practices](../development/best-practices.md) for optimal usage 159 | - Follow the [Qdrant Setup](qdrant_setup.md) to set up the vector database 160 | 161 | ## Support 162 | 163 | If you encounter any issues during installation: 164 | 165 | 1. Check the [Troubleshooting Guide](../troubleshooting/common-issues.md) 166 | 2. Search existing [GitHub Issues](https://github.com/modelcontextprotocol/mcp-codebase-insight/issues) 167 | 3. Open a new issue if needed 168 | ``` -------------------------------------------------------------------------------- /docs/SSE_INTEGRATION.md: -------------------------------------------------------------------------------- ```markdown 1 | # Server-Sent Events (SSE) Integration 2 | 3 | This document explains the Server-Sent Events (SSE) integration in the MCP Codebase Insight server, including its purpose, architecture, and usage instructions. 4 | 5 | ## Overview 6 | 7 | The SSE integration enables real-time, bidirectional communication between the MCP Codebase Insight server and clients using the Model Context Protocol (MCP). This allows clients to receive live updates for long-running operations and establish persistent connections for continuous data flow. 8 | 9 | ## Architecture 10 | 11 | The SSE integration is built as a modular component within the MCP Codebase Insight system, following these design principles: 12 | 13 | 1. **Separation of Concerns**: The SSE transport layer is isolated from the core application logic 14 | 2. **Non-Interference**: SSE endpoints operate alongside existing REST API endpoints without disruption 15 | 3. **Shared Resources**: Both REST and SSE interfaces use the same underlying components and state 16 | 17 | ### Key Components 18 | 19 | - **MCP_CodebaseInsightServer**: Manages the MCP protocol server and exposes system functionality as MCP tools 20 | - **FastMCP**: The core MCP protocol implementation that handles messaging format and protocol features 21 | - **SseServerTransport**: Implements the SSE protocol for persistent connections 22 | - **Starlette Integration**: Low-level ASGI application that handles SSE connections 23 | 24 | ### Endpoint Structure 25 | 26 | - `/mcp/sse/`: Establishes the SSE connection for real-time events 27 | - `/mcp/messages/`: Handles incoming messages from clients via HTTP POST 28 | 29 | ### Data Flow 30 | 31 | ``` 32 | Client <---> SSE Connection (/mcp/sse/) <---> MCP Server <---> Core Components 33 | <---> Message POST (/mcp/messages/) <--/ 34 | ``` 35 | 36 | ## Available Tools 37 | 38 | The SSE integration exposes these core system capabilities as MCP tools: 39 | 40 | 1. **vector-search**: Search for code snippets semantically similar to a query text 41 | 2. **knowledge-search**: Search for patterns in the knowledge base 42 | 3. **adr-list**: Retrieve architectural decision records 43 | 4. **task-status**: Check status of long-running tasks 44 | 45 | ## Usage Instructions 46 | 47 | ### Client Configuration 48 | 49 | To connect to the SSE endpoint, configure your MCP client as follows: 50 | 51 | ```json 52 | { 53 | "mcpClients": { 54 | "codebase-insight-sse": { 55 | "url": "http://localhost:8000/mcp", 56 | "transport": "sse" 57 | } 58 | } 59 | } 60 | ``` 61 | 62 | ### Example: Connecting with MCP Client 63 | 64 | ```python 65 | from mcp.client import Client 66 | 67 | # Connect to the SSE endpoint 68 | client = Client.connect("codebase-insight-sse") 69 | 70 | # Use vector search tool 71 | results = await client.call_tool( 72 | "vector-search", 73 | {"query": "function that parses JSON", "limit": 5} 74 | ) 75 | ``` 76 | 77 | ## Testing 78 | 79 | The SSE implementation includes tests to verify: 80 | 81 | 1. Connection establishment and maintenance 82 | 2. Tool registration and execution 83 | 3. Error handling and reconnection behavior 84 | 85 | Run SSE-specific tests with: 86 | 87 | ```bash 88 | pytest tests/integration/test_sse.py -v 89 | ``` 90 | 91 | ## Security Considerations 92 | 93 | The SSE integration inherits the security model of the main application. When security features like authentication are enabled, they apply to SSE connections as well. 94 | 95 | ## Performance Considerations 96 | 97 | SSE connections are persistent and can consume server resources. Consider these guidelines: 98 | 99 | - Implement client-side reconnection strategies with exponential backoff 100 | - Set reasonable timeouts for idle connections 101 | - Monitor connection counts in production environments 102 | 103 | ## Troubleshooting 104 | 105 | Common issues and solutions: 106 | 107 | 1. **Connection Refused**: Ensure the server is running and the client is using the correct URL 108 | 2. **Invalid SSE Format**: Check for middleware that might buffer responses 109 | 3. **Connection Drops**: Verify network stability and implement reconnection logic 110 | ```