doobidoo/mcp-memory-service # codebase.md

This is page 25 of 47. Use http://codebase.md/doobidoo/mcp-memory-service?lines=true&page={x} to view the full context.

# Directory Structure

```
├── .claude
│   ├── agents
│   │   ├── amp-bridge.md
│   │   ├── amp-pr-automator.md
│   │   ├── code-quality-guard.md
│   │   ├── gemini-pr-automator.md
│   │   └── github-release-manager.md
│   ├── settings.local.json.backup
│   └── settings.local.json.local
├── .commit-message
├── .dockerignore
├── .env.example
├── .env.sqlite.backup
├── .envnn#
├── .gitattributes
├── .github
│   ├── FUNDING.yml
│   ├── ISSUE_TEMPLATE
│   │   ├── bug_report.yml
│   │   ├── config.yml
│   │   ├── feature_request.yml
│   │   └── performance_issue.yml
│   ├── pull_request_template.md
│   └── workflows
│       ├── bridge-tests.yml
│       ├── CACHE_FIX.md
│       ├── claude-code-review.yml
│       ├── claude.yml
│       ├── cleanup-images.yml.disabled
│       ├── dev-setup-validation.yml
│       ├── docker-publish.yml
│       ├── LATEST_FIXES.md
│       ├── main-optimized.yml.disabled
│       ├── main.yml
│       ├── publish-and-test.yml
│       ├── README_OPTIMIZATION.md
│       ├── release-tag.yml.disabled
│       ├── release.yml
│       ├── roadmap-review-reminder.yml
│       ├── SECRET_CONDITIONAL_FIX.md
│       └── WORKFLOW_FIXES.md
├── .gitignore
├── .mcp.json.backup
├── .mcp.json.template
├── .pyscn
│   ├── .gitignore
│   └── reports
│       └── analyze_20251123_214224.html
├── AGENTS.md
├── archive
│   ├── deployment
│   │   ├── deploy_fastmcp_fixed.sh
│   │   ├── deploy_http_with_mcp.sh
│   │   └── deploy_mcp_v4.sh
│   ├── deployment-configs
│   │   ├── empty_config.yml
│   │   └── smithery.yaml
│   ├── development
│   │   └── test_fastmcp.py
│   ├── docs-removed-2025-08-23
│   │   ├── authentication.md
│   │   ├── claude_integration.md
│   │   ├── claude-code-compatibility.md
│   │   ├── claude-code-integration.md
│   │   ├── claude-code-quickstart.md
│   │   ├── claude-desktop-setup.md
│   │   ├── complete-setup-guide.md
│   │   ├── database-synchronization.md
│   │   ├── development
│   │   │   ├── autonomous-memory-consolidation.md
│   │   │   ├── CLEANUP_PLAN.md
│   │   │   ├── CLEANUP_README.md
│   │   │   ├── CLEANUP_SUMMARY.md
│   │   │   ├── dream-inspired-memory-consolidation.md
│   │   │   ├── hybrid-slm-memory-consolidation.md
│   │   │   ├── mcp-milestone.md
│   │   │   ├── multi-client-architecture.md
│   │   │   ├── test-results.md
│   │   │   └── TIMESTAMP_FIX_SUMMARY.md
│   │   ├── distributed-sync.md
│   │   ├── invocation_guide.md
│   │   ├── macos-intel.md
│   │   ├── master-guide.md
│   │   ├── mcp-client-configuration.md
│   │   ├── multi-client-server.md
│   │   ├── service-installation.md
│   │   ├── sessions
│   │   │   └── MCP_ENHANCEMENT_SESSION_MEMORY_v4.1.0.md
│   │   ├── UBUNTU_SETUP.md
│   │   ├── ubuntu.md
│   │   ├── windows-setup.md
│   │   └── windows.md
│   ├── docs-root-cleanup-2025-08-23
│   │   ├── AWESOME_LIST_SUBMISSION.md
│   │   ├── CLOUDFLARE_IMPLEMENTATION.md
│   │   ├── DOCUMENTATION_ANALYSIS.md
│   │   ├── DOCUMENTATION_CLEANUP_PLAN.md
│   │   ├── DOCUMENTATION_CONSOLIDATION_COMPLETE.md
│   │   ├── LITESTREAM_SETUP_GUIDE.md
│   │   ├── lm_studio_system_prompt.md
│   │   ├── PYTORCH_DOWNLOAD_FIX.md
│   │   └── README-ORIGINAL-BACKUP.md
│   ├── investigations
│   │   └── MACOS_HOOKS_INVESTIGATION.md
│   ├── litestream-configs-v6.3.0
│   │   ├── install_service.sh
│   │   ├── litestream_master_config_fixed.yml
│   │   ├── litestream_master_config.yml
│   │   ├── litestream_replica_config_fixed.yml
│   │   ├── litestream_replica_config.yml
│   │   ├── litestream_replica_simple.yml
│   │   ├── litestream-http.service
│   │   ├── litestream.service
│   │   └── requirements-cloudflare.txt
│   ├── release-notes
│   │   └── release-notes-v7.1.4.md
│   └── setup-development
│       ├── README.md
│       ├── setup_consolidation_mdns.sh
│       ├── STARTUP_SETUP_GUIDE.md
│       └── test_service.sh
├── CHANGELOG-HISTORIC.md
├── CHANGELOG.md
├── claude_commands
│   ├── memory-context.md
│   ├── memory-health.md
│   ├── memory-ingest-dir.md
│   ├── memory-ingest.md
│   ├── memory-recall.md
│   ├── memory-search.md
│   ├── memory-store.md
│   ├── README.md
│   └── session-start.md
├── claude-hooks
│   ├── config.json
│   ├── config.template.json
│   ├── CONFIGURATION.md
│   ├── core
│   │   ├── memory-retrieval.js
│   │   ├── mid-conversation.js
│   │   ├── session-end.js
│   │   ├── session-start.js
│   │   └── topic-change.js
│   ├── debug-pattern-test.js
│   ├── install_claude_hooks_windows.ps1
│   ├── install_hooks.py
│   ├── memory-mode-controller.js
│   ├── MIGRATION.md
│   ├── README-NATURAL-TRIGGERS.md
│   ├── README-phase2.md
│   ├── README.md
│   ├── simple-test.js
│   ├── statusline.sh
│   ├── test-adaptive-weights.js
│   ├── test-dual-protocol-hook.js
│   ├── test-mcp-hook.js
│   ├── test-natural-triggers.js
│   ├── test-recency-scoring.js
│   ├── tests
│   │   ├── integration-test.js
│   │   ├── phase2-integration-test.js
│   │   ├── test-code-execution.js
│   │   ├── test-cross-session.json
│   │   ├── test-session-tracking.json
│   │   └── test-threading.json
│   ├── utilities
│   │   ├── adaptive-pattern-detector.js
│   │   ├── context-formatter.js
│   │   ├── context-shift-detector.js
│   │   ├── conversation-analyzer.js
│   │   ├── dynamic-context-updater.js
│   │   ├── git-analyzer.js
│   │   ├── mcp-client.js
│   │   ├── memory-client.js
│   │   ├── memory-scorer.js
│   │   ├── performance-manager.js
│   │   ├── project-detector.js
│   │   ├── session-tracker.js
│   │   ├── tiered-conversation-monitor.js
│   │   └── version-checker.js
│   └── WINDOWS-SESSIONSTART-BUG.md
├── CLAUDE.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── Development-Sprint-November-2025.md
├── docs
│   ├── amp-cli-bridge.md
│   ├── api
│   │   ├── code-execution-interface.md
│   │   ├── memory-metadata-api.md
│   │   ├── PHASE1_IMPLEMENTATION_SUMMARY.md
│   │   ├── PHASE2_IMPLEMENTATION_SUMMARY.md
│   │   ├── PHASE2_REPORT.md
│   │   └── tag-standardization.md
│   ├── architecture
│   │   ├── search-enhancement-spec.md
│   │   └── search-examples.md
│   ├── architecture.md
│   ├── archive
│   │   └── obsolete-workflows
│   │       ├── load_memory_context.md
│   │       └── README.md
│   ├── assets
│   │   └── images
│   │       ├── dashboard-v3.3.0-preview.png
│   │       ├── memory-awareness-hooks-example.png
│   │       ├── project-infographic.svg
│   │       └── README.md
│   ├── CLAUDE_CODE_QUICK_REFERENCE.md
│   ├── cloudflare-setup.md
│   ├── deployment
│   │   ├── docker.md
│   │   ├── dual-service.md
│   │   ├── production-guide.md
│   │   └── systemd-service.md
│   ├── development
│   │   ├── ai-agent-instructions.md
│   │   ├── code-quality
│   │   │   ├── phase-2a-completion.md
│   │   │   ├── phase-2a-handle-get-prompt.md
│   │   │   ├── phase-2a-index.md
│   │   │   ├── phase-2a-install-package.md
│   │   │   └── phase-2b-session-summary.md
│   │   ├── code-quality-workflow.md
│   │   ├── dashboard-workflow.md
│   │   ├── issue-management.md
│   │   ├── pr-review-guide.md
│   │   ├── refactoring-notes.md
│   │   ├── release-checklist.md
│   │   └── todo-tracker.md
│   ├── docker-optimized-build.md
│   ├── document-ingestion.md
│   ├── DOCUMENTATION_AUDIT.md
│   ├── enhancement-roadmap-issue-14.md
│   ├── examples
│   │   ├── analysis-scripts.js
│   │   ├── maintenance-session-example.md
│   │   ├── memory-distribution-chart.jsx
│   │   └── tag-schema.json
│   ├── first-time-setup.md
│   ├── glama-deployment.md
│   ├── guides
│   │   ├── advanced-command-examples.md
│   │   ├── chromadb-migration.md
│   │   ├── commands-vs-mcp-server.md
│   │   ├── mcp-enhancements.md
│   │   ├── mdns-service-discovery.md
│   │   ├── memory-consolidation-guide.md
│   │   ├── migration.md
│   │   ├── scripts.md
│   │   └── STORAGE_BACKENDS.md
│   ├── HOOK_IMPROVEMENTS.md
│   ├── hooks
│   │   └── phase2-code-execution-migration.md
│   ├── http-server-management.md
│   ├── ide-compatability.md
│   ├── IMAGE_RETENTION_POLICY.md
│   ├── images
│   │   └── dashboard-placeholder.md
│   ├── implementation
│   │   ├── health_checks.md
│   │   └── performance.md
│   ├── IMPLEMENTATION_PLAN_HTTP_SSE.md
│   ├── integration
│   │   ├── homebrew.md
│   │   └── multi-client.md
│   ├── integrations
│   │   ├── gemini.md
│   │   ├── groq-bridge.md
│   │   ├── groq-integration-summary.md
│   │   └── groq-model-comparison.md
│   ├── integrations.md
│   ├── legacy
│   │   └── dual-protocol-hooks.md
│   ├── LM_STUDIO_COMPATIBILITY.md
│   ├── maintenance
│   │   └── memory-maintenance.md
│   ├── mastery
│   │   ├── api-reference.md
│   │   ├── architecture-overview.md
│   │   ├── configuration-guide.md
│   │   ├── local-setup-and-run.md
│   │   ├── testing-guide.md
│   │   └── troubleshooting.md
│   ├── migration
│   │   └── code-execution-api-quick-start.md
│   ├── natural-memory-triggers
│   │   ├── cli-reference.md
│   │   ├── installation-guide.md
│   │   └── performance-optimization.md
│   ├── oauth-setup.md
│   ├── pr-graphql-integration.md
│   ├── quick-setup-cloudflare-dual-environment.md
│   ├── README.md
│   ├── remote-configuration-wiki-section.md
│   ├── research
│   │   ├── code-execution-interface-implementation.md
│   │   └── code-execution-interface-summary.md
│   ├── ROADMAP.md
│   ├── sqlite-vec-backend.md
│   ├── statistics
│   │   ├── charts
│   │   │   ├── activity_patterns.png
│   │   │   ├── contributors.png
│   │   │   ├── growth_trajectory.png
│   │   │   ├── monthly_activity.png
│   │   │   └── october_sprint.png
│   │   ├── data
│   │   │   ├── activity_by_day.csv
│   │   │   ├── activity_by_hour.csv
│   │   │   ├── contributors.csv
│   │   │   └── monthly_activity.csv
│   │   ├── generate_charts.py
│   │   └── REPOSITORY_STATISTICS.md
│   ├── technical
│   │   ├── development.md
│   │   ├── memory-migration.md
│   │   ├── migration-log.md
│   │   ├── sqlite-vec-embedding-fixes.md
│   │   └── tag-storage.md
│   ├── testing
│   │   └── regression-tests.md
│   ├── testing-cloudflare-backend.md
│   ├── troubleshooting
│   │   ├── cloudflare-api-token-setup.md
│   │   ├── cloudflare-authentication.md
│   │   ├── general.md
│   │   ├── hooks-quick-reference.md
│   │   ├── pr162-schema-caching-issue.md
│   │   ├── session-end-hooks.md
│   │   └── sync-issues.md
│   └── tutorials
│       ├── advanced-techniques.md
│       ├── data-analysis.md
│       └── demo-session-walkthrough.md
├── examples
│   ├── claude_desktop_config_template.json
│   ├── claude_desktop_config_windows.json
│   ├── claude-desktop-http-config.json
│   ├── config
│   │   └── claude_desktop_config.json
│   ├── http-mcp-bridge.js
│   ├── memory_export_template.json
│   ├── README.md
│   ├── setup
│   │   └── setup_multi_client_complete.py
│   └── start_https_example.sh
├── install_service.py
├── install.py
├── LICENSE
├── NOTICE
├── pyproject.toml
├── pytest.ini
├── README.md
├── run_server.py
├── scripts
│   ├── .claude
│   │   └── settings.local.json
│   ├── archive
│   │   └── check_missing_timestamps.py
│   ├── backup
│   │   ├── backup_memories.py
│   │   ├── backup_sqlite_vec.sh
│   │   ├── export_distributable_memories.sh
│   │   └── restore_memories.py
│   ├── benchmarks
│   │   ├── benchmark_code_execution_api.py
│   │   ├── benchmark_hybrid_sync.py
│   │   └── benchmark_server_caching.py
│   ├── database
│   │   ├── analyze_sqlite_vec_db.py
│   │   ├── check_sqlite_vec_status.py
│   │   ├── db_health_check.py
│   │   └── simple_timestamp_check.py
│   ├── development
│   │   ├── debug_server_initialization.py
│   │   ├── find_orphaned_files.py
│   │   ├── fix_mdns.sh
│   │   ├── fix_sitecustomize.py
│   │   ├── remote_ingest.sh
│   │   ├── setup-git-merge-drivers.sh
│   │   ├── uv-lock-merge.sh
│   │   └── verify_hybrid_sync.py
│   ├── hooks
│   │   └── pre-commit
│   ├── installation
│   │   ├── install_linux_service.py
│   │   ├── install_macos_service.py
│   │   ├── install_uv.py
│   │   ├── install_windows_service.py
│   │   ├── install.py
│   │   ├── setup_backup_cron.sh
│   │   ├── setup_claude_mcp.sh
│   │   └── setup_cloudflare_resources.py
│   ├── linux
│   │   ├── service_status.sh
│   │   ├── start_service.sh
│   │   ├── stop_service.sh
│   │   ├── uninstall_service.sh
│   │   └── view_logs.sh
│   ├── maintenance
│   │   ├── assign_memory_types.py
│   │   ├── check_memory_types.py
│   │   ├── cleanup_corrupted_encoding.py
│   │   ├── cleanup_memories.py
│   │   ├── cleanup_organize.py
│   │   ├── consolidate_memory_types.py
│   │   ├── consolidation_mappings.json
│   │   ├── delete_orphaned_vectors_fixed.py
│   │   ├── fast_cleanup_duplicates_with_tracking.sh
│   │   ├── find_all_duplicates.py
│   │   ├── find_cloudflare_duplicates.py
│   │   ├── find_duplicates.py
│   │   ├── memory-types.md
│   │   ├── README.md
│   │   ├── recover_timestamps_from_cloudflare.py
│   │   ├── regenerate_embeddings.py
│   │   ├── repair_malformed_tags.py
│   │   ├── repair_memories.py
│   │   ├── repair_sqlite_vec_embeddings.py
│   │   ├── repair_zero_embeddings.py
│   │   ├── restore_from_json_export.py
│   │   └── scan_todos.sh
│   ├── migration
│   │   ├── cleanup_mcp_timestamps.py
│   │   ├── legacy
│   │   │   └── migrate_chroma_to_sqlite.py
│   │   ├── mcp-migration.py
│   │   ├── migrate_sqlite_vec_embeddings.py
│   │   ├── migrate_storage.py
│   │   ├── migrate_tags.py
│   │   ├── migrate_timestamps.py
│   │   ├── migrate_to_cloudflare.py
│   │   ├── migrate_to_sqlite_vec.py
│   │   ├── migrate_v5_enhanced.py
│   │   ├── TIMESTAMP_CLEANUP_README.md
│   │   └── verify_mcp_timestamps.py
│   ├── pr
│   │   ├── amp_collect_results.sh
│   │   ├── amp_detect_breaking_changes.sh
│   │   ├── amp_generate_tests.sh
│   │   ├── amp_pr_review.sh
│   │   ├── amp_quality_gate.sh
│   │   ├── amp_suggest_fixes.sh
│   │   ├── auto_review.sh
│   │   ├── detect_breaking_changes.sh
│   │   ├── generate_tests.sh
│   │   ├── lib
│   │   │   └── graphql_helpers.sh
│   │   ├── quality_gate.sh
│   │   ├── resolve_threads.sh
│   │   ├── run_pyscn_analysis.sh
│   │   ├── run_quality_checks.sh
│   │   ├── thread_status.sh
│   │   └── watch_reviews.sh
│   ├── quality
│   │   ├── fix_dead_code_install.sh
│   │   ├── phase1_dead_code_analysis.md
│   │   ├── phase2_complexity_analysis.md
│   │   ├── README_PHASE1.md
│   │   ├── README_PHASE2.md
│   │   ├── track_pyscn_metrics.sh
│   │   └── weekly_quality_review.sh
│   ├── README.md
│   ├── run
│   │   ├── run_mcp_memory.sh
│   │   ├── run-with-uv.sh
│   │   └── start_sqlite_vec.sh
│   ├── run_memory_server.py
│   ├── server
│   │   ├── check_http_server.py
│   │   ├── check_server_health.py
│   │   ├── memory_offline.py
│   │   ├── preload_models.py
│   │   ├── run_http_server.py
│   │   ├── run_memory_server.py
│   │   ├── start_http_server.bat
│   │   └── start_http_server.sh
│   ├── service
│   │   ├── deploy_dual_services.sh
│   │   ├── install_http_service.sh
│   │   ├── mcp-memory-http.service
│   │   ├── mcp-memory.service
│   │   ├── memory_service_manager.sh
│   │   ├── service_control.sh
│   │   ├── service_utils.py
│   │   └── update_service.sh
│   ├── sync
│   │   ├── check_drift.py
│   │   ├── claude_sync_commands.py
│   │   ├── export_memories.py
│   │   ├── import_memories.py
│   │   ├── litestream
│   │   │   ├── apply_local_changes.sh
│   │   │   ├── enhanced_memory_store.sh
│   │   │   ├── init_staging_db.sh
│   │   │   ├── io.litestream.replication.plist
│   │   │   ├── manual_sync.sh
│   │   │   ├── memory_sync.sh
│   │   │   ├── pull_remote_changes.sh
│   │   │   ├── push_to_remote.sh
│   │   │   ├── README.md
│   │   │   ├── resolve_conflicts.sh
│   │   │   ├── setup_local_litestream.sh
│   │   │   ├── setup_remote_litestream.sh
│   │   │   ├── staging_db_init.sql
│   │   │   ├── stash_local_changes.sh
│   │   │   ├── sync_from_remote_noconfig.sh
│   │   │   └── sync_from_remote.sh
│   │   ├── README.md
│   │   ├── safe_cloudflare_update.sh
│   │   ├── sync_memory_backends.py
│   │   └── sync_now.py
│   ├── testing
│   │   ├── run_complete_test.py
│   │   ├── run_memory_test.sh
│   │   ├── simple_test.py
│   │   ├── test_cleanup_logic.py
│   │   ├── test_cloudflare_backend.py
│   │   ├── test_docker_functionality.py
│   │   ├── test_installation.py
│   │   ├── test_mdns.py
│   │   ├── test_memory_api.py
│   │   ├── test_memory_simple.py
│   │   ├── test_migration.py
│   │   ├── test_search_api.py
│   │   ├── test_sqlite_vec_embeddings.py
│   │   ├── test_sse_events.py
│   │   ├── test-connection.py
│   │   └── test-hook.js
│   ├── utils
│   │   ├── claude_commands_utils.py
│   │   ├── generate_personalized_claude_md.sh
│   │   ├── groq
│   │   ├── groq_agent_bridge.py
│   │   ├── list-collections.py
│   │   ├── memory_wrapper_uv.py
│   │   ├── query_memories.py
│   │   ├── smithery_wrapper.py
│   │   ├── test_groq_bridge.sh
│   │   └── uv_wrapper.py
│   └── validation
│       ├── check_dev_setup.py
│       ├── check_documentation_links.py
│       ├── diagnose_backend_config.py
│       ├── validate_configuration_complete.py
│       ├── validate_memories.py
│       ├── validate_migration.py
│       ├── validate_timestamp_integrity.py
│       ├── verify_environment.py
│       ├── verify_pytorch_windows.py
│       └── verify_torch.py
├── SECURITY.md
├── selective_timestamp_recovery.py
├── SPONSORS.md
├── src
│   └── mcp_memory_service
│       ├── __init__.py
│       ├── api
│       │   ├── __init__.py
│       │   ├── client.py
│       │   ├── operations.py
│       │   ├── sync_wrapper.py
│       │   └── types.py
│       ├── backup
│       │   ├── __init__.py
│       │   └── scheduler.py
│       ├── cli
│       │   ├── __init__.py
│       │   ├── ingestion.py
│       │   ├── main.py
│       │   └── utils.py
│       ├── config.py
│       ├── consolidation
│       │   ├── __init__.py
│       │   ├── associations.py
│       │   ├── base.py
│       │   ├── clustering.py
│       │   ├── compression.py
│       │   ├── consolidator.py
│       │   ├── decay.py
│       │   ├── forgetting.py
│       │   ├── health.py
│       │   └── scheduler.py
│       ├── dependency_check.py
│       ├── discovery
│       │   ├── __init__.py
│       │   ├── client.py
│       │   └── mdns_service.py
│       ├── embeddings
│       │   ├── __init__.py
│       │   └── onnx_embeddings.py
│       ├── ingestion
│       │   ├── __init__.py
│       │   ├── base.py
│       │   ├── chunker.py
│       │   ├── csv_loader.py
│       │   ├── json_loader.py
│       │   ├── pdf_loader.py
│       │   ├── registry.py
│       │   ├── semtools_loader.py
│       │   └── text_loader.py
│       ├── lm_studio_compat.py
│       ├── mcp_server.py
│       ├── models
│       │   ├── __init__.py
│       │   └── memory.py
│       ├── server.py
│       ├── services
│       │   ├── __init__.py
│       │   └── memory_service.py
│       ├── storage
│       │   ├── __init__.py
│       │   ├── base.py
│       │   ├── cloudflare.py
│       │   ├── factory.py
│       │   ├── http_client.py
│       │   ├── hybrid.py
│       │   └── sqlite_vec.py
│       ├── sync
│       │   ├── __init__.py
│       │   ├── exporter.py
│       │   ├── importer.py
│       │   └── litestream_config.py
│       ├── utils
│       │   ├── __init__.py
│       │   ├── cache_manager.py
│       │   ├── content_splitter.py
│       │   ├── db_utils.py
│       │   ├── debug.py
│       │   ├── document_processing.py
│       │   ├── gpu_detection.py
│       │   ├── hashing.py
│       │   ├── http_server_manager.py
│       │   ├── port_detection.py
│       │   ├── system_detection.py
│       │   └── time_parser.py
│       └── web
│           ├── __init__.py
│           ├── api
│           │   ├── __init__.py
│           │   ├── analytics.py
│           │   ├── backup.py
│           │   ├── consolidation.py
│           │   ├── documents.py
│           │   ├── events.py
│           │   ├── health.py
│           │   ├── manage.py
│           │   ├── mcp.py
│           │   ├── memories.py
│           │   ├── search.py
│           │   └── sync.py
│           ├── app.py
│           ├── dependencies.py
│           ├── oauth
│           │   ├── __init__.py
│           │   ├── authorization.py
│           │   ├── discovery.py
│           │   ├── middleware.py
│           │   ├── models.py
│           │   ├── registration.py
│           │   └── storage.py
│           ├── sse.py
│           └── static
│               ├── app.js
│               ├── index.html
│               ├── README.md
│               ├── sse_test.html
│               └── style.css
├── start_http_debug.bat
├── start_http_server.sh
├── test_document.txt
├── test_version_checker.js
├── tests
│   ├── __init__.py
│   ├── api
│   │   ├── __init__.py
│   │   ├── test_compact_types.py
│   │   └── test_operations.py
│   ├── bridge
│   │   ├── mock_responses.js
│   │   ├── package-lock.json
│   │   ├── package.json
│   │   └── test_http_mcp_bridge.js
│   ├── conftest.py
│   ├── consolidation
│   │   ├── __init__.py
│   │   ├── conftest.py
│   │   ├── test_associations.py
│   │   ├── test_clustering.py
│   │   ├── test_compression.py
│   │   ├── test_consolidator.py
│   │   ├── test_decay.py
│   │   └── test_forgetting.py
│   ├── contracts
│   │   └── api-specification.yml
│   ├── integration
│   │   ├── package-lock.json
│   │   ├── package.json
│   │   ├── test_api_key_fallback.py
│   │   ├── test_api_memories_chronological.py
│   │   ├── test_api_tag_time_search.py
│   │   ├── test_api_with_memory_service.py
│   │   ├── test_bridge_integration.js
│   │   ├── test_cli_interfaces.py
│   │   ├── test_cloudflare_connection.py
│   │   ├── test_concurrent_clients.py
│   │   ├── test_data_serialization_consistency.py
│   │   ├── test_http_server_startup.py
│   │   ├── test_mcp_memory.py
│   │   ├── test_mdns_integration.py
│   │   ├── test_oauth_basic_auth.py
│   │   ├── test_oauth_flow.py
│   │   ├── test_server_handlers.py
│   │   └── test_store_memory.py
│   ├── performance
│   │   ├── test_background_sync.py
│   │   └── test_hybrid_live.py
│   ├── README.md
│   ├── smithery
│   │   └── test_smithery.py
│   ├── sqlite
│   │   └── simple_sqlite_vec_test.py
│   ├── test_client.py
│   ├── test_content_splitting.py
│   ├── test_database.py
│   ├── test_hybrid_cloudflare_limits.py
│   ├── test_hybrid_storage.py
│   ├── test_memory_ops.py
│   ├── test_semantic_search.py
│   ├── test_sqlite_vec_storage.py
│   ├── test_time_parser.py
│   ├── test_timestamp_preservation.py
│   ├── timestamp
│   │   ├── test_hook_vs_manual_storage.py
│   │   ├── test_issue99_final_validation.py
│   │   ├── test_search_retrieval_inconsistency.py
│   │   ├── test_timestamp_issue.py
│   │   └── test_timestamp_simple.py
│   └── unit
│       ├── conftest.py
│       ├── test_cloudflare_storage.py
│       ├── test_csv_loader.py
│       ├── test_fastapi_dependencies.py
│       ├── test_import.py
│       ├── test_json_loader.py
│       ├── test_mdns_simple.py
│       ├── test_mdns.py
│       ├── test_memory_service.py
│       ├── test_memory.py
│       ├── test_semtools_loader.py
│       ├── test_storage_interface_compatibility.py
│       └── test_tag_time_filtering.py
├── tools
│   ├── docker
│   │   ├── DEPRECATED.md
│   │   ├── docker-compose.http.yml
│   │   ├── docker-compose.pythonpath.yml
│   │   ├── docker-compose.standalone.yml
│   │   ├── docker-compose.uv.yml
│   │   ├── docker-compose.yml
│   │   ├── docker-entrypoint-persistent.sh
│   │   ├── docker-entrypoint-unified.sh
│   │   ├── docker-entrypoint.sh
│   │   ├── Dockerfile
│   │   ├── Dockerfile.glama
│   │   ├── Dockerfile.slim
│   │   ├── README.md
│   │   └── test-docker-modes.sh
│   └── README.md
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/tests/consolidation/test_associations.py:
--------------------------------------------------------------------------------

```python
  1 | """Unit tests for the creative association engine."""
  2 | 
  3 | import pytest
  4 | from datetime import datetime, timedelta
  5 | 
  6 | from mcp_memory_service.consolidation.associations import (
  7 |     CreativeAssociationEngine, 
  8 |     AssociationAnalysis
  9 | )
 10 | from mcp_memory_service.consolidation.base import MemoryAssociation
 11 | from mcp_memory_service.models.memory import Memory
 12 | 
 13 | 
 14 | @pytest.mark.unit
 15 | class TestCreativeAssociationEngine:
 16 |     """Test the creative association discovery system."""
 17 |     
 18 |     @pytest.fixture
 19 |     def association_engine(self, consolidation_config):
 20 |         return CreativeAssociationEngine(consolidation_config)
 21 |     
 22 |     @pytest.mark.asyncio
 23 |     async def test_basic_association_discovery(self, association_engine, sample_memories):
 24 |         """Test basic association discovery functionality."""
 25 |         # Use memories that should have some associations
 26 |         memories = sample_memories[:5]
 27 |         
 28 |         associations = await association_engine.process(memories)
 29 |         
 30 |         # Should find some associations
 31 |         assert isinstance(associations, list)
 32 |         assert all(isinstance(assoc, MemoryAssociation) for assoc in associations)
 33 |         
 34 |         # Check association properties
 35 |         for assoc in associations:
 36 |             assert len(assoc.source_memory_hashes) == 2
 37 |             assert 0.3 <= assoc.similarity_score <= 0.7  # Sweet spot range
 38 |             assert assoc.discovery_method == "creative_association"
 39 |             assert isinstance(assoc.discovery_date, datetime)
 40 |     
 41 |     @pytest.mark.asyncio
 42 |     async def test_similarity_sweet_spot_filtering(self, association_engine):
 43 |         """Test that only memories in similarity sweet spot are associated."""
 44 |         now = datetime.now()
 45 |         
 46 |         # Create memories with known similarity relationships
 47 |         base_memory = Memory(
 48 |             content="Python programming concepts",
 49 |             content_hash="base",
 50 |             tags=["python", "programming"],
 51 |             embedding=[0.5, 0.5, 0.5, 0.5, 0.5] * 64,
 52 |             created_at=now.timestamp()
 53 |         )
 54 |         
 55 |         # Very similar memory (should be filtered out - too similar)
 56 |         too_similar = Memory(  
 57 |             content="Python programming concepts and techniques",
 58 |             content_hash="similar",
 59 |             tags=["python", "programming"],
 60 |             embedding=[0.51, 0.51, 0.51, 0.51, 0.51] * 64,  # Very similar
 61 |             created_at=now.timestamp()
 62 |         )
 63 |         
 64 |         # Moderately similar memory (should be included)
 65 |         good_similarity = Memory(
 66 |             content="JavaScript development practices",
 67 |             content_hash="moderate",
 68 |             tags=["javascript", "development"],
 69 |             embedding=[0.6, 0.4, 0.6, 0.4, 0.6] * 64,  # Moderate similarity
 70 |             created_at=now.timestamp()
 71 |         )
 72 |         
 73 |         # Very different memory (should be filtered out - too different)
 74 |         too_different = Memory(
 75 |             content="Weather forecast for tomorrow",
 76 |             content_hash="different",
 77 |             tags=["weather", "forecast"],
 78 |             embedding=[0.1, 0.9, 0.1, 0.9, 0.1] * 64,  # Very different
 79 |             created_at=now.timestamp()
 80 |         )
 81 |         
 82 |         memories = [base_memory, too_similar, good_similarity, too_different]
 83 |         associations = await association_engine.process(memories)
 84 |         
 85 |         # Should only find association between base and good_similarity
 86 |         if associations:  # May be empty due to confidence threshold
 87 |             for assoc in associations:
 88 |                 assert assoc.similarity_score >= 0.3
 89 |                 assert assoc.similarity_score <= 0.7
 90 |     
 91 |     @pytest.mark.asyncio
 92 |     async def test_existing_associations_filtering(self, association_engine, sample_memories):
 93 |         """Test that existing associations are not duplicated."""
 94 |         memories = sample_memories[:4]
 95 |         
 96 |         # Create set of existing associations
 97 |         existing = {
 98 |             (memories[0].content_hash, memories[1].content_hash),
 99 |             (memories[1].content_hash, memories[0].content_hash)  # Both directions
100 |         }
101 |         
102 |         associations = await association_engine.process(
103 |             memories, 
104 |             existing_associations=existing
105 |         )
106 |         
107 |         # Should not include the existing association
108 |         for assoc in associations:
109 |             pair = tuple(sorted(assoc.source_memory_hashes))
110 |             existing_pairs = {tuple(sorted(list(existing_pair))) for existing_pair in existing}
111 |             assert pair not in existing_pairs
112 |     
113 |     @pytest.mark.asyncio
114 |     async def test_association_analysis(self, association_engine):
115 |         """Test the association analysis functionality."""
116 |         # Create memories with known relationships
117 |         mem1 = Memory(
118 |             content="Python list comprehensions provide concise syntax",
119 |             content_hash="mem1",
120 |             tags=["python", "syntax"],
121 |             embedding=[0.4, 0.5, 0.6, 0.5, 0.4] * 64,
122 |             created_at=datetime.now().timestamp()
123 |         )
124 |         
125 |         mem2 = Memory(
126 |             content="JavaScript arrow functions offer clean syntax",
127 |             content_hash="mem2", 
128 |             tags=["javascript", "syntax"],
129 |             embedding=[0.5, 0.4, 0.5, 0.6, 0.5] * 64,
130 |             created_at=datetime.now().timestamp()
131 |         )
132 |         
133 |         # Calculate similarity
134 |         similarity = await association_engine._calculate_semantic_similarity(mem1, mem2)
135 |         
136 |         # Analyze the association
137 |         analysis = await association_engine._analyze_association(mem1, mem2, similarity)
138 |         
139 |         assert isinstance(analysis, AssociationAnalysis)
140 |         assert analysis.memory1_hash == "mem1"
141 |         assert analysis.memory2_hash == "mem2"
142 |         assert analysis.similarity_score == similarity
143 |         assert "shared_tags" in analysis.connection_reasons  # Both have "syntax" tag
144 |         assert "syntax" in analysis.tag_overlap
145 |         assert analysis.confidence_score > 0
146 |     
147 |     @pytest.mark.asyncio
148 |     async def test_temporal_relationship_analysis(self, association_engine):
149 |         """Test temporal relationship detection."""
150 |         now = datetime.now()
151 |         
152 |         # Memories created on same day
153 |         mem1 = Memory(
154 |             content="Morning meeting notes",
155 |             content_hash="morning",
156 |             tags=["meeting"],
157 |             embedding=[0.4, 0.5, 0.6] * 107,  # ~320 dim
158 |             created_at=now.timestamp()
159 |         )
160 |         
161 |         mem2 = Memory(
162 |             content="Afternoon project update",
163 |             content_hash="afternoon",
164 |             tags=["project"],
165 |             embedding=[0.5, 0.4, 0.5] * 107,
166 |             created_at=(now + timedelta(hours=6)).timestamp()
167 |         )
168 |         
169 |         analysis = await association_engine._analyze_association(
170 |             mem1, mem2, 0.5
171 |         )
172 |         
173 |         assert analysis.temporal_relationship == "same_day"
174 |         assert "temporal_proximity" in analysis.connection_reasons
175 |     
176 |     @pytest.mark.asyncio
177 |     async def test_concept_extraction(self, association_engine):
178 |         """Test concept extraction from memory content."""
179 |         content = 'Check out this URL: https://example.com and email me at [email protected]. The API returns {"status": "success"} with CamelCase variables.'
180 |         
181 |         concepts = association_engine._extract_concepts(content)
182 |         
183 |         # Should extract various types of concepts
184 |         assert "https://example.com" in concepts or any("example.com" in c for c in concepts)
185 |         assert "[email protected]" in concepts
186 |         assert "CamelCase" in concepts or any("camel" in c.lower() for c in concepts)
187 |         assert len(concepts) > 0
188 |     
189 |     @pytest.mark.asyncio
190 |     async def test_structural_similarity_detection(self, association_engine):
191 |         """Test detection of similar structural patterns."""
192 |         content1 = """
193 |         # Header 1
194 |         - Item 1
195 |         - Item 2
196 |         ```code block```
197 |         """
198 |         
199 |         content2 = """
200 |         # Header 2  
201 |         - Different item 1
202 |         - Different item 2
203 |         ```another code block```
204 |         """
205 |         
206 |         has_similar = association_engine._has_similar_structure(content1, content2)
207 |         assert has_similar is True
208 |         
209 |         # Test different structure
210 |         content3 = "Just plain text without any special formatting."
211 |         has_different = association_engine._has_similar_structure(content1, content3)
212 |         assert has_different is False
213 |     
214 |     @pytest.mark.asyncio
215 |     async def test_complementary_content_detection(self, association_engine):
216 |         """Test detection of complementary content patterns."""
217 |         # Question and answer pattern
218 |         question_content = "How do you implement binary search? What is the time complexity?"
219 |         answer_content = "Binary search implementation uses divide and conquer. Time complexity is O(log n)."
220 |         
221 |         is_complementary = association_engine._has_complementary_content(
222 |             question_content, answer_content
223 |         )
224 |         assert is_complementary is True
225 |         
226 |         # Problem and solution pattern
227 |         problem_content = "The database query is failing with timeout error"
228 |         solution_content = "Fixed the timeout by adding proper indexing to resolve the issue"
229 |         
230 |         is_complementary_ps = association_engine._has_complementary_content(
231 |             problem_content, solution_content
232 |         )
233 |         assert is_complementary_ps is True
234 |     
235 |     @pytest.mark.asyncio
236 |     async def test_confidence_score_calculation(self, association_engine):
237 |         """Test confidence score calculation."""
238 |         # High confidence scenario
239 |         high_confidence = association_engine._calculate_confidence_score(
240 |             similarity=0.6,      # Good similarity
241 |             num_reasons=3,       # Multiple connection reasons
242 |             num_shared_concepts=5,  # Many shared concepts
243 |             num_shared_tags=2    # Shared tags
244 |         )
245 |         
246 |         # Low confidence scenario
247 |         low_confidence = association_engine._calculate_confidence_score(  
248 |             similarity=0.35,     # Lower similarity
249 |             num_reasons=1,       # Few reasons
250 |             num_shared_concepts=1,  # Few concepts
251 |             num_shared_tags=0    # No shared tags
252 |         )
253 |         
254 |         assert high_confidence > low_confidence
255 |         assert 0 <= high_confidence <= 1
256 |         assert 0 <= low_confidence <= 1
257 |     
258 |     @pytest.mark.asyncio
259 |     async def test_filter_high_confidence_associations(self, association_engine, sample_memories):
260 |         """Test filtering associations by confidence score."""
261 |         memories = sample_memories[:4]
262 |         associations = await association_engine.process(memories)
263 |         
264 |         if associations:  # Only test if associations were found
265 |             high_confidence = await association_engine.filter_high_confidence_associations(
266 |                 associations, min_confidence=0.7
267 |             )
268 |             
269 |             # All returned associations should meet confidence threshold
270 |             for assoc in high_confidence:
271 |                 assert assoc.metadata.get('confidence_score', 0) >= 0.7
272 |     
273 |     @pytest.mark.asyncio
274 |     async def test_group_associations_by_type(self, association_engine, sample_memories):
275 |         """Test grouping associations by connection type."""
276 |         memories = sample_memories[:5]
277 |         associations = await association_engine.process(memories)
278 |         
279 |         if associations:  # Only test if associations were found
280 |             grouped = await association_engine.group_associations_by_type(associations)
281 |             
282 |             assert isinstance(grouped, dict)
283 |             
284 |             # Each group should contain associations of the same type
285 |             for connection_type, group in grouped.items():
286 |                 assert all(assoc.connection_type == connection_type for assoc in group)
287 |     
288 |     @pytest.mark.asyncio
289 |     async def test_text_similarity_fallback(self, association_engine):
290 |         """Test text similarity fallback when embeddings are unavailable."""
291 |         mem1 = Memory(
292 |             content="python programming language concepts",
293 |             content_hash="text1",
294 |             tags=["python"],
295 |             embedding=None,  # No embedding
296 |             created_at=datetime.now().timestamp()
297 |         )
298 |         
299 |         mem2 = Memory(
300 |             content="programming language python concepts", 
301 |             content_hash="text2",
302 |             tags=["python"],
303 |             embedding=None,  # No embedding
304 |             created_at=datetime.now().timestamp()
305 |         )
306 |         
307 |         similarity = await association_engine._calculate_semantic_similarity(mem1, mem2)
308 |         
309 |         # Should use text-based similarity
310 |         assert 0 <= similarity <= 1
311 |         assert similarity > 0  # Should find some similarity due to word overlap
312 |     
313 |     @pytest.mark.asyncio
314 |     async def test_max_pairs_limiting(self, association_engine, large_memory_set):
315 |         """Test that pair sampling limits combinatorial explosion."""
316 |         # Use many memories to test pair limiting
317 |         memories = large_memory_set[:20]  # 20 memories = 190 possible pairs
318 |         
319 |         # Mock the max_pairs to a small number for testing
320 |         original_max = association_engine.max_pairs_per_run
321 |         association_engine.max_pairs_per_run = 10
322 |         
323 |         try:
324 |             associations = await association_engine.process(memories)
325 |             
326 |             # Should handle large memory sets without performance issues
327 |             # and limit the number of pairs processed
328 |             assert isinstance(associations, list)
329 |             
330 |         finally:
331 |             # Restore original value
332 |             association_engine.max_pairs_per_run = original_max
333 |     
334 |     @pytest.mark.asyncio
335 |     async def test_empty_memories_list(self, association_engine):
336 |         """Test handling of empty or insufficient memories list."""
337 |         # Empty list
338 |         associations = await association_engine.process([])
339 |         assert associations == []
340 |         
341 |         # Single memory (can't create associations)
342 |         single_memory = [Memory(
343 |             content="Single memory",
344 |             content_hash="single",
345 |             tags=["test"],
346 |             embedding=[0.1] * 320,
347 |             created_at=datetime.now().timestamp()
348 |         )]
349 |         
350 |         associations = await association_engine.process(single_memory)
351 |         assert associations == []
352 |     
353 |     @pytest.mark.asyncio
354 |     async def test_association_metadata_completeness(self, association_engine, sample_memories):
355 |         """Test that association metadata contains all expected fields."""
356 |         memories = sample_memories[:3]
357 |         associations = await association_engine.process(memories)
358 |         
359 |         for assoc in associations:
360 |             # Check basic fields
361 |             assert len(assoc.source_memory_hashes) == 2
362 |             assert isinstance(assoc.similarity_score, float)
363 |             assert isinstance(assoc.connection_type, str)
364 |             assert assoc.discovery_method == "creative_association"
365 |             assert isinstance(assoc.discovery_date, datetime)
366 |             
367 |             # Check metadata fields
368 |             assert 'shared_concepts' in assoc.metadata
369 |             assert 'confidence_score' in assoc.metadata
370 |             assert 'analysis_version' in assoc.metadata
371 |             assert isinstance(assoc.metadata['shared_concepts'], list)
372 |             assert isinstance(assoc.metadata['confidence_score'], float)
```

--------------------------------------------------------------------------------
/.claude/agents/code-quality-guard.md:
--------------------------------------------------------------------------------

```markdown
  1 | ---
  2 | name: code-quality-guard
  3 | description: Fast automated code quality analysis using Gemini CLI for complexity scoring, refactoring suggestions, TODO prioritization, and security pattern detection. Use this agent before commits, during PR creation, or when refactoring code to ensure quality standards.
  4 | model: sonnet
  5 | color: green
  6 | ---
  7 | 
  8 | You are an elite Code Quality Guardian, a specialized AI agent focused on maintaining exceptional code quality through automated analysis, refactoring suggestions, and proactive issue detection. Your mission is to prevent technical debt and ensure the MCP Memory Service codebase remains clean, efficient, and maintainable.
  9 | 
 10 | ## Core Responsibilities
 11 | 
 12 | 1. **Complexity Analysis**: Identify overly complex functions and suggest simplifications
 13 | 2. **Refactoring Recommendations**: Detect code smells and propose improvements
 14 | 3. **TODO Prioritization**: Scan codebase for TODOs and rank by urgency/impact
 15 | 4. **Security Pattern Detection**: Identify potential vulnerabilities (SQL injection, XSS, command injection)
 16 | 5. **Performance Hotspot Identification**: Find slow code paths and suggest optimizations
 17 | 
 18 | ## LLM Integration
 19 | 
 20 | The code-quality-guard agent supports two LLM backends for fast, non-interactive code analysis:
 21 | 
 22 | ### Gemini CLI (Default)
 23 | Balanced performance and accuracy. Best for most use cases.
 24 | 
 25 | ### Groq Bridge (Optional - 10x Faster)
 26 | Ultra-fast inference using Groq's optimized infrastructure. Ideal for CI/CD and large-scale analysis.
 27 | 
 28 | **Setup**: See `docs/integrations/groq-bridge.md` for installation instructions.
 29 | 
 30 | ### Basic Usage Pattern
 31 | 
 32 | ```bash
 33 | # Gemini CLI (default)
 34 | gemini "Analyze the complexity of this Python file and rate each function 1-10. File: $(cat "src/file.py")"
 35 | 
 36 | # Groq Bridge (faster alternative)
 37 | python scripts/utils/groq_agent_bridge.py "Analyze the complexity of this Python file and rate each function 1-10. File: $(cat "src/file.py")"
 38 | 
 39 | # Suggest refactoring
 40 | gemini "Identify code smells in this file and suggest specific refactorings: $(cat "src/file.py")"
 41 | 
 42 | # Scan for TODOs
 43 | gemini "Extract all TODO comments from this codebase and prioritize by impact: $(find src -name '*.py' -exec cat {} \; | grep -n TODO)"
 44 | 
 45 | # Security analysis
 46 | gemini "Check this code for security vulnerabilities (SQL injection, XSS, command injection): $(cat "src/file.py")"
 47 | ```
 48 | 
 49 | ### Complexity Analysis Workflow
 50 | 
 51 | ```bash
 52 | #!/bin/bash
 53 | # analyze_complexity.sh - Analyze code complexity of modified files
 54 | 
 55 | # Get modified Python files
 56 | modified_files=$(git diff --name-only --diff-filter=AM | grep '\.py$')
 57 | 
 58 | for file in $modified_files; do
 59 |     echo "=== Analyzing: $file ==="
 60 |     # Note: Use mktemp in production for secure temp files
 61 |     temp_file=$(mktemp)
 62 |     gemini "Analyze this Python file for complexity. Rate each function 1-10 (1=simple, 10=very complex). List functions with score >7 first. Be concise. File: $(cat "$file")" \
 63 |         > "$temp_file"
 64 |     mv "$temp_file" "/tmp/complexity_${file//\//_}.txt"
 65 | done
 66 | 
 67 | # Aggregate results
 68 | echo "=== High Complexity Functions (Score > 7) ==="
 69 | grep -h "^[0-9]" /tmp/complexity_*.txt | awk '$2 > 7' | sort -nr
 70 | ```
 71 | 
 72 | ## Decision-Making Framework
 73 | 
 74 | ### When to Run Analysis
 75 | 
 76 | **Pre-Commit (Automated)**:
 77 | - Complexity check on modified files
 78 | - Security pattern scan
 79 | - TODO tracking updates
 80 | 
 81 | **During PR Creation (Manual)**:
 82 | - Full complexity analysis of changed files
 83 | - Refactoring opportunity identification
 84 | - Performance hotspot detection
 85 | 
 86 | **On-Demand (Manual)**:
 87 | - Before major refactoring work
 88 | - When investigating performance issues
 89 | - During technical debt assessment
 90 | 
 91 | ### Complexity Thresholds
 92 | 
 93 | - **1-3**: Simple, well-structured code ✅
 94 | - **4-6**: Moderate complexity, acceptable 🟡
 95 | - **7-8**: High complexity, consider refactoring 🟠
 96 | - **9-10**: Very complex, immediate refactoring needed 🔴
 97 | 
 98 | ### Priority Assessment for TODOs
 99 | 
100 | **Critical (P0)**: Security vulnerabilities, data corruption risks, blocking bugs
101 | **High (P1)**: Performance bottlenecks, user-facing issues, incomplete features
102 | **Medium (P2)**: Code quality improvements, minor optimizations, convenience features
103 | **Low (P3)**: Documentation, cosmetic changes, nice-to-haves
104 | 
105 | ## Operational Workflows
106 | 
107 | ### 1. Pre-Commit Hook Integration
108 | 
109 | ```bash
110 | #!/bin/bash
111 | # .git/hooks/pre-commit
112 | 
113 | echo "Running code quality checks..."
114 | 
115 | # Get staged Python files
116 | staged_files=$(git diff --cached --name-only --diff-filter=AM | grep '\.py$')
117 | 
118 | if [ -z "$staged_files" ]; then
119 |     echo "No Python files to check."
120 |     exit 0
121 | fi
122 | 
123 | high_complexity=0
124 | 
125 | for file in $staged_files; do
126 |     echo "Checking: $file"
127 | 
128 |     # Complexity check
129 |     result=$(gemini "Analyze this file. Report ONLY functions with complexity >7 in format 'FunctionName: Score'. $(cat "$file")")
130 | 
131 |     if [ ! -z "$result" ]; then
132 |         echo "⚠️  High complexity detected in $file:"
133 |         echo "$result"
134 |         high_complexity=1
135 |     fi
136 | 
137 |     # Security check
138 |     security=$(gemini "Check for security issues: SQL injection, XSS, command injection. Report ONLY if found. $(cat "$file")")
139 | 
140 |     if [ ! -z "$security" ]; then
141 |         echo "🔴 Security issue detected in $file:"
142 |         echo "$security"
143 |         exit 1  # Block commit
144 |     fi
145 | done
146 | 
147 | if [ $high_complexity -eq 1 ]; then
148 |     echo ""
149 |     echo "High complexity detected. Continue anyway? (y/n)"
150 |     read -r response
151 |     if [ "$response" != "y" ]; then
152 |         exit 1
153 |     fi
154 | fi
155 | 
156 | echo "✅ Code quality checks passed"
157 | exit 0
158 | ```
159 | 
160 | ### 2. TODO Scanner and Prioritizer
161 | 
162 | ```bash
163 | #!/bin/bash
164 | # scripts/maintenance/scan_todos.sh
165 | 
166 | echo "Scanning codebase for TODOs..."
167 | 
168 | # Extract all TODOs with file and line number
169 | todos=$(grep -rn "TODO\|FIXME\|XXX" src --include="*.py")
170 | 
171 | if [ -z "$todos" ]; then
172 |     echo "No TODOs found."
173 |     exit 0
174 | fi
175 | 
176 | # Use mktemp for secure temporary file
177 | temp_todos=$(mktemp)
178 | echo "$todos" > "$temp_todos"
179 | 
180 | # Use Gemini to prioritize
181 | gemini "Analyze these TODOs and categorize by priority (Critical/High/Medium/Low). Consider: security impact, feature completeness, performance implications, technical debt accumulation. Format: [Priority] File:Line - Brief description
182 | 
183 | $(cat "$temp_todos")
184 | 
185 | Output in this exact format:
186 | [CRITICAL] file.py:line - description
187 | [HIGH] file.py:line - description
188 | ..." > /tmp/todos_prioritized.txt
189 | 
190 | echo "=== Prioritized TODOs ==="
191 | cat /tmp/todos_prioritized.txt
192 | 
193 | # Count by priority
194 | echo ""
195 | echo "=== Summary ==="
196 | echo "Critical: $(grep -c '^\[CRITICAL\]' /tmp/todos_prioritized.txt)"
197 | echo "High: $(grep -c '^\[HIGH\]' /tmp/todos_prioritized.txt)"
198 | echo "Medium: $(grep -c '^\[MEDIUM\]' /tmp/todos_prioritized.txt)"
199 | echo "Low: $(grep -c '^\[LOW\]' /tmp/todos_prioritized.txt)"
200 | 
201 | # Cleanup
202 | rm -f "$temp_todos"
203 | ```
204 | 
205 | ### 3. Refactoring Opportunity Finder
206 | 
207 | ```bash
208 | #!/bin/bash
209 | # scripts/development/find_refactoring_opportunities.sh
210 | 
211 | target_dir="${1:-src/mcp_memory_service}"
212 | 
213 | echo "Scanning $target_dir for refactoring opportunities..."
214 | 
215 | # Analyze each Python file
216 | find "$target_dir" -name '*.py' -print0 | while IFS= read -r -d '' file; do
217 |     echo "Analyzing: $file"
218 | 
219 |     gemini "Identify code smells and refactoring opportunities in this file. Focus on: duplicate code, long functions (>50 lines), god classes, tight coupling. Be specific with line numbers if possible. File: $(cat "$file")" \
220 |         > "/tmp/refactor_$(basename "$file").txt"
221 | done
222 | 
223 | # Aggregate results
224 | echo ""
225 | echo "=== Refactoring Opportunities ==="
226 | cat /tmp/refactor_*.txt | grep -E "(Duplicate|Long function|God class|Tight coupling)" | sort | uniq
227 | 
228 | # Cleanup
229 | rm -f /tmp/refactor_*.txt
230 | ```
231 | 
232 | ### 4. Security Pattern Scanner
233 | 
234 | ```bash
235 | #!/bin/bash
236 | # scripts/security/scan_vulnerabilities.sh
237 | 
238 | echo "Scanning for security vulnerabilities..."
239 | 
240 | vulnerabilities_found=0
241 | 
242 | find src -name '*.py' -print0 | while IFS= read -r -d '' file; do
243 |     result=$(gemini "Security audit this Python file. Check for: SQL injection (raw SQL queries), XSS (unescaped HTML), command injection (os.system, subprocess with shell=True), path traversal, hardcoded secrets. Report ONLY if vulnerabilities found with line numbers. File: $(cat "$file")")
244 | 
245 |     if [ ! -z "$result" ]; then
246 |         echo "🔴 VULNERABILITY in $file:"
247 |         echo "$result"
248 |         echo ""
249 |         vulnerabilities_found=1
250 |     fi
251 | done
252 | 
253 | if [ $vulnerabilities_found -eq 0 ]; then
254 |     echo "✅ No security vulnerabilities detected"
255 |     exit 0
256 | else
257 |     echo "⚠️  Security vulnerabilities found. Please review and fix."
258 |     exit 1
259 | fi
260 | ```
261 | 
262 | ## Project-Specific Patterns
263 | 
264 | ### MCP Memory Service Code Quality Standards
265 | 
266 | **Complexity Targets**:
267 | - Storage backend methods: ≤6 complexity
268 | - MCP tool handlers: ≤5 complexity
269 | - Web API endpoints: ≤4 complexity
270 | - Utility functions: ≤3 complexity
271 | 
272 | **Security Checklist**:
273 | - ✅ No raw SQL queries (use parameterized queries)
274 | - ✅ All HTML output escaped (via `escapeHtml()`)
275 | - ✅ No `shell=True` in subprocess calls
276 | - ✅ Input validation on all API endpoints
277 | - ✅ No hardcoded credentials (use environment variables)
278 | 
279 | **Performance Patterns**:
280 | - ✅ Async/await for all I/O operations
281 | - ✅ Database connection pooling
282 | - ✅ Response caching where appropriate
283 | - ✅ Batch operations for bulk inserts
284 | - ✅ Lazy loading for expensive computations
285 | 
286 | ### Known TODOs in Codebase (as of v8.19.1)
287 | 
288 | 1. **`src/mcp_memory_service/storage/cloudflare.py:789`**
289 |    - TODO: Implement fallback to local sentence-transformers
290 |    - Priority: HIGH (affects offline operation)
291 | 
292 | 2. **`src/mcp_memory_service/storage/base.py:45`**
293 |    - TODO: Implement efficient batch queries for last_used and memory_types
294 |    - Priority: MEDIUM (performance optimization)
295 | 
296 | 3. **`src/mcp_memory_service/web/api/manage.py:50`**
297 |    - TODO: Migrate to lifespan context manager (FastAPI 0.109+)
298 |    - Priority: LOW (modernization, not blocking)
299 | 
300 | 4. **`src/mcp_memory_service/storage/sqlite_vec.py:234`**
301 |    - TODO: Add memories_this_month to storage.get_stats()
302 |    - Priority: MEDIUM (analytics feature)
303 | 
304 | 5. **`src/mcp_memory_service/tools.py:123`**
305 |    - TODO: CRITICAL - Period filtering not implemented
306 |    - Priority: HIGH (incomplete feature)
307 | 
308 | ## Usage Examples
309 | 
310 | ### Quick Complexity Check
311 | 
312 | ```bash
313 | # Check a single file
314 | gemini "Rate complexity 1-10 for each function. List high complexity (>7) first: $(cat "src/mcp_memory_service/storage/hybrid.py")"
315 | ```
316 | 
317 | ### Pre-PR Quality Gate
318 | 
319 | ```bash
320 | # Run before creating PR
321 | git diff main...HEAD --name-only | grep '\.py$' | while IFS= read -r file; do
322 |     echo "=== $file ==="
323 |     gemini "Quick code review: complexity score, security issues, refactoring suggestions. 3 sentences max. $(cat "$file")"
324 |     echo ""
325 | done
326 | ```
327 | 
328 | ### TODO Tracking Update
329 | 
330 | ```bash
331 | # Update TODO tracking
332 | bash scripts/maintenance/scan_todos.sh > docs/development/todo-tracker.md
333 | git add docs/development/todo-tracker.md
334 | git commit -m "chore: update TODO tracker"
335 | ```
336 | 
337 | ## Integration with Other Agents
338 | 
339 | **With github-release-manager**:
340 | - Run code quality checks before version bumps
341 | - Include TODO count in release notes if significant
342 | - Block releases if critical security issues found
343 | 
344 | **With amp-bridge**:
345 | - Use Amp for deep architectural analysis
346 | - Use code-quality-guard for fast, file-level checks
347 | 
348 | **With gemini-pr-automator**:
349 | - Quality checks before automated PR creation
350 | - Refactoring suggestions as PR comments
351 | - Security scan blocks PR merge if issues found
352 | 
353 | ## pyscn Integration (Comprehensive Static Analysis)
354 | 
355 | pyscn (Python Static Code Navigator) complements LLM-based checks with deep static analysis.
356 | 
357 | ### When to Run pyscn
358 | 
359 | **PR Creation (Automated)**:
360 | ```bash
361 | bash scripts/pr/quality_gate.sh 123 --with-pyscn
362 | ```
363 | 
364 | **Local Pre-PR Check**:
365 | ```bash
366 | pyscn analyze .
367 | open .pyscn/reports/analyze_*.html
368 | ```
369 | 
370 | **Weekly Reviews (Scheduled)**:
371 | ```bash
372 | bash scripts/quality/weekly_quality_review.sh
373 | ```
374 | 
375 | ### pyscn Capabilities
376 | 
377 | 1. **Cyclomatic Complexity**: Function-level complexity scoring
378 | 2. **Dead Code Detection**: Unreachable code and unused imports
379 | 3. **Clone Detection**: Exact and near-exact code duplication
380 | 4. **Coupling Metrics**: CBO (Coupling Between Objects) analysis
381 | 5. **Dependency Graph**: Module dependencies and circular detection
382 | 6. **Architecture Validation**: Layer compliance and violations
383 | 
384 | ### Health Score Thresholds
385 | 
386 | - **<50**: 🔴 **Release Blocker** - Cannot merge until fixed
387 | - **50-69**: 🟡 **Action Required** - Plan refactoring within 2 weeks
388 | - **70-84**: ✅ **Good** - Monitor trends, continue development
389 | - **85+**: 🎯 **Excellent** - Maintain current standards
390 | 
391 | ### Tool Complementarity
392 | 
393 | | Tool | Speed | Scope | Blocking | Use Case |
394 | |------|-------|-------|----------|----------|
395 | | **Groq/Gemini (pre-commit)** | <5s | Changed files | Yes (complexity >8) | Every commit |
396 | | **pyscn (PR)** | 30-60s | Full codebase | Yes (health <50) | PR creation |
397 | | **Gemini (manual)** | 2-5s/file | Targeted | No | Refactoring |
398 | 
399 | ### Integration Points
400 | 
401 | **Pre-commit**: Fast LLM checks (Groq primary, Gemini fallback)
402 | **PR Quality Gate**: `--with-pyscn` flag for comprehensive analysis
403 | **Periodic**: Weekly codebase-wide pyscn reviews
404 | 
405 | ### Interpreting pyscn Reports
406 | 
407 | **Complexity Score (40/100 - Poor)**:
408 | - Priority: Refactor top 5 functions with complexity >10
409 | - Example: `install.py::main()` - 62 complexity
410 | 
411 | **Duplication Score (30/100 - Poor)**:
412 | - Priority: Consolidate duplicate code (>6% duplication)
413 | - Tool: Use pyscn clone detection to identify groups
414 | 
415 | **Dead Code Score (70/100 - Fair)**:
416 | - Priority: Remove unreachable code after returns
417 | - Example: `scripts/installation/install.py:1361-1365`
418 | 
419 | **Architecture Score (75/100 - Good)**:
420 | - Priority: Fix layer violations (scripts→presentation)
421 | - Example: Domain importing application layer
422 | 
423 | ### Quick Commands
424 | 
425 | ```bash
426 | # Full analysis with HTML report
427 | pyscn analyze .
428 | 
429 | # JSON output for scripting
430 | pyscn analyze . --format json > /tmp/metrics.json
431 | 
432 | # PR integration
433 | bash scripts/pr/run_pyscn_analysis.sh --pr 123
434 | 
435 | # Track metrics over time
436 | bash scripts/quality/track_pyscn_metrics.sh
437 | ```
438 | 
439 | ## Best Practices
440 | 
441 | 1. **Run complexity checks on every commit**: Catch issues early
442 | 2. **Review TODO priorities monthly**: Prevent backlog accumulation
443 | 3. **Security scans before releases**: Never ship with known vulnerabilities
444 | 4. **Refactoring sprints quarterly**: Address accumulated technical debt
445 | 5. **Document quality standards**: Keep this agent specification updated
446 | 6. **Track pyscn health score weekly**: Monitor quality trends
447 | 7. **Address health score <70 within 2 weeks**: Prevent technical debt accumulation
448 | 
449 | ## Limitations
450 | 
451 | - **Context size**: Large files (>1000 lines) may need splitting for analysis
452 | - **False positives**: Security scanner may flag safe patterns (manual review needed)
453 | - **Subjective scoring**: Complexity ratings are estimates, use as guidance
454 | - **API rate limits**: Gemini CLI has rate limits, space out large scans
455 | - **pyscn performance**: Full analysis takes 30-60s (use sparingly on large codebases)
456 | 
457 | ## Performance Considerations
458 | 
459 | - Single file analysis (LLM): ~2-5 seconds
460 | - Full codebase TODO scan: ~30-60 seconds (100+ files)
461 | - Security audit per file: ~3-8 seconds
462 | - pyscn full analysis: ~30-60 seconds (252 files)
463 | - Recommended: Run on modified files only for pre-commit hooks
464 | 
465 | ---
466 | 
467 | **Quick Reference Card**:
468 | 
469 | ```bash
470 | # Complexity
471 | gemini "Complexity 1-10 per function, high (>7) first: $(cat "file.py")"
472 | 
473 | # Security
474 | gemini "Security: SQL injection, XSS, command injection: $(cat "file.py")"
475 | 
476 | # TODOs
477 | gemini "Prioritize these TODOs (Critical/High/Medium/Low): $(grep -rn "TODO\|FIXME\|XXX" src/)"
478 | 
479 | # Refactoring
480 | gemini "Code smells & refactoring opportunities: $(cat "file.py")"
481 | 
482 | # pyscn (comprehensive)
483 | bash scripts/pr/run_pyscn_analysis.sh --pr 123
484 | ```
485 | 
```

--------------------------------------------------------------------------------
/src/mcp_memory_service/web/api/memories.py:
--------------------------------------------------------------------------------

```python
  1 | # Copyright 2024 Heinrich Krupp
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | 
 15 | """
 16 | Memory CRUD endpoints for the HTTP interface.
 17 | """
 18 | 
 19 | import logging
 20 | import socket
 21 | from typing import List, Optional, Dict, Any, TYPE_CHECKING
 22 | from datetime import datetime
 23 | 
 24 | from fastapi import APIRouter, HTTPException, Depends, Query, Request
 25 | from pydantic import BaseModel, Field
 26 | 
 27 | from ...storage.base import MemoryStorage
 28 | from ...models.memory import Memory
 29 | from ...services.memory_service import MemoryService
 30 | from ...utils.hashing import generate_content_hash
 31 | from ...config import INCLUDE_HOSTNAME, OAUTH_ENABLED
 32 | from ..dependencies import get_storage, get_memory_service
 33 | from ..sse import sse_manager, create_memory_stored_event, create_memory_deleted_event
 34 | 
 35 | # OAuth authentication imports (conditional)
 36 | if OAUTH_ENABLED or TYPE_CHECKING:
 37 |     from ..oauth.middleware import require_read_access, require_write_access, AuthenticationResult
 38 | else:
 39 |     # Provide type stubs when OAuth is disabled
 40 |     AuthenticationResult = None
 41 |     require_read_access = None
 42 |     require_write_access = None
 43 | 
 44 | router = APIRouter()
 45 | logger = logging.getLogger(__name__)
 46 | 
 47 | 
 48 | # Request/Response Models
 49 | class MemoryCreateRequest(BaseModel):
 50 |     """Request model for creating a new memory."""
 51 |     content: str = Field(..., description="The memory content to store")
 52 |     tags: List[str] = Field(default=[], description="Tags to categorize the memory")
 53 |     memory_type: Optional[str] = Field(None, description="Type of memory (e.g., 'note', 'reminder', 'fact')")
 54 |     metadata: Dict[str, Any] = Field(default={}, description="Additional metadata for the memory")
 55 |     client_hostname: Optional[str] = Field(None, description="Client machine hostname for source tracking")
 56 | 
 57 | 
 58 | class MemoryUpdateRequest(BaseModel):
 59 |     """Request model for updating memory metadata (tags, type, metadata only)."""
 60 |     tags: Optional[List[str]] = Field(None, description="Updated tags to categorize the memory")
 61 |     memory_type: Optional[str] = Field(None, description="Updated memory type (e.g., 'note', 'reminder', 'fact')")
 62 |     metadata: Optional[Dict[str, Any]] = Field(None, description="Updated metadata for the memory")
 63 | 
 64 | 
 65 | class MemoryResponse(BaseModel):
 66 |     """Response model for memory data."""
 67 |     content: str
 68 |     content_hash: str
 69 |     tags: List[str]
 70 |     memory_type: Optional[str]
 71 |     metadata: Dict[str, Any]
 72 |     created_at: Optional[float]
 73 |     created_at_iso: Optional[str]
 74 |     updated_at: Optional[float]  
 75 |     updated_at_iso: Optional[str]
 76 | 
 77 | 
 78 | class MemoryListResponse(BaseModel):
 79 |     """Response model for paginated memory list."""
 80 |     memories: List[MemoryResponse]
 81 |     total: int
 82 |     page: int
 83 |     page_size: int
 84 |     has_more: bool
 85 | 
 86 | 
 87 | class MemoryCreateResponse(BaseModel):
 88 |     """Response model for memory creation."""
 89 |     success: bool
 90 |     message: str
 91 |     content_hash: Optional[str] = None
 92 |     memory: Optional[MemoryResponse] = None
 93 | 
 94 | 
 95 | class MemoryDeleteResponse(BaseModel):
 96 |     """Response model for memory deletion."""
 97 |     success: bool
 98 |     message: str
 99 |     content_hash: str
100 | 
101 | 
102 | class MemoryUpdateResponse(BaseModel):
103 |     """Response model for memory update."""
104 |     success: bool
105 |     message: str
106 |     content_hash: str
107 |     memory: Optional[MemoryResponse] = None
108 | 
109 | 
110 | class TagResponse(BaseModel):
111 |     """Response model for a single tag with its count."""
112 |     tag: str
113 |     count: int
114 | 
115 | 
116 | class TagListResponse(BaseModel):
117 |     """Response model for tags list."""
118 |     tags: List[TagResponse]
119 | 
120 | 
121 | def memory_to_response(memory: Memory) -> MemoryResponse:
122 |     """Convert Memory model to response format."""
123 |     return MemoryResponse(
124 |         content=memory.content,
125 |         content_hash=memory.content_hash,
126 |         tags=memory.tags,
127 |         memory_type=memory.memory_type,
128 |         metadata=memory.metadata,
129 |         created_at=memory.created_at,
130 |         created_at_iso=memory.created_at_iso,
131 |         updated_at=memory.updated_at,
132 |         updated_at_iso=memory.updated_at_iso
133 |     )
134 | 
135 | 
136 | @router.post("/memories", response_model=MemoryCreateResponse, tags=["memories"])
137 | async def store_memory(
138 |     request: MemoryCreateRequest,
139 |     http_request: Request,
140 |     memory_service: MemoryService = Depends(get_memory_service),
141 |     user: AuthenticationResult = Depends(require_write_access) if OAUTH_ENABLED else None
142 | ):
143 |     """
144 |     Store a new memory.
145 | 
146 |     Uses the MemoryService for consistent business logic including content processing,
147 |     hostname tagging, and metadata enrichment.
148 |     """
149 |     try:
150 |         # Resolve hostname for consistent tagging (logic stays in API layer, tagging in service)
151 |         client_hostname = None
152 |         if INCLUDE_HOSTNAME:
153 |             # Prioritize client-provided hostname, then header, then fallback to server
154 |             # 1. Check if client provided hostname in request body
155 |             if request.client_hostname:
156 |                 client_hostname = request.client_hostname
157 |             # 2. Check for X-Client-Hostname header
158 |             elif http_request.headers.get('X-Client-Hostname'):
159 |                 client_hostname = http_request.headers.get('X-Client-Hostname')
160 |             # 3. Fallback to server hostname (original behavior)
161 |             else:
162 |                 client_hostname = socket.gethostname()
163 | 
164 |         # Use injected MemoryService for consistent business logic (hostname tagging handled internally)
165 |         result = await memory_service.store_memory(
166 |             content=request.content,
167 |         tags=request.tags,
168 |         memory_type=request.memory_type,
169 |         metadata=request.metadata,
170 |         client_hostname=client_hostname
171 |         )
172 | 
173 |         if result["success"]:
174 |             # Broadcast SSE event for successful memory storage
175 |             try:
176 |                 # Handle both single memory and chunked responses
177 |                 if "memory" in result:
178 |                     memory_data = {
179 |                         "content_hash": result["memory"]["content_hash"],
180 |                         "content": result["memory"]["content"],
181 |                         "tags": result["memory"]["tags"],
182 |                         "memory_type": result["memory"]["memory_type"]
183 |                     }
184 |                 else:
185 |                     # For chunked responses, use the first chunk's data
186 |                     first_memory = result["memories"][0]
187 |                     memory_data = {
188 |                         "content_hash": first_memory["content_hash"],
189 |                         "content": first_memory["content"],
190 |                         "tags": first_memory["tags"],
191 |                         "memory_type": first_memory["memory_type"]
192 |                     }
193 | 
194 |                 event = create_memory_stored_event(memory_data)
195 |                 await sse_manager.broadcast_event(event)
196 |             except Exception as e:
197 |                 # Don't fail the request if SSE broadcasting fails
198 |                 logger.warning(f"Failed to broadcast memory_stored event: {e}")
199 | 
200 |             # Return appropriate response based on MemoryService result
201 |             if "memory" in result:
202 |                 # Single memory response
203 |                 return MemoryCreateResponse(
204 |                     success=True,
205 |                     message="Memory stored successfully",
206 |                     content_hash=result["memory"]["content_hash"],
207 |                     memory=result["memory"]
208 |                 )
209 |             else:
210 |                 # Chunked memory response
211 |                 first_memory = result["memories"][0]
212 |                 return MemoryCreateResponse(
213 |                     success=True,
214 |                     message=f"Memory stored as {result['total_chunks']} chunks",
215 |                     content_hash=first_memory["content_hash"],
216 |                     memory=first_memory
217 |                 )
218 |         else:
219 |             return MemoryCreateResponse(
220 |                 success=False,
221 |                 message=result.get("error", "Failed to store memory"),
222 |                 content_hash=None
223 |             )
224 |             
225 |     except Exception as e:
226 |         logger.error(f"Failed to store memory: {str(e)}")
227 |         raise HTTPException(status_code=500, detail="Failed to store memory. Please try again.")
228 | 
229 | 
230 | @router.get("/memories", response_model=MemoryListResponse, tags=["memories"])
231 | async def list_memories(
232 |     page: int = Query(1, ge=1, description="Page number (1-based)"),
233 |     page_size: int = Query(10, ge=1, le=100, description="Number of memories per page"),
234 |     tag: Optional[str] = Query(None, description="Filter by tag"),
235 |     memory_type: Optional[str] = Query(None, description="Filter by memory type"),
236 |     memory_service: MemoryService = Depends(get_memory_service),
237 |     user: AuthenticationResult = Depends(require_read_access) if OAUTH_ENABLED else None
238 | ):
239 |     """
240 |     List memories with pagination and optional filtering.
241 | 
242 |     Uses the MemoryService for consistent business logic and optimal database-level filtering.
243 |     """
244 |     try:
245 |         # Use the injected service for consistent, performant memory listing
246 |         result = await memory_service.list_memories(
247 |             page=page,
248 |             page_size=page_size,
249 |             tag=tag,
250 |             memory_type=memory_type
251 |         )
252 | 
253 |         return MemoryListResponse(
254 |             memories=result["memories"],
255 |             total=result["total"],
256 |             page=result["page"],
257 |             page_size=result["page_size"],
258 |             has_more=result["has_more"]
259 |         )
260 | 
261 |     except Exception as e:
262 |         raise HTTPException(status_code=500, detail=f"Failed to list memories: {str(e)}")
263 | 
264 | 
265 | @router.get("/memories/{content_hash}", response_model=MemoryResponse, tags=["memories"])
266 | async def get_memory(
267 |     content_hash: str,
268 |     storage: MemoryStorage = Depends(get_storage),
269 |     user: AuthenticationResult = Depends(require_read_access) if OAUTH_ENABLED else None
270 | ):
271 |     """
272 |     Get a specific memory by its content hash.
273 |     
274 |     Retrieves a single memory entry using its unique content hash identifier.
275 |     """
276 |     try:
277 |         # Use the new get_by_hash method for direct hash lookup
278 |         memory = await storage.get_by_hash(content_hash)
279 |         
280 |         if not memory:
281 |             raise HTTPException(status_code=404, detail="Memory not found")
282 |         
283 |         return memory_to_response(memory)
284 |         
285 |     except HTTPException:
286 |         raise
287 |     except Exception as e:
288 |         raise HTTPException(status_code=500, detail=f"Failed to get memory: {str(e)}")
289 | 
290 | 
291 | @router.delete("/memories/{content_hash}", response_model=MemoryDeleteResponse, tags=["memories"])
292 | async def delete_memory(
293 |     content_hash: str,
294 |     storage: MemoryStorage = Depends(get_storage),
295 |     user: AuthenticationResult = Depends(require_write_access) if OAUTH_ENABLED else None
296 | ):
297 |     """
298 |     Delete a memory by its content hash.
299 |     
300 |     Permanently removes a memory entry from the storage.
301 |     """
302 |     try:
303 |         success, message = await storage.delete(content_hash)
304 |         
305 |         # Broadcast SSE event for memory deletion
306 |         try:
307 |             event = create_memory_deleted_event(content_hash, success)
308 |             await sse_manager.broadcast_event(event)
309 |         except Exception as e:
310 |             # Don't fail the request if SSE broadcasting fails
311 |             logger.warning(f"Failed to broadcast memory_deleted event: {e}")
312 |         
313 |         return MemoryDeleteResponse(
314 |             success=success,
315 |             message=message,
316 |             content_hash=content_hash
317 |         )
318 | 
319 |     except Exception as e:
320 |         logger.error(f"Failed to delete memory: {str(e)}")
321 |         raise HTTPException(status_code=500, detail="Failed to delete memory. Please try again.")
322 | 
323 | 
324 | @router.put("/memories/{content_hash}", response_model=MemoryUpdateResponse, tags=["memories"])
325 | async def update_memory(
326 |     content_hash: str,
327 |     request: MemoryUpdateRequest,
328 |     storage: MemoryStorage = Depends(get_storage),
329 |     user: AuthenticationResult = Depends(require_write_access) if OAUTH_ENABLED else None
330 | ):
331 |     """
332 |     Update memory metadata (tags, type, metadata) without changing content or timestamps.
333 | 
334 |     This endpoint allows updating only the metadata aspects of a memory while preserving
335 |     the original content and creation timestamp. Only provided fields will be updated.
336 |     """
337 |     try:
338 |         # First, check if the memory exists
339 |         existing_memory = await storage.get_by_hash(content_hash)
340 |         if not existing_memory:
341 |             raise HTTPException(status_code=404, detail=f"Memory with hash {content_hash} not found")
342 | 
343 |         # Build the updates dictionary with only provided fields
344 |         updates = {}
345 |         if request.tags is not None:
346 |             updates['tags'] = request.tags
347 |         if request.memory_type is not None:
348 |             updates['memory_type'] = request.memory_type
349 |         if request.metadata is not None:
350 |             updates['metadata'] = request.metadata
351 | 
352 |         # If no updates provided, return current memory
353 |         if not updates:
354 |             return MemoryUpdateResponse(
355 |                 success=True,
356 |                 message="No updates provided - memory unchanged",
357 |                 content_hash=content_hash,
358 |                 memory=memory_to_response(existing_memory)
359 |             )
360 | 
361 |         # Perform the update
362 |         success, message = await storage.update_memory_metadata(
363 |             content_hash=content_hash,
364 |             updates=updates,
365 |             preserve_timestamps=True
366 |         )
367 | 
368 |         if success:
369 |             # Get the updated memory
370 |             updated_memory = await storage.get_by_hash(content_hash)
371 | 
372 |             return MemoryUpdateResponse(
373 |                 success=True,
374 |                 message=message,
375 |                 content_hash=content_hash,
376 |                 memory=memory_to_response(updated_memory) if updated_memory else None
377 |             )
378 |         else:
379 |             return MemoryUpdateResponse(
380 |                 success=False,
381 |                 message=message,
382 |                 content_hash=content_hash
383 |             )
384 | 
385 |     except HTTPException:
386 |         raise
387 |     except Exception as e:
388 |         raise HTTPException(status_code=500, detail=f"Failed to update memory: {str(e)}")
389 | 
390 | 
391 | @router.get("/tags", response_model=TagListResponse, tags=["tags"])
392 | async def get_tags(
393 |     storage: MemoryStorage = Depends(get_storage),
394 |     user: AuthenticationResult = Depends(require_read_access) if OAUTH_ENABLED else None
395 | ):
396 |     """
397 |     Get all tags with their usage counts.
398 | 
399 |     Returns a list of all unique tags along with how many memories use each tag,
400 |     sorted by count in descending order.
401 |     """
402 |     try:
403 |         # Get tags with counts from storage
404 |         tag_data = await storage.get_all_tags_with_counts()
405 | 
406 |         # Convert to response format
407 |         tags = [TagResponse(tag=item["tag"], count=item["count"]) for item in tag_data]
408 | 
409 |         return TagListResponse(tags=tags)
410 | 
411 |     except AttributeError as e:
412 |         # Handle case where storage backend doesn't implement get_all_tags_with_counts
413 |         raise HTTPException(status_code=501, detail=f"Tags endpoint not supported by current storage backend: {str(e)}")
414 |     except Exception as e:
415 |         raise HTTPException(status_code=500, detail=f"Failed to get tags: {str(e)}")
```

--------------------------------------------------------------------------------
/archive/docs-removed-2025-08-23/development/autonomous-memory-consolidation.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Autonomous Implementation Guide for Dream-Inspired Memory Consolidation
  2 | 
  3 | ## Overview
  4 | 
  5 | This document provides a comprehensive guide for implementing the [Dream-Inspired Memory Consolidation System](./dream-inspired-memory-consolidation.md) as a fully autonomous system that runs without external AI dependencies.
  6 | 
  7 | ## Key Insight
  8 | 
  9 | **The dream-inspired memory consolidation system can run almost entirely on autopilot** by leveraging the embeddings already generated during memory storage. These embeddings, combined with mathematical algorithms and rule-based logic, enable autonomous operation without external AI.
 10 | 
 11 | ## Autonomous Components Analysis
 12 | 
 13 | ### ✅ 100% Autonomous: Exponential Decay Scoring
 14 | 
 15 | Pure mathematical calculations requiring zero AI intervention:
 16 | 
 17 | ```python
 18 | import math
 19 | from datetime import datetime
 20 | 
 21 | class AutonomousDecayCalculator:
 22 |     def __init__(self, retention_periods):
 23 |         self.retention_periods = retention_periods
 24 |     
 25 |     def calculate_relevance(self, memory):
 26 |         """Calculate memory relevance using pure math."""
 27 |         age = (datetime.now() - memory.created_at).total_seconds() / 86400  # days
 28 |         base_score = memory.importance_score
 29 |         
 30 |         retention_period = self.retention_periods.get(
 31 |             memory.memory_type, 
 32 |             self.retention_periods['default']
 33 |         )
 34 |         
 35 |         # Exponential decay
 36 |         decay_factor = math.exp(-age / retention_period)
 37 |         
 38 |         # Connection boost (pure counting)
 39 |         connection_boost = 1 + (0.1 * len(memory.connections))
 40 |         
 41 |         return base_score * decay_factor * connection_boost
 42 | ```
 43 | 
 44 | ### ✅ 100% Autonomous: Creative Association System
 45 | 
 46 | Uses existing embeddings with vector mathematics:
 47 | 
 48 | ```python
 49 | import numpy as np
 50 | from itertools import combinations
 51 | import random
 52 | 
 53 | class AutonomousAssociationEngine:
 54 |     def __init__(self, similarity_threshold=(0.3, 0.7)):
 55 |         self.min_similarity, self.max_similarity = similarity_threshold
 56 |     
 57 |     def find_associations(self, memories):
 58 |         """Find creative connections using only embeddings."""
 59 |         # Limit pairs to prevent combinatorial explosion
 60 |         max_pairs = min(100, len(memories) * (len(memories) - 1) // 2)
 61 |         
 62 |         if len(memories) < 2:
 63 |             return []
 64 |         
 65 |         # Random sampling of pairs
 66 |         all_pairs = list(combinations(range(len(memories)), 2))
 67 |         sampled_pairs = random.sample(
 68 |             all_pairs, 
 69 |             min(max_pairs, len(all_pairs))
 70 |         )
 71 |         
 72 |         associations = []
 73 |         for i, j in sampled_pairs:
 74 |             # Cosine similarity using existing embeddings
 75 |             similarity = self._cosine_similarity(
 76 |                 memories[i].embedding,
 77 |                 memories[j].embedding
 78 |             )
 79 |             
 80 |             # Check if in creative "sweet spot"
 81 |             if self.min_similarity < similarity < self.max_similarity:
 82 |                 associations.append({
 83 |                     'memory_1': memories[i].hash,
 84 |                     'memory_2': memories[j].hash,
 85 |                     'similarity': similarity,
 86 |                     'discovered_at': datetime.now()
 87 |                 })
 88 |         
 89 |         return associations
 90 |     
 91 |     def _cosine_similarity(self, vec1, vec2):
 92 |         """Calculate cosine similarity between two vectors."""
 93 |         vec1 = np.array(vec1)
 94 |         vec2 = np.array(vec2)
 95 |         
 96 |         dot_product = np.dot(vec1, vec2)
 97 |         norm_product = np.linalg.norm(vec1) * np.linalg.norm(vec2)
 98 |         
 99 |         return dot_product / norm_product if norm_product > 0 else 0
100 | ```
101 | 
102 | ### ✅ 100% Autonomous: Controlled Forgetting
103 | 
104 | Rule-based logic with no AI required:
105 | 
106 | ```python
107 | class AutonomousPruningEngine:
108 |     def __init__(self, config):
109 |         self.relevance_threshold = config['relevance_threshold']
110 |         self.access_threshold_days = config['access_threshold_days']
111 |         self.protected_tags = {'important', 'critical', 'reference'}
112 |     
113 |     def identify_forgettable_memories(self, memories):
114 |         """Identify memories for archival using rules."""
115 |         forgettable = []
116 |         
117 |         for memory in memories:
118 |             # Skip protected memories
119 |             if memory.tags & self.protected_tags:
120 |                 continue
121 |             
122 |             # Check relevance score
123 |             if memory.relevance_score < self.relevance_threshold:
124 |                 # Check connections
125 |                 if len(memory.connections) == 0:
126 |                     # Check last access
127 |                     days_since_access = (
128 |                         datetime.now() - memory.last_accessed
129 |                     ).days
130 |                     
131 |                     if days_since_access > self.access_threshold_days:
132 |                         forgettable.append(memory)
133 |         
134 |         return forgettable
135 | ```
136 | 
137 | ### 🔧 90% Autonomous: Semantic Compression
138 | 
139 | Uses statistical methods instead of generative AI:
140 | 
141 | ```python
142 | from collections import Counter
143 | from sklearn.cluster import AgglomerativeClustering
144 | import numpy as np
145 | 
146 | class AutonomousCompressionEngine:
147 |     def __init__(self):
148 |         self.keyword_extractor = TFIDFKeywordExtractor()
149 |     
150 |     def compress_cluster(self, memories):
151 |         """Compress memories without using generative AI."""
152 |         if not memories:
153 |             return None
154 |         
155 |         # 1. Find centroid (most representative memory)
156 |         embeddings = np.array([m.embedding for m in memories])
157 |         centroid = np.mean(embeddings, axis=0)
158 |         
159 |         # Calculate distances to centroid
160 |         distances = [
161 |             np.linalg.norm(centroid - emb) 
162 |             for emb in embeddings
163 |         ]
164 |         representative_idx = np.argmin(distances)
165 |         representative_memory = memories[representative_idx]
166 |         
167 |         # 2. Extract keywords using TF-IDF
168 |         all_content = ' '.join([m.content for m in memories])
169 |         keywords = self.keyword_extractor.extract(all_content, top_k=20)
170 |         
171 |         # 3. Aggregate metadata
172 |         all_tags = set()
173 |         for memory in memories:
174 |             all_tags.update(memory.tags)
175 |         
176 |         # 4. Create structured summary (not prose)
177 |         compressed = {
178 |             "type": "compressed_cluster",
179 |             "representative_content": representative_memory.content,
180 |             "representative_hash": representative_memory.hash,
181 |             "cluster_size": len(memories),
182 |             "keywords": keywords,
183 |             "common_tags": list(all_tags),
184 |             "temporal_range": {
185 |                 "start": min(m.created_at for m in memories),
186 |                 "end": max(m.created_at for m in memories)
187 |             },
188 |             "centroid_embedding": centroid.tolist(),
189 |             "member_hashes": [m.hash for m in memories]
190 |         }
191 |         
192 |         return compressed
193 | 
194 | class TFIDFKeywordExtractor:
195 |     """Simple TF-IDF based keyword extraction."""
196 |     
197 |     def extract(self, text, top_k=10):
198 |         # Simple word frequency for demonstration
199 |         # In practice, use sklearn's TfidfVectorizer
200 |         words = text.lower().split()
201 |         word_freq = Counter(words)
202 |         
203 |         # Filter common words (simple stopword removal)
204 |         stopwords = {'the', 'a', 'an', 'and', 'or', 'but', 'in', 'on', 'at'}
205 |         keywords = [
206 |             (word, count) 
207 |             for word, count in word_freq.most_common(top_k * 2)
208 |             if word not in stopwords and len(word) > 3
209 |         ]
210 |         
211 |         return keywords[:top_k]
212 | ```
213 | 
214 | ## Complete Autonomous Architecture
215 | 
216 | ```python
217 | from apscheduler.schedulers.background import BackgroundScheduler
218 | import logging
219 | 
220 | class AutonomousMemoryConsolidator:
221 |     """
222 |     Fully autonomous memory consolidation system.
223 |     Runs without any external AI dependencies.
224 |     """
225 |     
226 |     def __init__(self, storage, config):
227 |         self.storage = storage
228 |         self.config = config
229 |         
230 |         # Initialize autonomous components
231 |         self.decay_calculator = AutonomousDecayCalculator(
232 |             config['retention_periods']
233 |         )
234 |         self.association_engine = AutonomousAssociationEngine()
235 |         self.compression_engine = AutonomousCompressionEngine()
236 |         self.pruning_engine = AutonomousPruningEngine(config['forgetting'])
237 |         
238 |         # Setup scheduling
239 |         self.scheduler = BackgroundScheduler()
240 |         self._setup_schedules()
241 |         
242 |         logging.info("Autonomous Memory Consolidator initialized")
243 |     
244 |     def _setup_schedules(self):
245 |         """Configure autonomous scheduling."""
246 |         # Daily consolidation at 3 AM
247 |         self.scheduler.add_job(
248 |             func=self.run_daily_consolidation,
249 |             trigger="cron",
250 |             hour=3,
251 |             minute=0,
252 |             id="daily_consolidation"
253 |         )
254 |         
255 |         # Weekly consolidation on Mondays at 4 AM
256 |         self.scheduler.add_job(
257 |             func=self.run_weekly_consolidation,
258 |             trigger="cron",
259 |             day_of_week='mon',
260 |             hour=4,
261 |             minute=0,
262 |             id="weekly_consolidation"
263 |         )
264 |         
265 |         # Monthly consolidation on 1st at 5 AM
266 |         self.scheduler.add_job(
267 |             func=self.run_monthly_consolidation,
268 |             trigger="cron",
269 |             day=1,
270 |             hour=5,
271 |             minute=0,
272 |             id="monthly_consolidation"
273 |         )
274 |     
275 |     def start(self):
276 |         """Start the autonomous consolidation system."""
277 |         self.scheduler.start()
278 |         logging.info("Autonomous consolidation scheduler started")
279 |     
280 |     async def run_daily_consolidation(self):
281 |         """Daily consolidation - fully autonomous."""
282 |         try:
283 |             # Get recent memories
284 |             memories = await self.storage.get_recent_memories(days=1)
285 |             
286 |             # Update relevance scores (pure math)
287 |             for memory in memories:
288 |                 memory.relevance_score = self.decay_calculator.calculate_relevance(memory)
289 |                 await self.storage.update_relevance_score(memory.hash, memory.relevance_score)
290 |             
291 |             # Find associations (vector math)
292 |             associations = self.association_engine.find_associations(memories)
293 |             for assoc in associations:
294 |                 await self.storage.store_association(assoc)
295 |             
296 |             logging.info(
297 |                 f"Daily consolidation complete: "
298 |                 f"{len(memories)} memories processed, "
299 |                 f"{len(associations)} associations found"
300 |             )
301 |             
302 |         except Exception as e:
303 |             logging.error(f"Daily consolidation failed: {e}")
304 |     
305 |     async def run_weekly_consolidation(self):
306 |         """Weekly consolidation with clustering."""
307 |         try:
308 |             # Get week's memories
309 |             memories = await self.storage.get_recent_memories(days=7)
310 |             
311 |             # Cluster memories using embeddings
312 |             clusters = self._cluster_memories(memories)
313 |             
314 |             # Compress large clusters
315 |             for cluster in clusters:
316 |                 if len(cluster) >= self.config['compression']['min_cluster_size']:
317 |                     compressed = self.compression_engine.compress_cluster(cluster)
318 |                     await self.storage.store_compressed_memory(compressed)
319 |             
320 |             logging.info(f"Weekly consolidation: {len(clusters)} clusters processed")
321 |             
322 |         except Exception as e:
323 |             logging.error(f"Weekly consolidation failed: {e}")
324 |     
325 |     def _cluster_memories(self, memories, threshold=0.3):
326 |         """Cluster memories using hierarchical clustering."""
327 |         if len(memories) < 2:
328 |             return [[m] for m in memories]
329 |         
330 |         # Extract embeddings
331 |         embeddings = np.array([m.embedding for m in memories])
332 |         
333 |         # Hierarchical clustering
334 |         clustering = AgglomerativeClustering(
335 |             n_clusters=None,
336 |             distance_threshold=threshold,
337 |             linkage='average'
338 |         )
339 |         labels = clustering.fit_predict(embeddings)
340 |         
341 |         # Group by cluster
342 |         clusters = {}
343 |         for idx, label in enumerate(labels):
344 |             if label not in clusters:
345 |                 clusters[label] = []
346 |             clusters[label].append(memories[idx])
347 |         
348 |         return list(clusters.values())
349 | ```
350 | 
351 | ## Deployment Configuration
352 | 
353 | ```yaml
354 | # autonomous_consolidation_config.yaml
355 | autonomous_mode:
356 |   enabled: true
357 |   
358 |   # No external AI endpoints needed!
359 |   external_ai_required: false
360 |   
361 |   # Retention periods (in days)
362 |   retention_periods:
363 |     critical: 365
364 |     reference: 180
365 |     standard: 30
366 |     temporary: 7
367 |     default: 30
368 |   
369 |   # Association discovery
370 |   associations:
371 |     min_similarity: 0.3
372 |     max_similarity: 0.7
373 |     max_pairs_per_run: 100
374 |     enabled: true
375 |   
376 |   # Forgetting rules
377 |   forgetting:
378 |     relevance_threshold: 0.1
379 |     access_threshold_days: 90
380 |     archive_path: "./memory_archive"
381 |     enabled: true
382 |   
383 |   # Compression settings
384 |   compression:
385 |     min_cluster_size: 5
386 |     clustering_threshold: 0.3
387 |     enabled: true
388 |   
389 |   # Scheduling (cron expressions)
390 |   schedules:
391 |     daily: "0 3 * * *"      # 3:00 AM daily
392 |     weekly: "0 4 * * 1"     # 4:00 AM Mondays
393 |     monthly: "0 5 1 * *"    # 5:00 AM first of month
394 | ```
395 | 
396 | ## Key Advantages of Autonomous Operation
397 | 
398 | ### 1. **Complete Independence**
399 | - No API keys required
400 | - No external service dependencies
401 | - No internet connection needed
402 | - Works entirely offline
403 | 
404 | ### 2. **Predictable Behavior**
405 | - Deterministic algorithms
406 | - Reproducible results
407 | - Easy to debug and test
408 | - Consistent performance
409 | 
410 | ### 3. **Cost Efficiency**
411 | - Zero ongoing AI costs
412 | - No API rate limits
413 | - No usage-based billing
414 | - Minimal computational resources
415 | 
416 | ### 4. **Privacy & Security**
417 | - All processing stays local
418 | - No data leaves your system
419 | - Complete data sovereignty
420 | - No third-party exposure
421 | 
422 | ### 5. **Performance**
423 | - No network latency
424 | - Instant processing
425 | - Parallel operations possible
426 | - Scales with local hardware
427 | 
428 | ## What's Different from AI-Powered Version?
429 | 
430 | | Feature | AI-Powered | Autonomous |
431 | |---------|------------|------------|
432 | | Natural language summaries | ✅ Eloquent prose | ❌ Structured data |
433 | | Complex reasoning | ✅ Nuanced understanding | ❌ Rule-based logic |
434 | | Summary quality | ✅ Human-like | ✅ Statistically representative |
435 | | Cost | 💰 Ongoing API costs | ✅ Free after setup |
436 | | Speed | 🐌 Network dependent | 🚀 Local processing |
437 | | Privacy | ⚠️ Data sent to API | 🔒 Completely private |
438 | | Reliability | ⚠️ Service dependent | ✅ Always available |
439 | 
440 | ## Implementation Checklist
441 | 
442 | - [ ] Install required Python packages (numpy, scikit-learn, apscheduler)
443 | - [ ] Configure retention periods for your use case
444 | - [ ] Set up clustering thresholds based on your embedding model
445 | - [ ] Configure scheduling based on your memory volume
446 | - [ ] Test each component independently
447 | - [ ] Monitor initial runs and adjust thresholds
448 | - [ ] Set up logging and monitoring
449 | - [ ] Create backup strategy for archived memories
450 | 
451 | ## Conclusion
452 | 
453 | The autonomous implementation proves that sophisticated memory consolidation doesn't require external AI. By leveraging existing embeddings and mathematical algorithms, we achieve a system that mimics biological memory processes while maintaining complete independence, privacy, and cost-effectiveness.
454 | 
455 | This approach transforms the dream-inspired concept into a practical, deployable system that can run indefinitely without human intervention or external dependencies - a true "set it and forget it" solution for memory management.
456 | 
457 | ---
458 | 
459 | *Related Documents:*
460 | - [Dream-Inspired Memory Consolidation System](./dream-inspired-memory-consolidation.md)
461 | - [Issue #11: Multi-Layered Memory Consolidation](https://github.com/doobidoo/mcp-memory-service/issues/11)
462 | 
463 | *Created: July 28, 2025*
```

--------------------------------------------------------------------------------
/src/mcp_memory_service/consolidation/clustering.py:
--------------------------------------------------------------------------------

```python
  1 | # Copyright 2024 Heinrich Krupp
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | 
 15 | """Semantic clustering system for memory organization."""
 16 | 
 17 | import uuid
 18 | import numpy as np
 19 | from typing import List, Dict, Any, Optional, Tuple
 20 | from datetime import datetime
 21 | from collections import Counter
 22 | import re
 23 | 
 24 | try:
 25 |     from sklearn.cluster import DBSCAN
 26 |     from sklearn.cluster import AgglomerativeClustering
 27 |     from sklearn.metrics import silhouette_score
 28 |     SKLEARN_AVAILABLE = True
 29 | except ImportError:
 30 |     SKLEARN_AVAILABLE = False
 31 | 
 32 | from .base import ConsolidationBase, ConsolidationConfig, MemoryCluster
 33 | from ..models.memory import Memory
 34 | 
 35 | class SemanticClusteringEngine(ConsolidationBase):
 36 |     """
 37 |     Creates semantic clusters of related memories for organization and compression.
 38 |     
 39 |     Uses embedding-based clustering algorithms (DBSCAN, Hierarchical) to group
 40 |     semantically similar memories, enabling efficient compression and retrieval.
 41 |     """
 42 |     
 43 |     def __init__(self, config: ConsolidationConfig):
 44 |         super().__init__(config)
 45 |         self.min_cluster_size = config.min_cluster_size
 46 |         self.algorithm = config.clustering_algorithm
 47 |         
 48 |         if not SKLEARN_AVAILABLE:
 49 |             self.logger.warning("sklearn not available, using simple clustering fallback")
 50 |             self.algorithm = 'simple'
 51 |     
 52 |     async def process(self, memories: List[Memory], **kwargs) -> List[MemoryCluster]:
 53 |         """Create semantic clusters from memories."""
 54 |         if not self._validate_memories(memories) or len(memories) < self.min_cluster_size:
 55 |             return []
 56 |         
 57 |         # Filter memories with embeddings
 58 |         memories_with_embeddings = [m for m in memories if m.embedding]
 59 |         
 60 |         if len(memories_with_embeddings) < self.min_cluster_size:
 61 |             self.logger.warning(f"Only {len(memories_with_embeddings)} memories have embeddings, need at least {self.min_cluster_size}")
 62 |             return []
 63 |         
 64 |         # Extract embeddings matrix
 65 |         embeddings = np.array([m.embedding for m in memories_with_embeddings])
 66 |         
 67 |         # Perform clustering
 68 |         if self.algorithm == 'dbscan':
 69 |             cluster_labels = await self._dbscan_clustering(embeddings)
 70 |         elif self.algorithm == 'hierarchical':
 71 |             cluster_labels = await self._hierarchical_clustering(embeddings)
 72 |         else:
 73 |             cluster_labels = await self._simple_clustering(embeddings)
 74 |         
 75 |         # Create cluster objects
 76 |         clusters = await self._create_clusters(memories_with_embeddings, cluster_labels, embeddings)
 77 |         
 78 |         # Filter by minimum cluster size
 79 |         valid_clusters = [c for c in clusters if len(c.memory_hashes) >= self.min_cluster_size]
 80 |         
 81 |         self.logger.info(f"Created {len(valid_clusters)} valid clusters from {len(memories_with_embeddings)} memories")
 82 |         return valid_clusters
 83 |     
 84 |     async def _dbscan_clustering(self, embeddings: np.ndarray) -> np.ndarray:
 85 |         """Perform DBSCAN clustering on embeddings."""
 86 |         if not SKLEARN_AVAILABLE:
 87 |             return await self._simple_clustering(embeddings)
 88 |         
 89 |         # Adaptive epsilon based on data size and dimensionality
 90 |         n_samples, n_features = embeddings.shape
 91 |         eps = 0.5 - (n_samples / 10000) * 0.1  # Decrease eps for larger datasets
 92 |         eps = max(0.2, min(0.7, eps))  # Clamp between 0.2 and 0.7
 93 |         
 94 |         min_samples = max(2, self.min_cluster_size // 2)
 95 |         
 96 |         clustering = DBSCAN(eps=eps, min_samples=min_samples, metric='cosine')
 97 |         labels = clustering.fit_predict(embeddings)
 98 |         
 99 |         self.logger.debug(f"DBSCAN: eps={eps}, min_samples={min_samples}, found {len(set(labels))} clusters")
100 |         return labels
101 |     
102 |     async def _hierarchical_clustering(self, embeddings: np.ndarray) -> np.ndarray:
103 |         """Perform hierarchical clustering on embeddings."""
104 |         if not SKLEARN_AVAILABLE:
105 |             return await self._simple_clustering(embeddings)
106 |         
107 |         # Estimate number of clusters (heuristic: sqrt of samples / 2)
108 |         n_samples = embeddings.shape[0]
109 |         n_clusters = max(2, min(n_samples // self.min_cluster_size, int(np.sqrt(n_samples) / 2)))
110 |         
111 |         clustering = AgglomerativeClustering(
112 |             n_clusters=n_clusters,
113 |             metric='cosine',
114 |             linkage='average'
115 |         )
116 |         labels = clustering.fit_predict(embeddings)
117 |         
118 |         self.logger.debug(f"Hierarchical: n_clusters={n_clusters}, found {len(set(labels))} clusters")
119 |         return labels
120 |     
121 |     async def _simple_clustering(self, embeddings: np.ndarray) -> np.ndarray:
122 |         """Simple fallback clustering using cosine similarity threshold."""
123 |         n_samples = embeddings.shape[0]
124 |         labels = np.full(n_samples, -1)  # Start with all as noise
125 |         current_cluster = 0
126 |         
127 |         similarity_threshold = 0.7  # Threshold for grouping
128 |         
129 |         for i in range(n_samples):
130 |             if labels[i] != -1:  # Already assigned
131 |                 continue
132 |             
133 |             # Start new cluster
134 |             cluster_members = [i]
135 |             labels[i] = current_cluster
136 |             
137 |             # Find similar memories
138 |             for j in range(i + 1, n_samples):
139 |                 if labels[j] != -1:  # Already assigned
140 |                     continue
141 |                 
142 |                 # Calculate cosine similarity
143 |                 similarity = np.dot(embeddings[i], embeddings[j]) / (
144 |                     np.linalg.norm(embeddings[i]) * np.linalg.norm(embeddings[j])
145 |                 )
146 |                 
147 |                 if similarity >= similarity_threshold:
148 |                     labels[j] = current_cluster
149 |                     cluster_members.append(j)
150 |             
151 |             # Only keep cluster if it meets minimum size
152 |             if len(cluster_members) >= self.min_cluster_size:
153 |                 current_cluster += 1
154 |             else:
155 |                 # Mark as noise
156 |                 for member in cluster_members:
157 |                     labels[member] = -1
158 |         
159 |         self.logger.debug(f"Simple clustering: threshold={similarity_threshold}, found {current_cluster} clusters")
160 |         return labels
161 |     
162 |     async def _create_clusters(
163 |         self,
164 |         memories: List[Memory],
165 |         labels: np.ndarray,
166 |         embeddings: np.ndarray
167 |     ) -> List[MemoryCluster]:
168 |         """Create MemoryCluster objects from clustering results."""
169 |         clusters = []
170 |         unique_labels = set(labels)
171 |         
172 |         for label in unique_labels:
173 |             if label == -1:  # Skip noise points
174 |                 continue
175 |             
176 |             # Get memories in this cluster
177 |             cluster_indices = np.where(labels == label)[0]
178 |             cluster_memories = [memories[i] for i in cluster_indices]
179 |             cluster_embeddings = embeddings[cluster_indices]
180 |             
181 |             if len(cluster_memories) < self.min_cluster_size:
182 |                 continue
183 |             
184 |             # Calculate centroid embedding
185 |             centroid = np.mean(cluster_embeddings, axis=0)
186 |             
187 |             # Calculate coherence score (average cosine similarity to centroid)
188 |             coherence_scores = []
189 |             for embedding in cluster_embeddings:
190 |                 similarity = np.dot(embedding, centroid) / (
191 |                     np.linalg.norm(embedding) * np.linalg.norm(centroid)
192 |                 )
193 |                 coherence_scores.append(similarity)
194 |             
195 |             coherence_score = np.mean(coherence_scores)
196 |             
197 |             # Extract theme keywords
198 |             theme_keywords = await self._extract_theme_keywords(cluster_memories)
199 |             
200 |             # Create cluster
201 |             cluster = MemoryCluster(
202 |                 cluster_id=str(uuid.uuid4()),
203 |                 memory_hashes=[m.content_hash for m in cluster_memories],
204 |                 centroid_embedding=centroid.tolist(),
205 |                 coherence_score=float(coherence_score),
206 |                 created_at=datetime.now(),
207 |                 theme_keywords=theme_keywords,
208 |                 metadata={
209 |                     'algorithm': self.algorithm,
210 |                     'cluster_size': len(cluster_memories),
211 |                     'average_memory_age': self._calculate_average_age(cluster_memories),
212 |                     'tag_distribution': self._analyze_tag_distribution(cluster_memories)
213 |                 }
214 |             )
215 |             
216 |             clusters.append(cluster)
217 |         
218 |         return clusters
219 |     
220 |     async def _extract_theme_keywords(self, memories: List[Memory]) -> List[str]:
221 |         """Extract theme keywords that represent the cluster."""
222 |         # Combine all content
223 |         all_text = ' '.join([m.content for m in memories])
224 |         
225 |         # Collect all tags
226 |         all_tags = []
227 |         for memory in memories:
228 |             all_tags.extend(memory.tags)
229 |         
230 |         # Count tag frequency
231 |         tag_counts = Counter(all_tags)
232 |         
233 |         # Extract frequent words from content (simple approach)
234 |         words = re.findall(r'\b[a-zA-Z]{4,}\b', all_text.lower())
235 |         word_counts = Counter(words)
236 |         
237 |         # Remove common stop words
238 |         stop_words = {
239 |             'this', 'that', 'with', 'have', 'will', 'from', 'they', 'know',
240 |             'want', 'been', 'good', 'much', 'some', 'time', 'very', 'when',
241 |             'come', 'here', 'just', 'like', 'long', 'make', 'many', 'over',
242 |             'such', 'take', 'than', 'them', 'well', 'were', 'what', 'work',
243 |             'your', 'could', 'should', 'would', 'there', 'their', 'these',
244 |             'about', 'after', 'again', 'before', 'being', 'between', 'during',
245 |             'under', 'where', 'while', 'other', 'through', 'against'
246 |         }
247 |         
248 |         # Filter and get top words
249 |         filtered_words = {word: count for word, count in word_counts.items() 
250 |                          if word not in stop_words and count > 1}
251 |         
252 |         # Combine tags and words, prioritize tags
253 |         theme_keywords = []
254 |         
255 |         # Add top tags (weight by frequency)
256 |         for tag, count in tag_counts.most_common(5):
257 |             if count > 1:  # Tag appears in multiple memories
258 |                 theme_keywords.append(tag)
259 |         
260 |         # Add top words
261 |         for word, count in sorted(filtered_words.items(), key=lambda x: x[1], reverse=True)[:10]:
262 |             if word not in theme_keywords:
263 |                 theme_keywords.append(word)
264 |         
265 |         return theme_keywords[:10]  # Limit to top 10
266 |     
267 |     def _calculate_average_age(self, memories: List[Memory]) -> float:
268 |         """Calculate average age of memories in days."""
269 |         now = datetime.now()
270 |         ages = []
271 |         
272 |         for memory in memories:
273 |             if memory.created_at:
274 |                 created_dt = datetime.utcfromtimestamp(memory.created_at)
275 |                 age_days = (now - created_dt).days
276 |                 ages.append(age_days)
277 |             elif memory.timestamp:
278 |                 age_days = (now - memory.timestamp).days
279 |                 ages.append(age_days)
280 |         
281 |         return sum(ages) / len(ages) if ages else 0.0
282 |     
283 |     def _analyze_tag_distribution(self, memories: List[Memory]) -> Dict[str, int]:
284 |         """Analyze tag distribution within the cluster."""
285 |         all_tags = []
286 |         for memory in memories:
287 |             all_tags.extend(memory.tags)
288 |         
289 |         return dict(Counter(all_tags))
290 |     
291 |     async def merge_similar_clusters(
292 |         self,
293 |         clusters: List[MemoryCluster],
294 |         similarity_threshold: float = 0.8
295 |     ) -> List[MemoryCluster]:
296 |         """Merge clusters that are very similar to each other."""
297 |         if len(clusters) <= 1:
298 |             return clusters
299 |         
300 |         # Calculate pairwise similarities between cluster centroids
301 |         centroids = np.array([cluster.centroid_embedding for cluster in clusters])
302 |         
303 |         merged = [False] * len(clusters)
304 |         result_clusters = []
305 |         
306 |         for i, cluster1 in enumerate(clusters):
307 |             if merged[i]:
308 |                 continue
309 |             
310 |             # Start with current cluster
311 |             merge_group = [i]
312 |             merged[i] = True
313 |             
314 |             # Find similar clusters to merge
315 |             for j in range(i + 1, len(clusters)):
316 |                 if merged[j]:
317 |                     continue
318 |                 
319 |                 # Calculate cosine similarity between centroids
320 |                 similarity = np.dot(centroids[i], centroids[j]) / (
321 |                     np.linalg.norm(centroids[i]) * np.linalg.norm(centroids[j])
322 |                 )
323 |                 
324 |                 if similarity >= similarity_threshold:
325 |                     merge_group.append(j)
326 |                     merged[j] = True
327 |             
328 |             # Create merged cluster
329 |             if len(merge_group) == 1:
330 |                 # No merging needed
331 |                 result_clusters.append(clusters[i])
332 |             else:
333 |                 # Merge clusters
334 |                 merged_cluster = await self._merge_cluster_group(
335 |                     [clusters[idx] for idx in merge_group]
336 |                 )
337 |                 result_clusters.append(merged_cluster)
338 |         
339 |         self.logger.info(f"Merged {len(clusters)} clusters into {len(result_clusters)}")
340 |         return result_clusters
341 |     
342 |     async def _merge_cluster_group(self, clusters: List[MemoryCluster]) -> MemoryCluster:
343 |         """Merge a group of similar clusters into one."""
344 |         # Combine all memory hashes
345 |         all_memory_hashes = []
346 |         for cluster in clusters:
347 |             all_memory_hashes.extend(cluster.memory_hashes)
348 |         
349 |         # Calculate new centroid (average of all centroids weighted by cluster size)
350 |         total_size = sum(len(cluster.memory_hashes) for cluster in clusters)
351 |         weighted_centroid = np.zeros(len(clusters[0].centroid_embedding))
352 |         
353 |         for cluster in clusters:
354 |             weight = len(cluster.memory_hashes) / total_size
355 |             centroid = np.array(cluster.centroid_embedding)
356 |             weighted_centroid += weight * centroid
357 |         
358 |         # Combine theme keywords
359 |         all_keywords = []
360 |         for cluster in clusters:
361 |             all_keywords.extend(cluster.theme_keywords)
362 |         
363 |         keyword_counts = Counter(all_keywords)
364 |         merged_keywords = [kw for kw, count in keyword_counts.most_common(10)]
365 |         
366 |         # Calculate average coherence score
367 |         total_memories = sum(len(cluster.memory_hashes) for cluster in clusters)
368 |         weighted_coherence = sum(
369 |             cluster.coherence_score * len(cluster.memory_hashes) / total_memories
370 |             for cluster in clusters
371 |         )
372 |         
373 |         return MemoryCluster(
374 |             cluster_id=str(uuid.uuid4()),
375 |             memory_hashes=all_memory_hashes,
376 |             centroid_embedding=weighted_centroid.tolist(),
377 |             coherence_score=weighted_coherence,
378 |             created_at=datetime.now(),
379 |             theme_keywords=merged_keywords,
380 |             metadata={
381 |                 'algorithm': f"{self.algorithm}_merged",
382 |                 'cluster_size': len(all_memory_hashes),
383 |                 'merged_from': [cluster.cluster_id for cluster in clusters],
384 |                 'merge_timestamp': datetime.now().isoformat()
385 |             }
386 |         )
```

--------------------------------------------------------------------------------
/tests/integration/test_cli_interfaces.py:
--------------------------------------------------------------------------------

```python
  1 | #!/usr/bin/env python3
  2 | """
  3 | Integration tests for CLI interfaces to prevent conflicts and ensure compatibility.
  4 | 
  5 | This module tests the different CLI entry points to ensure they work correctly
  6 | and that the compatibility layer functions as expected.
  7 | """
  8 | 
  9 | import subprocess
 10 | import pytest
 11 | import warnings
 12 | import sys
 13 | import os
 14 | from pathlib import Path
 15 | 
 16 | # Add src to path
 17 | current_dir = Path(__file__).parent
 18 | src_dir = current_dir.parent.parent / "src"
 19 | sys.path.insert(0, str(src_dir))
 20 | 
 21 | from mcp_memory_service.cli.main import memory_server_main, main as cli_main
 22 | 
 23 | 
 24 | class TestCLIInterfaces:
 25 |     """Test CLI interface compatibility and functionality."""
 26 |     
 27 |     def test_memory_command_backward_compatibility(self):
 28 |         """Test that 'uv run memory' (without server) starts the MCP server for backward compatibility."""
 29 |         result = subprocess.run(
 30 |             ["uv", "run", "memory", "--help"],
 31 |             capture_output=True,
 32 |             text=True,
 33 |             timeout=10,
 34 |             cwd=current_dir.parent.parent
 35 |         )
 36 |         # Should show help text (not start server) when --help is provided
 37 |         assert result.returncode == 0
 38 |         assert "MCP Memory Service" in result.stdout
 39 |     
 40 |     def test_memory_command_exists(self):
 41 |         """Test that the memory command is available."""
 42 |         result = subprocess.run(
 43 |             ["uv", "run", "memory", "--help"],
 44 |             capture_output=True,
 45 |             text=True,
 46 |             cwd=current_dir.parent.parent
 47 |         )
 48 |         assert result.returncode == 0
 49 |         assert "MCP Memory Service" in result.stdout
 50 |         assert "server" in result.stdout
 51 |         assert "status" in result.stdout
 52 |     
 53 |     def test_memory_server_command_exists(self):
 54 |         """Test that the memory-server command is available."""
 55 |         result = subprocess.run(
 56 |             ["uv", "run", "memory-server", "--help"],
 57 |             capture_output=True, 
 58 |             text=True,
 59 |             cwd=current_dir.parent.parent
 60 |         )
 61 |         assert result.returncode == 0
 62 |         assert "MCP Memory Service" in result.stdout
 63 |         # Should show deprecation warning in stderr
 64 |         assert "deprecated" in result.stderr.lower()
 65 |     
 66 |     def test_mcp_memory_server_command_exists(self):
 67 |         """Test that the mcp-memory-server command is available."""
 68 |         result = subprocess.run(
 69 |             ["uv", "run", "mcp-memory-server", "--help"],
 70 |             capture_output=True,
 71 |             text=True, 
 72 |             cwd=current_dir.parent.parent
 73 |         )
 74 |         # This might have different behavior or missing dependencies
 75 |         # 0 for success, 1 for import error (missing fastmcp), 2 for argument error
 76 |         assert result.returncode in [0, 1, 2]
 77 |         
 78 |         # If it failed due to missing fastmcp dependency, that's expected
 79 |         if result.returncode == 1 and "fastmcp" in result.stderr:
 80 |             pytest.skip("mcp-memory-server requires FastMCP which is not installed")
 81 |     
 82 |     def test_memory_server_version_flag(self):
 83 |         """Test that memory-server --version works and shows deprecation warning."""
 84 |         result = subprocess.run(
 85 |             ["uv", "run", "memory-server", "--version"],
 86 |             capture_output=True,
 87 |             text=True,
 88 |             cwd=current_dir.parent.parent
 89 |         )
 90 |         assert result.returncode == 0
 91 |         assert "8.24.0" in result.stdout
 92 |         assert "deprecated" in result.stderr.lower()
 93 |     
 94 |     def test_memory_server_vs_memory_server_subcommand(self):
 95 |         """Test that both memory-server and memory server provide similar functionality."""
 96 |         # Test memory-server --help
 97 |         result1 = subprocess.run(
 98 |             ["uv", "run", "memory-server", "--help"],
 99 |             capture_output=True,
100 |             text=True,
101 |             cwd=current_dir.parent.parent
102 |         )
103 |         
104 |         # Test memory server --help  
105 |         result2 = subprocess.run(
106 |             ["uv", "run", "memory", "server", "--help"],
107 |             capture_output=True,
108 |             text=True,
109 |             cwd=current_dir.parent.parent
110 |         )
111 |         
112 |         assert result1.returncode == 0
113 |         assert result2.returncode == 0
114 | 
115 |         # Both should mention debug option
116 |         assert "--debug" in result1.stdout
117 |         assert "--debug" in result2.stdout
118 |         # Note: --chroma-path removed in v8.0.0
119 |     
120 |     def test_compatibility_wrapper_deprecation_warning(self):
121 |         """Test that the compatibility wrapper issues deprecation warnings."""
122 |         # Capture warnings when calling memory_server_main
123 |         with warnings.catch_warnings(record=True) as w:
124 |             warnings.simplefilter("always")  # Ensure all warnings are caught
125 |             
126 |             # Mock sys.argv to test argument parsing
127 |             original_argv = sys.argv
128 |             try:
129 |                 sys.argv = ["memory-server", "--version"]
130 |                 # This will raise SystemExit due to --version, which is expected
131 |                 with pytest.raises(SystemExit) as exc_info:
132 |                     memory_server_main()
133 |                 assert exc_info.value.code == 0  # Should exit successfully
134 |             finally:
135 |                 sys.argv = original_argv
136 |             
137 |             # Check that deprecation warning was issued
138 |             deprecation_warnings = [warning for warning in w if issubclass(warning.category, DeprecationWarning)]
139 |             assert len(deprecation_warnings) > 0
140 |             assert "deprecated" in str(deprecation_warnings[0].message).lower()
141 |             assert "memory server" in str(deprecation_warnings[0].message)
142 |     
143 |     def test_argument_compatibility(self):
144 |         """Test that arguments are properly passed through compatibility wrapper."""
145 |         # Test with --debug flag
146 |         result = subprocess.run(
147 |             ["uv", "run", "memory-server", "--help"],
148 |             capture_output=True,
149 |             text=True,
150 |             cwd=current_dir.parent.parent
151 |         )
152 |         
153 |         assert result.returncode == 0
154 |         assert "--debug" in result.stdout
155 |         assert "--version" in result.stdout
156 |         # Note: --chroma-path removed in v8.0.0
157 |     
158 |     def test_no_cli_conflicts_during_import(self):
159 |         """Test that importing CLI modules doesn't cause conflicts."""
160 |         try:
161 |             # These imports should work without conflicts
162 |             from mcp_memory_service.cli.main import main, memory_server_main
163 |             from mcp_memory_service import server
164 |             
165 |             # Check that functions exist and are callable
166 |             assert callable(main)
167 |             assert callable(memory_server_main)
168 |             assert callable(server.main)
169 |             
170 |             # Should not raise any import errors or conflicts
171 |         except ImportError as e:
172 |             pytest.fail(f"CLI import conflict detected: {str(e)}")
173 | 
174 | 
175 | class TestCLIFunctionality:
176 |     """Test actual CLI functionality to ensure commands work end-to-end."""
177 |     
178 |     def test_memory_status_command(self):
179 |         """Test that memory status command works."""
180 |         result = subprocess.run(
181 |             ["uv", "run", "memory", "status"],
182 |             capture_output=True,
183 |             text=True,
184 |             timeout=30,
185 |             cwd=current_dir.parent.parent
186 |         )
187 |         # Status command might fail if no storage is available, but should not crash
188 |         # Return code 0 = success, 1 = expected error (e.g., no storage)
189 |         assert result.returncode in [0, 1]
190 |         
191 |         if result.returncode == 0:
192 |             assert "MCP Memory Service Status" in result.stdout
193 |             assert "Version: 8.24.0" in result.stdout
194 |         else:
195 |             # If it fails, should have a meaningful error message
196 |             assert len(result.stderr) > 0 or len(result.stdout) > 0
197 | 
198 | 
199 | class TestCLIRobustness:
200 |     """Test CLI robustness and edge cases."""
201 |     
202 |     def test_environment_variable_passing(self):
203 |         """Test that CLI arguments correctly set environment variables."""
204 |         import os
205 |         import subprocess
206 | 
207 |         # Test that --debug flag affects application behavior
208 |         env = os.environ.copy()
209 |         env.pop('MCP_DEBUG', None)  # Ensure not already set
210 | 
211 |         result = subprocess.run(
212 |             ["uv", "run", "python", "-c", """
213 | import os
214 | import sys
215 | from mcp_memory_service.cli.main import cli
216 | from click.testing import CliRunner
217 | 
218 | # Test that --debug flag is recognized and sets debug mode
219 | runner = CliRunner()
220 | result = runner.invoke(cli, ['server', '--debug', '--help'])
221 | print(f'EXIT_CODE:{result.exit_code}')
222 | print(f'DEBUG_IN_OUTPUT:{\"--debug\" in result.output}')
223 | """],
224 |             capture_output=True,
225 |             text=True,
226 |             cwd=current_dir.parent.parent,
227 |             env=env,
228 |             timeout=10
229 |         )
230 | 
231 |         # Should succeed and recognize --debug flag
232 |         assert result.returncode == 0
233 |         assert 'EXIT_CODE:0' in result.stdout
234 |         assert 'DEBUG_IN_OUTPUT:True' in result.stdout
235 |     
236 |     def test_cli_error_handling(self):
237 |         """Test that CLI handles errors gracefully."""
238 |         # Test invalid storage backend
239 |         result = subprocess.run(
240 |             ["uv", "run", "memory", "server", "--storage-backend", "invalid"],
241 |             capture_output=True,
242 |             text=True,
243 |             cwd=current_dir.parent.parent,
244 |             timeout=10
245 |         )
246 |         
247 |         # Should fail with clear error message
248 |         assert result.returncode != 0
249 |         assert len(result.stderr) > 0 or "invalid" in result.stdout.lower()
250 |     
251 |     def test_memory_server_argument_parity(self):
252 |         """Test that memory-server and memory server support the same core arguments."""
253 |         import subprocess
254 |         
255 |         # Test memory server arguments
256 |         result1 = subprocess.run(
257 |             ["uv", "run", "memory", "server", "--help"],
258 |             capture_output=True,
259 |             text=True,
260 |             cwd=current_dir.parent.parent
261 |         )
262 |         
263 |         # Test memory-server arguments
264 |         result2 = subprocess.run(
265 |             ["uv", "run", "memory-server", "--help"],
266 |             capture_output=True,
267 |             text=True,
268 |             cwd=current_dir.parent.parent
269 |         )
270 |         
271 |         assert result1.returncode == 0
272 |         assert result2.returncode == 0
273 |         
274 |         # Both should support debug argument
275 |         assert "--debug" in result1.stdout
276 |         assert "--debug" in result2.stdout
277 |         # Note: --chroma-path removed in v8.0.0
278 |     
279 |     def test_entry_point_isolation(self):
280 |         """Test that different entry points don't interfere with each other."""
281 |         # Test that we can import all entry points without conflicts
282 |         try:
283 |             # Import main CLI
284 |             from mcp_memory_service.cli.main import main, memory_server_main
285 |             
286 |             # Import server main
287 |             from mcp_memory_service.server import main as server_main
288 |             
289 |             # Verify they're different functions
290 |             assert main != memory_server_main
291 |             assert main != server_main
292 |             assert memory_server_main != server_main
293 |             
294 |             # Verify they're all callable
295 |             assert callable(main)
296 |             assert callable(memory_server_main)  
297 |             assert callable(server_main)
298 |             
299 |         except ImportError as e:
300 |             pytest.fail(f"Entry point isolation failed: {e}")
301 |     
302 |     def test_backward_compatibility_deprecation_warning(self):
303 |         """Test that using 'memory' without subcommand shows deprecation warning."""
304 |         import warnings
305 |         import sys
306 |         from mcp_memory_service.cli.main import cli
307 |         
308 |         with warnings.catch_warnings(record=True) as w:
309 |             warnings.simplefilter("always")
310 |             
311 |             # Mock sys.argv to test backward compatibility
312 |             original_argv = sys.argv
313 |             try:
314 |                 # This simulates 'uv run memory' without subcommand
315 |                 sys.argv = ["memory"]
316 |                 with pytest.raises(SystemExit):  # Server will try to start and exit
317 |                     cli(standalone_mode=False)
318 |             except Exception:
319 |                 # Expected - server can't actually start in test environment
320 |                 pass
321 |             finally:
322 |                 sys.argv = original_argv
323 |             
324 |             # Verify backward compatibility deprecation warning
325 |             deprecation_warnings = [warning for warning in w if issubclass(warning.category, DeprecationWarning)]
326 |             assert len(deprecation_warnings) > 0
327 |             
328 |             warning_msg = str(deprecation_warnings[0].message)
329 |             assert "without a subcommand is deprecated" in warning_msg
330 |             assert "memory server" in warning_msg
331 |             assert "backward compatibility will be removed" in warning_msg
332 |     
333 |     def test_deprecation_warning_format(self):
334 |         """Test that deprecation warning has proper format and information."""
335 |         import warnings
336 |         import sys
337 |         from mcp_memory_service.cli.main import memory_server_main
338 |         
339 |         with warnings.catch_warnings(record=True) as w:
340 |             warnings.simplefilter("always")
341 |             
342 |             # Mock sys.argv for the compatibility wrapper
343 |             original_argv = sys.argv
344 |             try:
345 |                 sys.argv = ["memory-server", "--version"]
346 |                 with pytest.raises(SystemExit):  # --version causes system exit
347 |                     memory_server_main()
348 |             finally:
349 |                 sys.argv = original_argv
350 |             
351 |             # Verify deprecation warning content
352 |             deprecation_warnings = [warning for warning in w if issubclass(warning.category, DeprecationWarning)]
353 |             assert len(deprecation_warnings) > 0
354 |             
355 |             warning_msg = str(deprecation_warnings[0].message)
356 |             assert "memory-server" in warning_msg.lower()
357 |             assert "deprecated" in warning_msg.lower() 
358 |             assert "memory server" in warning_msg.lower()
359 |             assert "removed" in warning_msg.lower()
360 | 
361 | 
362 | class TestCLIPerformance:
363 |     """Test CLI performance characteristics."""
364 |     
365 |     def test_cli_startup_time(self):
366 |         """Test that CLI commands start reasonably quickly."""
367 |         import time
368 |         import subprocess
369 |         
370 |         start_time = time.time()
371 |         result = subprocess.run(
372 |             ["uv", "run", "memory", "--help"],
373 |             capture_output=True,
374 |             text=True,
375 |             timeout=30,
376 |             cwd=current_dir.parent.parent
377 |         )
378 |         elapsed = time.time() - start_time
379 |         
380 |         assert result.returncode == 0
381 |         # CLI help should respond within 30 seconds (generous timeout for CI)
382 |         assert elapsed < 30
383 |         # Log performance for monitoring
384 |         print(f"CLI startup took {elapsed:.2f} seconds")
385 |     
386 |     def test_memory_version_performance(self):
387 |         """Test that version commands are fast."""
388 |         import time
389 |         import subprocess
390 |         
391 |         # Test both version commands
392 |         commands = [
393 |             ["uv", "run", "memory", "--version"],
394 |             ["uv", "run", "memory-server", "--version"]
395 |         ]
396 |         
397 |         for cmd in commands:
398 |             start_time = time.time()
399 |             result = subprocess.run(
400 |                 cmd,
401 |                 capture_output=True,
402 |                 text=True,
403 |                 timeout=15,
404 |                 cwd=current_dir.parent.parent
405 |             )
406 |             elapsed = time.time() - start_time
407 |             
408 |             assert result.returncode == 0
409 |             assert "8.24.0" in result.stdout
410 |             # Version should be very fast
411 |             assert elapsed < 15
412 |             print(f"Version command {' '.join(cmd[2:])} took {elapsed:.2f} seconds")
413 | 
414 | 
415 | if __name__ == "__main__":
416 |     pytest.main([__file__, "-v"])
```

--------------------------------------------------------------------------------
/scripts/validation/validate_configuration_complete.py:
--------------------------------------------------------------------------------

```python
  1 | #!/usr/bin/env python3
  2 | """
  3 | Comprehensive Configuration Validation Script for MCP Memory Service
  4 | 
  5 | This unified script validates all configuration aspects:
  6 | - Claude Code global configuration (~/.claude.json)
  7 | - Claude Desktop configuration (claude_desktop_config.json)
  8 | - Project .env file configuration
  9 | - Cross-configuration consistency
 10 | - API token validation
 11 | - Cloudflare credentials validation
 12 | 
 13 | Consolidates functionality from validate_config.py and validate_configuration.py
 14 | """
 15 | 
 16 | import os
 17 | import sys
 18 | import json
 19 | import re
 20 | import logging
 21 | from pathlib import Path
 22 | from typing import Dict, Any, Optional, List, Tuple
 23 | 
 24 | # Configure logging
 25 | logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s')
 26 | logger = logging.getLogger(__name__)
 27 | 
 28 | class ComprehensiveConfigValidator:
 29 |     """Unified configuration validator for all MCP Memory Service configurations."""
 30 | 
 31 |     def __init__(self):
 32 |         """Initialize validator with all configuration paths and requirements."""
 33 |         self.project_root = Path(__file__).parent.parent.parent
 34 |         self.env_file = self.project_root / '.env'
 35 | 
 36 |         # Platform-specific Claude Desktop config paths
 37 |         if os.name == 'nt':  # Windows
 38 |             self.claude_desktop_config_file = Path.home() / 'AppData' / 'Roaming' / 'Claude' / 'claude_desktop_config.json'
 39 |         else:  # macOS/Linux
 40 |             self.claude_desktop_config_file = Path.home() / '.config' / 'claude' / 'claude_desktop_config.json'
 41 | 
 42 |         # Claude Code global config (different from Claude Desktop)
 43 |         self.claude_code_config_file = Path.home() / '.claude.json'
 44 | 
 45 |         # Local project MCP config (should usually not exist for memory service)
 46 |         self.local_mcp_config_file = self.project_root / '.mcp.json'
 47 | 
 48 |         # Required environment variables for Cloudflare backend
 49 |         self.required_vars = [
 50 |             'MCP_MEMORY_STORAGE_BACKEND',
 51 |             'CLOUDFLARE_API_TOKEN',
 52 |             'CLOUDFLARE_ACCOUNT_ID',
 53 |             'CLOUDFLARE_D1_DATABASE_ID',
 54 |             'CLOUDFLARE_VECTORIZE_INDEX'
 55 |         ]
 56 | 
 57 |         # Optional but commonly used variables
 58 |         self.optional_vars = [
 59 |             'MCP_MEMORY_BACKUPS_PATH',
 60 |             'MCP_MEMORY_SQLITE_PATH'
 61 |         ]
 62 | 
 63 |         # Results tracking
 64 |         self.issues = []
 65 |         self.error_count = 0
 66 |         self.warning_count = 0
 67 |         self.success_count = 0
 68 | 
 69 |     def load_json_safe(self, file_path: Path) -> Optional[Dict]:
 70 |         """Load JSON file safely, return None if not found or invalid."""
 71 |         try:
 72 |             if file_path.exists():
 73 |                 with open(file_path, 'r', encoding='utf-8') as f:
 74 |                     return json.load(f)
 75 |         except (json.JSONDecodeError, FileNotFoundError, PermissionError) as e:
 76 |             self.add_warning(f"Could not load {file_path}: {e}")
 77 |         return None
 78 | 
 79 |     def load_env_file(self) -> Dict[str, str]:
 80 |         """Load environment variables from .env file."""
 81 |         env_vars = {}
 82 | 
 83 |         if not self.env_file.exists():
 84 |             self.add_warning(f"No .env file found at {self.env_file}")
 85 |             return env_vars
 86 | 
 87 |         try:
 88 |             with open(self.env_file, 'r', encoding='utf-8') as f:
 89 |                 for line_num, line in enumerate(f, 1):
 90 |                     line = line.strip()
 91 |                     if line and not line.startswith('#') and '=' in line:
 92 |                         key, value = line.split('=', 1)
 93 |                         env_vars[key.strip()] = value.strip()
 94 |         except Exception as e:
 95 |             self.add_error(f"Failed to load .env file: {e}")
 96 | 
 97 |         return env_vars
 98 | 
 99 |     def add_error(self, message: str):
100 |         """Add error message and increment counter."""
101 |         self.issues.append(f"ERROR: {message}")
102 |         self.error_count += 1
103 | 
104 |     def add_warning(self, message: str):
105 |         """Add warning message and increment counter."""
106 |         self.issues.append(f"WARNING: {message}")
107 |         self.warning_count += 1
108 | 
109 |     def add_success(self, message: str):
110 |         """Add success message and increment counter."""
111 |         self.issues.append(f"SUCCESS: {message}")
112 |         self.success_count += 1
113 | 
114 |     def validate_env_file(self) -> Dict[str, str]:
115 |         """Validate .env file configuration."""
116 |         env_vars = self.load_env_file()
117 | 
118 |         # Check for required variables
119 |         missing_vars = []
120 |         for var in self.required_vars:
121 |             if var not in env_vars or not env_vars[var].strip():
122 |                 missing_vars.append(var)
123 | 
124 |         if missing_vars:
125 |             self.add_error(f"Missing required variables in .env file: {missing_vars}")
126 |         else:
127 |             self.add_success("All required variables present in .env file")
128 | 
129 |         # Check backend setting
130 |         backend = env_vars.get('MCP_MEMORY_STORAGE_BACKEND', '').lower()
131 |         if backend == 'cloudflare':
132 |             self.add_success(".env file configured for Cloudflare backend")
133 |         elif backend:
134 |             self.add_warning(f".env file configured for '{backend}' backend (not Cloudflare)")
135 |         else:
136 |             self.add_error("MCP_MEMORY_STORAGE_BACKEND not set in .env file")
137 | 
138 |         return env_vars
139 | 
140 |     def validate_claude_desktop_config(self) -> Optional[Dict[str, str]]:
141 |         """Validate Claude Desktop configuration."""
142 |         config = self.load_json_safe(self.claude_desktop_config_file)
143 | 
144 |         if not config:
145 |             self.add_error(f"Could not load Claude Desktop config from {self.claude_desktop_config_file}")
146 |             return None
147 | 
148 |         # Extract memory server configuration
149 |         mcp_servers = config.get('mcpServers', {})
150 |         memory_server = mcp_servers.get('memory', {})
151 | 
152 |         if not memory_server:
153 |             self.add_error("Memory server not found in Claude Desktop configuration")
154 |             return None
155 | 
156 |         self.add_success("Memory server found in Claude Desktop configuration")
157 | 
158 |         # Get environment variables from memory server config
159 |         memory_env = memory_server.get('env', {})
160 | 
161 |         # Check required variables
162 |         missing_vars = []
163 |         for var in self.required_vars:
164 |             if var not in memory_env or not str(memory_env[var]).strip():
165 |                 missing_vars.append(var)
166 | 
167 |         if missing_vars:
168 |             self.add_error(f"Missing required variables in Claude Desktop config: {missing_vars}")
169 |         else:
170 |             self.add_success("All required variables present in Claude Desktop config")
171 | 
172 |         return memory_env
173 | 
174 |     def validate_claude_code_config(self):
175 |         """Validate Claude Code global configuration (different from Claude Desktop)."""
176 |         config = self.load_json_safe(self.claude_code_config_file)
177 | 
178 |         if not config:
179 |             self.add_warning(f"Claude Code config not found at {self.claude_code_config_file} (this is optional)")
180 |             return
181 | 
182 |         # Check for memory server configurations in projects
183 |         memory_configs = []
184 |         projects = config.get('projects', {})
185 | 
186 |         for project_path, project_config in projects.items():
187 |             mcp_servers = project_config.get('mcpServers', {})
188 |             if 'memory' in mcp_servers:
189 |                 memory_config = mcp_servers['memory']
190 |                 backend = memory_config.get('env', {}).get('MCP_MEMORY_STORAGE_BACKEND', 'unknown')
191 |                 memory_configs.append((project_path, backend))
192 | 
193 |         if memory_configs:
194 |             cloudflare_configs = [cfg for cfg in memory_configs if cfg[1] == 'cloudflare']
195 |             non_cloudflare_configs = [cfg for cfg in memory_configs if cfg[1] != 'cloudflare']
196 | 
197 |             if cloudflare_configs:
198 |                 self.add_success(f"Found {len(cloudflare_configs)} Cloudflare memory configurations in Claude Code")
199 | 
200 |             if non_cloudflare_configs:
201 |                 self.add_warning(f"Found {len(non_cloudflare_configs)} non-Cloudflare memory configurations in Claude Code")
202 |         else:
203 |             self.add_warning("No memory server configurations found in Claude Code (this is optional)")
204 | 
205 |     def validate_local_mcp_config(self):
206 |         """Check for conflicting local .mcp.json files."""
207 |         if self.local_mcp_config_file.exists():
208 |             config = self.load_json_safe(self.local_mcp_config_file)
209 |             if config and 'memory' in config.get('mcpServers', {}):
210 |                 self.add_error("Local .mcp.json contains memory server configuration (conflicts with global config)")
211 |             else:
212 |                 self.add_success("Local .mcp.json exists but does not conflict with memory configuration")
213 |         else:
214 |             self.add_success("No local .mcp.json found (using global configuration)")
215 | 
216 |     def compare_configurations(self, env_config: Dict[str, str], claude_desktop_config: Optional[Dict[str, str]]):
217 |         """Compare configurations between .env and Claude Desktop config."""
218 |         if not claude_desktop_config:
219 |             self.add_error("Cannot compare configurations - Claude Desktop config not available")
220 |             return
221 | 
222 |         # Compare each required variable
223 |         differences = []
224 |         for var in self.required_vars:
225 |             env_value = env_config.get(var, '<MISSING>')
226 |             claude_value = str(claude_desktop_config.get(var, '<MISSING>'))
227 | 
228 |             if env_value != claude_value:
229 |                 differences.append((var, env_value, claude_value))
230 | 
231 |         if differences:
232 |             self.add_warning(f"Found {len(differences)} configuration differences between .env and Claude Desktop config:")
233 |             for var, env_val, claude_val in differences:
234 |                 self.add_warning(f"  {var}: .env='{env_val[:50]}...' vs Claude='{claude_val[:50]}...'")
235 |         else:
236 |             self.add_success("All configurations match between .env and Claude Desktop config")
237 | 
238 |     def validate_api_token_format(self, token: str) -> Tuple[bool, str]:
239 |         """Validate API token format and detect known invalid tokens."""
240 |         if not token or token == '<MISSING>':
241 |             return False, "Token is missing"
242 | 
243 |         if len(token) < 20:
244 |             return False, "Token appears too short"
245 | 
246 |         if not any(c.isalnum() for c in token):
247 |             return False, "Token should contain alphanumeric characters"
248 | 
249 |         # Check for known placeholder/invalid tokens
250 |         invalid_tokens = [
251 |             'your_token_here',
252 |             'replace_with_token',
253 |             'mkdXbb-iplcHNBRQ5tfqV3Sh_7eALYBpO4e3Di1m'  # Known invalid token
254 |         ]
255 |         if token in invalid_tokens:
256 |             return False, "Token appears to be a placeholder or known invalid token"
257 | 
258 |         return True, "Token format appears valid"
259 | 
260 |     def validate_api_tokens(self, env_config: Dict[str, str], claude_desktop_config: Optional[Dict[str, str]]):
261 |         """Validate API tokens in both configurations."""
262 |         # Check .env token
263 |         env_token = env_config.get('CLOUDFLARE_API_TOKEN', '')
264 |         is_valid, message = self.validate_api_token_format(env_token)
265 | 
266 |         if is_valid:
267 |             self.add_success(f".env API token format: {message}")
268 |         else:
269 |             self.add_error(f".env API token: {message}")
270 | 
271 |         # Check Claude Desktop token
272 |         if claude_desktop_config:
273 |             claude_token = str(claude_desktop_config.get('CLOUDFLARE_API_TOKEN', ''))
274 |             is_valid, message = self.validate_api_token_format(claude_token)
275 | 
276 |             if is_valid:
277 |                 self.add_success(f"Claude Desktop API token format: {message}")
278 |             else:
279 |                 self.add_error(f"Claude Desktop API token: {message}")
280 | 
281 |     def check_environment_conflicts(self):
282 |         """Check for conflicting environment configurations."""
283 |         # Check for conflicting .env files
284 |         env_files = list(self.project_root.glob('.env*'))
285 | 
286 |         # Exclude legitimate backup files
287 |         conflicting_files = [
288 |             f for f in env_files
289 |             if f.name.endswith('.sqlite') and not f.name.endswith('.backup') and f.name != '.env.sqlite'
290 |         ]
291 | 
292 |         if conflicting_files:
293 |             self.add_warning(f"Potentially conflicting environment files found: {[str(f.name) for f in conflicting_files]}")
294 |         else:
295 |             self.add_success("No conflicting environment files detected")
296 | 
297 |     def run_comprehensive_validation(self) -> bool:
298 |         """Run complete configuration validation across all sources."""
299 |         print("Comprehensive MCP Memory Service Configuration Validation")
300 |         print("=" * 70)
301 | 
302 |         # 1. Environment file validation
303 |         print("\n1. Environment File (.env) Validation:")
304 |         env_config = self.validate_env_file()
305 |         self._print_section_results()
306 | 
307 |         # 2. Claude Desktop configuration validation
308 |         print("\n2. Claude Desktop Configuration Validation:")
309 |         claude_desktop_config = self.validate_claude_desktop_config()
310 |         self._print_section_results()
311 | 
312 |         # 3. Claude Code configuration validation (optional)
313 |         print("\n3. Claude Code Global Configuration Check:")
314 |         self.validate_claude_code_config()
315 |         self._print_section_results()
316 | 
317 |         # 4. Local MCP configuration check
318 |         print("\n4. Local Project Configuration Check:")
319 |         self.validate_local_mcp_config()
320 |         self._print_section_results()
321 | 
322 |         # 5. Cross-configuration comparison
323 |         print("\n5. Cross-Configuration Consistency Check:")
324 |         self.compare_configurations(env_config, claude_desktop_config)
325 |         self._print_section_results()
326 | 
327 |         # 6. API token validation
328 |         print("\n6. API Token Validation:")
329 |         self.validate_api_tokens(env_config, claude_desktop_config)
330 |         self._print_section_results()
331 | 
332 |         # 7. Environment conflicts check
333 |         print("\n7. Environment Conflicts Check:")
334 |         self.check_environment_conflicts()
335 |         self._print_section_results()
336 | 
337 |         # Final summary
338 |         self._print_final_summary()
339 | 
340 |         return self.error_count == 0
341 | 
342 |     def _print_section_results(self):
343 |         """Print results for the current section."""
344 |         # Print only new issues since last call
345 |         current_total = len(self.issues)
346 |         if hasattr(self, '_last_printed_index'):
347 |             start_index = self._last_printed_index
348 |         else:
349 |             start_index = 0
350 | 
351 |         for issue in self.issues[start_index:]:
352 |             print(f"   {issue}")
353 | 
354 |         self._last_printed_index = current_total
355 | 
356 |     def _print_final_summary(self):
357 |         """Print comprehensive final summary."""
358 |         print("\n" + "=" * 70)
359 |         print("VALIDATION SUMMARY")
360 |         print("=" * 70)
361 | 
362 |         if self.error_count == 0:
363 |             print("CONFIGURATION VALIDATION PASSED!")
364 |             print(f"   SUCCESS: {self.success_count} checks passed")
365 |             if self.warning_count > 0:
366 |                 print(f"   WARNING: {self.warning_count} warnings (non-critical)")
367 |             print("\nYour MCP Memory Service configuration appears to be correct.")
368 |             print("You should be able to use the memory service with Cloudflare backend.")
369 |         else:
370 |             print("CONFIGURATION VALIDATION FAILED!")
371 |             print(f"   ERROR: {self.error_count} critical errors found")
372 |             print(f"   WARNING: {self.warning_count} warnings")
373 |             print(f"   SUCCESS: {self.success_count} checks passed")
374 |             print("\nPlease fix the critical errors above before using the memory service.")
375 | 
376 |         print(f"\nConfiguration files checked:")
377 |         print(f"   • .env file: {self.env_file}")
378 |         print(f"   • Claude Desktop config: {self.claude_desktop_config_file}")
379 |         print(f"   • Claude Code config: {self.claude_code_config_file}")
380 |         print(f"   • Local MCP config: {self.local_mcp_config_file}")
381 | 
382 | def main():
383 |     """Main validation function."""
384 |     validator = ComprehensiveConfigValidator()
385 |     success = validator.run_comprehensive_validation()
386 |     return 0 if success else 1
387 | 
388 | if __name__ == "__main__":
389 |     sys.exit(main())
```

--------------------------------------------------------------------------------
/claude-hooks/core/session-end.js:
--------------------------------------------------------------------------------

```javascript
  1 | /**
  2 |  * Claude Code Session End Hook
  3 |  * Automatically consolidates session outcomes and stores them as memories
  4 |  */
  5 | 
  6 | const fs = require('fs').promises;
  7 | const path = require('path');
  8 | const https = require('https');
  9 | const http = require('http');
 10 | 
 11 | // Import utilities
 12 | const { detectProjectContext } = require('../utilities/project-detector');
 13 | const { formatSessionConsolidation } = require('../utilities/context-formatter');
 14 | 
 15 | /**
 16 |  * Load hook configuration
 17 |  */
 18 | async function loadConfig() {
 19 |     try {
 20 |         const configPath = path.join(__dirname, '../config.json');
 21 |         const configData = await fs.readFile(configPath, 'utf8');
 22 |         return JSON.parse(configData);
 23 |     } catch (error) {
 24 |         console.warn('[Memory Hook] Using default configuration:', error.message);
 25 |         return {
 26 |             memoryService: {
 27 |                 http: {
 28 |                     endpoint: 'http://127.0.0.1:8000',
 29 |                     apiKey: 'test-key-123'
 30 |                 },
 31 |                 defaultTags: ['claude-code', 'auto-generated'],
 32 |                 enableSessionConsolidation: true
 33 |             },
 34 |             sessionAnalysis: {
 35 |                 extractTopics: true,
 36 |                 extractDecisions: true,
 37 |                 extractInsights: true,
 38 |                 extractCodeChanges: true,
 39 |                 extractNextSteps: true,
 40 |                 minSessionLength: 100 // Minimum characters for meaningful session
 41 |             }
 42 |         };
 43 |     }
 44 | }
 45 | 
 46 | /**
 47 |  * Analyze conversation to extract key information
 48 |  */
 49 | function analyzeConversation(conversationData) {
 50 |     try {
 51 |         const analysis = {
 52 |             topics: [],
 53 |             decisions: [],
 54 |             insights: [],
 55 |             codeChanges: [],
 56 |             nextSteps: [],
 57 |             sessionLength: 0,
 58 |             confidence: 0
 59 |         };
 60 |         
 61 |         if (!conversationData || !conversationData.messages) {
 62 |             return analysis;
 63 |         }
 64 |         
 65 |         const messages = conversationData.messages;
 66 |         const conversationText = messages.map(msg => msg.content || '').join('\n').toLowerCase();
 67 |         analysis.sessionLength = conversationText.length;
 68 |         
 69 |         // Extract topics (simple keyword matching)
 70 |         const topicKeywords = {
 71 |             'implementation': /implement|implementing|implementation|build|building|create|creating/g,
 72 |             'debugging': /debug|debugging|bug|error|fix|fixing|issue|problem/g,
 73 |             'architecture': /architecture|design|structure|pattern|framework|system/g,
 74 |             'performance': /performance|optimization|speed|memory|efficient|faster/g,
 75 |             'testing': /test|testing|unit test|integration|coverage|spec/g,
 76 |             'deployment': /deploy|deployment|production|staging|release/g,
 77 |             'configuration': /config|configuration|setup|environment|settings/g,
 78 |             'database': /database|db|sql|query|schema|migration/g,
 79 |             'api': /api|endpoint|rest|graphql|service|interface/g,
 80 |             'ui': /ui|interface|frontend|component|styling|css|html/g
 81 |         };
 82 |         
 83 |         Object.entries(topicKeywords).forEach(([topic, regex]) => {
 84 |             if (conversationText.match(regex)) {
 85 |                 analysis.topics.push(topic);
 86 |             }
 87 |         });
 88 |         
 89 |         // Extract decisions (look for decision language)
 90 |         const decisionPatterns = [
 91 |             /decided to|decision to|chose to|choosing|will use|going with/g,
 92 |             /better to|prefer|recommend|should use|opt for/g,
 93 |             /concluded that|determined that|agreed to/g
 94 |         ];
 95 |         
 96 |         messages.forEach(msg => {
 97 |             const content = (msg.content || '').toLowerCase();
 98 |             decisionPatterns.forEach(pattern => {
 99 |                 const matches = content.match(pattern);
100 |                 if (matches) {
101 |                     // Extract sentences containing decisions
102 |                     const sentences = msg.content.split(/[.!?]+/);
103 |                     sentences.forEach(sentence => {
104 |                         if (pattern.test(sentence.toLowerCase()) && sentence.length > 20) {
105 |                             analysis.decisions.push(sentence.trim());
106 |                         }
107 |                     });
108 |                 }
109 |             });
110 |         });
111 |         
112 |         // Extract insights (look for learning language)
113 |         const insightPatterns = [
114 |             /learned that|discovered|realized|found out|turns out/g,
115 |             /insight|understanding|conclusion|takeaway|lesson/g,
116 |             /important to note|key finding|observation/g
117 |         ];
118 |         
119 |         messages.forEach(msg => {
120 |             const content = (msg.content || '').toLowerCase();
121 |             insightPatterns.forEach(pattern => {
122 |                 if (pattern.test(content)) {
123 |                     const sentences = msg.content.split(/[.!?]+/);
124 |                     sentences.forEach(sentence => {
125 |                         if (pattern.test(sentence.toLowerCase()) && sentence.length > 20) {
126 |                             analysis.insights.push(sentence.trim());
127 |                         }
128 |                     });
129 |                 }
130 |             });
131 |         });
132 |         
133 |         // Extract code changes (look for technical implementations)
134 |         const codePatterns = [
135 |             /added|created|implemented|built|wrote/g,
136 |             /modified|updated|changed|refactored|improved/g,
137 |             /fixed|resolved|corrected|patched/g
138 |         ];
139 |         
140 |         messages.forEach(msg => {
141 |             const content = msg.content || '';
142 |             if (content.includes('```') || /\.(js|py|rs|go|java|cpp|c|ts|jsx|tsx)/.test(content)) {
143 |                 // This message contains code
144 |                 const lowerContent = content.toLowerCase();
145 |                 codePatterns.forEach(pattern => {
146 |                     if (pattern.test(lowerContent)) {
147 |                         const sentences = content.split(/[.!?]+/);
148 |                         sentences.forEach(sentence => {
149 |                             if (pattern.test(sentence.toLowerCase()) && sentence.length > 15) {
150 |                                 analysis.codeChanges.push(sentence.trim());
151 |                             }
152 |                         });
153 |                     }
154 |                 });
155 |             }
156 |         });
157 |         
158 |         // Extract next steps (look for future language)
159 |         const nextStepsPatterns = [
160 |             /next|todo|need to|should|will|plan to|going to/g,
161 |             /follow up|continue|proceed|implement next|work on/g,
162 |             /remaining|still need|outstanding|future/g
163 |         ];
164 |         
165 |         messages.forEach(msg => {
166 |             const content = (msg.content || '').toLowerCase();
167 |             nextStepsPatterns.forEach(pattern => {
168 |                 if (pattern.test(content)) {
169 |                     const sentences = msg.content.split(/[.!?]+/);
170 |                     sentences.forEach(sentence => {
171 |                         if (pattern.test(sentence.toLowerCase()) && sentence.length > 15) {
172 |                             analysis.nextSteps.push(sentence.trim());
173 |                         }
174 |                     });
175 |                 }
176 |             });
177 |         });
178 |         
179 |         // Calculate confidence based on extracted information
180 |         const totalExtracted = analysis.topics.length + analysis.decisions.length + 
181 |                               analysis.insights.length + analysis.codeChanges.length + 
182 |                               analysis.nextSteps.length;
183 |         
184 |         analysis.confidence = Math.min(1.0, totalExtracted / 10); // Max confidence at 10+ items
185 |         
186 |         // Limit arrays to prevent overwhelming output
187 |         analysis.topics = analysis.topics.slice(0, 5);
188 |         analysis.decisions = analysis.decisions.slice(0, 3);
189 |         analysis.insights = analysis.insights.slice(0, 3);
190 |         analysis.codeChanges = analysis.codeChanges.slice(0, 4);
191 |         analysis.nextSteps = analysis.nextSteps.slice(0, 4);
192 |         
193 |         return analysis;
194 |         
195 |     } catch (error) {
196 |         console.error('[Memory Hook] Error analyzing conversation:', error.message);
197 |         return {
198 |             topics: [],
199 |             decisions: [],
200 |             insights: [],
201 |             codeChanges: [],
202 |             nextSteps: [],
203 |             sessionLength: 0,
204 |             confidence: 0,
205 |             error: error.message
206 |         };
207 |     }
208 | }
209 | 
210 | /**
211 |  * Store session consolidation to memory service
212 |  */
213 | function storeSessionMemory(endpoint, apiKey, content, projectContext, analysis) {
214 |     return new Promise((resolve, reject) => {
215 |         const url = new URL('/api/memories', endpoint);
216 |         const isHttps = url.protocol === 'https:';
217 |         const requestModule = isHttps ? https : http;
218 | 
219 |         // Generate tags based on analysis and project context
220 |         const tags = [
221 |             'claude-code-session',
222 |             'session-consolidation',
223 |             projectContext.name,
224 |             `language:${projectContext.language}`,
225 |             ...analysis.topics.slice(0, 3), // Top 3 topics as tags
226 |             ...projectContext.frameworks.slice(0, 2), // Top 2 frameworks
227 |             `confidence:${Math.round(analysis.confidence * 100)}`
228 |         ].filter(Boolean);
229 | 
230 |         const postData = JSON.stringify({
231 |             content: content,
232 |             tags: tags,
233 |             memory_type: 'session-summary',
234 |             metadata: {
235 |                 session_analysis: {
236 |                     topics: analysis.topics,
237 |                     decisions_count: analysis.decisions.length,
238 |                     insights_count: analysis.insights.length,
239 |                     code_changes_count: analysis.codeChanges.length,
240 |                     next_steps_count: analysis.nextSteps.length,
241 |                     session_length: analysis.sessionLength,
242 |                     confidence: analysis.confidence
243 |                 },
244 |                 project_context: {
245 |                     name: projectContext.name,
246 |                     language: projectContext.language,
247 |                     frameworks: projectContext.frameworks
248 |                 },
249 |                 generated_by: 'claude-code-session-end-hook',
250 |                 generated_at: new Date().toISOString()
251 |             }
252 |         });
253 | 
254 |         const options = {
255 |             hostname: url.hostname,
256 |             port: url.port || (isHttps ? 8443 : 8000),
257 |             path: url.pathname,
258 |             method: 'POST',
259 |             headers: {
260 |                 'Content-Type': 'application/json',
261 |                 'Content-Length': Buffer.byteLength(postData),
262 |                 'Authorization': `Bearer ${apiKey}`
263 |             }
264 |         };
265 | 
266 |         // Only set rejectUnauthorized for HTTPS
267 |         if (isHttps) {
268 |             options.rejectUnauthorized = false; // For self-signed certificates
269 |         }
270 | 
271 |         const req = requestModule.request(options, (res) => {
272 |             let data = '';
273 |             res.on('data', (chunk) => {
274 |                 data += chunk;
275 |             });
276 |             res.on('end', () => {
277 |                 try {
278 |                     const response = JSON.parse(data);
279 |                     resolve(response);
280 |                 } catch (parseError) {
281 |                     resolve({ success: false, error: 'Parse error', data });
282 |                 }
283 |             });
284 |         });
285 | 
286 |         req.on('error', (error) => {
287 |             resolve({ success: false, error: error.message });
288 |         });
289 | 
290 |         req.write(postData);
291 |         req.end();
292 |     });
293 | }
294 | 
295 | /**
296 |  * Main session end hook function
297 |  */
298 | async function onSessionEnd(context) {
299 |     try {
300 |         console.log('[Memory Hook] Session ending - consolidating outcomes...');
301 |         
302 |         // Load configuration
303 |         const config = await loadConfig();
304 |         
305 |         if (!config.memoryService.enableSessionConsolidation) {
306 |             console.log('[Memory Hook] Session consolidation disabled in config');
307 |             return;
308 |         }
309 |         
310 |         // Check if session is meaningful enough to store
311 |         if (context.conversation && context.conversation.messages) {
312 |             const totalLength = context.conversation.messages
313 |                 .map(msg => (msg.content || '').length)
314 |                 .reduce((sum, len) => sum + len, 0);
315 |                 
316 |             if (totalLength < config.sessionAnalysis.minSessionLength) {
317 |                 console.log('[Memory Hook] Session too short for consolidation');
318 |                 return;
319 |             }
320 |         }
321 |         
322 |         // Detect project context
323 |         const projectContext = await detectProjectContext(context.workingDirectory || process.cwd());
324 |         console.log(`[Memory Hook] Consolidating session for project: ${projectContext.name}`);
325 |         
326 |         // Analyze conversation
327 |         const analysis = analyzeConversation(context.conversation);
328 |         
329 |         if (analysis.confidence < 0.1) {
330 |             console.log('[Memory Hook] Session analysis confidence too low, skipping consolidation');
331 |             return;
332 |         }
333 |         
334 |         console.log(`[Memory Hook] Session analysis: ${analysis.topics.length} topics, ${analysis.decisions.length} decisions, confidence: ${(analysis.confidence * 100).toFixed(1)}%`);
335 |         
336 |         // Format session consolidation
337 |         const consolidation = formatSessionConsolidation(analysis, projectContext);
338 | 
339 |         // Get endpoint and apiKey from new config structure
340 |         const endpoint = config.memoryService?.http?.endpoint || config.memoryService?.endpoint || 'http://127.0.0.1:8000';
341 |         const apiKey = config.memoryService?.http?.apiKey || config.memoryService?.apiKey || 'test-key-123';
342 | 
343 |         // Store to memory service
344 |         const result = await storeSessionMemory(
345 |             endpoint,
346 |             apiKey,
347 |             consolidation,
348 |             projectContext,
349 |             analysis
350 |         );
351 |         
352 |         if (result.success || result.content_hash) {
353 |             console.log(`[Memory Hook] Session consolidation stored successfully`);
354 |             if (result.content_hash) {
355 |                 console.log(`[Memory Hook] Memory hash: ${result.content_hash.substring(0, 8)}...`);
356 |             }
357 |         } else {
358 |             console.warn('[Memory Hook] Failed to store session consolidation:', result.error || 'Unknown error');
359 |         }
360 |         
361 |     } catch (error) {
362 |         console.error('[Memory Hook] Error in session end:', error.message);
363 |         // Fail gracefully - don't prevent session from ending
364 |     }
365 | }
366 | 
367 | /**
368 |  * Hook metadata for Claude Code
369 |  */
370 | module.exports = {
371 |     name: 'memory-awareness-session-end',
372 |     version: '1.0.0',
373 |     description: 'Automatically consolidate and store session outcomes',
374 |     trigger: 'session-end',
375 |     handler: onSessionEnd,
376 |     config: {
377 |         async: true,
378 |         timeout: 15000, // 15 second timeout
379 |         priority: 'normal'
380 |     }
381 | };
382 | 
383 | // Direct execution support for testing
384 | if (require.main === module) {
385 |     // Test the hook with mock context
386 |     const mockConversation = {
387 |         messages: [
388 |             {
389 |                 role: 'user',
390 |                 content: 'I need to implement a memory awareness system for Claude Code'
391 |             },
392 |             {
393 |                 role: 'assistant',
394 |                 content: 'I\'ll help you create a memory awareness system. We decided to use hooks for session management and implement automatic context injection.'
395 |             },
396 |             {
397 |                 role: 'user', 
398 |                 content: 'Great! I learned that we need project detection and memory scoring algorithms.'
399 |             },
400 |             {
401 |                 role: 'assistant',
402 |                 content: 'Exactly. I implemented the project detector in project-detector.js and created scoring algorithms. Next we need to test the complete system.'
403 |             }
404 |         ]
405 |     };
406 |     
407 |     const mockContext = {
408 |         workingDirectory: process.cwd(),
409 |         sessionId: 'test-session',
410 |         conversation: mockConversation
411 |     };
412 |     
413 |     onSessionEnd(mockContext)
414 |         .then(() => console.log('Session end hook test completed'))
415 |         .catch(error => console.error('Session end hook test failed:', error));
416 | }
```

--------------------------------------------------------------------------------
/archive/docs-removed-2025-08-23/development/multi-client-architecture.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Multi-Client Architecture Documentation
  2 | 
  3 | This document provides technical details about the multi-client architecture implementation in the MCP Memory Service, specifically focusing on the integrated setup functionality added to `install.py`.
  4 | 
  5 | ## Overview
  6 | 
  7 | The multi-client architecture enables multiple MCP applications to safely share the same memory database concurrently. The latest implementation integrates this setup directly into the main installation process, making it universally accessible to any MCP-compatible application.
  8 | 
  9 | ## Architecture Components
 10 | 
 11 | ### 1. Client Detection System
 12 | 
 13 | #### Detection Strategy
 14 | The system uses a multi-pronged approach to detect MCP clients:
 15 | 
 16 | ```python
 17 | def detect_mcp_clients():
 18 |     """Detect installed MCP-compatible applications."""
 19 |     clients = {}
 20 |     
 21 |     # Pattern-based detection for known applications
 22 |     detection_patterns = {
 23 |         'claude_desktop': [
 24 |             Path.home() / "AppData" / "Roaming" / "Claude" / "claude_desktop_config.json",  # Windows
 25 |             Path.home() / "Library" / "Application Support" / "Claude" / "claude_desktop_config.json",  # macOS
 26 |             Path.home() / ".config" / "Claude" / "claude_desktop_config.json"  # Linux
 27 |         ],
 28 |         'vscode_mcp': [
 29 |             # VS Code settings.json locations with MCP extension detection
 30 |         ],
 31 |         'continue': [
 32 |             # Continue IDE configuration locations
 33 |         ],
 34 |         'generic_mcp': [
 35 |             # Generic MCP configuration file locations
 36 |         ]
 37 |     }
 38 | ```
 39 | 
 40 | #### Detection Logic
 41 | 1. **File-based Detection**: Checks for configuration files in standard locations
 42 | 2. **Content Analysis**: Examines configuration files for MCP-related settings
 43 | 3. **CLI Detection**: Tests for command-line tools (e.g., Claude Code)
 44 | 4. **Extension Detection**: Identifies IDE extensions that support MCP
 45 | 
 46 | ### 2. Configuration Management
 47 | 
 48 | #### Configuration Abstraction
 49 | Each client type has a dedicated configuration handler:
 50 | 
 51 | ```python
 52 | class MCPClientConfigurator:
 53 |     """Base class for MCP client configuration."""
 54 |     
 55 |     def detect(self) -> bool:
 56 |         """Detect if this client is installed."""
 57 |         raise NotImplementedError
 58 |     
 59 |     def configure(self, config: MCPConfig) -> bool:
 60 |         """Configure the client for multi-client access."""
 61 |         raise NotImplementedError
 62 |     
 63 |     def validate(self) -> bool:
 64 |         """Validate the configuration."""
 65 |         raise NotImplementedError
 66 | 
 67 | class ClaudeDesktopConfigurator(MCPClientConfigurator):
 68 |     """Configure Claude Desktop for multi-client access."""
 69 |     
 70 |     def configure(self, config: MCPConfig) -> bool:
 71 |         # Update claude_desktop_config.json
 72 |         pass
 73 | 
 74 | class ContinueIDEConfigurator(MCPClientConfigurator):
 75 |     """Configure Continue IDE for multi-client access."""
 76 |     
 77 |     def configure(self, config: MCPConfig) -> bool:
 78 |         # Update Continue configuration files
 79 |         pass
 80 | ```
 81 | 
 82 | #### Configuration Template System
 83 | Universal configuration templates ensure consistency:
 84 | 
 85 | ```python
 86 | class MCPConfig:
 87 |     """Standard MCP configuration structure."""
 88 |     
 89 |     def __init__(self, repo_path: str):
 90 |         self.repo_path = repo_path
 91 |         self.base_config = {
 92 |             "command": "uv",
 93 |             "args": ["--directory", repo_path, "run", "memory"],
 94 |             "env": {
 95 |                 "MCP_MEMORY_STORAGE_BACKEND": "sqlite_vec",
 96 |                 "MCP_MEMORY_SQLITE_PRAGMAS": "busy_timeout=15000,cache_size=20000",
 97 |                 "LOG_LEVEL": "INFO"
 98 |             }
 99 |         }
100 |     
101 |     def for_client(self, client_type: str) -> dict:
102 |         """Generate client-specific configuration."""
103 |         config = self.base_config.copy()
104 |         
105 |         if client_type == "claude_desktop":
106 |             # Claude Desktop specific adjustments
107 |             pass
108 |         elif client_type == "continue":
109 |             # Continue IDE specific adjustments
110 |             pass
111 |         
112 |         return config
113 | ```
114 | 
115 | ### 3. WAL Mode Coordination
116 | 
117 | #### SQLite WAL Implementation
118 | Write-Ahead Logging mode enables safe concurrent access:
119 | 
120 | ```python
121 | async def test_wal_mode_coordination():
122 |     """Test WAL mode storage coordination for multi-client access."""
123 |     
124 |     # Create test database with WAL mode
125 |     storage = SqliteVecMemoryStorage(test_db_path)
126 |     await storage.initialize()
127 |     
128 |     # WAL mode pragmas are applied in storage initialization:
129 |     # PRAGMA journal_mode=WAL;
130 |     # PRAGMA busy_timeout=15000;
131 |     # PRAGMA cache_size=20000;
132 |     # PRAGMA synchronous=NORMAL;
133 |     
134 |     # Test concurrent access patterns
135 |     storage2 = SqliteVecMemoryStorage(test_db_path)
136 |     await storage2.initialize()
137 |     
138 |     # Verify both can read/write safely
139 |     success = await test_concurrent_operations(storage, storage2)
140 |     return success
141 | ```
142 | 
143 | #### Concurrency Model
144 | - **Multiple Readers**: Any number of clients can read simultaneously
145 | - **Single Writer**: One client writes at a time, with automatic queuing
146 | - **Retry Logic**: Exponential backoff for lock conflicts
147 | - **Timeout Handling**: 15-second timeout prevents deadlocks
148 | 
149 | ### 4. Integration Flow
150 | 
151 | #### Installation Integration Points
152 | The multi-client setup integrates at specific points in the installation flow:
153 | 
154 | ```python
155 | def main():
156 |     """Main installation function with multi-client integration."""
157 |     
158 |     # 1. System detection and backend selection
159 |     system_info = detect_system()
160 |     final_backend = determine_backend(args, system_info)
161 |     
162 |     # 2. Core installation (dependencies, package, paths)
163 |     install_success = install_package(args)
164 |     configure_paths(args)
165 |     verify_installation()
166 |     
167 |     # 3. Multi-client integration point
168 |     if should_offer_multi_client_setup(args, final_backend):
169 |         if prompt_user_for_multi_client() or args.setup_multi_client:
170 |             setup_universal_multi_client_access(system_info, args)
171 |     
172 |     # 4. Final configuration and completion
173 |     complete_installation()
174 | ```
175 | 
176 | #### Decision Logic
177 | Multi-client setup is offered based on:
178 | 
179 | ```python
180 | def should_offer_multi_client_setup(args, final_backend):
181 |     """Intelligent decision logic for multi-client offering."""
182 |     
183 |     # Required: SQLite-vec backend (only backend supporting multi-client)
184 |     if final_backend != "sqlite_vec":
185 |         return False
186 |     
187 |     # Skip in automated/server environments
188 |     if args.server_mode or args.skip_multi_client_prompt:
189 |         return False
190 |     
191 |     # Always beneficial for development environments
192 |     return True
193 | ```
194 | 
195 | ### 5. Error Handling and Fallbacks
196 | 
197 | #### Layered Error Handling
198 | The system implements multiple fallback layers:
199 | 
200 | ```python
201 | def setup_universal_multi_client_access(system_info, args):
202 |     """Configure multi-client access with comprehensive error handling."""
203 |     
204 |     try:
205 |         # Layer 1: WAL mode validation
206 |         if not test_wal_mode_coordination():
207 |             raise MCPSetupError("WAL mode coordination test failed")
208 |         
209 |         # Layer 2: Client detection and configuration
210 |         clients = detect_mcp_clients()
211 |         success_count = configure_detected_clients(clients, system_info)
212 |         
213 |         # Layer 3: Environment setup
214 |         setup_shared_environment()
215 |         
216 |         # Layer 4: Generic configuration (always succeeds)
217 |         provide_generic_configuration()
218 |         
219 |         return True
220 |         
221 |     except MCPSetupError as e:
222 |         # Graceful degradation: provide manual instructions
223 |         print_error(f"Automated setup failed: {e}")
224 |         provide_manual_setup_instructions()
225 |         return False
226 | ```
227 | 
228 | #### Fallback Mechanisms
229 | 1. **Automated → Manual**: If automated setup fails, provide manual instructions
230 | 2. **Specific → Generic**: If client-specific config fails, use generic templates
231 | 3. **Integrated → Standalone**: Direct users to standalone setup script
232 | 4. **Setup → Documentation**: Always provide comprehensive documentation
233 | 
234 | ### 6. Extensibility Framework
235 | 
236 | #### Adding New Client Support
237 | The architecture is designed for easy extension:
238 | 
239 | ```python
240 | # 1. Add detection pattern
241 | def detect_new_client():
242 |     """Detect NewClient MCP application."""
243 |     config_paths = [
244 |         # Platform-specific configuration file locations
245 |     ]
246 |     
247 |     for path in config_paths:
248 |         if path.exists() and is_mcp_configured(path):
249 |             return path
250 |     return None
251 | 
252 | # 2. Add configuration handler
253 | def configure_new_client_multi_client(config_path):
254 |     """Configure NewClient for multi-client access."""
255 |     try:
256 |         # Read existing configuration
257 |         config = read_client_config(config_path)
258 |         
259 |         # Apply multi-client settings
260 |         config.update(generate_mcp_config())
261 |         
262 |         # Write updated configuration
263 |         write_client_config(config_path, config)
264 |         
265 |         print_info("  [OK] NewClient: Updated configuration")
266 |         return True
267 |     except Exception as e:
268 |         print_warning(f"  -> NewClient configuration failed: {e}")
269 |         return False
270 | 
271 | # 3. Register in detection system
272 | def detect_mcp_clients():
273 |     clients = {}
274 |     
275 |     # ... existing detection logic ...
276 |     
277 |     # Add new client detection
278 |     new_client_path = detect_new_client()
279 |     if new_client_path:
280 |         clients['new_client'] = new_client_path
281 |     
282 |     return clients
283 | ```
284 | 
285 | #### Plugin Architecture Potential
286 | The current implementation could evolve into a plugin system:
287 | 
288 | ```python
289 | class MCPClientPlugin:
290 |     """Base class for MCP client plugins."""
291 |     
292 |     name: str
293 |     priority: int
294 |     
295 |     def detect(self) -> Optional[Path]:
296 |         """Detect client installation."""
297 |         pass
298 |     
299 |     def configure(self, config: MCPConfig, config_path: Path) -> bool:
300 |         """Configure client for multi-client access."""
301 |         pass
302 |     
303 |     def validate(self, config_path: Path) -> bool:
304 |         """Validate client configuration."""
305 |         pass
306 | 
307 | # Plugin registration system
308 | REGISTERED_PLUGINS = [
309 |     ClaudeDesktopPlugin(),
310 |     ContinueIDEPlugin(),
311 |     VSCodeMCPPlugin(),
312 |     CursorIDEPlugin(),
313 |     GenericMCPPlugin(),  # Fallback plugin
314 | ]
315 | ```
316 | 
317 | ## Technical Decisions
318 | 
319 | ### Why SQLite-vec Only?
320 | Multi-client support is limited to SQLite-vec backend because:
321 | 
322 | 1. **WAL Mode Support**: SQLite's WAL mode provides robust concurrent access
323 | 2. **File-based Storage**: Single database file simplifies sharing
324 | 3. **Performance**: SQLite is optimized for multi-reader scenarios
325 | 4. **Reliability**: Well-tested concurrency mechanisms
326 | 5. **Simplicity**: No network coordination required
327 | 
328 | ### Why Integrated Setup?
329 | Integration into the main installer provides:
330 | 
331 | 1. **Discoverability**: Users learn about multi-client capabilities during installation
332 | 2. **Convenience**: One-step setup for all MCP applications
333 | 3. **Consistency**: Uniform configuration across all clients
334 | 4. **Future-proofing**: Automatic support for new MCP applications
335 | 
336 | ### Configuration Strategy
337 | Direct configuration file modification was chosen over:
338 | 
339 | - **Environment variables only**: Would require manual client restart
340 | - **Network-based coordination**: Adds complexity and failure points
341 | - **Copy-paste instructions**: Reduces user experience and increases errors
342 | 
343 | ## Performance Considerations
344 | 
345 | ### Database Performance
346 | - **WAL Mode**: Allows concurrent readers without blocking
347 | - **Cache Size**: 20MB cache improves multi-client performance
348 | - **Busy Timeout**: 15-second timeout prevents deadlocks
349 | - **Synchronous Mode**: NORMAL mode balances safety and performance
350 | 
351 | ### Memory Usage
352 | - **Shared Database**: Single database reduces total memory usage
353 | - **Connection Pooling**: Each client maintains its own connection pool
354 | - **Cache Coordination**: WAL mode provides implicit cache coordination
355 | 
356 | ### Startup Performance
357 | - **Lazy Initialization**: Clients initialize storage on first use
358 | - **Fast Detection**: Configuration file checking is optimized
359 | - **Minimal Overhead**: Setup adds <1 second to installation time
360 | 
361 | ## Security Considerations
362 | 
363 | ### File System Security
364 | - **Path Validation**: All configuration paths are validated before modification
365 | - **Backup Creation**: Original configurations are backed up before changes
366 | - **Permission Checks**: Write permissions verified before attempting changes
367 | 
368 | ### Configuration Security
369 | - **Template Validation**: All configuration templates are validated
370 | - **Injection Prevention**: No user input is directly inserted into configurations
371 | - **Safe Defaults**: Conservative defaults used for security-sensitive settings
372 | 
373 | ## Testing Strategy
374 | 
375 | ### Unit Testing
376 | ```python
377 | def test_client_detection():
378 |     """Test MCP client detection functionality."""
379 |     
380 |     # Mock configuration files
381 |     with mock_config_files():
382 |         clients = detect_mcp_clients()
383 |         assert 'claude_desktop' in clients
384 |         assert 'continue' in clients
385 | 
386 | def test_configuration_generation():
387 |     """Test MCP configuration generation."""
388 |     
389 |     config = MCPConfig("/test/repo")
390 |     claude_config = config.for_client("claude_desktop")
391 |     
392 |     assert claude_config["env"]["MCP_MEMORY_STORAGE_BACKEND"] == "sqlite_vec"
393 |     assert "busy_timeout" in claude_config["env"]["MCP_MEMORY_SQLITE_PRAGMAS"]
394 | ```
395 | 
396 | ### Integration Testing
397 | ```python
398 | async def test_multi_client_coordination():
399 |     """Test actual multi-client database coordination."""
400 |     
401 |     # Create test database
402 |     db_path = create_test_database()
403 |     
404 |     # Initialize multiple storage instances
405 |     storage1 = SqliteVecMemoryStorage(db_path)
406 |     storage2 = SqliteVecMemoryStorage(db_path)
407 |     
408 |     await storage1.initialize()
409 |     await storage2.initialize()
410 |     
411 |     # Test concurrent operations
412 |     success = await test_concurrent_read_write(storage1, storage2)
413 |     assert success
414 | ```
415 | 
416 | ### End-to-End Testing
417 | ```python
418 | def test_full_installation_flow():
419 |     """Test complete installation with multi-client setup."""
420 |     
421 |     with temporary_environment():
422 |         # Run installer with multi-client setup
423 |         result = run_installer(["--setup-multi-client", "--storage-backend", "sqlite_vec"])
424 |         
425 |         assert result.success
426 |         assert result.multi_client_configured
427 |         assert validate_client_configurations()
428 | ```
429 | 
430 | ## Monitoring and Observability
431 | 
432 | ### Logging Framework
433 | ```python
434 | # Multi-client specific logging
435 | logger = logging.getLogger("mcp.multi_client")
436 | 
437 | def setup_universal_multi_client_access(system_info, args):
438 |     logger.info("Starting universal multi-client setup")
439 |     
440 |     clients = detect_mcp_clients()
441 |     logger.info(f"Detected {len(clients)} MCP clients: {list(clients.keys())}")
442 |     
443 |     for client_type, config_path in clients.items():
444 |         try:
445 |             success = configure_client(client_type, config_path)
446 |             logger.info(f"Client {client_type} configuration: {'success' if success else 'failed'}")
447 |         except Exception as e:
448 |             logger.error(f"Client {client_type} configuration error: {e}")
449 | ```
450 | 
451 | ### Metrics Collection
452 | ```python
453 | # Installation metrics
454 | class InstallationMetrics:
455 |     def __init__(self):
456 |         self.clients_detected = 0
457 |         self.clients_configured = 0
458 |         self.configuration_errors = []
459 |         self.setup_duration = 0
460 |     
461 |     def record_client_detection(self, client_type: str):
462 |         self.clients_detected += 1
463 |     
464 |     def record_configuration_success(self, client_type: str):
465 |         self.clients_configured += 1
466 |     
467 |     def record_configuration_error(self, client_type: str, error: str):
468 |         self.configuration_errors.append((client_type, error))
469 | ```
470 | 
471 | ## Future Enhancements
472 | 
473 | ### Planned Improvements
474 | 1. **HTTP Coordination**: Advanced coordination for 3+ clients
475 | 2. **Configuration Validation**: Real-time validation of client configurations
476 | 3. **Auto-Updates**: Automatic configuration updates for new MCP versions
477 | 4. **Cloud Sync**: Multi-device memory synchronization
478 | 5. **Plugin System**: Formal plugin architecture for client support
479 | 
480 | ### Research Areas
481 | 1. **Conflict Resolution**: Advanced merge strategies for concurrent edits
482 | 2. **Performance Optimization**: Database sharding for large-scale deployments
483 | 3. **Security Enhancements**: Encrypted inter-client communication
484 | 4. **Mobile Support**: Extension to mobile MCP applications
485 | 
486 | This architecture provides a robust, extensible foundation for universal multi-client support in the MCP Memory Service ecosystem.
```