basicmachines-co/basic-memory # codebase.md

This is page 18 of 23. Use http://codebase.md/basicmachines-co/basic-memory?lines=true&page={x} to view the full context.

# Directory Structure

```
├── .claude
│   ├── agents
│   │   ├── python-developer.md
│   │   └── system-architect.md
│   └── commands
│       ├── release
│       │   ├── beta.md
│       │   ├── changelog.md
│       │   ├── release-check.md
│       │   └── release.md
│       ├── spec.md
│       └── test-live.md
├── .dockerignore
├── .github
│   ├── dependabot.yml
│   ├── ISSUE_TEMPLATE
│   │   ├── bug_report.md
│   │   ├── config.yml
│   │   ├── documentation.md
│   │   └── feature_request.md
│   └── workflows
│       ├── claude-code-review.yml
│       ├── claude-issue-triage.yml
│       ├── claude.yml
│       ├── dev-release.yml
│       ├── docker.yml
│       ├── pr-title.yml
│       ├── release.yml
│       └── test.yml
├── .gitignore
├── .python-version
├── CHANGELOG.md
├── CITATION.cff
├── CLA.md
├── CLAUDE.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── docker-compose.yml
├── Dockerfile
├── docs
│   ├── ai-assistant-guide-extended.md
│   ├── character-handling.md
│   ├── cloud-cli.md
│   └── Docker.md
├── justfile
├── LICENSE
├── llms-install.md
├── pyproject.toml
├── README.md
├── SECURITY.md
├── smithery.yaml
├── specs
│   ├── SPEC-1 Specification-Driven Development Process.md
│   ├── SPEC-10 Unified Deployment Workflow and Event Tracking.md
│   ├── SPEC-11 Basic Memory API Performance Optimization.md
│   ├── SPEC-12 OpenTelemetry Observability.md
│   ├── SPEC-13 CLI Authentication with Subscription Validation.md
│   ├── SPEC-14 Cloud Git Versioning & GitHub Backup.md
│   ├── SPEC-14- Cloud Git Versioning & GitHub Backup.md
│   ├── SPEC-15 Configuration Persistence via Tigris for Cloud Tenants.md
│   ├── SPEC-16 MCP Cloud Service Consolidation.md
│   ├── SPEC-17 Semantic Search with ChromaDB.md
│   ├── SPEC-18 AI Memory Management Tool.md
│   ├── SPEC-19 Sync Performance and Memory Optimization.md
│   ├── SPEC-2 Slash Commands Reference.md
│   ├── SPEC-3 Agent Definitions.md
│   ├── SPEC-4 Notes Web UI Component Architecture.md
│   ├── SPEC-5 CLI Cloud Upload via WebDAV.md
│   ├── SPEC-6 Explicit Project Parameter Architecture.md
│   ├── SPEC-7 POC to spike Tigris Turso for local access to cloud data.md
│   ├── SPEC-8 TigrisFS Integration.md
│   ├── SPEC-9 Multi-Project Bidirectional Sync Architecture.md
│   ├── SPEC-9 Signed Header Tenant Information.md
│   └── SPEC-9-1 Follow-Ups- Conflict, Sync, and Observability.md
├── src
│   └── basic_memory
│       ├── __init__.py
│       ├── alembic
│       │   ├── alembic.ini
│       │   ├── env.py
│       │   ├── migrations.py
│       │   ├── script.py.mako
│       │   └── versions
│       │       ├── 3dae7c7b1564_initial_schema.py
│       │       ├── 502b60eaa905_remove_required_from_entity_permalink.py
│       │       ├── 5fe1ab1ccebe_add_projects_table.py
│       │       ├── 647e7a75e2cd_project_constraint_fix.py
│       │       ├── 9d9c1cb7d8f5_add_mtime_and_size_columns_to_entity_.py
│       │       ├── a1b2c3d4e5f6_fix_project_foreign_keys.py
│       │       ├── b3c3938bacdb_relation_to_name_unique_index.py
│       │       ├── cc7172b46608_update_search_index_schema.py
│       │       └── e7e1f4367280_add_scan_watermark_tracking_to_project.py
│       ├── api
│       │   ├── __init__.py
│       │   ├── app.py
│       │   ├── routers
│       │   │   ├── __init__.py
│       │   │   ├── directory_router.py
│       │   │   ├── importer_router.py
│       │   │   ├── knowledge_router.py
│       │   │   ├── management_router.py
│       │   │   ├── memory_router.py
│       │   │   ├── project_router.py
│       │   │   ├── prompt_router.py
│       │   │   ├── resource_router.py
│       │   │   ├── search_router.py
│       │   │   └── utils.py
│       │   └── template_loader.py
│       ├── cli
│       │   ├── __init__.py
│       │   ├── app.py
│       │   ├── auth.py
│       │   ├── commands
│       │   │   ├── __init__.py
│       │   │   ├── cloud
│       │   │   │   ├── __init__.py
│       │   │   │   ├── api_client.py
│       │   │   │   ├── bisync_commands.py
│       │   │   │   ├── cloud_utils.py
│       │   │   │   ├── core_commands.py
│       │   │   │   ├── mount_commands.py
│       │   │   │   ├── rclone_config.py
│       │   │   │   ├── rclone_installer.py
│       │   │   │   ├── upload_command.py
│       │   │   │   └── upload.py
│       │   │   ├── command_utils.py
│       │   │   ├── db.py
│       │   │   ├── import_chatgpt.py
│       │   │   ├── import_claude_conversations.py
│       │   │   ├── import_claude_projects.py
│       │   │   ├── import_memory_json.py
│       │   │   ├── mcp.py
│       │   │   ├── project.py
│       │   │   ├── status.py
│       │   │   ├── sync.py
│       │   │   └── tool.py
│       │   └── main.py
│       ├── config.py
│       ├── db.py
│       ├── deps.py
│       ├── file_utils.py
│       ├── ignore_utils.py
│       ├── importers
│       │   ├── __init__.py
│       │   ├── base.py
│       │   ├── chatgpt_importer.py
│       │   ├── claude_conversations_importer.py
│       │   ├── claude_projects_importer.py
│       │   ├── memory_json_importer.py
│       │   └── utils.py
│       ├── markdown
│       │   ├── __init__.py
│       │   ├── entity_parser.py
│       │   ├── markdown_processor.py
│       │   ├── plugins.py
│       │   ├── schemas.py
│       │   └── utils.py
│       ├── mcp
│       │   ├── __init__.py
│       │   ├── async_client.py
│       │   ├── project_context.py
│       │   ├── prompts
│       │   │   ├── __init__.py
│       │   │   ├── ai_assistant_guide.py
│       │   │   ├── continue_conversation.py
│       │   │   ├── recent_activity.py
│       │   │   ├── search.py
│       │   │   └── utils.py
│       │   ├── resources
│       │   │   ├── ai_assistant_guide.md
│       │   │   └── project_info.py
│       │   ├── server.py
│       │   └── tools
│       │       ├── __init__.py
│       │       ├── build_context.py
│       │       ├── canvas.py
│       │       ├── chatgpt_tools.py
│       │       ├── delete_note.py
│       │       ├── edit_note.py
│       │       ├── list_directory.py
│       │       ├── move_note.py
│       │       ├── project_management.py
│       │       ├── read_content.py
│       │       ├── read_note.py
│       │       ├── recent_activity.py
│       │       ├── search.py
│       │       ├── utils.py
│       │       ├── view_note.py
│       │       └── write_note.py
│       ├── models
│       │   ├── __init__.py
│       │   ├── base.py
│       │   ├── knowledge.py
│       │   ├── project.py
│       │   └── search.py
│       ├── repository
│       │   ├── __init__.py
│       │   ├── entity_repository.py
│       │   ├── observation_repository.py
│       │   ├── project_info_repository.py
│       │   ├── project_repository.py
│       │   ├── relation_repository.py
│       │   ├── repository.py
│       │   └── search_repository.py
│       ├── schemas
│       │   ├── __init__.py
│       │   ├── base.py
│       │   ├── cloud.py
│       │   ├── delete.py
│       │   ├── directory.py
│       │   ├── importer.py
│       │   ├── memory.py
│       │   ├── project_info.py
│       │   ├── prompt.py
│       │   ├── request.py
│       │   ├── response.py
│       │   ├── search.py
│       │   └── sync_report.py
│       ├── services
│       │   ├── __init__.py
│       │   ├── context_service.py
│       │   ├── directory_service.py
│       │   ├── entity_service.py
│       │   ├── exceptions.py
│       │   ├── file_service.py
│       │   ├── initialization.py
│       │   ├── link_resolver.py
│       │   ├── project_service.py
│       │   ├── search_service.py
│       │   └── service.py
│       ├── sync
│       │   ├── __init__.py
│       │   ├── background_sync.py
│       │   ├── sync_service.py
│       │   └── watch_service.py
│       ├── templates
│       │   └── prompts
│       │       ├── continue_conversation.hbs
│       │       └── search.hbs
│       └── utils.py
├── test-int
│   ├── BENCHMARKS.md
│   ├── cli
│   │   ├── test_project_commands_integration.py
│   │   ├── test_sync_commands_integration.py
│   │   └── test_version_integration.py
│   ├── conftest.py
│   ├── mcp
│   │   ├── test_build_context_underscore.py
│   │   ├── test_build_context_validation.py
│   │   ├── test_chatgpt_tools_integration.py
│   │   ├── test_default_project_mode_integration.py
│   │   ├── test_delete_note_integration.py
│   │   ├── test_edit_note_integration.py
│   │   ├── test_list_directory_integration.py
│   │   ├── test_move_note_integration.py
│   │   ├── test_project_management_integration.py
│   │   ├── test_project_state_sync_integration.py
│   │   ├── test_read_content_integration.py
│   │   ├── test_read_note_integration.py
│   │   ├── test_search_integration.py
│   │   ├── test_single_project_mcp_integration.py
│   │   └── test_write_note_integration.py
│   ├── test_db_wal_mode.py
│   ├── test_disable_permalinks_integration.py
│   └── test_sync_performance_benchmark.py
├── tests
│   ├── __init__.py
│   ├── api
│   │   ├── conftest.py
│   │   ├── test_async_client.py
│   │   ├── test_continue_conversation_template.py
│   │   ├── test_directory_router.py
│   │   ├── test_importer_router.py
│   │   ├── test_knowledge_router.py
│   │   ├── test_management_router.py
│   │   ├── test_memory_router.py
│   │   ├── test_project_router_operations.py
│   │   ├── test_project_router.py
│   │   ├── test_prompt_router.py
│   │   ├── test_relation_background_resolution.py
│   │   ├── test_resource_router.py
│   │   ├── test_search_router.py
│   │   ├── test_search_template.py
│   │   ├── test_template_loader_helpers.py
│   │   └── test_template_loader.py
│   ├── cli
│   │   ├── conftest.py
│   │   ├── test_bisync_commands.py
│   │   ├── test_cli_tools.py
│   │   ├── test_cloud_authentication.py
│   │   ├── test_cloud_utils.py
│   │   ├── test_ignore_utils.py
│   │   ├── test_import_chatgpt.py
│   │   ├── test_import_claude_conversations.py
│   │   ├── test_import_claude_projects.py
│   │   ├── test_import_memory_json.py
│   │   └── test_upload.py
│   ├── conftest.py
│   ├── db
│   │   └── test_issue_254_foreign_key_constraints.py
│   ├── importers
│   │   ├── test_importer_base.py
│   │   └── test_importer_utils.py
│   ├── markdown
│   │   ├── __init__.py
│   │   ├── test_date_frontmatter_parsing.py
│   │   ├── test_entity_parser_error_handling.py
│   │   ├── test_entity_parser.py
│   │   ├── test_markdown_plugins.py
│   │   ├── test_markdown_processor.py
│   │   ├── test_observation_edge_cases.py
│   │   ├── test_parser_edge_cases.py
│   │   ├── test_relation_edge_cases.py
│   │   └── test_task_detection.py
│   ├── mcp
│   │   ├── conftest.py
│   │   ├── test_obsidian_yaml_formatting.py
│   │   ├── test_permalink_collision_file_overwrite.py
│   │   ├── test_prompts.py
│   │   ├── test_resources.py
│   │   ├── test_tool_build_context.py
│   │   ├── test_tool_canvas.py
│   │   ├── test_tool_delete_note.py
│   │   ├── test_tool_edit_note.py
│   │   ├── test_tool_list_directory.py
│   │   ├── test_tool_move_note.py
│   │   ├── test_tool_read_content.py
│   │   ├── test_tool_read_note.py
│   │   ├── test_tool_recent_activity.py
│   │   ├── test_tool_resource.py
│   │   ├── test_tool_search.py
│   │   ├── test_tool_utils.py
│   │   ├── test_tool_view_note.py
│   │   ├── test_tool_write_note.py
│   │   └── tools
│   │       └── test_chatgpt_tools.py
│   ├── Non-MarkdownFileSupport.pdf
│   ├── repository
│   │   ├── test_entity_repository_upsert.py
│   │   ├── test_entity_repository.py
│   │   ├── test_entity_upsert_issue_187.py
│   │   ├── test_observation_repository.py
│   │   ├── test_project_info_repository.py
│   │   ├── test_project_repository.py
│   │   ├── test_relation_repository.py
│   │   ├── test_repository.py
│   │   ├── test_search_repository_edit_bug_fix.py
│   │   └── test_search_repository.py
│   ├── schemas
│   │   ├── test_base_timeframe_minimum.py
│   │   ├── test_memory_serialization.py
│   │   ├── test_memory_url_validation.py
│   │   ├── test_memory_url.py
│   │   ├── test_schemas.py
│   │   └── test_search.py
│   ├── Screenshot.png
│   ├── services
│   │   ├── test_context_service.py
│   │   ├── test_directory_service.py
│   │   ├── test_entity_service_disable_permalinks.py
│   │   ├── test_entity_service.py
│   │   ├── test_file_service.py
│   │   ├── test_initialization.py
│   │   ├── test_link_resolver.py
│   │   ├── test_project_removal_bug.py
│   │   ├── test_project_service_operations.py
│   │   ├── test_project_service.py
│   │   └── test_search_service.py
│   ├── sync
│   │   ├── test_character_conflicts.py
│   │   ├── test_sync_service_incremental.py
│   │   ├── test_sync_service.py
│   │   ├── test_sync_wikilink_issue.py
│   │   ├── test_tmp_files.py
│   │   ├── test_watch_service_edge_cases.py
│   │   ├── test_watch_service_reload.py
│   │   └── test_watch_service.py
│   ├── test_config.py
│   ├── test_db_migration_deduplication.py
│   ├── test_deps.py
│   ├── test_production_cascade_delete.py
│   └── utils
│       ├── test_file_utils.py
│       ├── test_frontmatter_obsidian_compatible.py
│       ├── test_parse_tags.py
│       ├── test_permalink_formatting.py
│       ├── test_utf8_handling.py
│       └── test_validate_project_path.py
├── uv.lock
├── v0.15.0-RELEASE-DOCS.md
└── v15-docs
    ├── api-performance.md
    ├── background-relations.md
    ├── basic-memory-home.md
    ├── bug-fixes.md
    ├── chatgpt-integration.md
    ├── cloud-authentication.md
    ├── cloud-bisync.md
    ├── cloud-mode-usage.md
    ├── cloud-mount.md
    ├── default-project-mode.md
    ├── env-file-removal.md
    ├── env-var-overrides.md
    ├── explicit-project-parameter.md
    ├── gitignore-integration.md
    ├── project-root-env-var.md
    ├── README.md
    └── sqlite-performance.md
```

# Files

--------------------------------------------------------------------------------
/specs/SPEC-13 CLI Authentication with Subscription Validation.md:
--------------------------------------------------------------------------------

```markdown
  1 | ---
  2 | title: 'SPEC-13: CLI Authentication with Subscription Validation'
  3 | type: spec
  4 | permalink: specs/spec-12-cli-auth-subscription-validation
  5 | tags:
  6 | - authentication
  7 | - security
  8 | - cli
  9 | - subscription
 10 | status: draft
 11 | created: 2025-10-02
 12 | ---
 13 | 
 14 | # SPEC-13: CLI Authentication with Subscription Validation
 15 | 
 16 | ## Why
 17 | 
 18 | The Basic Memory Cloud CLI currently has a security gap in authentication that allows unauthorized access:
 19 | 
 20 | **Current Web Flow (Secure)**:
 21 | 1. User signs up via WorkOS AuthKit
 22 | 2. User creates Polar subscription
 23 | 3. Web app validates subscription before calling `POST /tenants/setup`
 24 | 4. Tenant provisioned only after subscription validation ✅
 25 | 
 26 | **Current CLI Flow (Insecure)**:
 27 | 1. User signs up via WorkOS AuthKit (OAuth device flow)
 28 | 2. User runs `bm cloud login`
 29 | 3. CLI receives JWT token from WorkOS
 30 | 4. CLI can access all cloud endpoints without subscription check ❌
 31 | 
 32 | **Problem**: Anyone can sign up with WorkOS and immediately access cloud infrastructure via CLI without having an active Polar subscription. This creates:
 33 | - Revenue loss (free resource consumption)
 34 | - Security risk (unauthorized data access)
 35 | - Support burden (users accessing features they haven't paid for)
 36 | 
 37 | **Root Cause**: The CLI authentication flow validates JWT tokens but doesn't verify subscription status before granting access to cloud resources.
 38 | 
 39 | ## What
 40 | 
 41 | Add subscription validation to authentication flow to ensure only users with active Polar subscriptions can access cloud resources across all access methods (CLI, MCP, Web App, Direct API).
 42 | 
 43 | **Affected Components**:
 44 | 
 45 | ### basic-memory-cloud (Cloud Service)
 46 | - `apps/cloud/src/basic_memory_cloud/deps.py` - Add subscription validation dependency
 47 | - `apps/cloud/src/basic_memory_cloud/services/subscription_service.py` - Add subscription check method
 48 | - `apps/cloud/src/basic_memory_cloud/api/tenant_mount.py` - Protect mount endpoints
 49 | - `apps/cloud/src/basic_memory_cloud/api/proxy.py` - Protect proxy endpoints
 50 | 
 51 | ### basic-memory (CLI)
 52 | - `src/basic_memory/cli/commands/cloud/core_commands.py` - Handle 403 errors
 53 | - `src/basic_memory/cli/commands/cloud/api_client.py` - Parse subscription errors
 54 | - `docs/cloud-cli.md` - Document subscription requirement
 55 | 
 56 | **Endpoints to Protect**:
 57 | - `GET /tenant/mount/info` - Used by CLI bisync setup
 58 | - `POST /tenant/mount/credentials` - Used by CLI bisync credentials
 59 | - `GET /proxy/{path:path}` - Used by Web App, MCP tools, CLI tools, Direct API
 60 | - All other `/proxy/*` endpoints - Centralized access point for all user operations
 61 | 
 62 | ## Complete Authentication Flow Analysis
 63 | 
 64 | ### Overview of All Access Flows
 65 | 
 66 | Basic Memory Cloud has **7 distinct authentication flows**. This spec closes subscription validation gaps in flows 2-4 and 6, which all converge on the `/proxy/*` endpoints.
 67 | 
 68 | ### Flow 1: Polar Webhook → Registration ✅ SECURE
 69 | ```
 70 | Polar webhook → POST /api/webhooks/polar
 71 | → Validates Polar webhook signature
 72 | → Creates/updates subscription in database
 73 | → No direct user access - webhook only
 74 | ```
 75 | **Auth**: Polar webhook signature validation
 76 | **Subscription Check**: N/A (webhook creates subscriptions)
 77 | **Status**: ✅ Secure - webhook validated, no user JWT involved
 78 | 
 79 | ### Flow 2: Web App Login ❌ NEEDS FIX
 80 | ```
 81 | User → apps/web (Vue.js/Nuxt)
 82 | → WorkOS AuthKit magic link authentication
 83 | → JWT stored in browser session
 84 | → Web app calls /proxy/{project}/... endpoints (memory, directory, projects)
 85 | → proxy.py validates JWT but does NOT check subscription
 86 | → Access granted without subscription ❌
 87 | ```
 88 | **Auth**: WorkOS JWT via `CurrentUserProfileHybridJwtDep`
 89 | **Subscription Check**: ❌ Missing
 90 | **Fixed By**: Task 1.4 (protect `/proxy/*` endpoints)
 91 | 
 92 | ### Flow 3: MCP (Model Context Protocol) ❌ NEEDS FIX
 93 | ```
 94 | AI Agent (Claude, Cursor, etc.) → https://mcp.basicmemory.com
 95 | → AuthKit OAuth device flow
 96 | → JWT stored in AI agent
 97 | → MCP tools call {cloud_host}/proxy/{endpoint} with Authorization header
 98 | → proxy.py validates JWT but does NOT check subscription
 99 | → MCP tools can access all cloud resources without subscription ❌
100 | ```
101 | **Auth**: AuthKit JWT via `CurrentUserProfileHybridJwtDep`
102 | **Subscription Check**: ❌ Missing
103 | **Fixed By**: Task 1.4 (protect `/proxy/*` endpoints)
104 | 
105 | ### Flow 4: CLI Auth (basic-memory) ❌ NEEDS FIX
106 | ```
107 | User → bm cloud login
108 | → AuthKit OAuth device flow
109 | → JWT stored in ~/.basic-memory/tokens.json
110 | → CLI calls:
111 |   - {cloud_host}/tenant/mount/info (for bisync setup)
112 |   - {cloud_host}/tenant/mount/credentials (for bisync credentials)
113 |   - {cloud_host}/proxy/{endpoint} (for all MCP tools)
114 | → tenant_mount.py and proxy.py validate JWT but do NOT check subscription
115 | → Access granted without subscription ❌
116 | ```
117 | **Auth**: AuthKit JWT via `CurrentUserProfileHybridJwtDep`
118 | **Subscription Check**: ❌ Missing
119 | **Fixed By**: Task 1.3 (protect `/tenant/mount/*`) + Task 1.4 (protect `/proxy/*`)
120 | 
121 | ### Flow 5: Cloud CLI (Admin Tasks) ✅ SECURE
122 | ```
123 | Admin → python -m basic_memory_cloud.cli.tenant_cli
124 | → Uses CLIAuth with admin WorkOS OAuth client
125 | → Gets JWT token with admin org membership
126 | → Calls /tenants/* endpoints (create, list, delete tenants)
127 | → tenants.py validates JWT AND admin org membership via AdminUserHybridDep
128 | → Access granted only to admin organization members ✅
129 | ```
130 | **Auth**: AuthKit JWT + Admin org validation via `AdminUserHybridDep`
131 | **Subscription Check**: N/A (admins bypass subscription requirement)
132 | **Status**: ✅ Secure - admin-only endpoints, separate from user flows
133 | 
134 | ### Flow 6: Direct API Calls ❌ NEEDS FIX
135 | ```
136 | Any HTTP client → {cloud_host}/proxy/{endpoint}
137 | → Sends Authorization: Bearer {jwt} header
138 | → proxy.py validates JWT but does NOT check subscription
139 | → Direct API access without subscription ❌
140 | ```
141 | **Auth**: WorkOS or AuthKit JWT via `CurrentUserProfileHybridJwtDep`
142 | **Subscription Check**: ❌ Missing
143 | **Fixed By**: Task 1.4 (protect `/proxy/*` endpoints)
144 | 
145 | ### Flow 7: Tenant API Instance (Internal) ✅ SECURE
146 | ```
147 | /proxy/* → Tenant API (basic-memory-{tenant_id}.fly.dev)
148 | → Validates signed header from proxy (tenant_id + signature)
149 | → Direct external access will be disabled in production
150 | → Only accessible via /proxy endpoints
151 | ```
152 | **Auth**: Signed header validation from proxy
153 | **Subscription Check**: N/A (internal only, validated at proxy layer)
154 | **Status**: ✅ Secure - validates proxy signature, not directly accessible
155 | 
156 | ### Authentication Flow Summary Matrix
157 | 
158 | | Flow | Access Method | Current Auth | Subscription Check | Fixed By SPEC-13 |
159 | |------|---------------|--------------|-------------------|------------------|
160 | | 1. Polar Webhook | Polar webhook → `/api/webhooks/polar` | Polar signature | N/A (webhook) | N/A |
161 | | 2. Web App | Browser → `/proxy/*` | WorkOS JWT ✅ | ❌ Missing | ✅ Task 1.4 |
162 | | 3. MCP | AI Agent → `/proxy/*` | AuthKit JWT ✅ | ❌ Missing | ✅ Task 1.4 |
163 | | 4. CLI | `bm cloud` → `/tenant/mount/*` + `/proxy/*` | AuthKit JWT ✅ | ❌ Missing | ✅ Task 1.3 + 1.4 |
164 | | 5. Cloud CLI (Admin) | `tenant_cli` → `/tenants/*` | AuthKit JWT ✅ + Admin org | N/A (admin) | N/A (admin bypass) |
165 | | 6. Direct API | HTTP client → `/proxy/*` | WorkOS/AuthKit JWT ✅ | ❌ Missing | ✅ Task 1.4 |
166 | | 7. Tenant API | Proxy → tenant instance | Proxy signature ✅ | N/A (internal) | N/A |
167 | 
168 | ### Key Insights
169 | 
170 | 1. **Single Point of Failure**: All user access (Web, MCP, CLI, Direct API) converges on `/proxy/*` endpoints
171 | 2. **Centralized Fix**: Protecting `/proxy/*` with subscription validation closes gaps in flows 2, 3, 4, and 6 simultaneously
172 | 3. **Admin Bypass**: Cloud CLI admin tasks use separate `/tenants/*` endpoints with admin-only access (no subscription needed)
173 | 4. **Defense in Depth**: `/tenant/mount/*` endpoints also protected for CLI bisync operations
174 | 
175 | ### Architecture Benefits
176 | 
177 | The `/proxy` layer serves as the **single centralized authorization point** for all user access:
178 | - ✅ One place to validate JWT tokens
179 | - ✅ One place to check subscription status
180 | - ✅ One place to handle tenant routing
181 | - ✅ Protects Web App, MCP, CLI, and Direct API simultaneously
182 | 
183 | This architecture makes the fix comprehensive and maintainable.
184 | 
185 | ## How (High Level)
186 | 
187 | ### Option A: Database Subscription Check (Recommended)
188 | 
189 | **Approach**: Add FastAPI dependency that validates subscription status from database before allowing access.
190 | 
191 | **Implementation**:
192 | 
193 | 1. **Create Subscription Validation Dependency** (`deps.py`)
194 |    ```python
195 |    async def get_authorized_cli_user_profile(
196 |        credentials: Annotated[HTTPAuthorizationCredentials, Depends(security)],
197 |        session: DatabaseSessionDep,
198 |        user_profile_repo: UserProfileRepositoryDep,
199 |        subscription_service: SubscriptionServiceDep,
200 |    ) -> UserProfile:
201 |        """
202 |        Hybrid authentication with subscription validation for CLI access.
203 | 
204 |        Validates JWT (WorkOS or AuthKit) and checks for active subscription.
205 |        Returns UserProfile if both checks pass.
206 |        """
207 |        # Try WorkOS JWT first (faster validation path)
208 |        try:
209 |            user_context = await validate_workos_jwt(credentials.credentials)
210 |        except HTTPException:
211 |            # Fall back to AuthKit JWT validation
212 |            try:
213 |                user_context = await validate_authkit_jwt(credentials.credentials)
214 |            except HTTPException as e:
215 |                raise HTTPException(
216 |                    status_code=401,
217 |                    detail="Invalid JWT token. Authentication required.",
218 |                ) from e
219 | 
220 |        # Check subscription status
221 |        has_subscription = await subscription_service.check_user_has_active_subscription(
222 |            session, user_context.workos_user_id
223 |        )
224 | 
225 |        if not has_subscription:
226 |            raise HTTPException(
227 |                status_code=403,
228 |                detail={
229 |                    "error": "subscription_required",
230 |                    "message": "Active subscription required for CLI access",
231 |                    "subscribe_url": "https://basicmemory.com/subscribe"
232 |                }
233 |            )
234 | 
235 |        # Look up and return user profile
236 |        user_profile = await user_profile_repo.get_user_profile_by_workos_user_id(
237 |            session, user_context.workos_user_id
238 |        )
239 |        if not user_profile:
240 |            raise HTTPException(401, detail="User profile not found")
241 | 
242 |        return user_profile
243 |    ```
244 | 
245 |    ```python
246 |    AuthorizedCLIUserProfileDep = Annotated[UserProfile, Depends(get_authorized_cli_user_profile)]
247 |    ```
248 | 
249 | 2. **Add Subscription Check Method** (`subscription_service.py`)
250 |    ```python
251 |    async def check_user_has_active_subscription(
252 |        self, session: AsyncSession, workos_user_id: str
253 |    ) -> bool:
254 |        """Check if user has active subscription."""
255 |        # Use existing repository method to get subscription by workos_user_id
256 |        # This joins UserProfile -> Subscription in a single query
257 |        subscription = await self.subscription_repository.get_subscription_by_workos_user_id(
258 |            session, workos_user_id
259 |        )
260 | 
261 |        return subscription is not None and subscription.status == "active"
262 |    ```
263 | 
264 | 3. **Protect Endpoints** (Replace `CurrentUserProfileHybridJwtDep` with `AuthorizedCLIUserProfileDep`)
265 |    ```python
266 |    # Before
267 |    @router.get("/mount/info")
268 |    async def get_mount_info(
269 |        user_profile: CurrentUserProfileHybridJwtDep,
270 |        session: DatabaseSessionDep,
271 |    ):
272 |        tenant_id = user_profile.tenant_id
273 |        ...
274 | 
275 |    # After
276 |    @router.get("/mount/info")
277 |    async def get_mount_info(
278 |        user_profile: AuthorizedCLIUserProfileDep,  # Now includes subscription check
279 |        session: DatabaseSessionDep,
280 |    ):
281 |        tenant_id = user_profile.tenant_id  # No changes needed to endpoint logic
282 |        ...
283 |    ```
284 | 
285 | 4. **Update CLI Error Handling**
286 |    ```python
287 |    # In core_commands.py login()
288 |    try:
289 |        success = await auth.login()
290 |        if success:
291 |            # Test subscription by calling protected endpoint
292 |            await make_api_request("GET", f"{host_url}/tenant/mount/info")
293 |    except CloudAPIError as e:
294 |        if e.status_code == 403 and e.detail.get("error") == "subscription_required":
295 |            console.print("[red]Subscription required[/red]")
296 |            console.print(f"Subscribe at: {e.detail['subscribe_url']}")
297 |            raise typer.Exit(1)
298 |    ```
299 | 
300 | **Pros**:
301 | - Simple to implement
302 | - Fast (single database query)
303 | - Clear error messages
304 | - Works with existing subscription flow
305 | 
306 | **Cons**:
307 | - Database is source of truth (could get out of sync with Polar)
308 | - Adds one extra subscription lookup query per request (lightweight JOIN query)
309 | 
310 | ### Option B: WorkOS Organizations
311 | 
312 | **Approach**: Add users to "beta-users" organization in WorkOS after subscription creation, validate org membership via JWT claims.
313 | 
314 | **Implementation**:
315 | 1. After Polar subscription webhook, add user to WorkOS org via API
316 | 2. Validate `org_id` claim in JWT matches authorized org
317 | 3. Use existing `get_admin_workos_jwt` pattern
318 | 
319 | **Pros**:
320 | - WorkOS as single source of truth
321 | - No database queries needed
322 | - More secure (harder to bypass)
323 | 
324 | **Cons**:
325 | - More complex (requires WorkOS API integration)
326 | - Requires managing WorkOS org membership
327 | - Less control over error messages
328 | - Additional API calls during registration
329 | 
330 | ### Recommendation
331 | 
332 | **Start with Option A (Database Check)** for:
333 | - Faster implementation
334 | - Clearer error messages
335 | - Easier testing
336 | - Existing subscription infrastructure
337 | 
338 | **Consider Option B later** if:
339 | - Need tighter security
340 | - Want to reduce database dependency
341 | - Scale requires fewer database queries
342 | 
343 | ## How to Evaluate
344 | 
345 | ### Success Criteria
346 | 
347 | **1. Unauthorized Users Blocked**
348 | - [ ] User without subscription cannot complete `bm cloud login`
349 | - [ ] User without subscription receives clear error with subscribe link
350 | - [ ] User without subscription cannot run `bm cloud setup`
351 | - [ ] User without subscription cannot run `bm sync` in cloud mode
352 | 
353 | **2. Authorized Users Work**
354 | - [ ] User with active subscription can login successfully
355 | - [ ] User with active subscription can setup bisync
356 | - [ ] User with active subscription can sync files
357 | - [ ] User with active subscription can use all MCP tools via proxy
358 | 
359 | **3. Subscription State Changes**
360 | - [ ] Expired subscription blocks access with clear error
361 | - [ ] Renewed subscription immediately restores access
362 | - [ ] Cancelled subscription blocks access after grace period
363 | 
364 | **4. Error Messages**
365 | - [ ] 403 errors include "subscription_required" error code
366 | - [ ] Error messages include subscribe URL
367 | - [ ] CLI displays user-friendly messages
368 | - [ ] Errors logged appropriately for debugging
369 | 
370 | **5. No Regressions**
371 | - [ ] Web app login/subscription flow unaffected
372 | - [ ] Admin endpoints still work (bypass check)
373 | - [ ] Tenant provisioning workflow unchanged
374 | - [ ] Performance not degraded
375 | 
376 | ### Test Cases
377 | 
378 | **Manual Testing**:
379 | ```bash
380 | # Test 1: Unauthorized user
381 | 1. Create new WorkOS account (no subscription)
382 | 2. Run `bm cloud login`
383 | 3. Verify: Login succeeds but shows subscription required error
384 | 4. Verify: Cannot run `bm cloud setup`
385 | 5. Verify: Clear error message with subscribe link
386 | 
387 | # Test 2: Authorized user
388 | 1. Use account with active Polar subscription
389 | 2. Run `bm cloud login`
390 | 3. Verify: Login succeeds without errors
391 | 4. Run `bm cloud setup`
392 | 5. Verify: Setup completes successfully
393 | 6. Run `bm sync`
394 | 7. Verify: Sync works normally
395 | 
396 | # Test 3: Subscription expiration
397 | 1. Use account with active subscription
398 | 2. Manually expire subscription in database
399 | 3. Run `bm cloud login`
400 | 4. Verify: Blocked with clear error
401 | 5. Renew subscription
402 | 6. Run `bm cloud login` again
403 | 7. Verify: Access restored
404 | ```
405 | 
406 | **Automated Tests**:
407 | ```python
408 | # Test subscription validation dependency
409 | async def test_authorized_user_allowed(
410 |     db_session,
411 |     user_profile_repo,
412 |     subscription_service,
413 |     mock_jwt_credentials
414 | ):
415 |     # Create user with active subscription
416 |     user_profile = await create_user_with_subscription(db_session, status="active")
417 | 
418 |     # Mock JWT credentials for the user
419 |     credentials = mock_jwt_credentials(user_profile.workos_user_id)
420 | 
421 |     # Should not raise exception
422 |     result = await get_authorized_cli_user_profile(
423 |         credentials, db_session, user_profile_repo, subscription_service
424 |     )
425 |     assert result.id == user_profile.id
426 |     assert result.workos_user_id == user_profile.workos_user_id
427 | 
428 | async def test_unauthorized_user_blocked(
429 |     db_session,
430 |     user_profile_repo,
431 |     subscription_service,
432 |     mock_jwt_credentials
433 | ):
434 |     # Create user without subscription
435 |     user_profile = await create_user_without_subscription(db_session)
436 |     credentials = mock_jwt_credentials(user_profile.workos_user_id)
437 | 
438 |     # Should raise 403
439 |     with pytest.raises(HTTPException) as exc:
440 |         await get_authorized_cli_user_profile(
441 |             credentials, db_session, user_profile_repo, subscription_service
442 |         )
443 | 
444 |     assert exc.value.status_code == 403
445 |     assert exc.value.detail["error"] == "subscription_required"
446 | 
447 | async def test_inactive_subscription_blocked(
448 |     db_session,
449 |     user_profile_repo,
450 |     subscription_service,
451 |     mock_jwt_credentials
452 | ):
453 |     # Create user with cancelled/inactive subscription
454 |     user_profile = await create_user_with_subscription(db_session, status="cancelled")
455 |     credentials = mock_jwt_credentials(user_profile.workos_user_id)
456 | 
457 |     # Should raise 403
458 |     with pytest.raises(HTTPException) as exc:
459 |         await get_authorized_cli_user_profile(
460 |             credentials, db_session, user_profile_repo, subscription_service
461 |         )
462 | 
463 |     assert exc.value.status_code == 403
464 |     assert exc.value.detail["error"] == "subscription_required"
465 | ```
466 | 
467 | ## Implementation Tasks
468 | 
469 | ### Phase 1: Cloud Service (basic-memory-cloud)
470 | 
471 | #### Task 1.1: Add subscription check method to SubscriptionService ✅
472 | **File**: `apps/cloud/src/basic_memory_cloud/services/subscription_service.py`
473 | 
474 | - [x] Add method `check_subscription(session: AsyncSession, workos_user_id: str) -> bool`
475 | - [x] Use existing `self.subscription_repository.get_subscription_by_workos_user_id(session, workos_user_id)`
476 | - [x] Check both `status == "active"` AND `current_period_end >= now()`
477 | - [x] Log both values when check fails
478 | - [x] Add docstring explaining the method
479 | - [x] Run `just typecheck` to verify types
480 | 
481 | **Actual implementation**:
482 | ```python
483 | async def check_subscription(
484 |     self, session: AsyncSession, workos_user_id: str
485 | ) -> bool:
486 |     """Check if user has active subscription with valid period."""
487 |     subscription = await self.subscription_repository.get_subscription_by_workos_user_id(
488 |         session, workos_user_id
489 |     )
490 | 
491 |     if subscription is None:
492 |         return False
493 | 
494 |     if subscription.status != "active":
495 |         logger.warning("Subscription inactive", workos_user_id=workos_user_id,
496 |                       status=subscription.status, current_period_end=subscription.current_period_end)
497 |         return False
498 | 
499 |     now = datetime.now(timezone.utc)
500 |     if subscription.current_period_end is None or subscription.current_period_end < now:
501 |         logger.warning("Subscription expired", workos_user_id=workos_user_id,
502 |                       status=subscription.status, current_period_end=subscription.current_period_end)
503 |         return False
504 | 
505 |     return True
506 | ```
507 | 
508 | #### Task 1.2: Add subscription validation dependency ✅
509 | **File**: `apps/cloud/src/basic_memory_cloud/deps.py`
510 | 
511 | - [x] Import necessary types at top of file (if not already present)
512 | - [x] Add `authorized_user_profile()` async function
513 | - [x] Implement hybrid JWT validation (WorkOS first, AuthKit fallback)
514 | - [x] Add subscription check using `subscription_service.check_subscription()`
515 | - [x] Raise `HTTPException(403)` with structured error detail if no active subscription
516 | - [x] Look up and return `UserProfile` after validation
517 | - [x] Add `AuthorizedUserProfileDep` type annotation
518 | - [x] Use `settings.subscription_url` from config (env var)
519 | - [x] Run `just typecheck` to verify types
520 | 
521 | **Expected code**:
522 | ```python
523 | async def get_authorized_cli_user_profile(
524 |     credentials: Annotated[HTTPAuthorizationCredentials, Depends(security)],
525 |     session: DatabaseSessionDep,
526 |     user_profile_repo: UserProfileRepositoryDep,
527 |     subscription_service: SubscriptionServiceDep,
528 | ) -> UserProfile:
529 |     """
530 |     Hybrid authentication with subscription validation for CLI access.
531 | 
532 |     Validates JWT (WorkOS or AuthKit) and checks for active subscription.
533 |     Returns UserProfile if both checks pass.
534 | 
535 |     Raises:
536 |         HTTPException(401): Invalid JWT token
537 |         HTTPException(403): No active subscription
538 |     """
539 |     # Try WorkOS JWT first (faster validation path)
540 |     try:
541 |         user_context = await validate_workos_jwt(credentials.credentials)
542 |     except HTTPException:
543 |         # Fall back to AuthKit JWT validation
544 |         try:
545 |             user_context = await validate_authkit_jwt(credentials.credentials)
546 |         except HTTPException as e:
547 |             raise HTTPException(
548 |                 status_code=401,
549 |                 detail="Invalid JWT token. Authentication required.",
550 |             ) from e
551 | 
552 |     # Check subscription status
553 |     has_subscription = await subscription_service.check_user_has_active_subscription(
554 |         session, user_context.workos_user_id
555 |     )
556 | 
557 |     if not has_subscription:
558 |         logger.warning(
559 |             "CLI access denied: no active subscription",
560 |             workos_user_id=user_context.workos_user_id,
561 |         )
562 |         raise HTTPException(
563 |             status_code=403,
564 |             detail={
565 |                 "error": "subscription_required",
566 |                 "message": "Active subscription required for CLI access",
567 |                 "subscribe_url": "https://basicmemory.com/subscribe"
568 |             }
569 |         )
570 | 
571 |     # Look up and return user profile
572 |     user_profile = await user_profile_repo.get_user_profile_by_workos_user_id(
573 |         session, user_context.workos_user_id
574 |     )
575 |     if not user_profile:
576 |         logger.error(
577 |             "User profile not found after successful auth",
578 |             workos_user_id=user_context.workos_user_id,
579 |         )
580 |         raise HTTPException(401, detail="User profile not found")
581 | 
582 |     logger.info(
583 |         "CLI access granted",
584 |         workos_user_id=user_context.workos_user_id,
585 |         user_profile_id=str(user_profile.id),
586 |     )
587 |     return user_profile
588 | 
589 | 
590 | AuthorizedCLIUserProfileDep = Annotated[UserProfile, Depends(get_authorized_cli_user_profile)]
591 | ```
592 | 
593 | #### Task 1.3: Protect tenant mount endpoints ✅
594 | **File**: `apps/cloud/src/basic_memory_cloud/api/tenant_mount.py`
595 | 
596 | - [x] Update import: add `AuthorizedUserProfileDep` from `..deps`
597 | - [x] Replace `user_profile: CurrentUserProfileHybridJwtDep` with `user_profile: AuthorizedUserProfileDep` in:
598 |   - [x] `get_tenant_mount_info()` (line ~23)
599 |   - [x] `create_tenant_mount_credentials()` (line ~88)
600 |   - [x] `revoke_tenant_mount_credentials()` (line ~244)
601 |   - [x] `list_tenant_mount_credentials()` (line ~326)
602 | - [x] Verify no other code changes needed (parameter name and usage stays the same)
603 | - [x] Run `just typecheck` to verify types
604 | 
605 | #### Task 1.4: Protect proxy endpoints ✅
606 | **File**: `apps/cloud/src/basic_memory_cloud/api/proxy.py`
607 | 
608 | - [x] Update import: add `AuthorizedUserProfileDep` from `..deps`
609 | - [x] Replace `user_profile: CurrentUserProfileHybridJwtDep` with `user_profile: AuthorizedUserProfileDep` in:
610 |   - [x] `check_tenant_health()` (line ~21)
611 |   - [x] `proxy_to_tenant()` (line ~63)
612 | - [x] Verify no other code changes needed (parameter name and usage stays the same)
613 | - [x] Run `just typecheck` to verify types
614 | 
615 | **Why Keep /proxy Architecture:**
616 | 
617 | The proxy layer is valuable because it:
618 | 1. **Centralizes authorization** - Single place for JWT + subscription validation (closes both CLI and MCP auth gaps)
619 | 2. **Handles tenant routing** - Maps tenant_id → fly_app_name without exposing infrastructure details
620 | 3. **Abstracts infrastructure** - MCP and CLI don't need to know about Fly.io naming conventions
621 | 4. **Enables features** - Can add rate limiting, caching, request logging, etc. at proxy layer
622 | 5. **Supports both flows** - CLI tools and MCP tools both use /proxy endpoints
623 | 
624 | The extra HTTP hop is minimal (< 10ms) and worth it for architectural benefits.
625 | 
626 | **Performance Note:** Cloud app has Redis available - can cache subscription status to reduce database queries if needed. Initial implementation uses direct database query (simple, acceptable performance ~5-10ms).
627 | 
628 | #### Task 1.5: Add unit tests for subscription service
629 | **File**: `apps/cloud/tests/services/test_subscription_service.py` (create if doesn't exist)
630 | 
631 | - [ ] Create test file if it doesn't exist
632 | - [ ] Add test: `test_check_user_has_active_subscription_returns_true_for_active()`
633 |   - Create user with active subscription
634 |   - Call `check_user_has_active_subscription()`
635 |   - Assert returns `True`
636 | - [ ] Add test: `test_check_user_has_active_subscription_returns_false_for_pending()`
637 |   - Create user with pending subscription
638 |   - Assert returns `False`
639 | - [ ] Add test: `test_check_user_has_active_subscription_returns_false_for_cancelled()`
640 |   - Create user with cancelled subscription
641 |   - Assert returns `False`
642 | - [ ] Add test: `test_check_user_has_active_subscription_returns_false_for_no_subscription()`
643 |   - Create user without subscription
644 |   - Assert returns `False`
645 | - [ ] Run `just test` to verify tests pass
646 | 
647 | #### Task 1.6: Add integration tests for dependency
648 | **File**: `apps/cloud/tests/test_deps.py` (create if doesn't exist)
649 | 
650 | - [ ] Create test file if it doesn't exist
651 | - [ ] Add fixtures for mocking JWT credentials
652 | - [ ] Add test: `test_authorized_cli_user_profile_with_active_subscription()`
653 |   - Mock valid JWT + active subscription
654 |   - Call dependency
655 |   - Assert returns UserProfile
656 | - [ ] Add test: `test_authorized_cli_user_profile_without_subscription_raises_403()`
657 |   - Mock valid JWT + no subscription
658 |   - Assert raises HTTPException(403) with correct error detail
659 | - [ ] Add test: `test_authorized_cli_user_profile_with_inactive_subscription_raises_403()`
660 |   - Mock valid JWT + cancelled subscription
661 |   - Assert raises HTTPException(403)
662 | - [ ] Add test: `test_authorized_cli_user_profile_with_invalid_jwt_raises_401()`
663 |   - Mock invalid JWT
664 |   - Assert raises HTTPException(401)
665 | - [ ] Run `just test` to verify tests pass
666 | 
667 | #### Task 1.7: Deploy and verify cloud service
668 | - [ ] Run `just check` to verify all quality checks pass
669 | - [ ] Commit changes with message: "feat: add subscription validation to CLI endpoints"
670 | - [ ] Deploy to preview environment: `flyctl deploy --config apps/cloud/fly.toml`
671 | - [ ] Test manually:
672 |   - [ ] Call `/tenant/mount/info` with valid JWT but no subscription → expect 403
673 |   - [ ] Call `/tenant/mount/info` with valid JWT and active subscription → expect 200
674 |   - [ ] Verify error response structure matches spec
675 | 
676 | ### Phase 2: CLI (basic-memory)
677 | 
678 | #### Task 2.1: Review and understand CLI authentication flow
679 | **Files**: `src/basic_memory/cli/commands/cloud/`
680 | 
681 | - [ ] Read `core_commands.py` to understand current login flow
682 | - [ ] Read `api_client.py` to understand current error handling
683 | - [ ] Identify where 403 errors should be caught
684 | - [ ] Identify what error messages should be displayed
685 | - [ ] Document current behavior in spec if needed
686 | 
687 | #### Task 2.2: Update API client error handling
688 | **File**: `src/basic_memory/cli/commands/cloud/api_client.py`
689 | 
690 | - [ ] Add custom exception class `SubscriptionRequiredError` (or similar)
691 | - [ ] Update HTTP error handling to parse 403 responses
692 | - [ ] Extract `error`, `message`, and `subscribe_url` from error detail
693 | - [ ] Raise specific exception for subscription_required errors
694 | - [ ] Run `just typecheck` in basic-memory repo to verify types
695 | 
696 | #### Task 2.3: Update CLI login command error handling
697 | **File**: `src/basic_memory/cli/commands/cloud/core_commands.py`
698 | 
699 | - [ ] Import the subscription error exception
700 | - [ ] Wrap login flow with try/except for subscription errors
701 | - [ ] Display user-friendly error message with rich console
702 | - [ ] Show subscribe URL prominently
703 | - [ ] Provide actionable next steps
704 | - [ ] Run `just typecheck` to verify types
705 | 
706 | **Expected error handling**:
707 | ```python
708 | try:
709 |     # Existing login logic
710 |     success = await auth.login()
711 |     if success:
712 |         # Test access to protected endpoint
713 |         await api_client.test_connection()
714 | except SubscriptionRequiredError as e:
715 |     console.print("\n[red]✗ Subscription Required[/red]\n")
716 |     console.print(f"[yellow]{e.message}[/yellow]\n")
717 |     console.print(f"Subscribe at: [blue underline]{e.subscribe_url}[/blue underline]\n")
718 |     console.print("[dim]Once you have an active subscription, run [bold]bm cloud login[/bold] again.[/dim]")
719 |     raise typer.Exit(1)
720 | ```
721 | 
722 | #### Task 2.4: Update CLI tests
723 | **File**: `tests/cli/test_cloud_commands.py`
724 | 
725 | - [ ] Add test: `test_login_without_subscription_shows_error()`
726 |   - Mock 403 subscription_required response
727 |   - Call login command
728 |   - Assert error message displayed
729 |   - Assert subscribe URL shown
730 | - [ ] Add test: `test_login_with_subscription_succeeds()`
731 |   - Mock successful authentication + subscription check
732 |   - Call login command
733 |   - Assert success message
734 | - [ ] Run `just test` to verify tests pass
735 | 
736 | #### Task 2.5: Update CLI documentation
737 | **File**: `docs/cloud-cli.md` (in basic-memory-docs repo)
738 | 
739 | - [ ] Add "Prerequisites" section if not present
740 | - [ ] Document subscription requirement
741 | - [ ] Add "Troubleshooting" section
742 | - [ ] Document "Subscription Required" error
743 | - [ ] Provide subscribe URL
744 | - [ ] Add FAQ entry about subscription errors
745 | - [ ] Build docs locally to verify formatting
746 | 
747 | ### Phase 3: End-to-End Testing
748 | 
749 | #### Task 3.1: Create test user accounts
750 | **Prerequisites**: Access to WorkOS admin and database
751 | 
752 | - [ ] Create test user WITHOUT subscription:
753 |   - [ ] Sign up via WorkOS AuthKit
754 |   - [ ] Get workos_user_id from database
755 |   - [ ] Verify no subscription record exists
756 |   - [ ] Save credentials for testing
757 | - [ ] Create test user WITH active subscription:
758 |   - [ ] Sign up via WorkOS AuthKit
759 |   - [ ] Create subscription via Polar or dev endpoint
760 |   - [ ] Verify subscription.status = "active" in database
761 |   - [ ] Save credentials for testing
762 | 
763 | #### Task 3.2: Manual testing - User without subscription
764 | **Environment**: Preview/staging deployment
765 | 
766 | - [ ] Run `bm cloud login` with no-subscription user
767 | - [ ] Verify: Login shows "Subscription Required" error
768 | - [ ] Verify: Subscribe URL is displayed
769 | - [ ] Verify: Cannot run `bm cloud setup`
770 | - [ ] Verify: Cannot call `/tenant/mount/info` directly via curl
771 | - [ ] Document any issues found
772 | 
773 | #### Task 3.3: Manual testing - User with active subscription
774 | **Environment**: Preview/staging deployment
775 | 
776 | - [ ] Run `bm cloud login` with active-subscription user
777 | - [ ] Verify: Login succeeds without errors
778 | - [ ] Verify: Can run `bm cloud setup`
779 | - [ ] Verify: Can call `/tenant/mount/info` successfully
780 | - [ ] Verify: Can call `/proxy/*` endpoints successfully
781 | - [ ] Document any issues found
782 | 
783 | #### Task 3.4: Test subscription state transitions
784 | **Environment**: Preview/staging deployment + database access
785 | 
786 | - [ ] Start with active subscription user
787 | - [ ] Verify: All operations work
788 | - [ ] Update subscription.status to "cancelled" in database
789 | - [ ] Verify: Login now shows "Subscription Required" error
790 | - [ ] Verify: Existing tokens are rejected with 403
791 | - [ ] Update subscription.status back to "active"
792 | - [ ] Verify: Access restored immediately
793 | - [ ] Document any issues found
794 | 
795 | #### Task 3.5: Integration test suite
796 | **File**: `apps/cloud/tests/integration/test_cli_subscription_flow.py` (create if doesn't exist)
797 | 
798 | - [ ] Create integration test file
799 | - [ ] Add test: `test_cli_flow_without_subscription()`
800 |   - Simulate full CLI flow without subscription
801 |   - Assert 403 at appropriate points
802 | - [ ] Add test: `test_cli_flow_with_active_subscription()`
803 |   - Simulate full CLI flow with active subscription
804 |   - Assert all operations succeed
805 | - [ ] Add test: `test_subscription_expiration_blocks_access()`
806 |   - Start with active subscription
807 |   - Change status to cancelled
808 |   - Assert access denied
809 | - [ ] Run tests in CI/CD pipeline
810 | - [ ] Document test coverage
811 | 
812 | #### Task 3.6: Load/performance testing (optional)
813 | **Environment**: Staging environment
814 | 
815 | - [ ] Test subscription check performance under load
816 | - [ ] Measure latency added by subscription check
817 | - [ ] Verify database query performance
818 | - [ ] Document any performance concerns
819 | - [ ] Optimize if needed
820 | 
821 | ## Implementation Summary Checklist
822 | 
823 | Use this high-level checklist to track overall progress:
824 | 
825 | ### Phase 1: Cloud Service 🔄
826 | - [x] Add subscription check method to SubscriptionService
827 | - [x] Add subscription validation dependency to deps.py
828 | - [x] Add subscription_url config (env var)
829 | - [x] Protect tenant mount endpoints (4 endpoints)
830 | - [x] Protect proxy endpoints (2 endpoints)
831 | - [ ] Add unit tests for subscription service
832 | - [ ] Add integration tests for dependency
833 | - [ ] Deploy and verify cloud service
834 | 
835 | ### Phase 2: CLI Updates 🔄
836 | - [ ] Review CLI authentication flow
837 | - [ ] Update API client error handling
838 | - [ ] Update CLI login command error handling
839 | - [ ] Add CLI tests
840 | - [ ] Update CLI documentation
841 | 
842 | ### Phase 3: End-to-End Testing 🧪
843 | - [ ] Create test user accounts
844 | - [ ] Manual testing - user without subscription
845 | - [ ] Manual testing - user with active subscription
846 | - [ ] Test subscription state transitions
847 | - [ ] Integration test suite
848 | - [ ] Load/performance testing (optional)
849 | 
850 | ## Questions to Resolve
851 | 
852 | ### Resolved ✅
853 | 
854 | 1. **Admin Access**
855 |    - ✅ **Decision**: Admin users bypass subscription check
856 |    - **Rationale**: Admin endpoints already use `AdminUserHybridDep`, which is separate from CLI user endpoints
857 |    - **Implementation**: No changes needed to admin endpoints
858 | 
859 | 2. **Subscription Check Implementation**
860 |    - ✅ **Decision**: Use Option A (Database Check)
861 |    - **Rationale**: Simpler, faster to implement, works with existing infrastructure
862 |    - **Implementation**: Single JOIN query via `get_subscription_by_workos_user_id()`
863 | 
864 | 3. **Dependency Return Type**
865 |    - ✅ **Decision**: Return `UserProfile` (not `UserContext`)
866 |    - **Rationale**: Drop-in compatibility with existing endpoints, no refactoring needed
867 |    - **Implementation**: `AuthorizedCLIUserProfileDep` returns `UserProfile`
868 | 
869 | ### To Be Resolved ⏳
870 | 
871 | 1. **Subscription Check Frequency**
872 |    - **Options**:
873 |      - Check on every API call (slower, more secure) ✅ **RECOMMENDED**
874 |      - Cache subscription status (faster, risk of stale data)
875 |      - Check only on login/setup (fast, but allows expired subscriptions temporarily)
876 |    - **Recommendation**: Check on every call via dependency injection (simple, secure, acceptable performance)
877 |    - **Impact**: ~5-10ms per request (single indexed JOIN query)
878 | 
879 | 2. **Grace Period**
880 |    - **Options**:
881 |      - No grace period - immediate block when status != "active" ✅ **RECOMMENDED**
882 |      - 7-day grace period after period_end
883 |      - 14-day grace period after period_end
884 |    - **Recommendation**: No grace period initially, add later if needed based on customer feedback
885 |    - **Implementation**: Check `subscription.status == "active"` only (ignore period_end initially)
886 | 
887 | 3. **Subscription Expiration Handling**
888 |    - **Question**: Should we check `current_period_end < now()` in addition to `status == "active"`?
889 |    - **Options**:
890 |      - Only check status field (rely on Polar webhooks to update status) ✅ **RECOMMENDED**
891 |      - Check both status and current_period_end (more defensive)
892 |    - **Recommendation**: Only check status field, assume Polar webhooks keep it current
893 |    - **Risk**: If webhooks fail, expired subscriptions might retain access until webhook succeeds
894 | 
895 | 4. **Subscribe URL**
896 |    - **Question**: What's the actual subscription URL?
897 |    - **Current**: Spec uses `https://basicmemory.com/subscribe`
898 |    - **Action Required**: Verify correct URL before implementation
899 | 
900 | 5. **Dev Mode / Testing Bypass**
901 |    - **Question**: Support bypass for development/testing?
902 |    - **Options**:
903 |      - Environment variable: `DISABLE_SUBSCRIPTION_CHECK=true`
904 |      - Always enforce (more realistic testing) ✅ **RECOMMENDED**
905 |    - **Recommendation**: No bypass - use test users with real subscriptions for realistic testing
906 |    - **Implementation**: Create dev endpoint to activate subscriptions for testing
907 | 
908 | ## Related Specs
909 | 
910 | - SPEC-9: Multi-Project Bidirectional Sync Architecture (CLI affected by this change)
911 | - SPEC-8: TigrisFS Integration (Mount endpoints protected)
912 | 
913 | ## Notes
914 | 
915 | - This spec prioritizes security over convenience - better to block unauthorized access than risk revenue loss
916 | - Clear error messages are critical - users should understand why they're blocked and how to resolve it
917 | - Consider adding telemetry to track subscription_required errors for monitoring signup conversion
918 | 
```

--------------------------------------------------------------------------------
/specs/SPEC-19 Sync Performance and Memory Optimization.md:
--------------------------------------------------------------------------------

```markdown
   1 | ---
   2 | title: 'SPEC-19: Sync Performance and Memory Optimization'
   3 | type: spec
   4 | permalink: specs/spec-17-sync-performance-optimization
   5 | tags:
   6 | - performance
   7 | - memory
   8 | - sync
   9 | - optimization
  10 | - core
  11 | status: draft
  12 | ---
  13 | 
  14 | # SPEC-19: Sync Performance and Memory Optimization
  15 | 
  16 | ## Why
  17 | 
  18 | ### Problem Statement
  19 | 
  20 | Current sync implementation causes Out-of-Memory (OOM) kills and poor performance on production systems:
  21 | 
  22 | **Evidence from Production**:
  23 | - **Tenant-6d2ff1a3**: OOM killed on 1GB machine
  24 |   - Files: 2,621 total (31 PDFs, 80MB binary data)
  25 |   - Memory: 1.5-1.7GB peak usage
  26 |   - Sync duration: 15+ minutes
  27 |   - Error: `Out of memory: Killed process 693 (python)`
  28 | 
  29 | **Root Causes**:
  30 | 
  31 | 1. **Checksum-based scanning loads ALL files into memory**
  32 |    - `scan_directory()` computes checksums for ALL 2,624 files upfront
  33 |    - Results stored in multiple dicts (`ScanResult.files`, `SyncReport.checksums`)
  34 |    - Even unchanged files are fully read and checksummed
  35 | 
  36 | 2. **Large files read entirely for checksums**
  37 |    - 16MB PDF → Full read into memory → Compute checksum
  38 |    - No streaming or chunked processing
  39 |    - TigrisFS caching compounds memory usage
  40 | 
  41 | 3. **Unbounded concurrency**
  42 |    - All 2,624 files processed simultaneously
  43 |    - Each file loads full content into memory
  44 |    - No semaphore limiting concurrent operations
  45 | 
  46 | 4. **Cloud-specific resource leaks**
  47 |    - aiohttp session leak in keepalive (not in context manager)
  48 |    - Circuit breaker resets every 30s sync cycle (ineffective)
  49 |    - Thundering herd: all tenants sync at :00 and :30
  50 | 
  51 | ### Impact
  52 | 
  53 | - **Production stability**: OOM kills are unacceptable
  54 | - **User experience**: 15+ minute syncs are too slow
  55 | - **Cost**: Forced upgrades from 1GB → 2GB machines ($5-10/mo per tenant)
  56 | - **Scalability**: Current approach won't scale to 100+ tenants
  57 | 
  58 | ### Architectural Decision
  59 | 
  60 | **Fix in basic-memory core first, NOT UberSync**
  61 | 
  62 | Rationale:
  63 | - Root causes are algorithmic, not architectural
  64 | - Benefits all users (CLI + Cloud)
  65 | - Lower risk than new centralized service
  66 | - Known solutions (rsync/rclone use same pattern)
  67 | - Can defer UberSync until metrics prove it necessary
  68 | 
  69 | ## What
  70 | 
  71 | ### Affected Components
  72 | 
  73 | **basic-memory (core)**:
  74 | - `src/basic_memory/sync/sync_service.py` - Core sync algorithm (~42KB)
  75 | - `src/basic_memory/models.py` - Entity model (add mtime/size columns)
  76 | - `src/basic_memory/file_utils.py` - Checksum computation functions
  77 | - `src/basic_memory/repository/entity_repository.py` - Database queries
  78 | - `alembic/versions/` - Database migration for schema changes
  79 | 
  80 | **basic-memory-cloud (wrapper)**:
  81 | - `apps/api/src/basic_memory_cloud_api/sync_worker.py` - Cloud sync wrapper
  82 | - Circuit breaker implementation
  83 | - Sync coordination logic
  84 | 
  85 | ### Database Schema Changes
  86 | 
  87 | Add to Entity model:
  88 | ```python
  89 | mtime: float  # File modification timestamp
  90 | size: int     # File size in bytes
  91 | ```
  92 | 
  93 | ## How (High Level)
  94 | 
  95 | ### Phase 1: Core Algorithm Fixes (basic-memory)
  96 | 
  97 | **Priority: P0 - Critical**
  98 | 
  99 | #### 1.1 mtime-based Scanning (Issue #383)
 100 | 
 101 | Replace expensive checksum-based scanning with lightweight stat-based comparison:
 102 | 
 103 | ```python
 104 | async def scan_directory(self, directory: Path) -> ScanResult:
 105 |     """Scan using mtime/size instead of checksums"""
 106 |     result = ScanResult()
 107 | 
 108 |     for root, dirnames, filenames in os.walk(str(directory)):
 109 |         for filename in filenames:
 110 |             rel_path = path.relative_to(directory).as_posix()
 111 |             stat = path.stat()
 112 | 
 113 |             # Store lightweight metadata instead of checksum
 114 |             result.files[rel_path] = {
 115 |                 'mtime': stat.st_mtime,
 116 |                 'size': stat.st_size
 117 |             }
 118 | 
 119 |     return result
 120 | 
 121 | async def scan(self, directory: Path):
 122 |     """Compare mtime/size, only compute checksums for changed files"""
 123 |     db_state = await self.get_db_file_state()  # Include mtime/size
 124 |     scan_result = await self.scan_directory(directory)
 125 | 
 126 |     for file_path, metadata in scan_result.files.items():
 127 |         db_metadata = db_state.get(file_path)
 128 | 
 129 |         # Only compute expensive checksum if mtime/size changed
 130 |         if not db_metadata or metadata['mtime'] != db_metadata['mtime']:
 131 |             checksum = await self._compute_checksum_streaming(file_path)
 132 |             # Process immediately, don't accumulate in memory
 133 | ```
 134 | 
 135 | **Benefits**:
 136 | - No file reads during initial scan (just stat calls)
 137 | - ~90% reduction in memory usage
 138 | - ~10x faster scan phase
 139 | - Only checksum files that actually changed
 140 | 
 141 | #### 1.2 Streaming Checksum Computation (Issue #382)
 142 | 
 143 | For large files (>1MB), use chunked reading to avoid loading entire file:
 144 | 
 145 | ```python
 146 | async def _compute_checksum_streaming(self, path: Path, chunk_size: int = 65536) -> str:
 147 |     """Compute checksum using 64KB chunks for large files"""
 148 |     hasher = hashlib.sha256()
 149 | 
 150 |     loop = asyncio.get_event_loop()
 151 | 
 152 |     def read_chunks():
 153 |         with open(path, 'rb') as f:
 154 |             while chunk := f.read(chunk_size):
 155 |                 hasher.update(chunk)
 156 | 
 157 |     await loop.run_in_executor(None, read_chunks)
 158 |     return hasher.hexdigest()
 159 | 
 160 | async def _compute_checksum_async(self, file_path: Path) -> str:
 161 |     """Choose appropriate checksum method based on file size"""
 162 |     stat = file_path.stat()
 163 | 
 164 |     if stat.st_size > 1_048_576:  # 1MB threshold
 165 |         return await self._compute_checksum_streaming(file_path)
 166 |     else:
 167 |         # Small files: existing fast path
 168 |         content = await self._read_file_async(file_path)
 169 |         return compute_checksum(content)
 170 | ```
 171 | 
 172 | **Benefits**:
 173 | - Constant memory usage regardless of file size
 174 | - 16MB PDF uses 64KB memory (not 16MB)
 175 | - Works well with TigrisFS network I/O
 176 | 
 177 | #### 1.3 Bounded Concurrency (Issue #198)
 178 | 
 179 | Add semaphore to limit concurrent file operations, or consider using aiofiles and async reads
 180 | 
 181 | ```python
 182 | class SyncService:
 183 |     def __init__(self, ...):
 184 |         # ... existing code ...
 185 |         self._file_semaphore = asyncio.Semaphore(10)  # Max 10 concurrent
 186 |         self._max_tracked_failures = 100  # LRU cache limit
 187 | 
 188 |     async def _read_file_async(self, file_path: Path) -> str:
 189 |         async with self._file_semaphore:
 190 |             loop = asyncio.get_event_loop()
 191 |             return await loop.run_in_executor(
 192 |                 self._thread_pool,
 193 |                 file_path.read_text,
 194 |                 "utf-8"
 195 |             )
 196 | 
 197 |     async def _record_failure(self, path: str, error: str):
 198 |         # ... existing code ...
 199 | 
 200 |         # Implement LRU eviction
 201 |         if len(self._file_failures) > self._max_tracked_failures:
 202 |             self._file_failures.popitem(last=False)  # Remove oldest
 203 | ```
 204 | 
 205 | **Benefits**:
 206 | - Maximum 10 files in memory at once (vs all 2,624)
 207 | - 90%+ reduction in peak memory usage
 208 | - Prevents unbounded memory growth on error-prone projects
 209 | 
 210 | ### Phase 2: Cloud-Specific Fixes (basic-memory-cloud)
 211 | 
 212 | **Priority: P1 - High**
 213 | 
 214 | #### 2.1 Fix Resource Leaks
 215 | 
 216 | ```python
 217 | # apps/api/src/basic_memory_cloud_api/sync_worker.py
 218 | 
 219 | async def send_keepalive():
 220 |     """Send keepalive pings using proper session management"""
 221 |     # Use context manager to ensure cleanup
 222 |     async with aiohttp.ClientSession(
 223 |         timeout=aiohttp.ClientTimeout(total=5)
 224 |     ) as session:
 225 |         while True:
 226 |             try:
 227 |                 await session.get(f"https://{fly_app_name}.fly.dev/health")
 228 |                 await asyncio.sleep(10)
 229 |             except asyncio.CancelledError:
 230 |                 raise  # Exit cleanly
 231 |             except Exception as e:
 232 |                 logger.warning(f"Keepalive failed: {e}")
 233 | ```
 234 | 
 235 | #### 2.2 Improve Circuit Breaker
 236 | 
 237 | Track failures across sync cycles instead of resetting every 30s:
 238 | 
 239 | ```python
 240 | # Persistent failure tracking
 241 | class SyncWorker:
 242 |     def __init__(self):
 243 |         self._persistent_failures: Dict[str, int] = {}  # file -> failure_count
 244 |         self._failure_window_start = time.time()
 245 | 
 246 |     async def should_skip_file(self, file_path: str) -> bool:
 247 |         # Skip files that failed >3 times in last hour
 248 |         if self._persistent_failures.get(file_path, 0) > 3:
 249 |             if time.time() - self._failure_window_start < 3600:
 250 |                 return True
 251 |         return False
 252 | ```
 253 | 
 254 | ### Phase 3: Measurement & Decision
 255 | 
 256 | **Priority: P2 - Future**
 257 | 
 258 | After implementing Phases 1-2, collect metrics for 2 weeks:
 259 | - Memory usage per tenant sync
 260 | - Sync duration (scan + process)
 261 | - Concurrent sync load at peak times
 262 | - OOM incidents
 263 | - Resource costs
 264 | 
 265 | **UberSync Decision Criteria**:
 266 | 
 267 | Build centralized sync service ONLY if metrics show:
 268 | - ✅ Core fixes insufficient for >100 tenants
 269 | - ✅ Resource contention causing problems
 270 | - ✅ Need for tenant tier prioritization (paid > free)
 271 | - ✅ Cost savings justify complexity
 272 | 
 273 | Otherwise, defer UberSync as premature optimization.
 274 | 
 275 | ## How to Evaluate
 276 | 
 277 | ### Success Metrics (Phase 1)
 278 | 
 279 | **Memory Usage**:
 280 | - ✅ Peak memory <500MB for 2,000+ file projects (was 1.5-1.7GB)
 281 | - ✅ Memory usage linear with concurrent files (10 max), not total files
 282 | - ✅ Large file memory usage: 64KB chunks (not 16MB)
 283 | 
 284 | **Performance**:
 285 | - ✅ Initial scan <30 seconds (was 5+ minutes)
 286 | - ✅ Full sync <5 minutes for 2,000+ files (was 15+ minutes)
 287 | - ✅ Subsequent syncs <10 seconds (only changed files)
 288 | 
 289 | **Stability**:
 290 | - ✅ 2,000+ file projects run on 1GB machines
 291 | - ✅ Zero OOM kills in production
 292 | - ✅ No degradation with binary files (PDFs, images)
 293 | 
 294 | ### Success Metrics (Phase 2)
 295 | 
 296 | **Resource Management**:
 297 | - ✅ Zero aiohttp session leaks (verified via monitoring)
 298 | - ✅ Circuit breaker prevents repeated failures (>3 fails = skip for 1 hour)
 299 | - ✅ Tenant syncs distributed over 30s window (no thundering herd)
 300 | 
 301 | **Observability**:
 302 | - ✅ Logfire traces show memory usage per sync
 303 | - ✅ Clear logging of skipped files and reasons
 304 | - ✅ Metrics on sync duration, file counts, failure rates
 305 | 
 306 | ### Test Plan
 307 | 
 308 | **Unit Tests** (basic-memory):
 309 | - mtime comparison logic
 310 | - Streaming checksum correctness
 311 | - Semaphore limiting (mock 100 files, verify max 10 concurrent)
 312 | - LRU cache eviction
 313 | - Checksum computation: streaming vs non-streaming equivalence
 314 | 
 315 | **Integration Tests** (basic-memory):
 316 | - Large file handling (create 20MB test file)
 317 | - Mixed file types (text + binary)
 318 | - Changed file detection via mtime
 319 | - Sync with 1,000+ files
 320 | 
 321 | **Load Tests** (basic-memory-cloud):
 322 | - Test on tenant-6d2ff1a3 (2,621 files, 31 PDFs)
 323 | - Monitor memory during full sync with Logfire
 324 | - Measure scan and sync duration
 325 | - Run on 1GB machine (downgrade from 2GB to verify)
 326 | - Simulate 10 concurrent tenant syncs
 327 | 
 328 | **Regression Tests**:
 329 | - Verify existing sync scenarios still work
 330 | - CLI sync behavior unchanged
 331 | - File watcher integration unaffected
 332 | 
 333 | ### Performance Benchmarks
 334 | 
 335 | Establish baseline, then compare after each phase:
 336 | 
 337 | | Metric | Baseline | Phase 1 Target | Phase 2 Target |
 338 | |--------|----------|----------------|----------------|
 339 | | Peak Memory (2,600 files) | 1.5-1.7GB | <500MB | <450MB |
 340 | | Initial Scan Time | 5+ min | <30 sec | <30 sec |
 341 | | Full Sync Time | 15+ min | <5 min | <5 min |
 342 | | Subsequent Sync | 2+ min | <10 sec | <10 sec |
 343 | | OOM Incidents/Week | 2-3 | 0 | 0 |
 344 | | Min RAM Required | 2GB | 1GB | 1GB |
 345 | 
 346 | ## Implementation Phases
 347 | 
 348 | ### Phase 0.5: Database Schema & Streaming Foundation
 349 | 
 350 | **Priority: P0 - Required for Phase 1**
 351 | 
 352 | This phase establishes the foundation for streaming sync with mtime-based change detection.
 353 | 
 354 | **Database Schema Changes**:
 355 | - [x] Add `mtime` column to Entity model (REAL type for float timestamp)
 356 | - [x] Add `size` column to Entity model (INTEGER type for file size in bytes)
 357 | - [x] Create Alembic migration for new columns (nullable initially)
 358 | - [x] Add indexes on `(file_path, project_id)` for optimistic upsert performance
 359 | - [ ] Backfill existing entities with mtime/size from filesystem
 360 | 
 361 | **Streaming Architecture**:
 362 | - [x] Replace `os.walk()` with `os.scandir()` for cached stat info
 363 | - [ ] Eliminate `get_db_file_state()` - no upfront SELECT all entities
 364 | - [x] Implement streaming iterator `_scan_directory_streaming()`
 365 | - [x] Add `get_by_file_path()` optimized query (single file lookup)
 366 | - [x] Add `get_all_file_paths()` for deletion detection (paths only, no entities)
 367 | 
 368 | **Benefits**:
 369 | - **50% fewer network calls** on Tigris (scandir returns cached stat)
 370 | - **No large dicts in memory** (process files one at a time)
 371 | - **Indexed lookups** instead of full table scan
 372 | - **Foundation for mtime comparison** (Phase 1)
 373 | 
 374 | **Code Changes**:
 375 | 
 376 | ```python
 377 | # Before: Load all entities upfront
 378 | db_paths = await self.get_db_file_state()  # SELECT * FROM entity WHERE project_id = ?
 379 | scan_result = await self.scan_directory()  # os.walk() + stat() per file
 380 | 
 381 | # After: Stream and query incrementally
 382 | async for file_path, stat_info in self.scan_directory():  # scandir() with cached stat
 383 |     db_entity = await self.entity_repository.get_by_file_path(rel_path)  # Indexed lookup
 384 |     # Process immediately, no accumulation
 385 | ```
 386 | 
 387 | **Files Modified**:
 388 | - `src/basic_memory/models.py` - Add mtime/size columns
 389 | - `alembic/versions/xxx_add_mtime_size.py` - Migration
 390 | - `src/basic_memory/sync/sync_service.py` - Streaming implementation
 391 | - `src/basic_memory/repository/entity_repository.py` - Add get_all_file_paths()
 392 | 
 393 | **Migration Strategy**:
 394 | ```sql
 395 | -- Migration: Add nullable columns
 396 | ALTER TABLE entity ADD COLUMN mtime REAL;
 397 | ALTER TABLE entity ADD COLUMN size INTEGER;
 398 | 
 399 | -- Backfill from filesystem during first sync after upgrade
 400 | -- (Handled in sync_service on first scan)
 401 | ```
 402 | 
 403 | ### Phase 1: Core Fixes
 404 | 
 405 | **mtime-based scanning**:
 406 | - [x] Add mtime/size columns to Entity model (completed in Phase 0.5)
 407 | - [x] Database migration (alembic) (completed in Phase 0.5)
 408 | - [x] Refactor `scan()` to use streaming architecture with mtime/size comparison
 409 | - [x] Update `sync_markdown_file()` and `sync_regular_file()` to store mtime/size in database
 410 | - [x] Only compute checksums for changed files (mtime/size differ)
 411 | - [x] Unit tests for streaming scan (6 tests passing)
 412 | - [ ] Integration test with 1,000 files (defer to benchmarks)
 413 | 
 414 | **Streaming checksums**:
 415 | - [x] Implement `_compute_checksum_streaming()` with chunked reading
 416 | - [x] Add file size threshold logic (1MB)
 417 | - [x] Test with large files (16MB PDF)
 418 | - [x] Verify memory usage stays constant
 419 | - [x] Test checksum equivalence (streaming vs non-streaming)
 420 | 
 421 | **Bounded concurrency**:
 422 | - [x] Add semaphore (10 concurrent) to `_read_file_async()` (already existed)
 423 | - [x] Add LRU cache for failures (100 max) (already existed)
 424 | - [ ] Review thread pool size configuration
 425 | - [ ] Load test with 2,000+ files
 426 | - [ ] Verify <500MB peak memory
 427 | 
 428 | **Cleanup & Optimization**:
 429 | - [x] Eliminate `get_db_file_state()` - no upfront SELECT all entities (streaming architecture complete)
 430 | - [x] Consolidate file operations in FileService (eliminate duplicate checksum logic)
 431 | - [x] Add aiofiles dependency (already present)
 432 | - [x] FileService streaming checksums for files >1MB
 433 | - [x] SyncService delegates all file operations to FileService
 434 | - [x] Complete true async I/O refactoring - all file operations use aiofiles
 435 |   - [x] Added `FileService.read_file_content()` using aiofiles
 436 |   - [x] Removed `SyncService._read_file_async()` wrapper method
 437 |   - [x] Removed `SyncService._compute_checksum_async()` wrapper method
 438 |   - [x] Inlined all 7 checksum calls to use `file_service.compute_checksum()` directly
 439 |   - [x] All file I/O operations now properly consolidated in FileService with non-blocking I/O
 440 | - [x] Removed sync_status_service completely (unnecessary complexity and state tracking)
 441 |   - [x] Removed `sync_status_service.py` and `sync_status` MCP tool
 442 |   - [x] Removed all `sync_status_tracker` calls from `sync_service.py`
 443 |   - [x] Removed migration status checks from MCP tools (`write_note`, `read_note`, `build_context`)
 444 |   - [x] Removed `check_migration_status()` and `wait_for_migration_or_return_status()` from `utils.py`
 445 |   - [x] Removed all related tests (4 test files deleted)
 446 |   - [x] All 1184 tests passing
 447 | 
 448 | **Phase 1 Implementation Summary:**
 449 | 
 450 | Phase 1 is now complete with all core fixes implemented and tested:
 451 | 
 452 | 1. **Streaming Architecture** (Phase 0.5 + Phase 1):
 453 |    - Replaced `os.walk()` with `os.scandir()` for cached stat info
 454 |    - Eliminated upfront `get_db_file_state()` SELECT query
 455 |    - Implemented `_scan_directory_streaming()` for incremental processing
 456 |    - Added indexed `get_by_file_path()` lookups
 457 |    - Result: 50% fewer network calls on TigrisFS, no large dicts in memory
 458 | 
 459 | 2. **mtime-based Change Detection**:
 460 |    - Added `mtime` and `size` columns to Entity model
 461 |    - Alembic migration completed and deployed
 462 |    - Only compute checksums when mtime/size differs from database
 463 |    - Result: ~90% reduction in checksum operations during typical syncs
 464 | 
 465 | 3. **True Async I/O with aiofiles**:
 466 |    - All file operations consolidated in FileService
 467 |    - `FileService.compute_checksum()`: 64KB chunked reading for constant memory (lines 261-296 of file_service.py)
 468 |    - `FileService.read_file_content()`: Non-blocking file reads with aiofiles (lines 160-193 of file_service.py)
 469 |    - Removed all wrapper methods from SyncService (`_read_file_async`, `_compute_checksum_async`)
 470 |    - Semaphore controls concurrency (max 10 concurrent file operations)
 471 |    - Result: Constant memory usage regardless of file size, true non-blocking I/O
 472 | 
 473 | 4. **Test Coverage**:
 474 |    - 41/43 sync tests passing (2 skipped as expected)
 475 |    - Circuit breaker tests updated for new architecture
 476 |    - Streaming checksum equivalence verified
 477 |    - All edge cases covered (large files, concurrent operations, failures)
 478 | 
 479 | **Key Files Modified**:
 480 | - `src/basic_memory/models.py` - Added mtime/size columns
 481 | - `alembic/versions/xxx_add_mtime_size.py` - Database migration
 482 | - `src/basic_memory/sync/sync_service.py` - Streaming implementation, removed wrapper methods
 483 | - `src/basic_memory/services/file_service.py` - Added `read_file_content()`, streaming checksums
 484 | - `src/basic_memory/repository/entity_repository.py` - Added `get_all_file_paths()`
 485 | - `tests/sync/test_sync_service.py` - Updated circuit breaker test mocks
 486 | 
 487 | **Performance Improvements Achieved**:
 488 | - Memory usage: Constant per file (64KB chunks) vs full file in memory
 489 | - Scan speed: Stat-only scan (no checksums for unchanged files)
 490 | - I/O efficiency: True async with aiofiles (no thread pool blocking)
 491 | - Network efficiency: 50% fewer calls on TigrisFS via scandir caching
 492 | - Architecture: Clean separation of concerns (FileService owns all file I/O)
 493 | - Reduced complexity: Removed unnecessary sync_status_service state tracking
 494 | 
 495 | **Observability**:
 496 | - [x] Added Logfire instrumentation to `sync_file()` and `sync_markdown_file()`
 497 | - [x] Logfire disabled by default via `ignore_no_config = true` in pyproject.toml
 498 | - [x] No telemetry in FOSS version unless explicitly configured
 499 | - [x] Cloud deployment can enable Logfire for performance monitoring
 500 | 
 501 | **Next Steps**: Phase 1.5 scan watermark optimization for large project performance.
 502 | 
 503 | ### Phase 1.5: Scan Watermark Optimization
 504 | 
 505 | **Priority: P0 - Critical for Large Projects**
 506 | 
 507 | This phase addresses Issue #388 where large projects (1,460+ files) take 7+ minutes for sync operations even when no files have changed.
 508 | 
 509 | **Problem Analysis**:
 510 | 
 511 | From production data (tenant-0a20eb58):
 512 | - Total sync time: 420-450 seconds (7+ minutes) with 0 changes
 513 | - Scan phase: 321 seconds (75% of total time)
 514 | - Per-file cost: 220ms × 1,460 files = 5+ minutes
 515 | - Root cause: Network I/O to TigrisFS for stat operations (even with mtime columns)
 516 | - 15 concurrent syncs every 30 seconds compounds the problem
 517 | 
 518 | **Current Behavior** (Phase 1):
 519 | ```python
 520 | async def scan(self, directory: Path):
 521 |     """Scan filesystem using mtime/size comparison"""
 522 |     # Still stats ALL 1,460 files every sync cycle
 523 |     async for file_path, stat_info in self._scan_directory_streaming():
 524 |         db_entity = await self.entity_repository.get_by_file_path(file_path)
 525 |         # Compare mtime/size, skip unchanged files
 526 |         # Only checksum if changed (✅ already optimized)
 527 | ```
 528 | 
 529 | **Problem**: Even with mtime optimization, we stat every file on every scan. On TigrisFS (network FUSE mount), this means 1,460 network calls taking 5+ minutes.
 530 | 
 531 | **Solution: Scan Watermark + File Count Detection**
 532 | 
 533 | Track when we last scanned and how many files existed. Use filesystem-level filtering to only examine files modified since last scan.
 534 | 
 535 | **Key Insight**: File count changes signal deletions
 536 | - Count same → incremental scan (95% of syncs)
 537 | - Count increased → new files found by incremental (4% of syncs)
 538 | - Count decreased → files deleted, need full scan (1% of syncs)
 539 | 
 540 | **Database Schema Changes**:
 541 | 
 542 | Add to Project model:
 543 | ```python
 544 | last_scan_timestamp: float | None  # Unix timestamp of last successful scan start
 545 | last_file_count: int | None        # Number of files found in last scan
 546 | ```
 547 | 
 548 | **Implementation Strategy**:
 549 | 
 550 | ```python
 551 | async def scan(self, directory: Path):
 552 |     """Smart scan using watermark and file count"""
 553 |     project = await self.project_repository.get_current()
 554 | 
 555 |     # Step 1: Quick file count (fast on TigrisFS: 1.4s for 1,460 files)
 556 |     current_count = await self._quick_count_files(directory)
 557 | 
 558 |     # Step 2: Determine scan strategy
 559 |     if project.last_file_count is None:
 560 |         # First sync ever → full scan
 561 |         file_paths = await self._scan_directory_full(directory)
 562 |         scan_type = "full_initial"
 563 | 
 564 |     elif current_count < project.last_file_count:
 565 |         # Files deleted → need full scan to detect which ones
 566 |         file_paths = await self._scan_directory_full(directory)
 567 |         scan_type = "full_deletions"
 568 |         logger.info(f"File count decreased ({project.last_file_count} → {current_count}), running full scan")
 569 | 
 570 |     elif project.last_scan_timestamp is not None:
 571 |         # Incremental scan: only files modified since last scan
 572 |         file_paths = await self._scan_directory_modified_since(
 573 |             directory,
 574 |             project.last_scan_timestamp
 575 |         )
 576 |         scan_type = "incremental"
 577 |         logger.info(f"Incremental scan since {project.last_scan_timestamp}, found {len(file_paths)} changed files")
 578 |     else:
 579 |         # Fallback to full scan
 580 |         file_paths = await self._scan_directory_full(directory)
 581 |         scan_type = "full_fallback"
 582 | 
 583 |     # Step 3: Process changed files (existing logic)
 584 |     for file_path in file_paths:
 585 |         await self._process_file(file_path)
 586 | 
 587 |     # Step 4: Update watermark AFTER successful scan
 588 |     await self.project_repository.update(
 589 |         project.id,
 590 |         last_scan_timestamp=time.time(),  # Start of THIS scan
 591 |         last_file_count=current_count
 592 |     )
 593 | 
 594 |     # Step 5: Record metrics
 595 |     logfire.metric_counter(f"sync.scan.{scan_type}").add(1)
 596 |     logfire.metric_histogram("sync.scan.files_scanned", unit="files").record(len(file_paths))
 597 | ```
 598 | 
 599 | **Helper Methods**:
 600 | 
 601 | ```python
 602 | async def _quick_count_files(self, directory: Path) -> int:
 603 |     """Fast file count using find command"""
 604 |     # TigrisFS: 1.4s for 1,460 files
 605 |     result = await asyncio.create_subprocess_shell(
 606 |         f'find "{directory}" -type f | wc -l',
 607 |         stdout=asyncio.subprocess.PIPE
 608 |     )
 609 |     stdout, _ = await result.communicate()
 610 |     return int(stdout.strip())
 611 | 
 612 | async def _scan_directory_modified_since(
 613 |     self,
 614 |     directory: Path,
 615 |     since_timestamp: float
 616 | ) -> List[str]:
 617 |     """Use find -newermt for filesystem-level filtering"""
 618 |     # Convert timestamp to find-compatible format
 619 |     since_date = datetime.fromtimestamp(since_timestamp).strftime("%Y-%m-%d %H:%M:%S")
 620 | 
 621 |     # TigrisFS: 0.2s for 0 changed files (vs 5+ minutes for full scan)
 622 |     result = await asyncio.create_subprocess_shell(
 623 |         f'find "{directory}" -type f -newermt "{since_date}"',
 624 |         stdout=asyncio.subprocess.PIPE
 625 |     )
 626 |     stdout, _ = await result.communicate()
 627 | 
 628 |     # Convert absolute paths to relative
 629 |     file_paths = []
 630 |     for line in stdout.decode().splitlines():
 631 |         if line:
 632 |             rel_path = Path(line).relative_to(directory).as_posix()
 633 |             file_paths.append(rel_path)
 634 | 
 635 |     return file_paths
 636 | ```
 637 | 
 638 | **TigrisFS Testing Results** (SSH to production-basic-memory-tenant-0a20eb58):
 639 | 
 640 | ```bash
 641 | # Full file count
 642 | $ time find . -type f | wc -l
 643 | 1460
 644 | real    0m1.362s  # ✅ Acceptable
 645 | 
 646 | # Incremental scan (1 hour window)
 647 | $ time find . -type f -newermt "2025-01-20 10:00:00" | wc -l
 648 | 0
 649 | real    0m0.161s  # ✅ 8.5x faster!
 650 | 
 651 | # Incremental scan (24 hours)
 652 | $ time find . -type f -newermt "2025-01-19 11:00:00" | wc -l
 653 | 0
 654 | real    0m0.239s  # ✅ 5.7x faster!
 655 | ```
 656 | 
 657 | **Conclusion**: `find -newermt` works perfectly on TigrisFS and provides massive speedup.
 658 | 
 659 | **Expected Performance Improvements**:
 660 | 
 661 | | Scenario | Files Changed | Current Time | With Watermark | Speedup |
 662 | |----------|---------------|--------------|----------------|---------|
 663 | | No changes (common) | 0 | 420s | ~2s | 210x |
 664 | | Few changes | 5-10 | 420s | ~5s | 84x |
 665 | | Many changes | 100+ | 420s | ~30s | 14x |
 666 | | Deletions (rare) | N/A | 420s | 420s | 1x |
 667 | 
 668 | **Full sync breakdown** (1,460 files, 0 changes):
 669 | - File count: 1.4s
 670 | - Incremental scan: 0.2s
 671 | - Database updates: 0.4s
 672 | - **Total: ~2s (225x faster)**
 673 | 
 674 | **Metrics to Track**:
 675 | 
 676 | ```python
 677 | # Scan type distribution
 678 | logfire.metric_counter("sync.scan.full_initial").add(1)
 679 | logfire.metric_counter("sync.scan.full_deletions").add(1)
 680 | logfire.metric_counter("sync.scan.incremental").add(1)
 681 | 
 682 | # Performance metrics
 683 | logfire.metric_histogram("sync.scan.duration", unit="ms").record(scan_ms)
 684 | logfire.metric_histogram("sync.scan.files_scanned", unit="files").record(file_count)
 685 | logfire.metric_histogram("sync.scan.files_changed", unit="files").record(changed_count)
 686 | 
 687 | # Watermark effectiveness
 688 | logfire.metric_histogram("sync.scan.watermark_age", unit="s").record(
 689 |     time.time() - project.last_scan_timestamp
 690 | )
 691 | ```
 692 | 
 693 | **Edge Cases Handled**:
 694 | 
 695 | 1. **First sync**: No watermark → full scan (expected)
 696 | 2. **Deletions**: File count decreased → full scan (rare but correct)
 697 | 3. **Clock skew**: Use scan start time, not end time (captures files created during scan)
 698 | 4. **Scan failure**: Don't update watermark on failure (retry will re-scan)
 699 | 5. **New files**: Count increased → incremental scan finds them (common, fast)
 700 | 
 701 | **Files to Modify**:
 702 | - `src/basic_memory/models.py` - Add last_scan_timestamp, last_file_count to Project
 703 | - `alembic/versions/xxx_add_scan_watermark.py` - Migration for new columns
 704 | - `src/basic_memory/sync/sync_service.py` - Implement watermark logic
 705 | - `src/basic_memory/repository/project_repository.py` - Update methods
 706 | - `tests/sync/test_sync_watermark.py` - Test watermark behavior
 707 | 
 708 | **Test Plan**:
 709 | - [x] SSH test on TigrisFS confirms `find -newermt` works (completed)
 710 | - [x] Unit tests for scan strategy selection (4 tests)
 711 | - [x] Unit tests for file count detection (integrated in strategy tests)
 712 | - [x] Integration test: verify incremental scan finds changed files (4 tests)
 713 | - [x] Integration test: verify deletion detection triggers full scan (2 tests)
 714 | - [ ] Load test on tenant-0a20eb58 (1,460 files) - pending production deployment
 715 | - [ ] Verify <3s for no-change sync - pending production deployment
 716 | 
 717 | **Implementation Status**: ✅ **COMPLETED**
 718 | 
 719 | **Code Changes** (Commit: `fb16055d`):
 720 | - ✅ Added `last_scan_timestamp` and `last_file_count` to Project model
 721 | - ✅ Created database migration `e7e1f4367280_add_scan_watermark_tracking_to_project.py`
 722 | - ✅ Implemented smart scan strategy selection in `sync_service.py`
 723 | - ✅ Added `_quick_count_files()` using `find | wc -l` (~1.4s for 1,460 files)
 724 | - ✅ Added `_scan_directory_modified_since()` using `find -newermt` (~0.2s)
 725 | - ✅ Added `_scan_directory_full()` wrapper for full scans
 726 | - ✅ Watermark update logic after successful sync (uses sync START time)
 727 | - ✅ Logfire metrics for scan types and performance tracking
 728 | 
 729 | **Test Coverage** (18 tests in `test_sync_service_incremental.py`):
 730 | - ✅ Scan strategy selection (4 tests)
 731 |   - First sync uses full scan
 732 |   - File count decreased triggers full scan
 733 |   - Same file count uses incremental scan
 734 |   - Increased file count uses incremental scan
 735 | - ✅ Incremental scan base cases (4 tests)
 736 |   - No changes scenario
 737 |   - Detects new files
 738 |   - Detects modified files
 739 |   - Detects multiple changes
 740 | - ✅ Deletion detection (2 tests)
 741 |   - Single file deletion
 742 |   - Multiple file deletions
 743 | - ✅ Move detection (2 tests)
 744 |   - Moves require full scan (renames don't update mtime)
 745 |   - Moves detected in full scan via checksum
 746 | - ✅ Watermark update (3 tests)
 747 |   - Watermark updated after successful sync
 748 |   - Watermark uses sync start time
 749 |   - File count accuracy
 750 | - ✅ Edge cases (3 tests)
 751 |   - Concurrent file changes
 752 |   - Empty directory handling
 753 |   - Respects .gitignore patterns
 754 | 
 755 | **Performance Expectations** (to be verified in production):
 756 | - No changes: 420s → ~2s (210x faster)
 757 | - Few changes (5-10): 420s → ~5s (84x faster)
 758 | - Many changes (100+): 420s → ~30s (14x faster)
 759 | - Deletions: 420s → 420s (full scan, rare case)
 760 | 
 761 | **Rollout Strategy**:
 762 | 1. ✅ Code complete and tested (18 new tests, all passing)
 763 | 2. ✅ Pushed to `phase-0.5-streaming-foundation` branch
 764 | 3. ⏳ Windows CI tests running
 765 | 4. 📊 Deploy to staging tenant with watermark optimization
 766 | 5. 📊 Monitor scan performance metrics via Logfire
 767 | 6. 📊 Verify no missed files (compare full vs incremental results)
 768 | 7. 📊 Deploy to production tenant-0a20eb58
 769 | 8. 📊 Measure actual improvement (expect 420s → 2-3s)
 770 | 
 771 | **Success Criteria**:
 772 | - ✅ Implementation complete with comprehensive tests
 773 | - [ ] No-change syncs complete in <3 seconds (was 420s) - pending production test
 774 | - [ ] Incremental scans (95% of cases) use watermark - pending production test
 775 | - [ ] Deletion detection works correctly (full scan when needed) - tested in unit tests ✅
 776 | - [ ] No files missed due to watermark logic - tested in unit tests ✅
 777 | - [ ] Metrics show scan type distribution matches expectations - pending production test
 778 | 
 779 | **Next Steps**:
 780 | 1. Production deployment to tenant-0a20eb58
 781 | 2. Measure actual performance improvements
 782 | 3. Monitor metrics for 1 week
 783 | 4. Phase 2 cloud-specific fixes
 784 | 5. Phase 3 production measurement and UberSync decision
 785 | 
 786 | ### Phase 2: Cloud Fixes 
 787 | 
 788 | **Resource leaks**:
 789 | - [ ] Fix aiohttp session context manager
 790 | - [ ] Implement persistent circuit breaker
 791 | - [ ] Add memory monitoring/alerts
 792 | - [ ] Test on production tenant
 793 | 
 794 | **Sync coordination**:
 795 | - [ ] Implement hash-based staggering
 796 | - [ ] Add jitter to sync intervals
 797 | - [ ] Load test with 10 concurrent tenants
 798 | - [ ] Verify no thundering herd
 799 | 
 800 | ### Phase 3: Measurement
 801 | 
 802 | **Deploy to production**:
 803 | - [ ] Deploy Phase 1+2 changes
 804 | - [ ] Downgrade tenant-6d2ff1a3 to 1GB
 805 | - [ ] Monitor for OOM incidents
 806 | 
 807 | **Collect metrics**:
 808 | - [ ] Memory usage patterns
 809 | - [ ] Sync duration distributions
 810 | - [ ] Concurrent sync load
 811 | - [ ] Cost analysis
 812 | 
 813 | **UberSync decision**:
 814 | - [ ] Review metrics against decision criteria
 815 | - [ ] Document findings
 816 | - [ ] Create SPEC-18 for UberSync if needed
 817 | 
 818 | ## Related Issues
 819 | 
 820 | ### basic-memory (core)
 821 | - [#383](https://github.com/basicmachines-co/basic-memory/issues/383) - Refactor sync to use mtime-based scanning
 822 | - [#382](https://github.com/basicmachines-co/basic-memory/issues/382) - Optimize memory for large file syncs
 823 | - [#371](https://github.com/basicmachines-co/basic-memory/issues/371) - aiofiles for non-blocking I/O (future)
 824 | 
 825 | ### basic-memory-cloud
 826 | - [#198](https://github.com/basicmachines-co/basic-memory-cloud/issues/198) - Memory optimization for sync worker
 827 | - [#189](https://github.com/basicmachines-co/basic-memory-cloud/issues/189) - Circuit breaker for infinite retry loops
 828 | 
 829 | ## References
 830 | 
 831 | **Standard sync tools using mtime**:
 832 | - rsync: Uses mtime-based comparison by default, only checksums on `--checksum` flag
 833 | - rclone: Default is mtime/size, `--checksum` mode optional
 834 | - syncthing: Block-level sync with mtime tracking
 835 | 
 836 | **fsnotify polling** (future consideration):
 837 | - [fsnotify/fsnotify#9](https://github.com/fsnotify/fsnotify/issues/9) - Polling mode for network filesystems
 838 | 
 839 | ## Notes
 840 | 
 841 | ### Why Not UberSync Now?
 842 | 
 843 | **Premature Optimization**:
 844 | - Current problems are algorithmic, not architectural
 845 | - No evidence that multi-tenant coordination is the issue
 846 | - Single tenant OOM proves algorithm is the problem
 847 | 
 848 | **Benefits of Core-First Approach**:
 849 | - ✅ Helps all users (CLI + Cloud)
 850 | - ✅ Lower risk (no new service)
 851 | - ✅ Clear path (issues specify fixes)
 852 | - ✅ Can defer UberSync until proven necessary
 853 | 
 854 | **When UberSync Makes Sense**:
 855 | - >100 active tenants causing resource contention
 856 | - Need for tenant tier prioritization (paid > free)
 857 | - Centralized observability requirements
 858 | - Cost optimization at scale
 859 | 
 860 | ### Migration Strategy
 861 | 
 862 | **Backward Compatibility**:
 863 | - New mtime/size columns nullable initially
 864 | - Existing entities sync normally (compute mtime on first scan)
 865 | - No breaking changes to MCP API
 866 | - CLI behavior unchanged
 867 | 
 868 | **Rollout**:
 869 | 1. Deploy to staging with test tenant
 870 | 2. Validate memory/performance improvements
 871 | 3. Deploy to production (blue-green)
 872 | 4. Monitor for 1 week
 873 | 5. Downgrade tenant machines if successful
 874 | 
 875 | ## Further Considerations
 876 | 
 877 | ### Version Control System (VCS) Integration
 878 | 
 879 | **Context:** Users frequently request git versioning, and large projects with PDFs/images pose memory challenges.
 880 | 
 881 | #### Git-Based Sync
 882 | 
 883 | **Approach:** Use git for change detection instead of custom mtime comparison.
 884 | 
 885 | ```python
 886 | # Git automatically tracks changes
 887 | repo = git.Repo(project_path)
 888 | repo.git.add(A=True)
 889 | diff = repo.index.diff('HEAD')
 890 | 
 891 | for change in diff:
 892 |     if change.change_type == 'M':  # Modified
 893 |         await sync_file(change.b_path)
 894 | ```
 895 | 
 896 | **Pros:**
 897 | - ✅ Proven, battle-tested change detection
 898 | - ✅ Built-in rename/move detection (similarity index)
 899 | - ✅ Efficient for cloud sync (git protocol over HTTP)
 900 | - ✅ Could enable version history as bonus feature
 901 | - ✅ Users want git integration anyway
 902 | 
 903 | **Cons:**
 904 | - ❌ User confusion (`.git` folder in knowledge base)
 905 | - ❌ Conflicts with existing git repos (submodule complexity)
 906 | - ❌ Adds dependency (git binary or dulwich/pygit2)
 907 | - ❌ Less control over sync logic
 908 | - ❌ Doesn't solve large file problem (PDFs still checksummed)
 909 | - ❌ Git LFS adds complexity
 910 | 
 911 | #### Jujutsu (jj) Alternative
 912 | 
 913 | **Why jj is compelling:**
 914 | 
 915 | 1. **Working Copy as Source of Truth**
 916 |    - Git: Staging area is intermediate state
 917 |    - Jujutsu: Working copy IS a commit
 918 |    - Aligns with "files are source of truth" philosophy!
 919 | 
 920 | 2. **Automatic Change Tracking**
 921 |    - No manual staging required
 922 |    - Working copy changes tracked automatically
 923 |    - Better fit for sync operations vs git's commit-centric model
 924 | 
 925 | 3. **Conflict Handling**
 926 |    - User edits + sync changes both preserved
 927 |    - Operation log vs linear history
 928 |    - Built for operations, not just history
 929 | 
 930 | **Cons:**
 931 | - ❌ New/immature (2020 vs git's 2005)
 932 | - ❌ Not universally available
 933 | - ❌ Steeper learning curve for users
 934 | - ❌ No LFS equivalent yet
 935 | - ❌ Still doesn't solve large file checksumming
 936 | 
 937 | #### Git Index Format (Hybrid Approach)
 938 | 
 939 | **Best of both worlds:** Use git's index format without full git repo.
 940 | 
 941 | ```python
 942 | from dulwich.index import Index  # Pure Python
 943 | 
 944 | # Use git index format for tracking
 945 | idx = Index(project_path / '.basic-memory' / 'index')
 946 | 
 947 | for file in files:
 948 |     stat = file.stat()
 949 |     if idx.get(file) and idx[file].mtime == stat.st_mtime:
 950 |         continue  # Unchanged (git's proven logic)
 951 | 
 952 |     await sync_file(file)
 953 |     idx[file] = (stat.st_mtime, stat.st_size, sha)
 954 | ```
 955 | 
 956 | **Pros:**
 957 | - ✅ Git's proven change detection logic
 958 | - ✅ No user-visible `.git` folder
 959 | - ✅ No git dependency (pure Python)
 960 | - ✅ Full control over sync
 961 | 
 962 | **Cons:**
 963 | - ❌ Adds dependency (dulwich)
 964 | - ❌ Doesn't solve large files
 965 | - ❌ No built-in versioning
 966 | 
 967 | ### Large File Handling
 968 | 
 969 | **Problem:** PDFs/images cause memory issues regardless of VCS choice.
 970 | 
 971 | **Solutions (Phase 1+):**
 972 | 
 973 | **1. Skip Checksums for Large Files**
 974 | ```python
 975 | if stat.st_size > 10_000_000:  # 10MB threshold
 976 |     checksum = None  # Use mtime/size only
 977 |     logger.info(f"Skipping checksum for {file_path}")
 978 | ```
 979 | 
 980 | **2. Partial Hashing**
 981 | ```python
 982 | if file.suffix in ['.pdf', '.jpg', '.png']:
 983 |     # Hash first/last 64KB instead of entire file
 984 |     checksum = hash_partial(file, chunk_size=65536)
 985 | ```
 986 | 
 987 | **3. External Blob Storage**
 988 | ```python
 989 | if stat.st_size > 10_000_000:
 990 |     blob_id = await upload_to_tigris_blob(file)
 991 |     entity.blob_id = blob_id
 992 |     entity.file_path = None  # Not in main sync
 993 | ```
 994 | 
 995 | ### Recommendation & Timeline
 996 | 
 997 | **Phase 0.5-1 (Now):** Custom streaming + mtime
 998 | - ✅ Solves urgent memory issues
 999 | - ✅ No dependencies
1000 | - ✅ Full control
1001 | - ✅ Skip checksums for large files (>10MB)
1002 | - ✅ Proven pattern (rsync/rclone)
1003 | 
1004 | **Phase 2 (After metrics):** Git index format exploration
1005 | ```python
1006 | # Optional: Use git index for tracking if beneficial
1007 | from dulwich.index import Index
1008 | # No git repo, just index file format
1009 | ```
1010 | 
1011 | **Future (User feature):** User-facing versioning
1012 | ```python
1013 | # Let users opt into VCS:
1014 | basic-memory config set versioning git
1015 | basic-memory config set versioning jj
1016 | basic-memory config set versioning none  # Current behavior
1017 | 
1018 | # Integrate with their chosen workflow
1019 | # Not forced upon them
1020 | ```
1021 | 
1022 | **Rationale:**
1023 | 1. **Don't block on VCS decision** - Memory issues are P0
1024 | 2. **Learn from deployment** - See actual usage patterns
1025 | 3. **Keep options open** - Can add git/jj later
1026 | 4. **Files as source of truth** - Core philosophy preserved
1027 | 5. **Large files need attention regardless** - VCS won't solve that
1028 | 
1029 | **Decision Point:**
1030 | - If Phase 0.5/1 achieves memory targets → VCS integration deferred
1031 | - If users strongly request versioning → Add as opt-in feature
1032 | - If change detection becomes bottleneck → Explore git index format
1033 | 
1034 | ## Agent Assignment
1035 | 
1036 | **Phase 1 Implementation**: `python-developer` agent
1037 | - Expertise in FastAPI, async Python, database migrations
1038 | - Handles basic-memory core changes
1039 | 
1040 | **Phase 2 Implementation**: `python-developer` agent
1041 | - Same agent continues with cloud-specific fixes
1042 | - Maintains consistency across phases
1043 | 
1044 | **Phase 3 Review**: `system-architect` agent
1045 | - Analyzes metrics and makes UberSync decision
1046 | - Creates SPEC-18 if centralized service needed
1047 | 
```

--------------------------------------------------------------------------------
/specs/SPEC-9 Multi-Project Bidirectional Sync Architecture.md:
--------------------------------------------------------------------------------

```markdown
   1 | ---
   2 | title: 'SPEC-9: Multi-Project Bidirectional Sync Architecture'
   3 | type: spec
   4 | permalink: specs/spec-9-multi-project-bisync
   5 | tags:
   6 | - cloud
   7 | - bisync
   8 | - architecture
   9 | - multi-project
  10 | ---
  11 | 
  12 | # SPEC-9: Multi-Project Bidirectional Sync Architecture
  13 | 
  14 | ## Status: ✅ Implementation Complete
  15 | 
  16 | **Completed Phases:**
  17 | - ✅ Phase 1: Cloud Mode Toggle & Config
  18 | - ✅ Phase 2: Bisync Updates (Multi-Project)
  19 | - ✅ Phase 3: Sync Command Dual Mode
  20 | - ✅ Phase 4: Remove Duplicate Commands & Cloud Mode Auth
  21 | - ✅ Phase 5: Mount Updates
  22 | - ✅ Phase 6: Safety & Validation
  23 | - ⏸️ Phase 7: Cloud-Side Implementation (Deferred to cloud repo)
  24 | - ✅ Phase 8.1: Testing (All test scenarios validated)
  25 | - ✅ Phase 8.2: Documentation (Core docs complete, demos pending)
  26 | 
  27 | **Key Achievements:**
  28 | - Unified CLI: `bm sync`, `bm project`, `bm tool` work transparently in both local and cloud modes
  29 | - Multi-project sync: Single `bm sync` operation handles all projects bidirectionally
  30 | - Cloud mode toggle: `bm cloud login` / `bm cloud logout` switches modes seamlessly
  31 | - Integrity checking: `bm cloud check` verifies file matching without data transfer
  32 | - Directory isolation: Mount and bisync use separate directories with conflict prevention
  33 | - Clean UX: No RCLONE_TEST files, clear error messages, transparent implementation
  34 | 
  35 | ## Why
  36 | 
  37 | **Current State:**
  38 | SPEC-8 implemented rclone bisync for cloud file synchronization, but has several architectural limitations:
  39 | 1. Syncs only a single project subdirectory (`bucket:/basic-memory`)
  40 | 2. Requires separate `bm cloud` command namespace, duplicating existing CLI commands
  41 | 3. Users must learn different commands for local vs cloud operations
  42 | 4. RCLONE_TEST marker files clutter user directories
  43 | 
  44 | **Problems:**
  45 | 1. **Duplicate Commands**: `bm project` vs `bm cloud project`, `bm tool` vs (no cloud equivalent)
  46 | 2. **Inconsistent UX**: Same operations require different command syntax depending on mode
  47 | 3. **Single Project Sync**: Users can only sync one project at a time
  48 | 4. **Manual Coordination**: Creating new projects requires manual coordination between local and cloud
  49 | 5. **Confusing Artifacts**: RCLONE_TEST marker files confuse users
  50 | 
  51 | **Goals:**
  52 | - **Unified CLI**: All existing `bm` commands work in both local and cloud mode via toggle
  53 | - **Multi-Project Sync**: Single sync operation handles all projects bidirectionally
  54 | - **Simple Mode Switch**: `bm cloud login` enables cloud mode, `logout` returns to local
  55 | - **Automatic Registration**: Projects auto-register on both local and cloud sides
  56 | - **Clean UX**: Remove unnecessary safety checks and confusing artifacts
  57 | 
  58 | ## Cloud Access Paradigm: The Dropbox Model
  59 | 
  60 | **Mental Model Shift:**
  61 | 
  62 | Basic Memory cloud access follows the **Dropbox/iCloud paradigm** - not a per-project cloud connection model.
  63 | 
  64 | **What This Means:**
  65 | 
  66 | ```
  67 | Traditional Project-Based Model (❌ Not This):
  68 |   bm cloud mount --project work      # Mount individual project
  69 |   bm cloud mount --project personal  # Mount another project
  70 |   bm cloud sync --project research   # Sync specific project
  71 |   → Multiple connections, multiple credentials, complex management
  72 | 
  73 | Dropbox Model (✅ This):
  74 |   bm cloud mount                     # One mount, all projects
  75 |   bm sync                            # One sync, all projects
  76 |   ~/basic-memory-cloud/              # One folder, all content
  77 |   → Single connection, organized by folders (projects)
  78 | ```
  79 | 
  80 | **Key Principles:**
  81 | 
  82 | 1. **Mount/Bisync = Access Methods, Not Project Tools**
  83 |    - Mount: Read-through cache to cloud (like Dropbox folder)
  84 |    - Bisync: Bidirectional sync with cloud (like Dropbox sync)
  85 |    - Both operate at **bucket level** (all projects)
  86 | 
  87 | 2. **Projects = Organization Within Cloud Space**
  88 |    - Projects are folders within your cloud storage
  89 |    - Creating a folder creates a project (auto-discovered)
  90 |    - Projects are managed via `bm project` commands
  91 | 
  92 | 3. **One Cloud Space Per Machine**
  93 |    - One set of IAM credentials per tenant
  94 |    - One mount point: `~/basic-memory-cloud/`
  95 |    - One bisync directory: `~/basic-memory-cloud-sync/` (default)
  96 |    - All projects accessible through this single entry point
  97 | 
  98 | 4. **Why This Works Better**
  99 |    - **Credential Management**: One credential set, not N sets per project
 100 |    - **Resource Efficiency**: One rclone process, not N processes
 101 |    - **Familiar Pattern**: Users already understand Dropbox/iCloud
 102 |    - **Operational Simplicity**: `mount` once, `unmount` once
 103 |    - **Scales Naturally**: Add projects by creating folders, not reconfiguring cloud access
 104 | 
 105 | **User Journey:**
 106 | 
 107 | ```bash
 108 | # Setup cloud access (once)
 109 | bm cloud login
 110 | bm cloud mount  # or: bm cloud setup for bisync
 111 | 
 112 | # Work with projects (create folders as needed)
 113 | cd ~/basic-memory-cloud/
 114 | mkdir my-new-project
 115 | echo "# Notes" > my-new-project/readme.md
 116 | 
 117 | # Cloud auto-discovers and registers project
 118 | # No additional cloud configuration needed
 119 | ```
 120 | 
 121 | This paradigm shift means **mount and bisync are infrastructure concerns**, while **projects are content organization**. Users think about their knowledge, not about cloud plumbing.
 122 | 
 123 | ## What
 124 | 
 125 | This spec affects:
 126 | 
 127 | 1. **Cloud Mode Toggle** (`config.py`, `async_client.py`):
 128 |    - Add `cloud_mode` flag to `~/.basic-memory/config.json`
 129 |    - Set/unset `BASIC_MEMORY_PROXY_URL` based on cloud mode
 130 |    - `bm cloud login` enables cloud mode, `logout` disables it
 131 |    - All CLI commands respect cloud mode via existing async_client
 132 | 
 133 | 2. **Unified CLI Commands**:
 134 |    - **Remove**: `bm cloud project` commands (duplicate of `bm project`)
 135 |    - **Enhance**: `bm sync` co-opted for bisync in cloud mode
 136 |    - **Keep**: `bm cloud login/logout/status/setup` for mode management
 137 |    - **Result**: `bm project`, `bm tool`, `bm sync` work in both modes
 138 | 
 139 | 3. **Bisync Integration** (`bisync_commands.py`):
 140 |    - Remove `--check-access` (no RCLONE_TEST files)
 141 |    - Sync bucket root (all projects), not single subdirectory
 142 |    - Project auto-registration before sync
 143 |    - `bm sync` triggers bisync in cloud mode
 144 |    - `bm sync --watch` for continuous sync
 145 | 
 146 | 4. **Config Structure**:
 147 |    ```json
 148 |    {
 149 |      "cloud_mode": true,
 150 |      "cloud_host": "https://cloud.basicmemory.com",
 151 |      "auth_tokens": {...},
 152 |      "bisync_config": {
 153 |        "profile": "balanced",
 154 |        "sync_dir": "~/basic-memory-cloud-sync"
 155 |      }
 156 |    }
 157 |    ```
 158 | 
 159 | 5. **User Workflows**:
 160 |    - **Enable cloud**: `bm cloud login` → all commands work remotely
 161 |    - **Create projects**: `bm project add "name"` creates on cloud
 162 |    - **Sync files**: `bm sync` runs bisync (all projects)
 163 |    - **Use tools**: `bm tool write-note` creates notes on cloud
 164 |    - **Disable cloud**: `bm cloud logout` → back to local mode
 165 | 
 166 | ## Implementation Tasks
 167 | 
 168 | ### Phase 1: Cloud Mode Toggle & Config (Foundation) ✅
 169 | 
 170 | **1.1 Update Config Schema**
 171 | - [x] Add `cloud_mode: bool = False` to Config model
 172 | - [x] Add `bisync_config: dict` with `profile` and `sync_dir` fields
 173 | - [x] Ensure `cloud_host` field exists
 174 | - [x] Add config migration for existing users (defaults handle this)
 175 | 
 176 | **1.2 Update async_client.py**
 177 | - [x] Read `cloud_mode` from config (not just environment)
 178 | - [x] Set `BASIC_MEMORY_PROXY_URL` from config when `cloud_mode=true`
 179 | - [x] Priority: env var > config.cloud_host (if cloud_mode) > None (local ASGI)
 180 | - [ ] Test both local and cloud mode routing
 181 | 
 182 | **1.3 Update Login/Logout Commands**
 183 | - [x] `bm cloud login`: Set `cloud_mode=true` and save config
 184 | - [x] `bm cloud login`: Set `BASIC_MEMORY_PROXY_URL` environment variable
 185 | - [x] `bm cloud logout`: Set `cloud_mode=false` and save config
 186 | - [x] `bm cloud logout`: Clear `BASIC_MEMORY_PROXY_URL` environment variable
 187 | - [x] `bm cloud status`: Show current mode (local/cloud), connection status
 188 | 
 189 | **1.4 Skip Initialization in Cloud Mode** ✅
 190 | - [x] Update `ensure_initialization()` to check `cloud_mode` and return early
 191 | - [x] Document that `config.projects` is only used in local mode
 192 | - [x] Cloud manages its own projects via API, no local reconciliation needed
 193 | 
 194 | ### Phase 2: Bisync Updates (Multi-Project)
 195 | 
 196 | **2.1 Remove RCLONE_TEST Files** ✅
 197 | - [x] Update all bisync profiles: `check_access=False`
 198 | - [x] Remove RCLONE_TEST creation from `setup_cloud_bisync()`
 199 | - [x] Remove RCLONE_TEST upload logic
 200 | - [ ] Update documentation
 201 | 
 202 | **2.2 Sync Bucket Root (All Projects)** ✅
 203 | - [x] Change remote path from `bucket:/basic-memory` to `bucket:/` in `build_bisync_command()`
 204 | - [x] Update `setup_cloud_bisync()` to use bucket root
 205 | - [ ] Test with multiple projects
 206 | 
 207 | **2.3 Project Auto-Registration (Bisync)** ✅
 208 | - [x] Add `fetch_cloud_projects()` function (GET /proxy/projects/projects)
 209 | - [x] Add `scan_local_directories()` function
 210 | - [x] Add `create_cloud_project()` function (POST /proxy/projects/projects)
 211 | - [x] Integrate into `run_bisync()`: fetch → scan → create missing → sync
 212 | - [x] Wait for API 201 response before syncing
 213 | 
 214 | **2.4 Bisync Directory Configuration** ✅
 215 | - [x] Add `--dir` parameter to `bm cloud bisync-setup`
 216 | - [x] Store bisync directory in config
 217 | - [x] Default to `~/basic-memory-cloud-sync/`
 218 | - [x] Add `validate_bisync_directory()` safety check
 219 | - [x] Update `get_default_mount_path()` to return fixed `~/basic-memory-cloud/`
 220 | 
 221 | **2.5 Sync/Status API Infrastructure** ✅ (commit d48b1dc)
 222 | - [x] Create `POST /{project}/project/sync` endpoint for background sync
 223 | - [x] Create `POST /{project}/project/status` endpoint for scan-only status
 224 | - [x] Create `SyncReportResponse` Pydantic schema
 225 | - [x] Refactor CLI `sync` command to use API endpoint
 226 | - [x] Refactor CLI `status` command to use API endpoint
 227 | - [x] Create `command_utils.py` with shared `run_sync()` function
 228 | - [x] Update `notify_container_sync()` to call `run_sync()` for each project
 229 | - [x] Update all tests to match new API-based implementation
 230 | 
 231 | ### Phase 3: Sync Command Dual Mode ✅
 232 | 
 233 | **3.1 Update `bm sync` Command** ✅
 234 | - [x] Check `config.cloud_mode` at start
 235 | - [x] If `cloud_mode=false`: Run existing local sync
 236 | - [x] If `cloud_mode=true`: Run bisync
 237 | - [x] Add `--watch` parameter for continuous sync
 238 | - [x] Add `--interval` parameter (default 60 seconds)
 239 | - [x] Error if `--watch` used in local mode with helpful message
 240 | 
 241 | **3.2 Watch Mode for Bisync** ✅
 242 | - [x] Implement `run_bisync_watch()` with interval loop
 243 | - [x] Add `--interval` parameter (default 60 seconds)
 244 | - [x] Handle errors gracefully, continue on failure
 245 | - [x] Show sync progress and status
 246 | 
 247 | **3.3 Integrity Check Command** ✅
 248 | - [x] Implement `bm cloud check` command using `rclone check`
 249 | - [x] Read-only operation that verifies file matching
 250 | - [x] Error with helpful messages if rclone/bisync not set up
 251 | - [x] Support `--one-way` flag for faster checks
 252 | - [x] Transparent about rclone implementation
 253 | - [x] Suggest `bm sync` to resolve differences
 254 | 
 255 | **Implementation Notes:**
 256 | - `bm sync` adapts to cloud mode automatically - users don't need separate commands
 257 | - `bm cloud bisync` kept for power users with full options (--dry-run, --resync, --profile, --verbose)
 258 | - `bm cloud check` provides integrity verification without transferring data
 259 | - Design philosophy: Simplicity for everyday use, transparency about implementation
 260 | 
 261 | ### Phase 4: Remove Duplicate Commands & Cloud Mode Auth ✅
 262 | 
 263 | **4.0 Cloud Mode Authentication** ✅
 264 | - [x] Update `async_client.py` to support dual auth sources
 265 | - [x] FastMCP context auth (cloud service mode) via `inject_auth_header()`
 266 | - [x] JWT token file auth (CLI cloud mode) via `CLIAuth.get_valid_token()`
 267 | - [x] Automatic token refresh for CLI cloud mode
 268 | - [x] Remove `BASIC_MEMORY_PROXY_URL` environment variable dependency
 269 | - [x] Simplify to use only `config.cloud_mode` + `config.cloud_host`
 270 | 
 271 | **4.1 Delete `bm cloud project` Commands** ✅
 272 | - [x] Remove `bm cloud project list` (use `bm project list`)
 273 | - [x] Remove `bm cloud project add` (use `bm project add`)
 274 | - [x] Update `core_commands.py` to remove project_app subcommands
 275 | - [x] Keep only: `login`, `logout`, `status`, `setup`, `mount`, `unmount`, bisync commands
 276 | - [x] Remove unused imports (Table, generate_permalink, os)
 277 | - [x] Clean up environment variable references in login/logout
 278 | 
 279 | **4.2 CLI Command Cloud Mode Integration** ✅
 280 | - [x] Add runtime `cloud_mode_enabled` checks to all CLI commands
 281 | - [x] Update `list_projects()` to conditionally authenticate based on cloud mode
 282 | - [x] Update `remove_project()` to conditionally authenticate based on cloud mode
 283 | - [x] Update `run_sync()` to conditionally authenticate based on cloud mode
 284 | - [x] Update `get_project_info()` to conditionally authenticate based on cloud mode
 285 | - [x] Update `run_status()` to conditionally authenticate based on cloud mode
 286 | - [x] Remove auth from `set_default_project()` (local-only command, no cloud version)
 287 | - [x] Create CLI integration tests (`test-int/cli/`) to validate both local and cloud modes
 288 | - [x] Replace mock-heavy CLI tests with integration tests (deleted 5 mock test files)
 289 | 
 290 | **4.3 OAuth Authentication Fixes** ✅
 291 | - [x] Restore missing `SettingsConfigDict` in `BasicMemoryConfig`
 292 | - [x] Fix environment variable reading with `BASIC_MEMORY_` prefix
 293 | - [x] Fix `.env` file loading
 294 | - [x] Fix extra field handling for config files
 295 | - [x] Resolve `bm cloud login` OAuth failure ("Something went wrong" error)
 296 | - [x] Implement PKCE (Proof Key for Code Exchange) for device flow
 297 | - [x] Generate code verifier and SHA256 challenge for device authorization
 298 | - [x] Send code_verifier with token polling requests
 299 | - [x] Support both PKCE-required and PKCE-optional OAuth clients
 300 | - [x] Verify authentication flow works end-to-end with staging and production
 301 | - [x] Document WorkOS requirement: redirect URI must be configured even for device flow
 302 | 
 303 | **4.4 Update Documentation**
 304 | - [ ] Update `cloud-cli.md` with cloud mode toggle workflow
 305 | - [ ] Document `bm cloud login` → use normal commands
 306 | - [ ] Add examples of cloud mode usage
 307 | - [ ] Document mount vs bisync directory isolation
 308 | - [ ] Add troubleshooting section
 309 | 
 310 | ### Phase 5: Mount Updates ✅
 311 | 
 312 | **5.1 Fixed Mount Directory** ✅
 313 | - [x] Change mount path to `~/basic-memory-cloud/` (fixed, no tenant ID)
 314 | - [x] Update `get_default_mount_path()` function
 315 | - [x] Remove configurability (fixed location)
 316 | - [x] Update mount commands to use new path
 317 | 
 318 | **5.2 Mount at Bucket Root** ✅
 319 | - [x] Ensure mount uses bucket root (not subdirectory)
 320 | - [x] Test with multiple projects
 321 | - [x] Verify all projects visible in mount
 322 | 
 323 | **Implementation:** Mount uses fixed `~/basic-memory-cloud/` directory and syncs entire bucket root `basic-memory-{tenant_id}:{bucket_name}` for all projects.
 324 | 
 325 | ### Phase 6: Safety & Validation ✅
 326 | 
 327 | **6.1 Directory Conflict Prevention** ✅
 328 | - [x] Implement `validate_bisync_directory()` check
 329 | - [x] Detect if bisync dir == mount dir
 330 | - [x] Detect if bisync dir is currently mounted
 331 | - [x] Show clear error messages with solutions
 332 | 
 333 | **6.2 State Management** ✅
 334 | - [x] Use `--workdir` for bisync state
 335 | - [x] Store state in `~/.basic-memory/bisync-state/{tenant-id}/`
 336 | - [x] Ensure state directory created before bisync
 337 | 
 338 | **Implementation:** `validate_bisync_directory()` prevents conflicts by checking directory equality and mount status. State managed in isolated `~/.basic-memory/bisync-state/{tenant-id}/` directory using `--workdir` flag.
 339 | 
 340 | ### Phase 7: Cloud-Side Implementation (Deferred to Cloud Repo)
 341 | 
 342 | **7.1 Project Discovery Service (Cloud)** - Deferred
 343 | - [ ] Create `ProjectDiscoveryService` background job
 344 | - [ ] Scan `/app/data/` every 2 minutes
 345 | - [ ] Auto-register new directories as projects
 346 | - [ ] Log discovery events
 347 | - [ ] Handle errors gracefully
 348 | 
 349 | **7.2 Project API Updates (Cloud)** - Deferred
 350 | - [ ] Ensure `POST /proxy/projects/projects` creates directory synchronously
 351 | - [ ] Return 201 with project details
 352 | - [ ] Ensure directory ready immediately after creation
 353 | 
 354 | **Note:** Phase 7 is cloud-side work that belongs in the basic-memory-cloud repository. The CLI-side implementation (Phase 2.3 auto-registration) is complete and working - it calls the existing cloud API endpoints.
 355 | 
 356 | ### Phase 8: Testing & Documentation
 357 | 
 358 | **8.1 Test Scenarios**
 359 | - [x] Test: Cloud mode toggle (login/logout)
 360 | - [x] Test: Local-first project creation (bisync)
 361 | - [x] Test: Cloud-first project creation (API)
 362 | - [x] Test: Multi-project bidirectional sync
 363 | - [x] Test: MCP tools in cloud mode
 364 | - [x] Test: Watch mode continuous sync
 365 | - [x] Test: Safety profile protection (max_delete implemented)
 366 | - [x] Test: No RCLONE_TEST files (check_access=False in all profiles)
 367 | - [x] Test: Mount/bisync directory isolation (validate_bisync_directory)
 368 | - [x] Test: Integrity check command (bm cloud check)
 369 | 
 370 | **8.2 Documentation**
 371 | - [x] Update cloud-cli.md with cloud mode instructions
 372 | - [x] Document Dropbox model paradigm
 373 | - [x] Update command reference with new commands
 374 | - [x] Document `bm sync` dual mode behavior
 375 | - [x] Document `bm cloud check` command
 376 | - [x] Document directory structure and fixed paths
 377 | - [ ] Update README with quick start
 378 | - [ ] Create migration guide for existing users
 379 | - [ ] Create video/GIF demos
 380 | 
 381 | ### Success Criteria Checklist
 382 | 
 383 | - [x] `bm cloud login` enables cloud mode for all commands
 384 | - [x] `bm cloud logout` reverts to local mode
 385 | - [x] `bm project`, `bm tool`, `bm sync` work transparently in both modes
 386 | - [x] `bm sync` runs bisync in cloud mode, local sync in local mode
 387 | - [x] Single sync operation handles all projects bidirectionally
 388 | - [x] Local directories auto-create cloud projects via API
 389 | - [x] Cloud projects auto-sync to local directories
 390 | - [x] No RCLONE_TEST files in user directories
 391 | - [x] Bisync profiles provide safety via `max_delete` limits
 392 | - [x] `bm sync --watch` enables continuous sync
 393 | - [x] No duplicate `bm cloud project` commands (removed)
 394 | - [x] `bm cloud check` command for integrity verification
 395 | - [ ] Documentation covers cloud mode toggle and workflows
 396 | - [ ] Edge cases handled gracefully with clear errors
 397 | 
 398 | ## How (High Level)
 399 | 
 400 | ### Architecture Overview
 401 | 
 402 | **Cloud Mode Toggle:**
 403 | ```
 404 | ┌─────────────────────────────────────┐
 405 | │  bm cloud login                     │
 406 | │  ├─ Authenticate via OAuth          │
 407 | │  ├─ Set cloud_mode: true in config  │
 408 | │  └─ Set BASIC_MEMORY_PROXY_URL      │
 409 | └─────────────────────────────────────┘
 410 |            ↓
 411 | ┌─────────────────────────────────────┐
 412 | │  All CLI commands use async_client  │
 413 | │  ├─ async_client checks proxy URL   │
 414 | │  ├─ If set: HTTP to cloud           │
 415 | │  └─ If not: Local ASGI              │
 416 | └─────────────────────────────────────┘
 417 |            ↓
 418 | ┌─────────────────────────────────────┐
 419 | │  bm project add "work"              │
 420 | │  bm tool write-note ...             │
 421 | │  bm sync (triggers bisync)          │
 422 | │  → All work against cloud           │
 423 | └─────────────────────────────────────┘
 424 | ```
 425 | 
 426 | **Storage Hierarchy:**
 427 | ```
 428 | Cloud Container:                   Bucket:                      Local Sync Dir:
 429 | /app/data/ (mounted) ←→ production-tenant-{id}/ ←→ ~/basic-memory-cloud-sync/
 430 | ├── basic-memory/               ├── basic-memory/               ├── basic-memory/
 431 | │   ├── notes/                  │   ├── notes/                  │   ├── notes/
 432 | │   └── concepts/               │   └── concepts/               │   └── concepts/
 433 | ├── work-project/               ├── work-project/               ├── work-project/
 434 | │   └── tasks/                  │   └── tasks/                  │   └── tasks/
 435 | └── personal/                   └── personal/                   └── personal/
 436 |     └── journal/                    └── journal/                    └── journal/
 437 | 
 438 | Bidirectional sync via rclone bisync
 439 | ```
 440 | 
 441 | ### Sync Flow
 442 | 
 443 | **`bm sync` execution (in cloud mode):**
 444 | 
 445 | 1. **Check cloud mode**
 446 |    ```python
 447 |    if not config.cloud_mode:
 448 |        # Run normal local file sync
 449 |        run_local_sync()
 450 |        return
 451 | 
 452 |    # Cloud mode: Run bisync
 453 |    ```
 454 | 
 455 | 2. **Fetch cloud projects**
 456 |    ```python
 457 |    # GET /proxy/projects/projects (via async_client)
 458 |    cloud_projects = fetch_cloud_projects()
 459 |    cloud_project_names = {p["name"] for p in cloud_projects["projects"]}
 460 |    ```
 461 | 
 462 | 3. **Scan local sync directory**
 463 |    ```python
 464 |    sync_dir = config.bisync_config["sync_dir"]  # ~/basic-memory-cloud-sync
 465 |    local_dirs = [d.name for d in sync_dir.iterdir()
 466 |                  if d.is_dir() and not d.name.startswith('.')]
 467 |    ```
 468 | 
 469 | 4. **Create missing cloud projects**
 470 |    ```python
 471 |    for dir_name in local_dirs:
 472 |        if dir_name not in cloud_project_names:
 473 |            # POST /proxy/projects/projects (via async_client)
 474 |            create_cloud_project(name=dir_name)
 475 |            # Blocks until 201 response
 476 |    ```
 477 | 
 478 | 5. **Run bisync on bucket root**
 479 |    ```bash
 480 |    rclone bisync \
 481 |      ~/basic-memory-cloud-sync \
 482 |      basic-memory-{tenant}:{bucket} \
 483 |      --filters-file ~/.basic-memory/.bmignore.rclone \
 484 |      --conflict-resolve=newer \
 485 |      --max-delete=25
 486 |    # Syncs ALL project subdirectories bidirectionally
 487 |    ```
 488 | 
 489 | 6. **Notify cloud to refresh** (commit d48b1dc)
 490 |    ```python
 491 |    # After rclone bisync completes, sync each project's database
 492 |    for project in cloud_projects:
 493 |        # POST /{project}/project/sync (via async_client)
 494 |        # Triggers background sync for this project
 495 |        await run_sync(project=project_name)
 496 |    ```
 497 | 
 498 | ### Key Changes
 499 | 
 500 | **1. Cloud Mode via Config**
 501 | 
 502 | **Config changes:**
 503 | ```python
 504 | class Config:
 505 |     cloud_mode: bool = False
 506 |     cloud_host: str = "https://cloud.basicmemory.com"
 507 |     bisync_config: dict = {
 508 |         "profile": "balanced",
 509 |         "sync_dir": "~/basic-memory-cloud-sync"
 510 |     }
 511 | ```
 512 | 
 513 | **async_client.py behavior:**
 514 | ```python
 515 | def create_client() -> AsyncClient:
 516 |     # Check config first, then environment
 517 |     config = ConfigManager().config
 518 |     proxy_url = os.getenv("BASIC_MEMORY_PROXY_URL") or \
 519 |                 (config.cloud_host if config.cloud_mode else None)
 520 | 
 521 |     if proxy_url:
 522 |         return AsyncClient(base_url=proxy_url)  # HTTP to cloud
 523 |     else:
 524 |         return AsyncClient(transport=ASGITransport(...))  # Local ASGI
 525 | ```
 526 | 
 527 | **2. Login/Logout Sets Cloud Mode**
 528 | 
 529 | ```python
 530 | # bm cloud login
 531 | async def login():
 532 |     # Existing OAuth flow...
 533 |     success = await auth.login()
 534 |     if success:
 535 |         config.cloud_mode = True
 536 |         config.save()
 537 |         os.environ["BASIC_MEMORY_PROXY_URL"] = config.cloud_host
 538 | ```
 539 | 
 540 | ```python
 541 | # bm cloud logout
 542 | def logout():
 543 |     config.cloud_mode = False
 544 |     config.save()
 545 |     os.environ.pop("BASIC_MEMORY_PROXY_URL", None)
 546 | ```
 547 | 
 548 | **3. Remove Duplicate Commands**
 549 | 
 550 | **Delete:**
 551 | - `bm cloud project list` → use `bm project list`
 552 | - `bm cloud project add` → use `bm project add`
 553 | 
 554 | **Keep:**
 555 | - `bm cloud login` - Enable cloud mode
 556 | - `bm cloud logout` - Disable cloud mode
 557 | - `bm cloud status` - Show current mode & connection
 558 | - `bm cloud setup` - Initial bisync setup
 559 | - `bm cloud bisync` - Power-user command with full options
 560 | - `bm cloud check` - Verify file integrity between local and cloud
 561 | 
 562 | **4. Sync Command Dual Mode**
 563 | 
 564 | ```python
 565 | # bm sync
 566 | def sync_command(watch: bool = False, profile: str = "balanced"):
 567 |     config = ConfigManager().config
 568 | 
 569 |     if config.cloud_mode:
 570 |         # Run bisync for cloud sync
 571 |         run_bisync(profile=profile, watch=watch)
 572 |     else:
 573 |         # Run local file sync
 574 |         run_local_sync()
 575 | ```
 576 | 
 577 | **5. Remove RCLONE_TEST Files**
 578 | 
 579 | ```python
 580 | # All profiles: check_access=False
 581 | BISYNC_PROFILES = {
 582 |     "safe": RcloneBisyncProfile(check_access=False, max_delete=10),
 583 |     "balanced": RcloneBisyncProfile(check_access=False, max_delete=25),
 584 |     "fast": RcloneBisyncProfile(check_access=False, max_delete=50),
 585 | }
 586 | ```
 587 | 
 588 | **6. Sync Bucket Root (All Projects)**
 589 | 
 590 | ```python
 591 | # Sync entire bucket, not subdirectory
 592 | rclone_remote = f"basic-memory-{tenant_id}:{bucket_name}"
 593 | ```
 594 | 
 595 | ## How to Evaluate
 596 | 
 597 | ### Test Scenarios
 598 | 
 599 | **1. Cloud Mode Toggle**
 600 | ```bash
 601 | # Start in local mode
 602 | bm project list
 603 | # → Shows local projects
 604 | 
 605 | # Enable cloud mode
 606 | bm cloud login
 607 | # → Authenticates, sets cloud_mode=true
 608 | 
 609 | bm project list
 610 | # → Now shows cloud projects (same command!)
 611 | 
 612 | # Disable cloud mode
 613 | bm cloud logout
 614 | 
 615 | bm project list
 616 | # → Back to local projects
 617 | ```
 618 | 
 619 | **Expected:** ✅ Single command works in both modes
 620 | 
 621 | **2. Local-First Project Creation (Cloud Mode)**
 622 | ```bash
 623 | # Enable cloud mode
 624 | bm cloud login
 625 | 
 626 | # Create new project locally in sync dir
 627 | mkdir ~/basic-memory-cloud-sync/my-research
 628 | echo "# Research Notes" > ~/basic-memory-cloud-sync/my-research/index.md
 629 | 
 630 | # Run sync (triggers bisync in cloud mode)
 631 | bm sync
 632 | 
 633 | # Verify:
 634 | # - Cloud project created automatically via API
 635 | # - Files synced to bucket:/my-research/
 636 | # - Cloud database updated
 637 | # - `bm project list` shows new project
 638 | ```
 639 | 
 640 | **Expected:** ✅ Project visible in cloud project list
 641 | 
 642 | **3. Cloud-First Project Creation**
 643 | ```bash
 644 | # In cloud mode
 645 | bm project add "work-notes"
 646 | # → Creates project on cloud (via async_client HTTP)
 647 | 
 648 | # Run sync
 649 | bm sync
 650 | 
 651 | # Verify:
 652 | # - Local directory ~/basic-memory-cloud-sync/work-notes/ created
 653 | # - Files sync bidirectionally
 654 | # - Can use `bm tool write-note` to add content remotely
 655 | ```
 656 | 
 657 | **Expected:** ✅ Project accessible via all CLI commands
 658 | 
 659 | **4. Multi-Project Bidirectional Sync**
 660 | ```bash
 661 | # Setup: 3 projects in cloud mode
 662 | # Modify files in all 3 locally and remotely
 663 | 
 664 | bm sync
 665 | 
 666 | # Verify:
 667 | # - All 3 projects sync simultaneously
 668 | # - Changes propagate correctly
 669 | # - No cross-project interference
 670 | ```
 671 | 
 672 | **Expected:** ✅ All projects in sync state
 673 | 
 674 | **5. MCP Tools Work in Cloud Mode**
 675 | ```bash
 676 | # In cloud mode
 677 | bm tool write-note \
 678 |   --title "Meeting Notes" \
 679 |   --folder "work-notes" \
 680 |   --content "Discussion points..."
 681 | 
 682 | # Verify:
 683 | # - Note created on cloud (via async_client HTTP)
 684 | # - Next `bm sync` pulls note to local
 685 | # - Note appears in ~/basic-memory-cloud-sync/work-notes/
 686 | ```
 687 | 
 688 | **Expected:** ✅ Tools work transparently in cloud mode
 689 | 
 690 | **6. Watch Mode Continuous Sync**
 691 | ```bash
 692 | # In cloud mode
 693 | bm sync --watch
 694 | 
 695 | # While running:
 696 | # - Create local folder → auto-creates cloud project
 697 | # - Edit files locally → syncs to cloud
 698 | # - Edit files remotely → syncs to local
 699 | # - Create project via API → appears locally
 700 | 
 701 | # Verify:
 702 | # - Continuous bidirectional sync
 703 | # - New projects handled automatically
 704 | # - No manual intervention needed
 705 | ```
 706 | 
 707 | **Expected:** ✅ Seamless continuous sync
 708 | 
 709 | **7. Safety Profile Protection**
 710 | ```bash
 711 | # Create project with 15 files locally
 712 | # Delete project from cloud (simulate error)
 713 | 
 714 | bm sync --profile safe
 715 | 
 716 | # Verify:
 717 | # - Bisync detects 15 pending deletions
 718 | # - Exceeds max_delete=10 limit
 719 | # - Aborts with clear error
 720 | # - No files deleted locally
 721 | ```
 722 | 
 723 | **Expected:** ✅ Safety limit prevents data loss
 724 | 
 725 | **8. No RCLONE_TEST Files**
 726 | ```bash
 727 | # After setup and multiple syncs
 728 | ls -la ~/basic-memory-cloud-sync/
 729 | 
 730 | # Verify:
 731 | # - No RCLONE_TEST files
 732 | # - No .rclone state files (in ~/.basic-memory/bisync-state/)
 733 | # - Clean directory structure
 734 | ```
 735 | 
 736 | **Expected:** ✅ User directory stays clean
 737 | 
 738 | ### Success Criteria
 739 | 
 740 | - [x] `bm cloud login` enables cloud mode for all commands
 741 | - [x] `bm cloud logout` reverts to local mode
 742 | - [x] `bm project`, `bm tool`, `bm sync` work in both modes transparently
 743 | - [x] `bm sync` runs bisync in cloud mode, local sync in local mode
 744 | - [x] Single sync operation handles all projects bidirectionally
 745 | - [x] Local directories auto-create cloud projects via API
 746 | - [x] Cloud projects auto-sync to local directories
 747 | - [x] No RCLONE_TEST files in user directories
 748 | - [x] Bisync profiles provide safety via `max_delete` limits
 749 | - [x] `bm sync --watch` enables continuous sync
 750 | - [x] No duplicate `bm cloud project` commands (removed)
 751 | - [x] `bm cloud check` command for integrity verification
 752 | - [ ] Documentation covers cloud mode toggle and workflows
 753 | - [ ] Edge cases handled gracefully with clear errors
 754 | 
 755 | ## Notes
 756 | 
 757 | ### API Contract
 758 | 
 759 | **Cloud must provide:**
 760 | 
 761 | 1. **Project Management APIs:**
 762 |    - `GET /proxy/projects/projects` - List all projects
 763 |    - `POST /proxy/projects/projects` - Create project synchronously
 764 |    - `POST /proxy/sync` - Trigger cache refresh
 765 | 
 766 | 2. **Project Discovery Service (Background):**
 767 |    - **Purpose**: Auto-register projects created via mount, direct bucket uploads, or any non-API method
 768 |    - **Interval**: Every 2 minutes
 769 |    - **Behavior**:
 770 |      - Scan `/app/data/` for directories
 771 |      - Register any directory not already in project database
 772 |      - Log discovery events
 773 |    - **Implementation**:
 774 |      ```python
 775 |      class ProjectDiscoveryService:
 776 |          """Background service to auto-discover projects from filesystem."""
 777 | 
 778 |          async def run(self):
 779 |              """Scan /app/data/ and register new project directories."""
 780 |              data_path = Path("/app/data")
 781 | 
 782 |              for dir_path in data_path.iterdir():
 783 |                  # Skip hidden and special directories
 784 |                  if not dir_path.is_dir() or dir_path.name.startswith('.'):
 785 |                      continue
 786 | 
 787 |                  project_name = dir_path.name
 788 | 
 789 |                  # Check if project already registered
 790 |                  project = await self.project_repo.get_by_name(project_name)
 791 |                  if not project:
 792 |                      # Auto-register new project
 793 |                      await self.project_repo.create(
 794 |                          name=project_name,
 795 |                          path=str(dir_path)
 796 |                      )
 797 |                      logger.info(f"Auto-discovered project: {project_name}")
 798 |      ```
 799 | 
 800 | **Project Creation (API-based):**
 801 | - API creates `/app/data/{project-name}/` directory
 802 | - Registers project in database
 803 | - Returns 201 with project details
 804 | - Directory ready for bisync immediately
 805 | 
 806 | **Project Creation (Discovery-based):**
 807 | - User creates folder via mount: `~/basic-memory-cloud/new-project/`
 808 | - Files appear in `/app/data/new-project/` (mounted bucket)
 809 | - Discovery service finds directory on next scan (within 2 minutes)
 810 | - Auto-registers as project
 811 | - User sees project in `bm project list` after discovery
 812 | 
 813 | **Why Both Methods:**
 814 | - **API**: Immediate registration when using bisync (client-side scan + API call)
 815 | - **Discovery**: Delayed registration when using mount (no API call hook)
 816 | - **Result**: Projects created ANY way (API, mount, bisync, WebDAV) eventually registered
 817 | - **Trade-off**: 2-minute delay for mount-created projects is acceptable
 818 | 
 819 | ### Mount vs Bisync Directory Isolation
 820 | 
 821 | **Critical Safety Requirement**: Mount and bisync MUST use different directories to prevent conflicts.
 822 | 
 823 | **The Dropbox Model Applied:**
 824 | 
 825 | Both mount and bisync operate at **bucket level** (all projects), following the Dropbox/iCloud paradigm:
 826 | 
 827 | ```
 828 | ~/basic-memory-cloud/          # Mount: Read-through cache (like Dropbox folder)
 829 | ├── work-notes/
 830 | ├── personal/
 831 | └── research/
 832 | 
 833 | ~/basic-memory-cloud-sync/         # Bisync: Bidirectional sync (like Dropbox sync folder)
 834 | ├── work-notes/
 835 | ├── personal/
 836 | └── research/
 837 | ```
 838 | 
 839 | **Mount Directory (Fixed):**
 840 | ```bash
 841 | # Fixed location, not configurable
 842 | ~/basic-memory-cloud/
 843 | ```
 844 | - **Scope**: Entire bucket (all projects)
 845 | - **Method**: NFS mount via `rclone nfsmount`
 846 | - **Behavior**: Read-through cache to cloud bucket
 847 | - **Credentials**: One IAM credential set per tenant
 848 | - **Process**: One rclone mount process
 849 | - **Use Case**: Quick access, browsing, light editing
 850 | - **Known Issue**: Obsidian compatibility problems with NFS
 851 | - **Not Configurable**: Fixed location prevents user error
 852 | 
 853 | **Why Fixed Location:**
 854 | - One mount point per machine (like `/Users/you/Dropbox`)
 855 | - Prevents credential proliferation (one credential set, not N)
 856 | - Prevents multiple mount processes (resource efficiency)
 857 | - Familiar pattern users already understand
 858 | - Simple operations: `mount` once, `unmount` once
 859 | 
 860 | **Bisync Directory (User Configurable):**
 861 | ```bash
 862 | # Default location
 863 | ~/basic-memory-cloud-sync/
 864 | 
 865 | # User can override
 866 | bm cloud setup --dir ~/my-knowledge-base
 867 | ```
 868 | - **Scope**: Entire bucket (all projects)
 869 | - **Method**: Bidirectional sync via `rclone bisync`
 870 | - **Behavior**: Full local copy with periodic sync
 871 | - **Credentials**: Same IAM credential set as mount
 872 | - **Use Case**: Full offline access, reliable editing, Obsidian support
 873 | - **Configurable**: Users may want specific locations (external drive, existing folder structure)
 874 | 
 875 | **Why User Configurable:**
 876 | - Users have preferences for where local copies live
 877 | - May want sync folder on external drive
 878 | - May want to integrate with existing folder structure
 879 | - Default works for most, option available for power users
 880 | 
 881 | **Conflict Prevention:**
 882 | ```python
 883 | def validate_bisync_directory(bisync_dir: Path):
 884 |     """Ensure bisync directory doesn't conflict with mount."""
 885 |     mount_dir = Path.home() / "basic-memory-cloud"
 886 | 
 887 |     if bisync_dir.resolve() == mount_dir.resolve():
 888 |         raise BisyncError(
 889 |             f"Cannot use {bisync_dir} for bisync - it's the mount directory!\n"
 890 |             f"Mount and bisync must use different directories.\n\n"
 891 |             f"Options:\n"
 892 |             f"  1. Use default: ~/basic-memory-cloud-sync/\n"
 893 |             f"  2. Specify different directory: --dir ~/my-sync-folder"
 894 |         )
 895 | 
 896 |     # Check if mount is active at this location
 897 |     result = subprocess.run(["mount"], capture_output=True, text=True)
 898 |     if str(bisync_dir) in result.stdout and "rclone" in result.stdout:
 899 |         raise BisyncError(
 900 |             f"{bisync_dir} is currently mounted via 'bm cloud mount'\n"
 901 |             f"Cannot use mounted directory for bisync.\n\n"
 902 |             f"Either:\n"
 903 |             f"  1. Unmount first: bm cloud unmount\n"
 904 |             f"  2. Use different directory for bisync"
 905 |         )
 906 | ```
 907 | 
 908 | **Why This Matters:**
 909 | - Mounting and syncing the SAME directory would create infinite loops
 910 | - rclone mount → bisync detects changes → syncs to bucket → mount sees changes → triggers bisync → ∞
 911 | - Separate directories = clean separation of concerns
 912 | - Mount is read-heavy caching layer, bisync is write-heavy bidirectional sync
 913 | 
 914 | ### Future Enhancements
 915 | 
 916 | **Phase 2 (Not in this spec):**
 917 | - **Near Real-Time Sync**: Integrate `watch_service.py` with cloud mode
 918 |   - Watch service detects local changes (already battle-tested)
 919 |   - Queue changes in memory
 920 |   - Use `rclone copy` for individual file sync (near instant)
 921 |   - Example: `rclone copyto ~/sync/project/file.md tenant:{bucket}/project/file.md`
 922 |   - Fallback to full `rclone bisync` every N seconds for bidirectional changes
 923 |   - Provides near real-time sync without polling overhead
 924 | - Per-project bisync profiles (different safety levels per project)
 925 | - Selective project sync (exclude specific projects from sync)
 926 | - Project deletion workflow (cascade to cloud/local)
 927 | - Conflict resolution UI/CLI
 928 | 
 929 | **Phase 3:**
 930 | - Project sharing between tenants
 931 | - Incremental backup/restore
 932 | - Sync statistics and bandwidth monitoring
 933 | - Mobile app integration with cloud mode
 934 | 
 935 | ### Related Specs
 936 | 
 937 | - **SPEC-8**: TigrisFS Integration - Original bisync implementation
 938 | - **SPEC-6**: Explicit Project Parameter Architecture - Multi-project foundations
 939 | - **SPEC-5**: CLI Cloud Upload via WebDAV - Cloud file operations
 940 | 
 941 | ### Implementation Notes
 942 | 
 943 | **Architectural Simplifications:**
 944 | - **Unified CLI**: Eliminated duplicate commands by using mode toggle
 945 | - **Single Entry Point**: All commands route through `async_client` which handles mode
 946 | - **Config-Driven**: Cloud mode stored in persistent config, not just environment
 947 | - **Transparent Routing**: Existing commands work without modification in cloud mode
 948 | 
 949 | **Complexity Trade-offs:**
 950 | - Removed: Separate `bm cloud project` command namespace
 951 | - Removed: Complex state detection for new projects
 952 | - Removed: RCLONE_TEST marker file management
 953 | - Added: Simple cloud_mode flag and config integration
 954 | - Added: Simple project list comparison before sync
 955 | - Relied on: Existing bisync profile safety mechanisms
 956 | - Result: Significantly simpler, more maintainable code
 957 | 
 958 | **User Experience:**
 959 | - **Mental Model**: "Toggle cloud mode, use normal commands"
 960 | - **No Learning Curve**: Same commands work locally and in cloud
 961 | - **Minimal Config**: Just login/logout to switch modes
 962 | - **Safety**: Profile system gives users control over safety/speed trade-offs
 963 | - **"Just Works"**: Create folders anywhere, they sync automatically
 964 | 
 965 | **Migration Path:**
 966 | - Existing `bm cloud project` users: Use `bm project` instead
 967 | - Existing `bm cloud bisync` becomes `bm sync` in cloud mode
 968 | - Config automatically migrates on first `bm cloud login`
 969 | 
 970 | 
 971 | ## Testing
 972 | 
 973 | 
 974 | Initial Setup (One Time)
 975 | 
 976 | 1. Login to cloud and enable cloud mode:
 977 | bm cloud login
 978 | # → Authenticates via OAuth
 979 | # → Sets cloud_mode=true in config
 980 | # → Sets BASIC_MEMORY_PROXY_URL environment variable
 981 | # → All CLI commands now route to cloud
 982 | 
 983 | 2. Check cloud mode status:
 984 | bm cloud status
 985 | # → Shows: Mode: Cloud (enabled)
 986 | # → Shows: Host: https://cloud.basicmemory.com
 987 | # → Checks cloud health
 988 | 
 989 | 3. Set up bidirectional sync:
 990 | bm cloud bisync-setup
 991 | # Or with custom directory:
 992 | bm cloud bisync-setup --dir ~/my-sync-folder
 993 | 
 994 | # This will:
 995 | # → Install rclone (if not already installed)
 996 | # → Get tenant info (tenant_id, bucket_name)
 997 | # → Generate scoped IAM credentials
 998 | # → Configure rclone with credentials
 999 | # → Create sync directory (default: ~/basic-memory-cloud-sync/)
1000 | # → Validate no conflict with mount directory
1001 | # → Run initial --resync to establish baseline
1002 | 
1003 | Normal Usage
1004 | 
1005 | 4. Create local project and sync:
1006 | # Create a local project directory
1007 | mkdir ~/basic-memory-cloud-sync/my-research
1008 | echo "# Research Notes" > ~/basic-memory-cloud-sync/my-research/readme.md
1009 | 
1010 | # Run sync
1011 | bm cloud bisync
1012 | 
1013 | # Auto-magic happens:
1014 | # → Checks for new local directories
1015 | # → Finds "my-research" not in cloud
1016 | # → Creates project on cloud via POST /proxy/projects/projects
1017 | # → Runs bidirectional sync (all projects)
1018 | # → Syncs to bucket root (all projects synced together)
1019 | 
1020 | 5. Watch mode for continuous sync:
1021 | bm cloud bisync --watch
1022 | # Or with custom interval:
1023 | bm cloud bisync --watch --interval 30
1024 | 
1025 | # → Syncs every 60 seconds (or custom interval)
1026 | # → Auto-registers new projects on each run
1027 | # → Press Ctrl+C to stop
1028 | 
1029 | 6. Check bisync status:
1030 | bm cloud bisync-status
1031 | # → Shows tenant ID
1032 | # → Shows sync directory path
1033 | # → Shows initialization status
1034 | # → Shows last sync time
1035 | # → Lists available profiles (safe/balanced/fast)
1036 | 
1037 | 7. Manual sync with different profiles:
1038 | # Safe mode (max 10 deletes, preserves conflicts)
1039 | bm cloud bisync --profile safe
1040 | 
1041 | # Balanced mode (max 25 deletes, auto-resolve to newer) - default
1042 | bm cloud bisync --profile balanced
1043 | 
1044 | # Fast mode (max 50 deletes, skip verification)
1045 | bm cloud bisync --profile fast
1046 | 
1047 | 8. Dry run to preview changes:
1048 | bm cloud bisync --dry-run
1049 | # → Shows what would be synced without making changes
1050 | 
1051 | 9. Force resync (if needed):
1052 | bm cloud bisync --resync
1053 | # → Establishes new baseline
1054 | # → Use if sync state is corrupted
1055 | 
1056 | 10. Check file integrity:
1057 | bm cloud check
1058 | # → Verifies all files match between local and cloud
1059 | # → Read-only operation (no data transfer)
1060 | # → Shows differences if any found
1061 | 
1062 | # Faster one-way check
1063 | bm cloud check --one-way
1064 | # → Only checks for missing files on destination
1065 | 
1066 | Verify Cloud Mode Integration
1067 | 
1068 | 11. Test that all commands work in cloud mode:
1069 | # List cloud projects (not local)
1070 | bm project list
1071 | 
1072 | # Create project on cloud
1073 | bm project add "work-notes"
1074 | 
1075 | # Use MCP tools against cloud
1076 | bm tool write-note --title "Test" --folder "my-research" --content "Hello"
1077 | 
1078 | # All of these work against cloud because cloud_mode=true
1079 | 
1080 | 12. Switch back to local mode:
1081 | bm cloud logout
1082 | # → Sets cloud_mode=false
1083 | # → Clears BASIC_MEMORY_PROXY_URL
1084 | # → All commands now work locally again
1085 | 
1086 | Expected Directory Structure
1087 | 
1088 | ~/basic-memory-cloud-sync/          # Your local sync directory
1089 | ├── my-research/                    # Auto-created cloud project
1090 | │   ├── readme.md
1091 | │   └── notes.md
1092 | ├── work-notes/                     # Another project
1093 | │   └── tasks.md
1094 | └── personal/                       # Another project
1095 |   └── journal.md
1096 | 
1097 | # All sync bidirectionally with:
1098 | bucket:/                            # Cloud bucket root
1099 | ├── my-research/
1100 | ├── work-notes/
1101 | └── personal/
1102 | 
1103 | Key Points to Test
1104 | 
1105 | 1. ✅ Cloud mode toggle works (login/logout)
1106 | 2. ✅ Bisync setup validates directory (no conflict with mount)
1107 | 3. ✅ Local directories auto-create cloud projects
1108 | 4. ✅ All projects sync together (bucket root)
1109 | 5. ✅ No RCLONE_TEST files created
1110 | 6. ✅ Changes sync bidirectionally
1111 | 7. ✅ Watch mode continuous sync works
1112 | 8. ✅ Profile safety limits work (max_delete)
1113 | 9. ✅ `bm sync` adapts to cloud mode automatically
1114 | 10. ✅ `bm cloud check` verifies file integrity without side effects
1115 | 
```