This is page 18 of 23. Use http://codebase.md/basicmachines-co/basic-memory?lines=true&page={x} to view the full context.
# Directory Structure
```
├── .claude
│ ├── agents
│ │ ├── python-developer.md
│ │ └── system-architect.md
│ └── commands
│ ├── release
│ │ ├── beta.md
│ │ ├── changelog.md
│ │ ├── release-check.md
│ │ └── release.md
│ ├── spec.md
│ └── test-live.md
├── .dockerignore
├── .github
│ ├── dependabot.yml
│ ├── ISSUE_TEMPLATE
│ │ ├── bug_report.md
│ │ ├── config.yml
│ │ ├── documentation.md
│ │ └── feature_request.md
│ └── workflows
│ ├── claude-code-review.yml
│ ├── claude-issue-triage.yml
│ ├── claude.yml
│ ├── dev-release.yml
│ ├── docker.yml
│ ├── pr-title.yml
│ ├── release.yml
│ └── test.yml
├── .gitignore
├── .python-version
├── CHANGELOG.md
├── CITATION.cff
├── CLA.md
├── CLAUDE.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── docker-compose.yml
├── Dockerfile
├── docs
│ ├── ai-assistant-guide-extended.md
│ ├── character-handling.md
│ ├── cloud-cli.md
│ └── Docker.md
├── justfile
├── LICENSE
├── llms-install.md
├── pyproject.toml
├── README.md
├── SECURITY.md
├── smithery.yaml
├── specs
│ ├── SPEC-1 Specification-Driven Development Process.md
│ ├── SPEC-10 Unified Deployment Workflow and Event Tracking.md
│ ├── SPEC-11 Basic Memory API Performance Optimization.md
│ ├── SPEC-12 OpenTelemetry Observability.md
│ ├── SPEC-13 CLI Authentication with Subscription Validation.md
│ ├── SPEC-14 Cloud Git Versioning & GitHub Backup.md
│ ├── SPEC-14- Cloud Git Versioning & GitHub Backup.md
│ ├── SPEC-15 Configuration Persistence via Tigris for Cloud Tenants.md
│ ├── SPEC-16 MCP Cloud Service Consolidation.md
│ ├── SPEC-17 Semantic Search with ChromaDB.md
│ ├── SPEC-18 AI Memory Management Tool.md
│ ├── SPEC-19 Sync Performance and Memory Optimization.md
│ ├── SPEC-2 Slash Commands Reference.md
│ ├── SPEC-3 Agent Definitions.md
│ ├── SPEC-4 Notes Web UI Component Architecture.md
│ ├── SPEC-5 CLI Cloud Upload via WebDAV.md
│ ├── SPEC-6 Explicit Project Parameter Architecture.md
│ ├── SPEC-7 POC to spike Tigris Turso for local access to cloud data.md
│ ├── SPEC-8 TigrisFS Integration.md
│ ├── SPEC-9 Multi-Project Bidirectional Sync Architecture.md
│ ├── SPEC-9 Signed Header Tenant Information.md
│ └── SPEC-9-1 Follow-Ups- Conflict, Sync, and Observability.md
├── src
│ └── basic_memory
│ ├── __init__.py
│ ├── alembic
│ │ ├── alembic.ini
│ │ ├── env.py
│ │ ├── migrations.py
│ │ ├── script.py.mako
│ │ └── versions
│ │ ├── 3dae7c7b1564_initial_schema.py
│ │ ├── 502b60eaa905_remove_required_from_entity_permalink.py
│ │ ├── 5fe1ab1ccebe_add_projects_table.py
│ │ ├── 647e7a75e2cd_project_constraint_fix.py
│ │ ├── 9d9c1cb7d8f5_add_mtime_and_size_columns_to_entity_.py
│ │ ├── a1b2c3d4e5f6_fix_project_foreign_keys.py
│ │ ├── b3c3938bacdb_relation_to_name_unique_index.py
│ │ ├── cc7172b46608_update_search_index_schema.py
│ │ └── e7e1f4367280_add_scan_watermark_tracking_to_project.py
│ ├── api
│ │ ├── __init__.py
│ │ ├── app.py
│ │ ├── routers
│ │ │ ├── __init__.py
│ │ │ ├── directory_router.py
│ │ │ ├── importer_router.py
│ │ │ ├── knowledge_router.py
│ │ │ ├── management_router.py
│ │ │ ├── memory_router.py
│ │ │ ├── project_router.py
│ │ │ ├── prompt_router.py
│ │ │ ├── resource_router.py
│ │ │ ├── search_router.py
│ │ │ └── utils.py
│ │ └── template_loader.py
│ ├── cli
│ │ ├── __init__.py
│ │ ├── app.py
│ │ ├── auth.py
│ │ ├── commands
│ │ │ ├── __init__.py
│ │ │ ├── cloud
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api_client.py
│ │ │ │ ├── bisync_commands.py
│ │ │ │ ├── cloud_utils.py
│ │ │ │ ├── core_commands.py
│ │ │ │ ├── mount_commands.py
│ │ │ │ ├── rclone_config.py
│ │ │ │ ├── rclone_installer.py
│ │ │ │ ├── upload_command.py
│ │ │ │ └── upload.py
│ │ │ ├── command_utils.py
│ │ │ ├── db.py
│ │ │ ├── import_chatgpt.py
│ │ │ ├── import_claude_conversations.py
│ │ │ ├── import_claude_projects.py
│ │ │ ├── import_memory_json.py
│ │ │ ├── mcp.py
│ │ │ ├── project.py
│ │ │ ├── status.py
│ │ │ ├── sync.py
│ │ │ └── tool.py
│ │ └── main.py
│ ├── config.py
│ ├── db.py
│ ├── deps.py
│ ├── file_utils.py
│ ├── ignore_utils.py
│ ├── importers
│ │ ├── __init__.py
│ │ ├── base.py
│ │ ├── chatgpt_importer.py
│ │ ├── claude_conversations_importer.py
│ │ ├── claude_projects_importer.py
│ │ ├── memory_json_importer.py
│ │ └── utils.py
│ ├── markdown
│ │ ├── __init__.py
│ │ ├── entity_parser.py
│ │ ├── markdown_processor.py
│ │ ├── plugins.py
│ │ ├── schemas.py
│ │ └── utils.py
│ ├── mcp
│ │ ├── __init__.py
│ │ ├── async_client.py
│ │ ├── project_context.py
│ │ ├── prompts
│ │ │ ├── __init__.py
│ │ │ ├── ai_assistant_guide.py
│ │ │ ├── continue_conversation.py
│ │ │ ├── recent_activity.py
│ │ │ ├── search.py
│ │ │ └── utils.py
│ │ ├── resources
│ │ │ ├── ai_assistant_guide.md
│ │ │ └── project_info.py
│ │ ├── server.py
│ │ └── tools
│ │ ├── __init__.py
│ │ ├── build_context.py
│ │ ├── canvas.py
│ │ ├── chatgpt_tools.py
│ │ ├── delete_note.py
│ │ ├── edit_note.py
│ │ ├── list_directory.py
│ │ ├── move_note.py
│ │ ├── project_management.py
│ │ ├── read_content.py
│ │ ├── read_note.py
│ │ ├── recent_activity.py
│ │ ├── search.py
│ │ ├── utils.py
│ │ ├── view_note.py
│ │ └── write_note.py
│ ├── models
│ │ ├── __init__.py
│ │ ├── base.py
│ │ ├── knowledge.py
│ │ ├── project.py
│ │ └── search.py
│ ├── repository
│ │ ├── __init__.py
│ │ ├── entity_repository.py
│ │ ├── observation_repository.py
│ │ ├── project_info_repository.py
│ │ ├── project_repository.py
│ │ ├── relation_repository.py
│ │ ├── repository.py
│ │ └── search_repository.py
│ ├── schemas
│ │ ├── __init__.py
│ │ ├── base.py
│ │ ├── cloud.py
│ │ ├── delete.py
│ │ ├── directory.py
│ │ ├── importer.py
│ │ ├── memory.py
│ │ ├── project_info.py
│ │ ├── prompt.py
│ │ ├── request.py
│ │ ├── response.py
│ │ ├── search.py
│ │ └── sync_report.py
│ ├── services
│ │ ├── __init__.py
│ │ ├── context_service.py
│ │ ├── directory_service.py
│ │ ├── entity_service.py
│ │ ├── exceptions.py
│ │ ├── file_service.py
│ │ ├── initialization.py
│ │ ├── link_resolver.py
│ │ ├── project_service.py
│ │ ├── search_service.py
│ │ └── service.py
│ ├── sync
│ │ ├── __init__.py
│ │ ├── background_sync.py
│ │ ├── sync_service.py
│ │ └── watch_service.py
│ ├── templates
│ │ └── prompts
│ │ ├── continue_conversation.hbs
│ │ └── search.hbs
│ └── utils.py
├── test-int
│ ├── BENCHMARKS.md
│ ├── cli
│ │ ├── test_project_commands_integration.py
│ │ ├── test_sync_commands_integration.py
│ │ └── test_version_integration.py
│ ├── conftest.py
│ ├── mcp
│ │ ├── test_build_context_underscore.py
│ │ ├── test_build_context_validation.py
│ │ ├── test_chatgpt_tools_integration.py
│ │ ├── test_default_project_mode_integration.py
│ │ ├── test_delete_note_integration.py
│ │ ├── test_edit_note_integration.py
│ │ ├── test_list_directory_integration.py
│ │ ├── test_move_note_integration.py
│ │ ├── test_project_management_integration.py
│ │ ├── test_project_state_sync_integration.py
│ │ ├── test_read_content_integration.py
│ │ ├── test_read_note_integration.py
│ │ ├── test_search_integration.py
│ │ ├── test_single_project_mcp_integration.py
│ │ └── test_write_note_integration.py
│ ├── test_db_wal_mode.py
│ ├── test_disable_permalinks_integration.py
│ └── test_sync_performance_benchmark.py
├── tests
│ ├── __init__.py
│ ├── api
│ │ ├── conftest.py
│ │ ├── test_async_client.py
│ │ ├── test_continue_conversation_template.py
│ │ ├── test_directory_router.py
│ │ ├── test_importer_router.py
│ │ ├── test_knowledge_router.py
│ │ ├── test_management_router.py
│ │ ├── test_memory_router.py
│ │ ├── test_project_router_operations.py
│ │ ├── test_project_router.py
│ │ ├── test_prompt_router.py
│ │ ├── test_relation_background_resolution.py
│ │ ├── test_resource_router.py
│ │ ├── test_search_router.py
│ │ ├── test_search_template.py
│ │ ├── test_template_loader_helpers.py
│ │ └── test_template_loader.py
│ ├── cli
│ │ ├── conftest.py
│ │ ├── test_bisync_commands.py
│ │ ├── test_cli_tools.py
│ │ ├── test_cloud_authentication.py
│ │ ├── test_cloud_utils.py
│ │ ├── test_ignore_utils.py
│ │ ├── test_import_chatgpt.py
│ │ ├── test_import_claude_conversations.py
│ │ ├── test_import_claude_projects.py
│ │ ├── test_import_memory_json.py
│ │ └── test_upload.py
│ ├── conftest.py
│ ├── db
│ │ └── test_issue_254_foreign_key_constraints.py
│ ├── importers
│ │ ├── test_importer_base.py
│ │ └── test_importer_utils.py
│ ├── markdown
│ │ ├── __init__.py
│ │ ├── test_date_frontmatter_parsing.py
│ │ ├── test_entity_parser_error_handling.py
│ │ ├── test_entity_parser.py
│ │ ├── test_markdown_plugins.py
│ │ ├── test_markdown_processor.py
│ │ ├── test_observation_edge_cases.py
│ │ ├── test_parser_edge_cases.py
│ │ ├── test_relation_edge_cases.py
│ │ └── test_task_detection.py
│ ├── mcp
│ │ ├── conftest.py
│ │ ├── test_obsidian_yaml_formatting.py
│ │ ├── test_permalink_collision_file_overwrite.py
│ │ ├── test_prompts.py
│ │ ├── test_resources.py
│ │ ├── test_tool_build_context.py
│ │ ├── test_tool_canvas.py
│ │ ├── test_tool_delete_note.py
│ │ ├── test_tool_edit_note.py
│ │ ├── test_tool_list_directory.py
│ │ ├── test_tool_move_note.py
│ │ ├── test_tool_read_content.py
│ │ ├── test_tool_read_note.py
│ │ ├── test_tool_recent_activity.py
│ │ ├── test_tool_resource.py
│ │ ├── test_tool_search.py
│ │ ├── test_tool_utils.py
│ │ ├── test_tool_view_note.py
│ │ ├── test_tool_write_note.py
│ │ └── tools
│ │ └── test_chatgpt_tools.py
│ ├── Non-MarkdownFileSupport.pdf
│ ├── repository
│ │ ├── test_entity_repository_upsert.py
│ │ ├── test_entity_repository.py
│ │ ├── test_entity_upsert_issue_187.py
│ │ ├── test_observation_repository.py
│ │ ├── test_project_info_repository.py
│ │ ├── test_project_repository.py
│ │ ├── test_relation_repository.py
│ │ ├── test_repository.py
│ │ ├── test_search_repository_edit_bug_fix.py
│ │ └── test_search_repository.py
│ ├── schemas
│ │ ├── test_base_timeframe_minimum.py
│ │ ├── test_memory_serialization.py
│ │ ├── test_memory_url_validation.py
│ │ ├── test_memory_url.py
│ │ ├── test_schemas.py
│ │ └── test_search.py
│ ├── Screenshot.png
│ ├── services
│ │ ├── test_context_service.py
│ │ ├── test_directory_service.py
│ │ ├── test_entity_service_disable_permalinks.py
│ │ ├── test_entity_service.py
│ │ ├── test_file_service.py
│ │ ├── test_initialization.py
│ │ ├── test_link_resolver.py
│ │ ├── test_project_removal_bug.py
│ │ ├── test_project_service_operations.py
│ │ ├── test_project_service.py
│ │ └── test_search_service.py
│ ├── sync
│ │ ├── test_character_conflicts.py
│ │ ├── test_sync_service_incremental.py
│ │ ├── test_sync_service.py
│ │ ├── test_sync_wikilink_issue.py
│ │ ├── test_tmp_files.py
│ │ ├── test_watch_service_edge_cases.py
│ │ ├── test_watch_service_reload.py
│ │ └── test_watch_service.py
│ ├── test_config.py
│ ├── test_db_migration_deduplication.py
│ ├── test_deps.py
│ ├── test_production_cascade_delete.py
│ └── utils
│ ├── test_file_utils.py
│ ├── test_frontmatter_obsidian_compatible.py
│ ├── test_parse_tags.py
│ ├── test_permalink_formatting.py
│ ├── test_utf8_handling.py
│ └── test_validate_project_path.py
├── uv.lock
├── v0.15.0-RELEASE-DOCS.md
└── v15-docs
├── api-performance.md
├── background-relations.md
├── basic-memory-home.md
├── bug-fixes.md
├── chatgpt-integration.md
├── cloud-authentication.md
├── cloud-bisync.md
├── cloud-mode-usage.md
├── cloud-mount.md
├── default-project-mode.md
├── env-file-removal.md
├── env-var-overrides.md
├── explicit-project-parameter.md
├── gitignore-integration.md
├── project-root-env-var.md
├── README.md
└── sqlite-performance.md
```
# Files
--------------------------------------------------------------------------------
/specs/SPEC-13 CLI Authentication with Subscription Validation.md:
--------------------------------------------------------------------------------
```markdown
1 | ---
2 | title: 'SPEC-13: CLI Authentication with Subscription Validation'
3 | type: spec
4 | permalink: specs/spec-12-cli-auth-subscription-validation
5 | tags:
6 | - authentication
7 | - security
8 | - cli
9 | - subscription
10 | status: draft
11 | created: 2025-10-02
12 | ---
13 |
14 | # SPEC-13: CLI Authentication with Subscription Validation
15 |
16 | ## Why
17 |
18 | The Basic Memory Cloud CLI currently has a security gap in authentication that allows unauthorized access:
19 |
20 | **Current Web Flow (Secure)**:
21 | 1. User signs up via WorkOS AuthKit
22 | 2. User creates Polar subscription
23 | 3. Web app validates subscription before calling `POST /tenants/setup`
24 | 4. Tenant provisioned only after subscription validation ✅
25 |
26 | **Current CLI Flow (Insecure)**:
27 | 1. User signs up via WorkOS AuthKit (OAuth device flow)
28 | 2. User runs `bm cloud login`
29 | 3. CLI receives JWT token from WorkOS
30 | 4. CLI can access all cloud endpoints without subscription check ❌
31 |
32 | **Problem**: Anyone can sign up with WorkOS and immediately access cloud infrastructure via CLI without having an active Polar subscription. This creates:
33 | - Revenue loss (free resource consumption)
34 | - Security risk (unauthorized data access)
35 | - Support burden (users accessing features they haven't paid for)
36 |
37 | **Root Cause**: The CLI authentication flow validates JWT tokens but doesn't verify subscription status before granting access to cloud resources.
38 |
39 | ## What
40 |
41 | Add subscription validation to authentication flow to ensure only users with active Polar subscriptions can access cloud resources across all access methods (CLI, MCP, Web App, Direct API).
42 |
43 | **Affected Components**:
44 |
45 | ### basic-memory-cloud (Cloud Service)
46 | - `apps/cloud/src/basic_memory_cloud/deps.py` - Add subscription validation dependency
47 | - `apps/cloud/src/basic_memory_cloud/services/subscription_service.py` - Add subscription check method
48 | - `apps/cloud/src/basic_memory_cloud/api/tenant_mount.py` - Protect mount endpoints
49 | - `apps/cloud/src/basic_memory_cloud/api/proxy.py` - Protect proxy endpoints
50 |
51 | ### basic-memory (CLI)
52 | - `src/basic_memory/cli/commands/cloud/core_commands.py` - Handle 403 errors
53 | - `src/basic_memory/cli/commands/cloud/api_client.py` - Parse subscription errors
54 | - `docs/cloud-cli.md` - Document subscription requirement
55 |
56 | **Endpoints to Protect**:
57 | - `GET /tenant/mount/info` - Used by CLI bisync setup
58 | - `POST /tenant/mount/credentials` - Used by CLI bisync credentials
59 | - `GET /proxy/{path:path}` - Used by Web App, MCP tools, CLI tools, Direct API
60 | - All other `/proxy/*` endpoints - Centralized access point for all user operations
61 |
62 | ## Complete Authentication Flow Analysis
63 |
64 | ### Overview of All Access Flows
65 |
66 | Basic Memory Cloud has **7 distinct authentication flows**. This spec closes subscription validation gaps in flows 2-4 and 6, which all converge on the `/proxy/*` endpoints.
67 |
68 | ### Flow 1: Polar Webhook → Registration ✅ SECURE
69 | ```
70 | Polar webhook → POST /api/webhooks/polar
71 | → Validates Polar webhook signature
72 | → Creates/updates subscription in database
73 | → No direct user access - webhook only
74 | ```
75 | **Auth**: Polar webhook signature validation
76 | **Subscription Check**: N/A (webhook creates subscriptions)
77 | **Status**: ✅ Secure - webhook validated, no user JWT involved
78 |
79 | ### Flow 2: Web App Login ❌ NEEDS FIX
80 | ```
81 | User → apps/web (Vue.js/Nuxt)
82 | → WorkOS AuthKit magic link authentication
83 | → JWT stored in browser session
84 | → Web app calls /proxy/{project}/... endpoints (memory, directory, projects)
85 | → proxy.py validates JWT but does NOT check subscription
86 | → Access granted without subscription ❌
87 | ```
88 | **Auth**: WorkOS JWT via `CurrentUserProfileHybridJwtDep`
89 | **Subscription Check**: ❌ Missing
90 | **Fixed By**: Task 1.4 (protect `/proxy/*` endpoints)
91 |
92 | ### Flow 3: MCP (Model Context Protocol) ❌ NEEDS FIX
93 | ```
94 | AI Agent (Claude, Cursor, etc.) → https://mcp.basicmemory.com
95 | → AuthKit OAuth device flow
96 | → JWT stored in AI agent
97 | → MCP tools call {cloud_host}/proxy/{endpoint} with Authorization header
98 | → proxy.py validates JWT but does NOT check subscription
99 | → MCP tools can access all cloud resources without subscription ❌
100 | ```
101 | **Auth**: AuthKit JWT via `CurrentUserProfileHybridJwtDep`
102 | **Subscription Check**: ❌ Missing
103 | **Fixed By**: Task 1.4 (protect `/proxy/*` endpoints)
104 |
105 | ### Flow 4: CLI Auth (basic-memory) ❌ NEEDS FIX
106 | ```
107 | User → bm cloud login
108 | → AuthKit OAuth device flow
109 | → JWT stored in ~/.basic-memory/tokens.json
110 | → CLI calls:
111 | - {cloud_host}/tenant/mount/info (for bisync setup)
112 | - {cloud_host}/tenant/mount/credentials (for bisync credentials)
113 | - {cloud_host}/proxy/{endpoint} (for all MCP tools)
114 | → tenant_mount.py and proxy.py validate JWT but do NOT check subscription
115 | → Access granted without subscription ❌
116 | ```
117 | **Auth**: AuthKit JWT via `CurrentUserProfileHybridJwtDep`
118 | **Subscription Check**: ❌ Missing
119 | **Fixed By**: Task 1.3 (protect `/tenant/mount/*`) + Task 1.4 (protect `/proxy/*`)
120 |
121 | ### Flow 5: Cloud CLI (Admin Tasks) ✅ SECURE
122 | ```
123 | Admin → python -m basic_memory_cloud.cli.tenant_cli
124 | → Uses CLIAuth with admin WorkOS OAuth client
125 | → Gets JWT token with admin org membership
126 | → Calls /tenants/* endpoints (create, list, delete tenants)
127 | → tenants.py validates JWT AND admin org membership via AdminUserHybridDep
128 | → Access granted only to admin organization members ✅
129 | ```
130 | **Auth**: AuthKit JWT + Admin org validation via `AdminUserHybridDep`
131 | **Subscription Check**: N/A (admins bypass subscription requirement)
132 | **Status**: ✅ Secure - admin-only endpoints, separate from user flows
133 |
134 | ### Flow 6: Direct API Calls ❌ NEEDS FIX
135 | ```
136 | Any HTTP client → {cloud_host}/proxy/{endpoint}
137 | → Sends Authorization: Bearer {jwt} header
138 | → proxy.py validates JWT but does NOT check subscription
139 | → Direct API access without subscription ❌
140 | ```
141 | **Auth**: WorkOS or AuthKit JWT via `CurrentUserProfileHybridJwtDep`
142 | **Subscription Check**: ❌ Missing
143 | **Fixed By**: Task 1.4 (protect `/proxy/*` endpoints)
144 |
145 | ### Flow 7: Tenant API Instance (Internal) ✅ SECURE
146 | ```
147 | /proxy/* → Tenant API (basic-memory-{tenant_id}.fly.dev)
148 | → Validates signed header from proxy (tenant_id + signature)
149 | → Direct external access will be disabled in production
150 | → Only accessible via /proxy endpoints
151 | ```
152 | **Auth**: Signed header validation from proxy
153 | **Subscription Check**: N/A (internal only, validated at proxy layer)
154 | **Status**: ✅ Secure - validates proxy signature, not directly accessible
155 |
156 | ### Authentication Flow Summary Matrix
157 |
158 | | Flow | Access Method | Current Auth | Subscription Check | Fixed By SPEC-13 |
159 | |------|---------------|--------------|-------------------|------------------|
160 | | 1. Polar Webhook | Polar webhook → `/api/webhooks/polar` | Polar signature | N/A (webhook) | N/A |
161 | | 2. Web App | Browser → `/proxy/*` | WorkOS JWT ✅ | ❌ Missing | ✅ Task 1.4 |
162 | | 3. MCP | AI Agent → `/proxy/*` | AuthKit JWT ✅ | ❌ Missing | ✅ Task 1.4 |
163 | | 4. CLI | `bm cloud` → `/tenant/mount/*` + `/proxy/*` | AuthKit JWT ✅ | ❌ Missing | ✅ Task 1.3 + 1.4 |
164 | | 5. Cloud CLI (Admin) | `tenant_cli` → `/tenants/*` | AuthKit JWT ✅ + Admin org | N/A (admin) | N/A (admin bypass) |
165 | | 6. Direct API | HTTP client → `/proxy/*` | WorkOS/AuthKit JWT ✅ | ❌ Missing | ✅ Task 1.4 |
166 | | 7. Tenant API | Proxy → tenant instance | Proxy signature ✅ | N/A (internal) | N/A |
167 |
168 | ### Key Insights
169 |
170 | 1. **Single Point of Failure**: All user access (Web, MCP, CLI, Direct API) converges on `/proxy/*` endpoints
171 | 2. **Centralized Fix**: Protecting `/proxy/*` with subscription validation closes gaps in flows 2, 3, 4, and 6 simultaneously
172 | 3. **Admin Bypass**: Cloud CLI admin tasks use separate `/tenants/*` endpoints with admin-only access (no subscription needed)
173 | 4. **Defense in Depth**: `/tenant/mount/*` endpoints also protected for CLI bisync operations
174 |
175 | ### Architecture Benefits
176 |
177 | The `/proxy` layer serves as the **single centralized authorization point** for all user access:
178 | - ✅ One place to validate JWT tokens
179 | - ✅ One place to check subscription status
180 | - ✅ One place to handle tenant routing
181 | - ✅ Protects Web App, MCP, CLI, and Direct API simultaneously
182 |
183 | This architecture makes the fix comprehensive and maintainable.
184 |
185 | ## How (High Level)
186 |
187 | ### Option A: Database Subscription Check (Recommended)
188 |
189 | **Approach**: Add FastAPI dependency that validates subscription status from database before allowing access.
190 |
191 | **Implementation**:
192 |
193 | 1. **Create Subscription Validation Dependency** (`deps.py`)
194 | ```python
195 | async def get_authorized_cli_user_profile(
196 | credentials: Annotated[HTTPAuthorizationCredentials, Depends(security)],
197 | session: DatabaseSessionDep,
198 | user_profile_repo: UserProfileRepositoryDep,
199 | subscription_service: SubscriptionServiceDep,
200 | ) -> UserProfile:
201 | """
202 | Hybrid authentication with subscription validation for CLI access.
203 |
204 | Validates JWT (WorkOS or AuthKit) and checks for active subscription.
205 | Returns UserProfile if both checks pass.
206 | """
207 | # Try WorkOS JWT first (faster validation path)
208 | try:
209 | user_context = await validate_workos_jwt(credentials.credentials)
210 | except HTTPException:
211 | # Fall back to AuthKit JWT validation
212 | try:
213 | user_context = await validate_authkit_jwt(credentials.credentials)
214 | except HTTPException as e:
215 | raise HTTPException(
216 | status_code=401,
217 | detail="Invalid JWT token. Authentication required.",
218 | ) from e
219 |
220 | # Check subscription status
221 | has_subscription = await subscription_service.check_user_has_active_subscription(
222 | session, user_context.workos_user_id
223 | )
224 |
225 | if not has_subscription:
226 | raise HTTPException(
227 | status_code=403,
228 | detail={
229 | "error": "subscription_required",
230 | "message": "Active subscription required for CLI access",
231 | "subscribe_url": "https://basicmemory.com/subscribe"
232 | }
233 | )
234 |
235 | # Look up and return user profile
236 | user_profile = await user_profile_repo.get_user_profile_by_workos_user_id(
237 | session, user_context.workos_user_id
238 | )
239 | if not user_profile:
240 | raise HTTPException(401, detail="User profile not found")
241 |
242 | return user_profile
243 | ```
244 |
245 | ```python
246 | AuthorizedCLIUserProfileDep = Annotated[UserProfile, Depends(get_authorized_cli_user_profile)]
247 | ```
248 |
249 | 2. **Add Subscription Check Method** (`subscription_service.py`)
250 | ```python
251 | async def check_user_has_active_subscription(
252 | self, session: AsyncSession, workos_user_id: str
253 | ) -> bool:
254 | """Check if user has active subscription."""
255 | # Use existing repository method to get subscription by workos_user_id
256 | # This joins UserProfile -> Subscription in a single query
257 | subscription = await self.subscription_repository.get_subscription_by_workos_user_id(
258 | session, workos_user_id
259 | )
260 |
261 | return subscription is not None and subscription.status == "active"
262 | ```
263 |
264 | 3. **Protect Endpoints** (Replace `CurrentUserProfileHybridJwtDep` with `AuthorizedCLIUserProfileDep`)
265 | ```python
266 | # Before
267 | @router.get("/mount/info")
268 | async def get_mount_info(
269 | user_profile: CurrentUserProfileHybridJwtDep,
270 | session: DatabaseSessionDep,
271 | ):
272 | tenant_id = user_profile.tenant_id
273 | ...
274 |
275 | # After
276 | @router.get("/mount/info")
277 | async def get_mount_info(
278 | user_profile: AuthorizedCLIUserProfileDep, # Now includes subscription check
279 | session: DatabaseSessionDep,
280 | ):
281 | tenant_id = user_profile.tenant_id # No changes needed to endpoint logic
282 | ...
283 | ```
284 |
285 | 4. **Update CLI Error Handling**
286 | ```python
287 | # In core_commands.py login()
288 | try:
289 | success = await auth.login()
290 | if success:
291 | # Test subscription by calling protected endpoint
292 | await make_api_request("GET", f"{host_url}/tenant/mount/info")
293 | except CloudAPIError as e:
294 | if e.status_code == 403 and e.detail.get("error") == "subscription_required":
295 | console.print("[red]Subscription required[/red]")
296 | console.print(f"Subscribe at: {e.detail['subscribe_url']}")
297 | raise typer.Exit(1)
298 | ```
299 |
300 | **Pros**:
301 | - Simple to implement
302 | - Fast (single database query)
303 | - Clear error messages
304 | - Works with existing subscription flow
305 |
306 | **Cons**:
307 | - Database is source of truth (could get out of sync with Polar)
308 | - Adds one extra subscription lookup query per request (lightweight JOIN query)
309 |
310 | ### Option B: WorkOS Organizations
311 |
312 | **Approach**: Add users to "beta-users" organization in WorkOS after subscription creation, validate org membership via JWT claims.
313 |
314 | **Implementation**:
315 | 1. After Polar subscription webhook, add user to WorkOS org via API
316 | 2. Validate `org_id` claim in JWT matches authorized org
317 | 3. Use existing `get_admin_workos_jwt` pattern
318 |
319 | **Pros**:
320 | - WorkOS as single source of truth
321 | - No database queries needed
322 | - More secure (harder to bypass)
323 |
324 | **Cons**:
325 | - More complex (requires WorkOS API integration)
326 | - Requires managing WorkOS org membership
327 | - Less control over error messages
328 | - Additional API calls during registration
329 |
330 | ### Recommendation
331 |
332 | **Start with Option A (Database Check)** for:
333 | - Faster implementation
334 | - Clearer error messages
335 | - Easier testing
336 | - Existing subscription infrastructure
337 |
338 | **Consider Option B later** if:
339 | - Need tighter security
340 | - Want to reduce database dependency
341 | - Scale requires fewer database queries
342 |
343 | ## How to Evaluate
344 |
345 | ### Success Criteria
346 |
347 | **1. Unauthorized Users Blocked**
348 | - [ ] User without subscription cannot complete `bm cloud login`
349 | - [ ] User without subscription receives clear error with subscribe link
350 | - [ ] User without subscription cannot run `bm cloud setup`
351 | - [ ] User without subscription cannot run `bm sync` in cloud mode
352 |
353 | **2. Authorized Users Work**
354 | - [ ] User with active subscription can login successfully
355 | - [ ] User with active subscription can setup bisync
356 | - [ ] User with active subscription can sync files
357 | - [ ] User with active subscription can use all MCP tools via proxy
358 |
359 | **3. Subscription State Changes**
360 | - [ ] Expired subscription blocks access with clear error
361 | - [ ] Renewed subscription immediately restores access
362 | - [ ] Cancelled subscription blocks access after grace period
363 |
364 | **4. Error Messages**
365 | - [ ] 403 errors include "subscription_required" error code
366 | - [ ] Error messages include subscribe URL
367 | - [ ] CLI displays user-friendly messages
368 | - [ ] Errors logged appropriately for debugging
369 |
370 | **5. No Regressions**
371 | - [ ] Web app login/subscription flow unaffected
372 | - [ ] Admin endpoints still work (bypass check)
373 | - [ ] Tenant provisioning workflow unchanged
374 | - [ ] Performance not degraded
375 |
376 | ### Test Cases
377 |
378 | **Manual Testing**:
379 | ```bash
380 | # Test 1: Unauthorized user
381 | 1. Create new WorkOS account (no subscription)
382 | 2. Run `bm cloud login`
383 | 3. Verify: Login succeeds but shows subscription required error
384 | 4. Verify: Cannot run `bm cloud setup`
385 | 5. Verify: Clear error message with subscribe link
386 |
387 | # Test 2: Authorized user
388 | 1. Use account with active Polar subscription
389 | 2. Run `bm cloud login`
390 | 3. Verify: Login succeeds without errors
391 | 4. Run `bm cloud setup`
392 | 5. Verify: Setup completes successfully
393 | 6. Run `bm sync`
394 | 7. Verify: Sync works normally
395 |
396 | # Test 3: Subscription expiration
397 | 1. Use account with active subscription
398 | 2. Manually expire subscription in database
399 | 3. Run `bm cloud login`
400 | 4. Verify: Blocked with clear error
401 | 5. Renew subscription
402 | 6. Run `bm cloud login` again
403 | 7. Verify: Access restored
404 | ```
405 |
406 | **Automated Tests**:
407 | ```python
408 | # Test subscription validation dependency
409 | async def test_authorized_user_allowed(
410 | db_session,
411 | user_profile_repo,
412 | subscription_service,
413 | mock_jwt_credentials
414 | ):
415 | # Create user with active subscription
416 | user_profile = await create_user_with_subscription(db_session, status="active")
417 |
418 | # Mock JWT credentials for the user
419 | credentials = mock_jwt_credentials(user_profile.workos_user_id)
420 |
421 | # Should not raise exception
422 | result = await get_authorized_cli_user_profile(
423 | credentials, db_session, user_profile_repo, subscription_service
424 | )
425 | assert result.id == user_profile.id
426 | assert result.workos_user_id == user_profile.workos_user_id
427 |
428 | async def test_unauthorized_user_blocked(
429 | db_session,
430 | user_profile_repo,
431 | subscription_service,
432 | mock_jwt_credentials
433 | ):
434 | # Create user without subscription
435 | user_profile = await create_user_without_subscription(db_session)
436 | credentials = mock_jwt_credentials(user_profile.workos_user_id)
437 |
438 | # Should raise 403
439 | with pytest.raises(HTTPException) as exc:
440 | await get_authorized_cli_user_profile(
441 | credentials, db_session, user_profile_repo, subscription_service
442 | )
443 |
444 | assert exc.value.status_code == 403
445 | assert exc.value.detail["error"] == "subscription_required"
446 |
447 | async def test_inactive_subscription_blocked(
448 | db_session,
449 | user_profile_repo,
450 | subscription_service,
451 | mock_jwt_credentials
452 | ):
453 | # Create user with cancelled/inactive subscription
454 | user_profile = await create_user_with_subscription(db_session, status="cancelled")
455 | credentials = mock_jwt_credentials(user_profile.workos_user_id)
456 |
457 | # Should raise 403
458 | with pytest.raises(HTTPException) as exc:
459 | await get_authorized_cli_user_profile(
460 | credentials, db_session, user_profile_repo, subscription_service
461 | )
462 |
463 | assert exc.value.status_code == 403
464 | assert exc.value.detail["error"] == "subscription_required"
465 | ```
466 |
467 | ## Implementation Tasks
468 |
469 | ### Phase 1: Cloud Service (basic-memory-cloud)
470 |
471 | #### Task 1.1: Add subscription check method to SubscriptionService ✅
472 | **File**: `apps/cloud/src/basic_memory_cloud/services/subscription_service.py`
473 |
474 | - [x] Add method `check_subscription(session: AsyncSession, workos_user_id: str) -> bool`
475 | - [x] Use existing `self.subscription_repository.get_subscription_by_workos_user_id(session, workos_user_id)`
476 | - [x] Check both `status == "active"` AND `current_period_end >= now()`
477 | - [x] Log both values when check fails
478 | - [x] Add docstring explaining the method
479 | - [x] Run `just typecheck` to verify types
480 |
481 | **Actual implementation**:
482 | ```python
483 | async def check_subscription(
484 | self, session: AsyncSession, workos_user_id: str
485 | ) -> bool:
486 | """Check if user has active subscription with valid period."""
487 | subscription = await self.subscription_repository.get_subscription_by_workos_user_id(
488 | session, workos_user_id
489 | )
490 |
491 | if subscription is None:
492 | return False
493 |
494 | if subscription.status != "active":
495 | logger.warning("Subscription inactive", workos_user_id=workos_user_id,
496 | status=subscription.status, current_period_end=subscription.current_period_end)
497 | return False
498 |
499 | now = datetime.now(timezone.utc)
500 | if subscription.current_period_end is None or subscription.current_period_end < now:
501 | logger.warning("Subscription expired", workos_user_id=workos_user_id,
502 | status=subscription.status, current_period_end=subscription.current_period_end)
503 | return False
504 |
505 | return True
506 | ```
507 |
508 | #### Task 1.2: Add subscription validation dependency ✅
509 | **File**: `apps/cloud/src/basic_memory_cloud/deps.py`
510 |
511 | - [x] Import necessary types at top of file (if not already present)
512 | - [x] Add `authorized_user_profile()` async function
513 | - [x] Implement hybrid JWT validation (WorkOS first, AuthKit fallback)
514 | - [x] Add subscription check using `subscription_service.check_subscription()`
515 | - [x] Raise `HTTPException(403)` with structured error detail if no active subscription
516 | - [x] Look up and return `UserProfile` after validation
517 | - [x] Add `AuthorizedUserProfileDep` type annotation
518 | - [x] Use `settings.subscription_url` from config (env var)
519 | - [x] Run `just typecheck` to verify types
520 |
521 | **Expected code**:
522 | ```python
523 | async def get_authorized_cli_user_profile(
524 | credentials: Annotated[HTTPAuthorizationCredentials, Depends(security)],
525 | session: DatabaseSessionDep,
526 | user_profile_repo: UserProfileRepositoryDep,
527 | subscription_service: SubscriptionServiceDep,
528 | ) -> UserProfile:
529 | """
530 | Hybrid authentication with subscription validation for CLI access.
531 |
532 | Validates JWT (WorkOS or AuthKit) and checks for active subscription.
533 | Returns UserProfile if both checks pass.
534 |
535 | Raises:
536 | HTTPException(401): Invalid JWT token
537 | HTTPException(403): No active subscription
538 | """
539 | # Try WorkOS JWT first (faster validation path)
540 | try:
541 | user_context = await validate_workos_jwt(credentials.credentials)
542 | except HTTPException:
543 | # Fall back to AuthKit JWT validation
544 | try:
545 | user_context = await validate_authkit_jwt(credentials.credentials)
546 | except HTTPException as e:
547 | raise HTTPException(
548 | status_code=401,
549 | detail="Invalid JWT token. Authentication required.",
550 | ) from e
551 |
552 | # Check subscription status
553 | has_subscription = await subscription_service.check_user_has_active_subscription(
554 | session, user_context.workos_user_id
555 | )
556 |
557 | if not has_subscription:
558 | logger.warning(
559 | "CLI access denied: no active subscription",
560 | workos_user_id=user_context.workos_user_id,
561 | )
562 | raise HTTPException(
563 | status_code=403,
564 | detail={
565 | "error": "subscription_required",
566 | "message": "Active subscription required for CLI access",
567 | "subscribe_url": "https://basicmemory.com/subscribe"
568 | }
569 | )
570 |
571 | # Look up and return user profile
572 | user_profile = await user_profile_repo.get_user_profile_by_workos_user_id(
573 | session, user_context.workos_user_id
574 | )
575 | if not user_profile:
576 | logger.error(
577 | "User profile not found after successful auth",
578 | workos_user_id=user_context.workos_user_id,
579 | )
580 | raise HTTPException(401, detail="User profile not found")
581 |
582 | logger.info(
583 | "CLI access granted",
584 | workos_user_id=user_context.workos_user_id,
585 | user_profile_id=str(user_profile.id),
586 | )
587 | return user_profile
588 |
589 |
590 | AuthorizedCLIUserProfileDep = Annotated[UserProfile, Depends(get_authorized_cli_user_profile)]
591 | ```
592 |
593 | #### Task 1.3: Protect tenant mount endpoints ✅
594 | **File**: `apps/cloud/src/basic_memory_cloud/api/tenant_mount.py`
595 |
596 | - [x] Update import: add `AuthorizedUserProfileDep` from `..deps`
597 | - [x] Replace `user_profile: CurrentUserProfileHybridJwtDep` with `user_profile: AuthorizedUserProfileDep` in:
598 | - [x] `get_tenant_mount_info()` (line ~23)
599 | - [x] `create_tenant_mount_credentials()` (line ~88)
600 | - [x] `revoke_tenant_mount_credentials()` (line ~244)
601 | - [x] `list_tenant_mount_credentials()` (line ~326)
602 | - [x] Verify no other code changes needed (parameter name and usage stays the same)
603 | - [x] Run `just typecheck` to verify types
604 |
605 | #### Task 1.4: Protect proxy endpoints ✅
606 | **File**: `apps/cloud/src/basic_memory_cloud/api/proxy.py`
607 |
608 | - [x] Update import: add `AuthorizedUserProfileDep` from `..deps`
609 | - [x] Replace `user_profile: CurrentUserProfileHybridJwtDep` with `user_profile: AuthorizedUserProfileDep` in:
610 | - [x] `check_tenant_health()` (line ~21)
611 | - [x] `proxy_to_tenant()` (line ~63)
612 | - [x] Verify no other code changes needed (parameter name and usage stays the same)
613 | - [x] Run `just typecheck` to verify types
614 |
615 | **Why Keep /proxy Architecture:**
616 |
617 | The proxy layer is valuable because it:
618 | 1. **Centralizes authorization** - Single place for JWT + subscription validation (closes both CLI and MCP auth gaps)
619 | 2. **Handles tenant routing** - Maps tenant_id → fly_app_name without exposing infrastructure details
620 | 3. **Abstracts infrastructure** - MCP and CLI don't need to know about Fly.io naming conventions
621 | 4. **Enables features** - Can add rate limiting, caching, request logging, etc. at proxy layer
622 | 5. **Supports both flows** - CLI tools and MCP tools both use /proxy endpoints
623 |
624 | The extra HTTP hop is minimal (< 10ms) and worth it for architectural benefits.
625 |
626 | **Performance Note:** Cloud app has Redis available - can cache subscription status to reduce database queries if needed. Initial implementation uses direct database query (simple, acceptable performance ~5-10ms).
627 |
628 | #### Task 1.5: Add unit tests for subscription service
629 | **File**: `apps/cloud/tests/services/test_subscription_service.py` (create if doesn't exist)
630 |
631 | - [ ] Create test file if it doesn't exist
632 | - [ ] Add test: `test_check_user_has_active_subscription_returns_true_for_active()`
633 | - Create user with active subscription
634 | - Call `check_user_has_active_subscription()`
635 | - Assert returns `True`
636 | - [ ] Add test: `test_check_user_has_active_subscription_returns_false_for_pending()`
637 | - Create user with pending subscription
638 | - Assert returns `False`
639 | - [ ] Add test: `test_check_user_has_active_subscription_returns_false_for_cancelled()`
640 | - Create user with cancelled subscription
641 | - Assert returns `False`
642 | - [ ] Add test: `test_check_user_has_active_subscription_returns_false_for_no_subscription()`
643 | - Create user without subscription
644 | - Assert returns `False`
645 | - [ ] Run `just test` to verify tests pass
646 |
647 | #### Task 1.6: Add integration tests for dependency
648 | **File**: `apps/cloud/tests/test_deps.py` (create if doesn't exist)
649 |
650 | - [ ] Create test file if it doesn't exist
651 | - [ ] Add fixtures for mocking JWT credentials
652 | - [ ] Add test: `test_authorized_cli_user_profile_with_active_subscription()`
653 | - Mock valid JWT + active subscription
654 | - Call dependency
655 | - Assert returns UserProfile
656 | - [ ] Add test: `test_authorized_cli_user_profile_without_subscription_raises_403()`
657 | - Mock valid JWT + no subscription
658 | - Assert raises HTTPException(403) with correct error detail
659 | - [ ] Add test: `test_authorized_cli_user_profile_with_inactive_subscription_raises_403()`
660 | - Mock valid JWT + cancelled subscription
661 | - Assert raises HTTPException(403)
662 | - [ ] Add test: `test_authorized_cli_user_profile_with_invalid_jwt_raises_401()`
663 | - Mock invalid JWT
664 | - Assert raises HTTPException(401)
665 | - [ ] Run `just test` to verify tests pass
666 |
667 | #### Task 1.7: Deploy and verify cloud service
668 | - [ ] Run `just check` to verify all quality checks pass
669 | - [ ] Commit changes with message: "feat: add subscription validation to CLI endpoints"
670 | - [ ] Deploy to preview environment: `flyctl deploy --config apps/cloud/fly.toml`
671 | - [ ] Test manually:
672 | - [ ] Call `/tenant/mount/info` with valid JWT but no subscription → expect 403
673 | - [ ] Call `/tenant/mount/info` with valid JWT and active subscription → expect 200
674 | - [ ] Verify error response structure matches spec
675 |
676 | ### Phase 2: CLI (basic-memory)
677 |
678 | #### Task 2.1: Review and understand CLI authentication flow
679 | **Files**: `src/basic_memory/cli/commands/cloud/`
680 |
681 | - [ ] Read `core_commands.py` to understand current login flow
682 | - [ ] Read `api_client.py` to understand current error handling
683 | - [ ] Identify where 403 errors should be caught
684 | - [ ] Identify what error messages should be displayed
685 | - [ ] Document current behavior in spec if needed
686 |
687 | #### Task 2.2: Update API client error handling
688 | **File**: `src/basic_memory/cli/commands/cloud/api_client.py`
689 |
690 | - [ ] Add custom exception class `SubscriptionRequiredError` (or similar)
691 | - [ ] Update HTTP error handling to parse 403 responses
692 | - [ ] Extract `error`, `message`, and `subscribe_url` from error detail
693 | - [ ] Raise specific exception for subscription_required errors
694 | - [ ] Run `just typecheck` in basic-memory repo to verify types
695 |
696 | #### Task 2.3: Update CLI login command error handling
697 | **File**: `src/basic_memory/cli/commands/cloud/core_commands.py`
698 |
699 | - [ ] Import the subscription error exception
700 | - [ ] Wrap login flow with try/except for subscription errors
701 | - [ ] Display user-friendly error message with rich console
702 | - [ ] Show subscribe URL prominently
703 | - [ ] Provide actionable next steps
704 | - [ ] Run `just typecheck` to verify types
705 |
706 | **Expected error handling**:
707 | ```python
708 | try:
709 | # Existing login logic
710 | success = await auth.login()
711 | if success:
712 | # Test access to protected endpoint
713 | await api_client.test_connection()
714 | except SubscriptionRequiredError as e:
715 | console.print("\n[red]✗ Subscription Required[/red]\n")
716 | console.print(f"[yellow]{e.message}[/yellow]\n")
717 | console.print(f"Subscribe at: [blue underline]{e.subscribe_url}[/blue underline]\n")
718 | console.print("[dim]Once you have an active subscription, run [bold]bm cloud login[/bold] again.[/dim]")
719 | raise typer.Exit(1)
720 | ```
721 |
722 | #### Task 2.4: Update CLI tests
723 | **File**: `tests/cli/test_cloud_commands.py`
724 |
725 | - [ ] Add test: `test_login_without_subscription_shows_error()`
726 | - Mock 403 subscription_required response
727 | - Call login command
728 | - Assert error message displayed
729 | - Assert subscribe URL shown
730 | - [ ] Add test: `test_login_with_subscription_succeeds()`
731 | - Mock successful authentication + subscription check
732 | - Call login command
733 | - Assert success message
734 | - [ ] Run `just test` to verify tests pass
735 |
736 | #### Task 2.5: Update CLI documentation
737 | **File**: `docs/cloud-cli.md` (in basic-memory-docs repo)
738 |
739 | - [ ] Add "Prerequisites" section if not present
740 | - [ ] Document subscription requirement
741 | - [ ] Add "Troubleshooting" section
742 | - [ ] Document "Subscription Required" error
743 | - [ ] Provide subscribe URL
744 | - [ ] Add FAQ entry about subscription errors
745 | - [ ] Build docs locally to verify formatting
746 |
747 | ### Phase 3: End-to-End Testing
748 |
749 | #### Task 3.1: Create test user accounts
750 | **Prerequisites**: Access to WorkOS admin and database
751 |
752 | - [ ] Create test user WITHOUT subscription:
753 | - [ ] Sign up via WorkOS AuthKit
754 | - [ ] Get workos_user_id from database
755 | - [ ] Verify no subscription record exists
756 | - [ ] Save credentials for testing
757 | - [ ] Create test user WITH active subscription:
758 | - [ ] Sign up via WorkOS AuthKit
759 | - [ ] Create subscription via Polar or dev endpoint
760 | - [ ] Verify subscription.status = "active" in database
761 | - [ ] Save credentials for testing
762 |
763 | #### Task 3.2: Manual testing - User without subscription
764 | **Environment**: Preview/staging deployment
765 |
766 | - [ ] Run `bm cloud login` with no-subscription user
767 | - [ ] Verify: Login shows "Subscription Required" error
768 | - [ ] Verify: Subscribe URL is displayed
769 | - [ ] Verify: Cannot run `bm cloud setup`
770 | - [ ] Verify: Cannot call `/tenant/mount/info` directly via curl
771 | - [ ] Document any issues found
772 |
773 | #### Task 3.3: Manual testing - User with active subscription
774 | **Environment**: Preview/staging deployment
775 |
776 | - [ ] Run `bm cloud login` with active-subscription user
777 | - [ ] Verify: Login succeeds without errors
778 | - [ ] Verify: Can run `bm cloud setup`
779 | - [ ] Verify: Can call `/tenant/mount/info` successfully
780 | - [ ] Verify: Can call `/proxy/*` endpoints successfully
781 | - [ ] Document any issues found
782 |
783 | #### Task 3.4: Test subscription state transitions
784 | **Environment**: Preview/staging deployment + database access
785 |
786 | - [ ] Start with active subscription user
787 | - [ ] Verify: All operations work
788 | - [ ] Update subscription.status to "cancelled" in database
789 | - [ ] Verify: Login now shows "Subscription Required" error
790 | - [ ] Verify: Existing tokens are rejected with 403
791 | - [ ] Update subscription.status back to "active"
792 | - [ ] Verify: Access restored immediately
793 | - [ ] Document any issues found
794 |
795 | #### Task 3.5: Integration test suite
796 | **File**: `apps/cloud/tests/integration/test_cli_subscription_flow.py` (create if doesn't exist)
797 |
798 | - [ ] Create integration test file
799 | - [ ] Add test: `test_cli_flow_without_subscription()`
800 | - Simulate full CLI flow without subscription
801 | - Assert 403 at appropriate points
802 | - [ ] Add test: `test_cli_flow_with_active_subscription()`
803 | - Simulate full CLI flow with active subscription
804 | - Assert all operations succeed
805 | - [ ] Add test: `test_subscription_expiration_blocks_access()`
806 | - Start with active subscription
807 | - Change status to cancelled
808 | - Assert access denied
809 | - [ ] Run tests in CI/CD pipeline
810 | - [ ] Document test coverage
811 |
812 | #### Task 3.6: Load/performance testing (optional)
813 | **Environment**: Staging environment
814 |
815 | - [ ] Test subscription check performance under load
816 | - [ ] Measure latency added by subscription check
817 | - [ ] Verify database query performance
818 | - [ ] Document any performance concerns
819 | - [ ] Optimize if needed
820 |
821 | ## Implementation Summary Checklist
822 |
823 | Use this high-level checklist to track overall progress:
824 |
825 | ### Phase 1: Cloud Service 🔄
826 | - [x] Add subscription check method to SubscriptionService
827 | - [x] Add subscription validation dependency to deps.py
828 | - [x] Add subscription_url config (env var)
829 | - [x] Protect tenant mount endpoints (4 endpoints)
830 | - [x] Protect proxy endpoints (2 endpoints)
831 | - [ ] Add unit tests for subscription service
832 | - [ ] Add integration tests for dependency
833 | - [ ] Deploy and verify cloud service
834 |
835 | ### Phase 2: CLI Updates 🔄
836 | - [ ] Review CLI authentication flow
837 | - [ ] Update API client error handling
838 | - [ ] Update CLI login command error handling
839 | - [ ] Add CLI tests
840 | - [ ] Update CLI documentation
841 |
842 | ### Phase 3: End-to-End Testing 🧪
843 | - [ ] Create test user accounts
844 | - [ ] Manual testing - user without subscription
845 | - [ ] Manual testing - user with active subscription
846 | - [ ] Test subscription state transitions
847 | - [ ] Integration test suite
848 | - [ ] Load/performance testing (optional)
849 |
850 | ## Questions to Resolve
851 |
852 | ### Resolved ✅
853 |
854 | 1. **Admin Access**
855 | - ✅ **Decision**: Admin users bypass subscription check
856 | - **Rationale**: Admin endpoints already use `AdminUserHybridDep`, which is separate from CLI user endpoints
857 | - **Implementation**: No changes needed to admin endpoints
858 |
859 | 2. **Subscription Check Implementation**
860 | - ✅ **Decision**: Use Option A (Database Check)
861 | - **Rationale**: Simpler, faster to implement, works with existing infrastructure
862 | - **Implementation**: Single JOIN query via `get_subscription_by_workos_user_id()`
863 |
864 | 3. **Dependency Return Type**
865 | - ✅ **Decision**: Return `UserProfile` (not `UserContext`)
866 | - **Rationale**: Drop-in compatibility with existing endpoints, no refactoring needed
867 | - **Implementation**: `AuthorizedCLIUserProfileDep` returns `UserProfile`
868 |
869 | ### To Be Resolved ⏳
870 |
871 | 1. **Subscription Check Frequency**
872 | - **Options**:
873 | - Check on every API call (slower, more secure) ✅ **RECOMMENDED**
874 | - Cache subscription status (faster, risk of stale data)
875 | - Check only on login/setup (fast, but allows expired subscriptions temporarily)
876 | - **Recommendation**: Check on every call via dependency injection (simple, secure, acceptable performance)
877 | - **Impact**: ~5-10ms per request (single indexed JOIN query)
878 |
879 | 2. **Grace Period**
880 | - **Options**:
881 | - No grace period - immediate block when status != "active" ✅ **RECOMMENDED**
882 | - 7-day grace period after period_end
883 | - 14-day grace period after period_end
884 | - **Recommendation**: No grace period initially, add later if needed based on customer feedback
885 | - **Implementation**: Check `subscription.status == "active"` only (ignore period_end initially)
886 |
887 | 3. **Subscription Expiration Handling**
888 | - **Question**: Should we check `current_period_end < now()` in addition to `status == "active"`?
889 | - **Options**:
890 | - Only check status field (rely on Polar webhooks to update status) ✅ **RECOMMENDED**
891 | - Check both status and current_period_end (more defensive)
892 | - **Recommendation**: Only check status field, assume Polar webhooks keep it current
893 | - **Risk**: If webhooks fail, expired subscriptions might retain access until webhook succeeds
894 |
895 | 4. **Subscribe URL**
896 | - **Question**: What's the actual subscription URL?
897 | - **Current**: Spec uses `https://basicmemory.com/subscribe`
898 | - **Action Required**: Verify correct URL before implementation
899 |
900 | 5. **Dev Mode / Testing Bypass**
901 | - **Question**: Support bypass for development/testing?
902 | - **Options**:
903 | - Environment variable: `DISABLE_SUBSCRIPTION_CHECK=true`
904 | - Always enforce (more realistic testing) ✅ **RECOMMENDED**
905 | - **Recommendation**: No bypass - use test users with real subscriptions for realistic testing
906 | - **Implementation**: Create dev endpoint to activate subscriptions for testing
907 |
908 | ## Related Specs
909 |
910 | - SPEC-9: Multi-Project Bidirectional Sync Architecture (CLI affected by this change)
911 | - SPEC-8: TigrisFS Integration (Mount endpoints protected)
912 |
913 | ## Notes
914 |
915 | - This spec prioritizes security over convenience - better to block unauthorized access than risk revenue loss
916 | - Clear error messages are critical - users should understand why they're blocked and how to resolve it
917 | - Consider adding telemetry to track subscription_required errors for monitoring signup conversion
918 |
```
--------------------------------------------------------------------------------
/specs/SPEC-19 Sync Performance and Memory Optimization.md:
--------------------------------------------------------------------------------
```markdown
1 | ---
2 | title: 'SPEC-19: Sync Performance and Memory Optimization'
3 | type: spec
4 | permalink: specs/spec-17-sync-performance-optimization
5 | tags:
6 | - performance
7 | - memory
8 | - sync
9 | - optimization
10 | - core
11 | status: draft
12 | ---
13 |
14 | # SPEC-19: Sync Performance and Memory Optimization
15 |
16 | ## Why
17 |
18 | ### Problem Statement
19 |
20 | Current sync implementation causes Out-of-Memory (OOM) kills and poor performance on production systems:
21 |
22 | **Evidence from Production**:
23 | - **Tenant-6d2ff1a3**: OOM killed on 1GB machine
24 | - Files: 2,621 total (31 PDFs, 80MB binary data)
25 | - Memory: 1.5-1.7GB peak usage
26 | - Sync duration: 15+ minutes
27 | - Error: `Out of memory: Killed process 693 (python)`
28 |
29 | **Root Causes**:
30 |
31 | 1. **Checksum-based scanning loads ALL files into memory**
32 | - `scan_directory()` computes checksums for ALL 2,624 files upfront
33 | - Results stored in multiple dicts (`ScanResult.files`, `SyncReport.checksums`)
34 | - Even unchanged files are fully read and checksummed
35 |
36 | 2. **Large files read entirely for checksums**
37 | - 16MB PDF → Full read into memory → Compute checksum
38 | - No streaming or chunked processing
39 | - TigrisFS caching compounds memory usage
40 |
41 | 3. **Unbounded concurrency**
42 | - All 2,624 files processed simultaneously
43 | - Each file loads full content into memory
44 | - No semaphore limiting concurrent operations
45 |
46 | 4. **Cloud-specific resource leaks**
47 | - aiohttp session leak in keepalive (not in context manager)
48 | - Circuit breaker resets every 30s sync cycle (ineffective)
49 | - Thundering herd: all tenants sync at :00 and :30
50 |
51 | ### Impact
52 |
53 | - **Production stability**: OOM kills are unacceptable
54 | - **User experience**: 15+ minute syncs are too slow
55 | - **Cost**: Forced upgrades from 1GB → 2GB machines ($5-10/mo per tenant)
56 | - **Scalability**: Current approach won't scale to 100+ tenants
57 |
58 | ### Architectural Decision
59 |
60 | **Fix in basic-memory core first, NOT UberSync**
61 |
62 | Rationale:
63 | - Root causes are algorithmic, not architectural
64 | - Benefits all users (CLI + Cloud)
65 | - Lower risk than new centralized service
66 | - Known solutions (rsync/rclone use same pattern)
67 | - Can defer UberSync until metrics prove it necessary
68 |
69 | ## What
70 |
71 | ### Affected Components
72 |
73 | **basic-memory (core)**:
74 | - `src/basic_memory/sync/sync_service.py` - Core sync algorithm (~42KB)
75 | - `src/basic_memory/models.py` - Entity model (add mtime/size columns)
76 | - `src/basic_memory/file_utils.py` - Checksum computation functions
77 | - `src/basic_memory/repository/entity_repository.py` - Database queries
78 | - `alembic/versions/` - Database migration for schema changes
79 |
80 | **basic-memory-cloud (wrapper)**:
81 | - `apps/api/src/basic_memory_cloud_api/sync_worker.py` - Cloud sync wrapper
82 | - Circuit breaker implementation
83 | - Sync coordination logic
84 |
85 | ### Database Schema Changes
86 |
87 | Add to Entity model:
88 | ```python
89 | mtime: float # File modification timestamp
90 | size: int # File size in bytes
91 | ```
92 |
93 | ## How (High Level)
94 |
95 | ### Phase 1: Core Algorithm Fixes (basic-memory)
96 |
97 | **Priority: P0 - Critical**
98 |
99 | #### 1.1 mtime-based Scanning (Issue #383)
100 |
101 | Replace expensive checksum-based scanning with lightweight stat-based comparison:
102 |
103 | ```python
104 | async def scan_directory(self, directory: Path) -> ScanResult:
105 | """Scan using mtime/size instead of checksums"""
106 | result = ScanResult()
107 |
108 | for root, dirnames, filenames in os.walk(str(directory)):
109 | for filename in filenames:
110 | rel_path = path.relative_to(directory).as_posix()
111 | stat = path.stat()
112 |
113 | # Store lightweight metadata instead of checksum
114 | result.files[rel_path] = {
115 | 'mtime': stat.st_mtime,
116 | 'size': stat.st_size
117 | }
118 |
119 | return result
120 |
121 | async def scan(self, directory: Path):
122 | """Compare mtime/size, only compute checksums for changed files"""
123 | db_state = await self.get_db_file_state() # Include mtime/size
124 | scan_result = await self.scan_directory(directory)
125 |
126 | for file_path, metadata in scan_result.files.items():
127 | db_metadata = db_state.get(file_path)
128 |
129 | # Only compute expensive checksum if mtime/size changed
130 | if not db_metadata or metadata['mtime'] != db_metadata['mtime']:
131 | checksum = await self._compute_checksum_streaming(file_path)
132 | # Process immediately, don't accumulate in memory
133 | ```
134 |
135 | **Benefits**:
136 | - No file reads during initial scan (just stat calls)
137 | - ~90% reduction in memory usage
138 | - ~10x faster scan phase
139 | - Only checksum files that actually changed
140 |
141 | #### 1.2 Streaming Checksum Computation (Issue #382)
142 |
143 | For large files (>1MB), use chunked reading to avoid loading entire file:
144 |
145 | ```python
146 | async def _compute_checksum_streaming(self, path: Path, chunk_size: int = 65536) -> str:
147 | """Compute checksum using 64KB chunks for large files"""
148 | hasher = hashlib.sha256()
149 |
150 | loop = asyncio.get_event_loop()
151 |
152 | def read_chunks():
153 | with open(path, 'rb') as f:
154 | while chunk := f.read(chunk_size):
155 | hasher.update(chunk)
156 |
157 | await loop.run_in_executor(None, read_chunks)
158 | return hasher.hexdigest()
159 |
160 | async def _compute_checksum_async(self, file_path: Path) -> str:
161 | """Choose appropriate checksum method based on file size"""
162 | stat = file_path.stat()
163 |
164 | if stat.st_size > 1_048_576: # 1MB threshold
165 | return await self._compute_checksum_streaming(file_path)
166 | else:
167 | # Small files: existing fast path
168 | content = await self._read_file_async(file_path)
169 | return compute_checksum(content)
170 | ```
171 |
172 | **Benefits**:
173 | - Constant memory usage regardless of file size
174 | - 16MB PDF uses 64KB memory (not 16MB)
175 | - Works well with TigrisFS network I/O
176 |
177 | #### 1.3 Bounded Concurrency (Issue #198)
178 |
179 | Add semaphore to limit concurrent file operations, or consider using aiofiles and async reads
180 |
181 | ```python
182 | class SyncService:
183 | def __init__(self, ...):
184 | # ... existing code ...
185 | self._file_semaphore = asyncio.Semaphore(10) # Max 10 concurrent
186 | self._max_tracked_failures = 100 # LRU cache limit
187 |
188 | async def _read_file_async(self, file_path: Path) -> str:
189 | async with self._file_semaphore:
190 | loop = asyncio.get_event_loop()
191 | return await loop.run_in_executor(
192 | self._thread_pool,
193 | file_path.read_text,
194 | "utf-8"
195 | )
196 |
197 | async def _record_failure(self, path: str, error: str):
198 | # ... existing code ...
199 |
200 | # Implement LRU eviction
201 | if len(self._file_failures) > self._max_tracked_failures:
202 | self._file_failures.popitem(last=False) # Remove oldest
203 | ```
204 |
205 | **Benefits**:
206 | - Maximum 10 files in memory at once (vs all 2,624)
207 | - 90%+ reduction in peak memory usage
208 | - Prevents unbounded memory growth on error-prone projects
209 |
210 | ### Phase 2: Cloud-Specific Fixes (basic-memory-cloud)
211 |
212 | **Priority: P1 - High**
213 |
214 | #### 2.1 Fix Resource Leaks
215 |
216 | ```python
217 | # apps/api/src/basic_memory_cloud_api/sync_worker.py
218 |
219 | async def send_keepalive():
220 | """Send keepalive pings using proper session management"""
221 | # Use context manager to ensure cleanup
222 | async with aiohttp.ClientSession(
223 | timeout=aiohttp.ClientTimeout(total=5)
224 | ) as session:
225 | while True:
226 | try:
227 | await session.get(f"https://{fly_app_name}.fly.dev/health")
228 | await asyncio.sleep(10)
229 | except asyncio.CancelledError:
230 | raise # Exit cleanly
231 | except Exception as e:
232 | logger.warning(f"Keepalive failed: {e}")
233 | ```
234 |
235 | #### 2.2 Improve Circuit Breaker
236 |
237 | Track failures across sync cycles instead of resetting every 30s:
238 |
239 | ```python
240 | # Persistent failure tracking
241 | class SyncWorker:
242 | def __init__(self):
243 | self._persistent_failures: Dict[str, int] = {} # file -> failure_count
244 | self._failure_window_start = time.time()
245 |
246 | async def should_skip_file(self, file_path: str) -> bool:
247 | # Skip files that failed >3 times in last hour
248 | if self._persistent_failures.get(file_path, 0) > 3:
249 | if time.time() - self._failure_window_start < 3600:
250 | return True
251 | return False
252 | ```
253 |
254 | ### Phase 3: Measurement & Decision
255 |
256 | **Priority: P2 - Future**
257 |
258 | After implementing Phases 1-2, collect metrics for 2 weeks:
259 | - Memory usage per tenant sync
260 | - Sync duration (scan + process)
261 | - Concurrent sync load at peak times
262 | - OOM incidents
263 | - Resource costs
264 |
265 | **UberSync Decision Criteria**:
266 |
267 | Build centralized sync service ONLY if metrics show:
268 | - ✅ Core fixes insufficient for >100 tenants
269 | - ✅ Resource contention causing problems
270 | - ✅ Need for tenant tier prioritization (paid > free)
271 | - ✅ Cost savings justify complexity
272 |
273 | Otherwise, defer UberSync as premature optimization.
274 |
275 | ## How to Evaluate
276 |
277 | ### Success Metrics (Phase 1)
278 |
279 | **Memory Usage**:
280 | - ✅ Peak memory <500MB for 2,000+ file projects (was 1.5-1.7GB)
281 | - ✅ Memory usage linear with concurrent files (10 max), not total files
282 | - ✅ Large file memory usage: 64KB chunks (not 16MB)
283 |
284 | **Performance**:
285 | - ✅ Initial scan <30 seconds (was 5+ minutes)
286 | - ✅ Full sync <5 minutes for 2,000+ files (was 15+ minutes)
287 | - ✅ Subsequent syncs <10 seconds (only changed files)
288 |
289 | **Stability**:
290 | - ✅ 2,000+ file projects run on 1GB machines
291 | - ✅ Zero OOM kills in production
292 | - ✅ No degradation with binary files (PDFs, images)
293 |
294 | ### Success Metrics (Phase 2)
295 |
296 | **Resource Management**:
297 | - ✅ Zero aiohttp session leaks (verified via monitoring)
298 | - ✅ Circuit breaker prevents repeated failures (>3 fails = skip for 1 hour)
299 | - ✅ Tenant syncs distributed over 30s window (no thundering herd)
300 |
301 | **Observability**:
302 | - ✅ Logfire traces show memory usage per sync
303 | - ✅ Clear logging of skipped files and reasons
304 | - ✅ Metrics on sync duration, file counts, failure rates
305 |
306 | ### Test Plan
307 |
308 | **Unit Tests** (basic-memory):
309 | - mtime comparison logic
310 | - Streaming checksum correctness
311 | - Semaphore limiting (mock 100 files, verify max 10 concurrent)
312 | - LRU cache eviction
313 | - Checksum computation: streaming vs non-streaming equivalence
314 |
315 | **Integration Tests** (basic-memory):
316 | - Large file handling (create 20MB test file)
317 | - Mixed file types (text + binary)
318 | - Changed file detection via mtime
319 | - Sync with 1,000+ files
320 |
321 | **Load Tests** (basic-memory-cloud):
322 | - Test on tenant-6d2ff1a3 (2,621 files, 31 PDFs)
323 | - Monitor memory during full sync with Logfire
324 | - Measure scan and sync duration
325 | - Run on 1GB machine (downgrade from 2GB to verify)
326 | - Simulate 10 concurrent tenant syncs
327 |
328 | **Regression Tests**:
329 | - Verify existing sync scenarios still work
330 | - CLI sync behavior unchanged
331 | - File watcher integration unaffected
332 |
333 | ### Performance Benchmarks
334 |
335 | Establish baseline, then compare after each phase:
336 |
337 | | Metric | Baseline | Phase 1 Target | Phase 2 Target |
338 | |--------|----------|----------------|----------------|
339 | | Peak Memory (2,600 files) | 1.5-1.7GB | <500MB | <450MB |
340 | | Initial Scan Time | 5+ min | <30 sec | <30 sec |
341 | | Full Sync Time | 15+ min | <5 min | <5 min |
342 | | Subsequent Sync | 2+ min | <10 sec | <10 sec |
343 | | OOM Incidents/Week | 2-3 | 0 | 0 |
344 | | Min RAM Required | 2GB | 1GB | 1GB |
345 |
346 | ## Implementation Phases
347 |
348 | ### Phase 0.5: Database Schema & Streaming Foundation
349 |
350 | **Priority: P0 - Required for Phase 1**
351 |
352 | This phase establishes the foundation for streaming sync with mtime-based change detection.
353 |
354 | **Database Schema Changes**:
355 | - [x] Add `mtime` column to Entity model (REAL type for float timestamp)
356 | - [x] Add `size` column to Entity model (INTEGER type for file size in bytes)
357 | - [x] Create Alembic migration for new columns (nullable initially)
358 | - [x] Add indexes on `(file_path, project_id)` for optimistic upsert performance
359 | - [ ] Backfill existing entities with mtime/size from filesystem
360 |
361 | **Streaming Architecture**:
362 | - [x] Replace `os.walk()` with `os.scandir()` for cached stat info
363 | - [ ] Eliminate `get_db_file_state()` - no upfront SELECT all entities
364 | - [x] Implement streaming iterator `_scan_directory_streaming()`
365 | - [x] Add `get_by_file_path()` optimized query (single file lookup)
366 | - [x] Add `get_all_file_paths()` for deletion detection (paths only, no entities)
367 |
368 | **Benefits**:
369 | - **50% fewer network calls** on Tigris (scandir returns cached stat)
370 | - **No large dicts in memory** (process files one at a time)
371 | - **Indexed lookups** instead of full table scan
372 | - **Foundation for mtime comparison** (Phase 1)
373 |
374 | **Code Changes**:
375 |
376 | ```python
377 | # Before: Load all entities upfront
378 | db_paths = await self.get_db_file_state() # SELECT * FROM entity WHERE project_id = ?
379 | scan_result = await self.scan_directory() # os.walk() + stat() per file
380 |
381 | # After: Stream and query incrementally
382 | async for file_path, stat_info in self.scan_directory(): # scandir() with cached stat
383 | db_entity = await self.entity_repository.get_by_file_path(rel_path) # Indexed lookup
384 | # Process immediately, no accumulation
385 | ```
386 |
387 | **Files Modified**:
388 | - `src/basic_memory/models.py` - Add mtime/size columns
389 | - `alembic/versions/xxx_add_mtime_size.py` - Migration
390 | - `src/basic_memory/sync/sync_service.py` - Streaming implementation
391 | - `src/basic_memory/repository/entity_repository.py` - Add get_all_file_paths()
392 |
393 | **Migration Strategy**:
394 | ```sql
395 | -- Migration: Add nullable columns
396 | ALTER TABLE entity ADD COLUMN mtime REAL;
397 | ALTER TABLE entity ADD COLUMN size INTEGER;
398 |
399 | -- Backfill from filesystem during first sync after upgrade
400 | -- (Handled in sync_service on first scan)
401 | ```
402 |
403 | ### Phase 1: Core Fixes
404 |
405 | **mtime-based scanning**:
406 | - [x] Add mtime/size columns to Entity model (completed in Phase 0.5)
407 | - [x] Database migration (alembic) (completed in Phase 0.5)
408 | - [x] Refactor `scan()` to use streaming architecture with mtime/size comparison
409 | - [x] Update `sync_markdown_file()` and `sync_regular_file()` to store mtime/size in database
410 | - [x] Only compute checksums for changed files (mtime/size differ)
411 | - [x] Unit tests for streaming scan (6 tests passing)
412 | - [ ] Integration test with 1,000 files (defer to benchmarks)
413 |
414 | **Streaming checksums**:
415 | - [x] Implement `_compute_checksum_streaming()` with chunked reading
416 | - [x] Add file size threshold logic (1MB)
417 | - [x] Test with large files (16MB PDF)
418 | - [x] Verify memory usage stays constant
419 | - [x] Test checksum equivalence (streaming vs non-streaming)
420 |
421 | **Bounded concurrency**:
422 | - [x] Add semaphore (10 concurrent) to `_read_file_async()` (already existed)
423 | - [x] Add LRU cache for failures (100 max) (already existed)
424 | - [ ] Review thread pool size configuration
425 | - [ ] Load test with 2,000+ files
426 | - [ ] Verify <500MB peak memory
427 |
428 | **Cleanup & Optimization**:
429 | - [x] Eliminate `get_db_file_state()` - no upfront SELECT all entities (streaming architecture complete)
430 | - [x] Consolidate file operations in FileService (eliminate duplicate checksum logic)
431 | - [x] Add aiofiles dependency (already present)
432 | - [x] FileService streaming checksums for files >1MB
433 | - [x] SyncService delegates all file operations to FileService
434 | - [x] Complete true async I/O refactoring - all file operations use aiofiles
435 | - [x] Added `FileService.read_file_content()` using aiofiles
436 | - [x] Removed `SyncService._read_file_async()` wrapper method
437 | - [x] Removed `SyncService._compute_checksum_async()` wrapper method
438 | - [x] Inlined all 7 checksum calls to use `file_service.compute_checksum()` directly
439 | - [x] All file I/O operations now properly consolidated in FileService with non-blocking I/O
440 | - [x] Removed sync_status_service completely (unnecessary complexity and state tracking)
441 | - [x] Removed `sync_status_service.py` and `sync_status` MCP tool
442 | - [x] Removed all `sync_status_tracker` calls from `sync_service.py`
443 | - [x] Removed migration status checks from MCP tools (`write_note`, `read_note`, `build_context`)
444 | - [x] Removed `check_migration_status()` and `wait_for_migration_or_return_status()` from `utils.py`
445 | - [x] Removed all related tests (4 test files deleted)
446 | - [x] All 1184 tests passing
447 |
448 | **Phase 1 Implementation Summary:**
449 |
450 | Phase 1 is now complete with all core fixes implemented and tested:
451 |
452 | 1. **Streaming Architecture** (Phase 0.5 + Phase 1):
453 | - Replaced `os.walk()` with `os.scandir()` for cached stat info
454 | - Eliminated upfront `get_db_file_state()` SELECT query
455 | - Implemented `_scan_directory_streaming()` for incremental processing
456 | - Added indexed `get_by_file_path()` lookups
457 | - Result: 50% fewer network calls on TigrisFS, no large dicts in memory
458 |
459 | 2. **mtime-based Change Detection**:
460 | - Added `mtime` and `size` columns to Entity model
461 | - Alembic migration completed and deployed
462 | - Only compute checksums when mtime/size differs from database
463 | - Result: ~90% reduction in checksum operations during typical syncs
464 |
465 | 3. **True Async I/O with aiofiles**:
466 | - All file operations consolidated in FileService
467 | - `FileService.compute_checksum()`: 64KB chunked reading for constant memory (lines 261-296 of file_service.py)
468 | - `FileService.read_file_content()`: Non-blocking file reads with aiofiles (lines 160-193 of file_service.py)
469 | - Removed all wrapper methods from SyncService (`_read_file_async`, `_compute_checksum_async`)
470 | - Semaphore controls concurrency (max 10 concurrent file operations)
471 | - Result: Constant memory usage regardless of file size, true non-blocking I/O
472 |
473 | 4. **Test Coverage**:
474 | - 41/43 sync tests passing (2 skipped as expected)
475 | - Circuit breaker tests updated for new architecture
476 | - Streaming checksum equivalence verified
477 | - All edge cases covered (large files, concurrent operations, failures)
478 |
479 | **Key Files Modified**:
480 | - `src/basic_memory/models.py` - Added mtime/size columns
481 | - `alembic/versions/xxx_add_mtime_size.py` - Database migration
482 | - `src/basic_memory/sync/sync_service.py` - Streaming implementation, removed wrapper methods
483 | - `src/basic_memory/services/file_service.py` - Added `read_file_content()`, streaming checksums
484 | - `src/basic_memory/repository/entity_repository.py` - Added `get_all_file_paths()`
485 | - `tests/sync/test_sync_service.py` - Updated circuit breaker test mocks
486 |
487 | **Performance Improvements Achieved**:
488 | - Memory usage: Constant per file (64KB chunks) vs full file in memory
489 | - Scan speed: Stat-only scan (no checksums for unchanged files)
490 | - I/O efficiency: True async with aiofiles (no thread pool blocking)
491 | - Network efficiency: 50% fewer calls on TigrisFS via scandir caching
492 | - Architecture: Clean separation of concerns (FileService owns all file I/O)
493 | - Reduced complexity: Removed unnecessary sync_status_service state tracking
494 |
495 | **Observability**:
496 | - [x] Added Logfire instrumentation to `sync_file()` and `sync_markdown_file()`
497 | - [x] Logfire disabled by default via `ignore_no_config = true` in pyproject.toml
498 | - [x] No telemetry in FOSS version unless explicitly configured
499 | - [x] Cloud deployment can enable Logfire for performance monitoring
500 |
501 | **Next Steps**: Phase 1.5 scan watermark optimization for large project performance.
502 |
503 | ### Phase 1.5: Scan Watermark Optimization
504 |
505 | **Priority: P0 - Critical for Large Projects**
506 |
507 | This phase addresses Issue #388 where large projects (1,460+ files) take 7+ minutes for sync operations even when no files have changed.
508 |
509 | **Problem Analysis**:
510 |
511 | From production data (tenant-0a20eb58):
512 | - Total sync time: 420-450 seconds (7+ minutes) with 0 changes
513 | - Scan phase: 321 seconds (75% of total time)
514 | - Per-file cost: 220ms × 1,460 files = 5+ minutes
515 | - Root cause: Network I/O to TigrisFS for stat operations (even with mtime columns)
516 | - 15 concurrent syncs every 30 seconds compounds the problem
517 |
518 | **Current Behavior** (Phase 1):
519 | ```python
520 | async def scan(self, directory: Path):
521 | """Scan filesystem using mtime/size comparison"""
522 | # Still stats ALL 1,460 files every sync cycle
523 | async for file_path, stat_info in self._scan_directory_streaming():
524 | db_entity = await self.entity_repository.get_by_file_path(file_path)
525 | # Compare mtime/size, skip unchanged files
526 | # Only checksum if changed (✅ already optimized)
527 | ```
528 |
529 | **Problem**: Even with mtime optimization, we stat every file on every scan. On TigrisFS (network FUSE mount), this means 1,460 network calls taking 5+ minutes.
530 |
531 | **Solution: Scan Watermark + File Count Detection**
532 |
533 | Track when we last scanned and how many files existed. Use filesystem-level filtering to only examine files modified since last scan.
534 |
535 | **Key Insight**: File count changes signal deletions
536 | - Count same → incremental scan (95% of syncs)
537 | - Count increased → new files found by incremental (4% of syncs)
538 | - Count decreased → files deleted, need full scan (1% of syncs)
539 |
540 | **Database Schema Changes**:
541 |
542 | Add to Project model:
543 | ```python
544 | last_scan_timestamp: float | None # Unix timestamp of last successful scan start
545 | last_file_count: int | None # Number of files found in last scan
546 | ```
547 |
548 | **Implementation Strategy**:
549 |
550 | ```python
551 | async def scan(self, directory: Path):
552 | """Smart scan using watermark and file count"""
553 | project = await self.project_repository.get_current()
554 |
555 | # Step 1: Quick file count (fast on TigrisFS: 1.4s for 1,460 files)
556 | current_count = await self._quick_count_files(directory)
557 |
558 | # Step 2: Determine scan strategy
559 | if project.last_file_count is None:
560 | # First sync ever → full scan
561 | file_paths = await self._scan_directory_full(directory)
562 | scan_type = "full_initial"
563 |
564 | elif current_count < project.last_file_count:
565 | # Files deleted → need full scan to detect which ones
566 | file_paths = await self._scan_directory_full(directory)
567 | scan_type = "full_deletions"
568 | logger.info(f"File count decreased ({project.last_file_count} → {current_count}), running full scan")
569 |
570 | elif project.last_scan_timestamp is not None:
571 | # Incremental scan: only files modified since last scan
572 | file_paths = await self._scan_directory_modified_since(
573 | directory,
574 | project.last_scan_timestamp
575 | )
576 | scan_type = "incremental"
577 | logger.info(f"Incremental scan since {project.last_scan_timestamp}, found {len(file_paths)} changed files")
578 | else:
579 | # Fallback to full scan
580 | file_paths = await self._scan_directory_full(directory)
581 | scan_type = "full_fallback"
582 |
583 | # Step 3: Process changed files (existing logic)
584 | for file_path in file_paths:
585 | await self._process_file(file_path)
586 |
587 | # Step 4: Update watermark AFTER successful scan
588 | await self.project_repository.update(
589 | project.id,
590 | last_scan_timestamp=time.time(), # Start of THIS scan
591 | last_file_count=current_count
592 | )
593 |
594 | # Step 5: Record metrics
595 | logfire.metric_counter(f"sync.scan.{scan_type}").add(1)
596 | logfire.metric_histogram("sync.scan.files_scanned", unit="files").record(len(file_paths))
597 | ```
598 |
599 | **Helper Methods**:
600 |
601 | ```python
602 | async def _quick_count_files(self, directory: Path) -> int:
603 | """Fast file count using find command"""
604 | # TigrisFS: 1.4s for 1,460 files
605 | result = await asyncio.create_subprocess_shell(
606 | f'find "{directory}" -type f | wc -l',
607 | stdout=asyncio.subprocess.PIPE
608 | )
609 | stdout, _ = await result.communicate()
610 | return int(stdout.strip())
611 |
612 | async def _scan_directory_modified_since(
613 | self,
614 | directory: Path,
615 | since_timestamp: float
616 | ) -> List[str]:
617 | """Use find -newermt for filesystem-level filtering"""
618 | # Convert timestamp to find-compatible format
619 | since_date = datetime.fromtimestamp(since_timestamp).strftime("%Y-%m-%d %H:%M:%S")
620 |
621 | # TigrisFS: 0.2s for 0 changed files (vs 5+ minutes for full scan)
622 | result = await asyncio.create_subprocess_shell(
623 | f'find "{directory}" -type f -newermt "{since_date}"',
624 | stdout=asyncio.subprocess.PIPE
625 | )
626 | stdout, _ = await result.communicate()
627 |
628 | # Convert absolute paths to relative
629 | file_paths = []
630 | for line in stdout.decode().splitlines():
631 | if line:
632 | rel_path = Path(line).relative_to(directory).as_posix()
633 | file_paths.append(rel_path)
634 |
635 | return file_paths
636 | ```
637 |
638 | **TigrisFS Testing Results** (SSH to production-basic-memory-tenant-0a20eb58):
639 |
640 | ```bash
641 | # Full file count
642 | $ time find . -type f | wc -l
643 | 1460
644 | real 0m1.362s # ✅ Acceptable
645 |
646 | # Incremental scan (1 hour window)
647 | $ time find . -type f -newermt "2025-01-20 10:00:00" | wc -l
648 | 0
649 | real 0m0.161s # ✅ 8.5x faster!
650 |
651 | # Incremental scan (24 hours)
652 | $ time find . -type f -newermt "2025-01-19 11:00:00" | wc -l
653 | 0
654 | real 0m0.239s # ✅ 5.7x faster!
655 | ```
656 |
657 | **Conclusion**: `find -newermt` works perfectly on TigrisFS and provides massive speedup.
658 |
659 | **Expected Performance Improvements**:
660 |
661 | | Scenario | Files Changed | Current Time | With Watermark | Speedup |
662 | |----------|---------------|--------------|----------------|---------|
663 | | No changes (common) | 0 | 420s | ~2s | 210x |
664 | | Few changes | 5-10 | 420s | ~5s | 84x |
665 | | Many changes | 100+ | 420s | ~30s | 14x |
666 | | Deletions (rare) | N/A | 420s | 420s | 1x |
667 |
668 | **Full sync breakdown** (1,460 files, 0 changes):
669 | - File count: 1.4s
670 | - Incremental scan: 0.2s
671 | - Database updates: 0.4s
672 | - **Total: ~2s (225x faster)**
673 |
674 | **Metrics to Track**:
675 |
676 | ```python
677 | # Scan type distribution
678 | logfire.metric_counter("sync.scan.full_initial").add(1)
679 | logfire.metric_counter("sync.scan.full_deletions").add(1)
680 | logfire.metric_counter("sync.scan.incremental").add(1)
681 |
682 | # Performance metrics
683 | logfire.metric_histogram("sync.scan.duration", unit="ms").record(scan_ms)
684 | logfire.metric_histogram("sync.scan.files_scanned", unit="files").record(file_count)
685 | logfire.metric_histogram("sync.scan.files_changed", unit="files").record(changed_count)
686 |
687 | # Watermark effectiveness
688 | logfire.metric_histogram("sync.scan.watermark_age", unit="s").record(
689 | time.time() - project.last_scan_timestamp
690 | )
691 | ```
692 |
693 | **Edge Cases Handled**:
694 |
695 | 1. **First sync**: No watermark → full scan (expected)
696 | 2. **Deletions**: File count decreased → full scan (rare but correct)
697 | 3. **Clock skew**: Use scan start time, not end time (captures files created during scan)
698 | 4. **Scan failure**: Don't update watermark on failure (retry will re-scan)
699 | 5. **New files**: Count increased → incremental scan finds them (common, fast)
700 |
701 | **Files to Modify**:
702 | - `src/basic_memory/models.py` - Add last_scan_timestamp, last_file_count to Project
703 | - `alembic/versions/xxx_add_scan_watermark.py` - Migration for new columns
704 | - `src/basic_memory/sync/sync_service.py` - Implement watermark logic
705 | - `src/basic_memory/repository/project_repository.py` - Update methods
706 | - `tests/sync/test_sync_watermark.py` - Test watermark behavior
707 |
708 | **Test Plan**:
709 | - [x] SSH test on TigrisFS confirms `find -newermt` works (completed)
710 | - [x] Unit tests for scan strategy selection (4 tests)
711 | - [x] Unit tests for file count detection (integrated in strategy tests)
712 | - [x] Integration test: verify incremental scan finds changed files (4 tests)
713 | - [x] Integration test: verify deletion detection triggers full scan (2 tests)
714 | - [ ] Load test on tenant-0a20eb58 (1,460 files) - pending production deployment
715 | - [ ] Verify <3s for no-change sync - pending production deployment
716 |
717 | **Implementation Status**: ✅ **COMPLETED**
718 |
719 | **Code Changes** (Commit: `fb16055d`):
720 | - ✅ Added `last_scan_timestamp` and `last_file_count` to Project model
721 | - ✅ Created database migration `e7e1f4367280_add_scan_watermark_tracking_to_project.py`
722 | - ✅ Implemented smart scan strategy selection in `sync_service.py`
723 | - ✅ Added `_quick_count_files()` using `find | wc -l` (~1.4s for 1,460 files)
724 | - ✅ Added `_scan_directory_modified_since()` using `find -newermt` (~0.2s)
725 | - ✅ Added `_scan_directory_full()` wrapper for full scans
726 | - ✅ Watermark update logic after successful sync (uses sync START time)
727 | - ✅ Logfire metrics for scan types and performance tracking
728 |
729 | **Test Coverage** (18 tests in `test_sync_service_incremental.py`):
730 | - ✅ Scan strategy selection (4 tests)
731 | - First sync uses full scan
732 | - File count decreased triggers full scan
733 | - Same file count uses incremental scan
734 | - Increased file count uses incremental scan
735 | - ✅ Incremental scan base cases (4 tests)
736 | - No changes scenario
737 | - Detects new files
738 | - Detects modified files
739 | - Detects multiple changes
740 | - ✅ Deletion detection (2 tests)
741 | - Single file deletion
742 | - Multiple file deletions
743 | - ✅ Move detection (2 tests)
744 | - Moves require full scan (renames don't update mtime)
745 | - Moves detected in full scan via checksum
746 | - ✅ Watermark update (3 tests)
747 | - Watermark updated after successful sync
748 | - Watermark uses sync start time
749 | - File count accuracy
750 | - ✅ Edge cases (3 tests)
751 | - Concurrent file changes
752 | - Empty directory handling
753 | - Respects .gitignore patterns
754 |
755 | **Performance Expectations** (to be verified in production):
756 | - No changes: 420s → ~2s (210x faster)
757 | - Few changes (5-10): 420s → ~5s (84x faster)
758 | - Many changes (100+): 420s → ~30s (14x faster)
759 | - Deletions: 420s → 420s (full scan, rare case)
760 |
761 | **Rollout Strategy**:
762 | 1. ✅ Code complete and tested (18 new tests, all passing)
763 | 2. ✅ Pushed to `phase-0.5-streaming-foundation` branch
764 | 3. ⏳ Windows CI tests running
765 | 4. 📊 Deploy to staging tenant with watermark optimization
766 | 5. 📊 Monitor scan performance metrics via Logfire
767 | 6. 📊 Verify no missed files (compare full vs incremental results)
768 | 7. 📊 Deploy to production tenant-0a20eb58
769 | 8. 📊 Measure actual improvement (expect 420s → 2-3s)
770 |
771 | **Success Criteria**:
772 | - ✅ Implementation complete with comprehensive tests
773 | - [ ] No-change syncs complete in <3 seconds (was 420s) - pending production test
774 | - [ ] Incremental scans (95% of cases) use watermark - pending production test
775 | - [ ] Deletion detection works correctly (full scan when needed) - tested in unit tests ✅
776 | - [ ] No files missed due to watermark logic - tested in unit tests ✅
777 | - [ ] Metrics show scan type distribution matches expectations - pending production test
778 |
779 | **Next Steps**:
780 | 1. Production deployment to tenant-0a20eb58
781 | 2. Measure actual performance improvements
782 | 3. Monitor metrics for 1 week
783 | 4. Phase 2 cloud-specific fixes
784 | 5. Phase 3 production measurement and UberSync decision
785 |
786 | ### Phase 2: Cloud Fixes
787 |
788 | **Resource leaks**:
789 | - [ ] Fix aiohttp session context manager
790 | - [ ] Implement persistent circuit breaker
791 | - [ ] Add memory monitoring/alerts
792 | - [ ] Test on production tenant
793 |
794 | **Sync coordination**:
795 | - [ ] Implement hash-based staggering
796 | - [ ] Add jitter to sync intervals
797 | - [ ] Load test with 10 concurrent tenants
798 | - [ ] Verify no thundering herd
799 |
800 | ### Phase 3: Measurement
801 |
802 | **Deploy to production**:
803 | - [ ] Deploy Phase 1+2 changes
804 | - [ ] Downgrade tenant-6d2ff1a3 to 1GB
805 | - [ ] Monitor for OOM incidents
806 |
807 | **Collect metrics**:
808 | - [ ] Memory usage patterns
809 | - [ ] Sync duration distributions
810 | - [ ] Concurrent sync load
811 | - [ ] Cost analysis
812 |
813 | **UberSync decision**:
814 | - [ ] Review metrics against decision criteria
815 | - [ ] Document findings
816 | - [ ] Create SPEC-18 for UberSync if needed
817 |
818 | ## Related Issues
819 |
820 | ### basic-memory (core)
821 | - [#383](https://github.com/basicmachines-co/basic-memory/issues/383) - Refactor sync to use mtime-based scanning
822 | - [#382](https://github.com/basicmachines-co/basic-memory/issues/382) - Optimize memory for large file syncs
823 | - [#371](https://github.com/basicmachines-co/basic-memory/issues/371) - aiofiles for non-blocking I/O (future)
824 |
825 | ### basic-memory-cloud
826 | - [#198](https://github.com/basicmachines-co/basic-memory-cloud/issues/198) - Memory optimization for sync worker
827 | - [#189](https://github.com/basicmachines-co/basic-memory-cloud/issues/189) - Circuit breaker for infinite retry loops
828 |
829 | ## References
830 |
831 | **Standard sync tools using mtime**:
832 | - rsync: Uses mtime-based comparison by default, only checksums on `--checksum` flag
833 | - rclone: Default is mtime/size, `--checksum` mode optional
834 | - syncthing: Block-level sync with mtime tracking
835 |
836 | **fsnotify polling** (future consideration):
837 | - [fsnotify/fsnotify#9](https://github.com/fsnotify/fsnotify/issues/9) - Polling mode for network filesystems
838 |
839 | ## Notes
840 |
841 | ### Why Not UberSync Now?
842 |
843 | **Premature Optimization**:
844 | - Current problems are algorithmic, not architectural
845 | - No evidence that multi-tenant coordination is the issue
846 | - Single tenant OOM proves algorithm is the problem
847 |
848 | **Benefits of Core-First Approach**:
849 | - ✅ Helps all users (CLI + Cloud)
850 | - ✅ Lower risk (no new service)
851 | - ✅ Clear path (issues specify fixes)
852 | - ✅ Can defer UberSync until proven necessary
853 |
854 | **When UberSync Makes Sense**:
855 | - >100 active tenants causing resource contention
856 | - Need for tenant tier prioritization (paid > free)
857 | - Centralized observability requirements
858 | - Cost optimization at scale
859 |
860 | ### Migration Strategy
861 |
862 | **Backward Compatibility**:
863 | - New mtime/size columns nullable initially
864 | - Existing entities sync normally (compute mtime on first scan)
865 | - No breaking changes to MCP API
866 | - CLI behavior unchanged
867 |
868 | **Rollout**:
869 | 1. Deploy to staging with test tenant
870 | 2. Validate memory/performance improvements
871 | 3. Deploy to production (blue-green)
872 | 4. Monitor for 1 week
873 | 5. Downgrade tenant machines if successful
874 |
875 | ## Further Considerations
876 |
877 | ### Version Control System (VCS) Integration
878 |
879 | **Context:** Users frequently request git versioning, and large projects with PDFs/images pose memory challenges.
880 |
881 | #### Git-Based Sync
882 |
883 | **Approach:** Use git for change detection instead of custom mtime comparison.
884 |
885 | ```python
886 | # Git automatically tracks changes
887 | repo = git.Repo(project_path)
888 | repo.git.add(A=True)
889 | diff = repo.index.diff('HEAD')
890 |
891 | for change in diff:
892 | if change.change_type == 'M': # Modified
893 | await sync_file(change.b_path)
894 | ```
895 |
896 | **Pros:**
897 | - ✅ Proven, battle-tested change detection
898 | - ✅ Built-in rename/move detection (similarity index)
899 | - ✅ Efficient for cloud sync (git protocol over HTTP)
900 | - ✅ Could enable version history as bonus feature
901 | - ✅ Users want git integration anyway
902 |
903 | **Cons:**
904 | - ❌ User confusion (`.git` folder in knowledge base)
905 | - ❌ Conflicts with existing git repos (submodule complexity)
906 | - ❌ Adds dependency (git binary or dulwich/pygit2)
907 | - ❌ Less control over sync logic
908 | - ❌ Doesn't solve large file problem (PDFs still checksummed)
909 | - ❌ Git LFS adds complexity
910 |
911 | #### Jujutsu (jj) Alternative
912 |
913 | **Why jj is compelling:**
914 |
915 | 1. **Working Copy as Source of Truth**
916 | - Git: Staging area is intermediate state
917 | - Jujutsu: Working copy IS a commit
918 | - Aligns with "files are source of truth" philosophy!
919 |
920 | 2. **Automatic Change Tracking**
921 | - No manual staging required
922 | - Working copy changes tracked automatically
923 | - Better fit for sync operations vs git's commit-centric model
924 |
925 | 3. **Conflict Handling**
926 | - User edits + sync changes both preserved
927 | - Operation log vs linear history
928 | - Built for operations, not just history
929 |
930 | **Cons:**
931 | - ❌ New/immature (2020 vs git's 2005)
932 | - ❌ Not universally available
933 | - ❌ Steeper learning curve for users
934 | - ❌ No LFS equivalent yet
935 | - ❌ Still doesn't solve large file checksumming
936 |
937 | #### Git Index Format (Hybrid Approach)
938 |
939 | **Best of both worlds:** Use git's index format without full git repo.
940 |
941 | ```python
942 | from dulwich.index import Index # Pure Python
943 |
944 | # Use git index format for tracking
945 | idx = Index(project_path / '.basic-memory' / 'index')
946 |
947 | for file in files:
948 | stat = file.stat()
949 | if idx.get(file) and idx[file].mtime == stat.st_mtime:
950 | continue # Unchanged (git's proven logic)
951 |
952 | await sync_file(file)
953 | idx[file] = (stat.st_mtime, stat.st_size, sha)
954 | ```
955 |
956 | **Pros:**
957 | - ✅ Git's proven change detection logic
958 | - ✅ No user-visible `.git` folder
959 | - ✅ No git dependency (pure Python)
960 | - ✅ Full control over sync
961 |
962 | **Cons:**
963 | - ❌ Adds dependency (dulwich)
964 | - ❌ Doesn't solve large files
965 | - ❌ No built-in versioning
966 |
967 | ### Large File Handling
968 |
969 | **Problem:** PDFs/images cause memory issues regardless of VCS choice.
970 |
971 | **Solutions (Phase 1+):**
972 |
973 | **1. Skip Checksums for Large Files**
974 | ```python
975 | if stat.st_size > 10_000_000: # 10MB threshold
976 | checksum = None # Use mtime/size only
977 | logger.info(f"Skipping checksum for {file_path}")
978 | ```
979 |
980 | **2. Partial Hashing**
981 | ```python
982 | if file.suffix in ['.pdf', '.jpg', '.png']:
983 | # Hash first/last 64KB instead of entire file
984 | checksum = hash_partial(file, chunk_size=65536)
985 | ```
986 |
987 | **3. External Blob Storage**
988 | ```python
989 | if stat.st_size > 10_000_000:
990 | blob_id = await upload_to_tigris_blob(file)
991 | entity.blob_id = blob_id
992 | entity.file_path = None # Not in main sync
993 | ```
994 |
995 | ### Recommendation & Timeline
996 |
997 | **Phase 0.5-1 (Now):** Custom streaming + mtime
998 | - ✅ Solves urgent memory issues
999 | - ✅ No dependencies
1000 | - ✅ Full control
1001 | - ✅ Skip checksums for large files (>10MB)
1002 | - ✅ Proven pattern (rsync/rclone)
1003 |
1004 | **Phase 2 (After metrics):** Git index format exploration
1005 | ```python
1006 | # Optional: Use git index for tracking if beneficial
1007 | from dulwich.index import Index
1008 | # No git repo, just index file format
1009 | ```
1010 |
1011 | **Future (User feature):** User-facing versioning
1012 | ```python
1013 | # Let users opt into VCS:
1014 | basic-memory config set versioning git
1015 | basic-memory config set versioning jj
1016 | basic-memory config set versioning none # Current behavior
1017 |
1018 | # Integrate with their chosen workflow
1019 | # Not forced upon them
1020 | ```
1021 |
1022 | **Rationale:**
1023 | 1. **Don't block on VCS decision** - Memory issues are P0
1024 | 2. **Learn from deployment** - See actual usage patterns
1025 | 3. **Keep options open** - Can add git/jj later
1026 | 4. **Files as source of truth** - Core philosophy preserved
1027 | 5. **Large files need attention regardless** - VCS won't solve that
1028 |
1029 | **Decision Point:**
1030 | - If Phase 0.5/1 achieves memory targets → VCS integration deferred
1031 | - If users strongly request versioning → Add as opt-in feature
1032 | - If change detection becomes bottleneck → Explore git index format
1033 |
1034 | ## Agent Assignment
1035 |
1036 | **Phase 1 Implementation**: `python-developer` agent
1037 | - Expertise in FastAPI, async Python, database migrations
1038 | - Handles basic-memory core changes
1039 |
1040 | **Phase 2 Implementation**: `python-developer` agent
1041 | - Same agent continues with cloud-specific fixes
1042 | - Maintains consistency across phases
1043 |
1044 | **Phase 3 Review**: `system-architect` agent
1045 | - Analyzes metrics and makes UberSync decision
1046 | - Creates SPEC-18 if centralized service needed
1047 |
```
--------------------------------------------------------------------------------
/specs/SPEC-9 Multi-Project Bidirectional Sync Architecture.md:
--------------------------------------------------------------------------------
```markdown
1 | ---
2 | title: 'SPEC-9: Multi-Project Bidirectional Sync Architecture'
3 | type: spec
4 | permalink: specs/spec-9-multi-project-bisync
5 | tags:
6 | - cloud
7 | - bisync
8 | - architecture
9 | - multi-project
10 | ---
11 |
12 | # SPEC-9: Multi-Project Bidirectional Sync Architecture
13 |
14 | ## Status: ✅ Implementation Complete
15 |
16 | **Completed Phases:**
17 | - ✅ Phase 1: Cloud Mode Toggle & Config
18 | - ✅ Phase 2: Bisync Updates (Multi-Project)
19 | - ✅ Phase 3: Sync Command Dual Mode
20 | - ✅ Phase 4: Remove Duplicate Commands & Cloud Mode Auth
21 | - ✅ Phase 5: Mount Updates
22 | - ✅ Phase 6: Safety & Validation
23 | - ⏸️ Phase 7: Cloud-Side Implementation (Deferred to cloud repo)
24 | - ✅ Phase 8.1: Testing (All test scenarios validated)
25 | - ✅ Phase 8.2: Documentation (Core docs complete, demos pending)
26 |
27 | **Key Achievements:**
28 | - Unified CLI: `bm sync`, `bm project`, `bm tool` work transparently in both local and cloud modes
29 | - Multi-project sync: Single `bm sync` operation handles all projects bidirectionally
30 | - Cloud mode toggle: `bm cloud login` / `bm cloud logout` switches modes seamlessly
31 | - Integrity checking: `bm cloud check` verifies file matching without data transfer
32 | - Directory isolation: Mount and bisync use separate directories with conflict prevention
33 | - Clean UX: No RCLONE_TEST files, clear error messages, transparent implementation
34 |
35 | ## Why
36 |
37 | **Current State:**
38 | SPEC-8 implemented rclone bisync for cloud file synchronization, but has several architectural limitations:
39 | 1. Syncs only a single project subdirectory (`bucket:/basic-memory`)
40 | 2. Requires separate `bm cloud` command namespace, duplicating existing CLI commands
41 | 3. Users must learn different commands for local vs cloud operations
42 | 4. RCLONE_TEST marker files clutter user directories
43 |
44 | **Problems:**
45 | 1. **Duplicate Commands**: `bm project` vs `bm cloud project`, `bm tool` vs (no cloud equivalent)
46 | 2. **Inconsistent UX**: Same operations require different command syntax depending on mode
47 | 3. **Single Project Sync**: Users can only sync one project at a time
48 | 4. **Manual Coordination**: Creating new projects requires manual coordination between local and cloud
49 | 5. **Confusing Artifacts**: RCLONE_TEST marker files confuse users
50 |
51 | **Goals:**
52 | - **Unified CLI**: All existing `bm` commands work in both local and cloud mode via toggle
53 | - **Multi-Project Sync**: Single sync operation handles all projects bidirectionally
54 | - **Simple Mode Switch**: `bm cloud login` enables cloud mode, `logout` returns to local
55 | - **Automatic Registration**: Projects auto-register on both local and cloud sides
56 | - **Clean UX**: Remove unnecessary safety checks and confusing artifacts
57 |
58 | ## Cloud Access Paradigm: The Dropbox Model
59 |
60 | **Mental Model Shift:**
61 |
62 | Basic Memory cloud access follows the **Dropbox/iCloud paradigm** - not a per-project cloud connection model.
63 |
64 | **What This Means:**
65 |
66 | ```
67 | Traditional Project-Based Model (❌ Not This):
68 | bm cloud mount --project work # Mount individual project
69 | bm cloud mount --project personal # Mount another project
70 | bm cloud sync --project research # Sync specific project
71 | → Multiple connections, multiple credentials, complex management
72 |
73 | Dropbox Model (✅ This):
74 | bm cloud mount # One mount, all projects
75 | bm sync # One sync, all projects
76 | ~/basic-memory-cloud/ # One folder, all content
77 | → Single connection, organized by folders (projects)
78 | ```
79 |
80 | **Key Principles:**
81 |
82 | 1. **Mount/Bisync = Access Methods, Not Project Tools**
83 | - Mount: Read-through cache to cloud (like Dropbox folder)
84 | - Bisync: Bidirectional sync with cloud (like Dropbox sync)
85 | - Both operate at **bucket level** (all projects)
86 |
87 | 2. **Projects = Organization Within Cloud Space**
88 | - Projects are folders within your cloud storage
89 | - Creating a folder creates a project (auto-discovered)
90 | - Projects are managed via `bm project` commands
91 |
92 | 3. **One Cloud Space Per Machine**
93 | - One set of IAM credentials per tenant
94 | - One mount point: `~/basic-memory-cloud/`
95 | - One bisync directory: `~/basic-memory-cloud-sync/` (default)
96 | - All projects accessible through this single entry point
97 |
98 | 4. **Why This Works Better**
99 | - **Credential Management**: One credential set, not N sets per project
100 | - **Resource Efficiency**: One rclone process, not N processes
101 | - **Familiar Pattern**: Users already understand Dropbox/iCloud
102 | - **Operational Simplicity**: `mount` once, `unmount` once
103 | - **Scales Naturally**: Add projects by creating folders, not reconfiguring cloud access
104 |
105 | **User Journey:**
106 |
107 | ```bash
108 | # Setup cloud access (once)
109 | bm cloud login
110 | bm cloud mount # or: bm cloud setup for bisync
111 |
112 | # Work with projects (create folders as needed)
113 | cd ~/basic-memory-cloud/
114 | mkdir my-new-project
115 | echo "# Notes" > my-new-project/readme.md
116 |
117 | # Cloud auto-discovers and registers project
118 | # No additional cloud configuration needed
119 | ```
120 |
121 | This paradigm shift means **mount and bisync are infrastructure concerns**, while **projects are content organization**. Users think about their knowledge, not about cloud plumbing.
122 |
123 | ## What
124 |
125 | This spec affects:
126 |
127 | 1. **Cloud Mode Toggle** (`config.py`, `async_client.py`):
128 | - Add `cloud_mode` flag to `~/.basic-memory/config.json`
129 | - Set/unset `BASIC_MEMORY_PROXY_URL` based on cloud mode
130 | - `bm cloud login` enables cloud mode, `logout` disables it
131 | - All CLI commands respect cloud mode via existing async_client
132 |
133 | 2. **Unified CLI Commands**:
134 | - **Remove**: `bm cloud project` commands (duplicate of `bm project`)
135 | - **Enhance**: `bm sync` co-opted for bisync in cloud mode
136 | - **Keep**: `bm cloud login/logout/status/setup` for mode management
137 | - **Result**: `bm project`, `bm tool`, `bm sync` work in both modes
138 |
139 | 3. **Bisync Integration** (`bisync_commands.py`):
140 | - Remove `--check-access` (no RCLONE_TEST files)
141 | - Sync bucket root (all projects), not single subdirectory
142 | - Project auto-registration before sync
143 | - `bm sync` triggers bisync in cloud mode
144 | - `bm sync --watch` for continuous sync
145 |
146 | 4. **Config Structure**:
147 | ```json
148 | {
149 | "cloud_mode": true,
150 | "cloud_host": "https://cloud.basicmemory.com",
151 | "auth_tokens": {...},
152 | "bisync_config": {
153 | "profile": "balanced",
154 | "sync_dir": "~/basic-memory-cloud-sync"
155 | }
156 | }
157 | ```
158 |
159 | 5. **User Workflows**:
160 | - **Enable cloud**: `bm cloud login` → all commands work remotely
161 | - **Create projects**: `bm project add "name"` creates on cloud
162 | - **Sync files**: `bm sync` runs bisync (all projects)
163 | - **Use tools**: `bm tool write-note` creates notes on cloud
164 | - **Disable cloud**: `bm cloud logout` → back to local mode
165 |
166 | ## Implementation Tasks
167 |
168 | ### Phase 1: Cloud Mode Toggle & Config (Foundation) ✅
169 |
170 | **1.1 Update Config Schema**
171 | - [x] Add `cloud_mode: bool = False` to Config model
172 | - [x] Add `bisync_config: dict` with `profile` and `sync_dir` fields
173 | - [x] Ensure `cloud_host` field exists
174 | - [x] Add config migration for existing users (defaults handle this)
175 |
176 | **1.2 Update async_client.py**
177 | - [x] Read `cloud_mode` from config (not just environment)
178 | - [x] Set `BASIC_MEMORY_PROXY_URL` from config when `cloud_mode=true`
179 | - [x] Priority: env var > config.cloud_host (if cloud_mode) > None (local ASGI)
180 | - [ ] Test both local and cloud mode routing
181 |
182 | **1.3 Update Login/Logout Commands**
183 | - [x] `bm cloud login`: Set `cloud_mode=true` and save config
184 | - [x] `bm cloud login`: Set `BASIC_MEMORY_PROXY_URL` environment variable
185 | - [x] `bm cloud logout`: Set `cloud_mode=false` and save config
186 | - [x] `bm cloud logout`: Clear `BASIC_MEMORY_PROXY_URL` environment variable
187 | - [x] `bm cloud status`: Show current mode (local/cloud), connection status
188 |
189 | **1.4 Skip Initialization in Cloud Mode** ✅
190 | - [x] Update `ensure_initialization()` to check `cloud_mode` and return early
191 | - [x] Document that `config.projects` is only used in local mode
192 | - [x] Cloud manages its own projects via API, no local reconciliation needed
193 |
194 | ### Phase 2: Bisync Updates (Multi-Project)
195 |
196 | **2.1 Remove RCLONE_TEST Files** ✅
197 | - [x] Update all bisync profiles: `check_access=False`
198 | - [x] Remove RCLONE_TEST creation from `setup_cloud_bisync()`
199 | - [x] Remove RCLONE_TEST upload logic
200 | - [ ] Update documentation
201 |
202 | **2.2 Sync Bucket Root (All Projects)** ✅
203 | - [x] Change remote path from `bucket:/basic-memory` to `bucket:/` in `build_bisync_command()`
204 | - [x] Update `setup_cloud_bisync()` to use bucket root
205 | - [ ] Test with multiple projects
206 |
207 | **2.3 Project Auto-Registration (Bisync)** ✅
208 | - [x] Add `fetch_cloud_projects()` function (GET /proxy/projects/projects)
209 | - [x] Add `scan_local_directories()` function
210 | - [x] Add `create_cloud_project()` function (POST /proxy/projects/projects)
211 | - [x] Integrate into `run_bisync()`: fetch → scan → create missing → sync
212 | - [x] Wait for API 201 response before syncing
213 |
214 | **2.4 Bisync Directory Configuration** ✅
215 | - [x] Add `--dir` parameter to `bm cloud bisync-setup`
216 | - [x] Store bisync directory in config
217 | - [x] Default to `~/basic-memory-cloud-sync/`
218 | - [x] Add `validate_bisync_directory()` safety check
219 | - [x] Update `get_default_mount_path()` to return fixed `~/basic-memory-cloud/`
220 |
221 | **2.5 Sync/Status API Infrastructure** ✅ (commit d48b1dc)
222 | - [x] Create `POST /{project}/project/sync` endpoint for background sync
223 | - [x] Create `POST /{project}/project/status` endpoint for scan-only status
224 | - [x] Create `SyncReportResponse` Pydantic schema
225 | - [x] Refactor CLI `sync` command to use API endpoint
226 | - [x] Refactor CLI `status` command to use API endpoint
227 | - [x] Create `command_utils.py` with shared `run_sync()` function
228 | - [x] Update `notify_container_sync()` to call `run_sync()` for each project
229 | - [x] Update all tests to match new API-based implementation
230 |
231 | ### Phase 3: Sync Command Dual Mode ✅
232 |
233 | **3.1 Update `bm sync` Command** ✅
234 | - [x] Check `config.cloud_mode` at start
235 | - [x] If `cloud_mode=false`: Run existing local sync
236 | - [x] If `cloud_mode=true`: Run bisync
237 | - [x] Add `--watch` parameter for continuous sync
238 | - [x] Add `--interval` parameter (default 60 seconds)
239 | - [x] Error if `--watch` used in local mode with helpful message
240 |
241 | **3.2 Watch Mode for Bisync** ✅
242 | - [x] Implement `run_bisync_watch()` with interval loop
243 | - [x] Add `--interval` parameter (default 60 seconds)
244 | - [x] Handle errors gracefully, continue on failure
245 | - [x] Show sync progress and status
246 |
247 | **3.3 Integrity Check Command** ✅
248 | - [x] Implement `bm cloud check` command using `rclone check`
249 | - [x] Read-only operation that verifies file matching
250 | - [x] Error with helpful messages if rclone/bisync not set up
251 | - [x] Support `--one-way` flag for faster checks
252 | - [x] Transparent about rclone implementation
253 | - [x] Suggest `bm sync` to resolve differences
254 |
255 | **Implementation Notes:**
256 | - `bm sync` adapts to cloud mode automatically - users don't need separate commands
257 | - `bm cloud bisync` kept for power users with full options (--dry-run, --resync, --profile, --verbose)
258 | - `bm cloud check` provides integrity verification without transferring data
259 | - Design philosophy: Simplicity for everyday use, transparency about implementation
260 |
261 | ### Phase 4: Remove Duplicate Commands & Cloud Mode Auth ✅
262 |
263 | **4.0 Cloud Mode Authentication** ✅
264 | - [x] Update `async_client.py` to support dual auth sources
265 | - [x] FastMCP context auth (cloud service mode) via `inject_auth_header()`
266 | - [x] JWT token file auth (CLI cloud mode) via `CLIAuth.get_valid_token()`
267 | - [x] Automatic token refresh for CLI cloud mode
268 | - [x] Remove `BASIC_MEMORY_PROXY_URL` environment variable dependency
269 | - [x] Simplify to use only `config.cloud_mode` + `config.cloud_host`
270 |
271 | **4.1 Delete `bm cloud project` Commands** ✅
272 | - [x] Remove `bm cloud project list` (use `bm project list`)
273 | - [x] Remove `bm cloud project add` (use `bm project add`)
274 | - [x] Update `core_commands.py` to remove project_app subcommands
275 | - [x] Keep only: `login`, `logout`, `status`, `setup`, `mount`, `unmount`, bisync commands
276 | - [x] Remove unused imports (Table, generate_permalink, os)
277 | - [x] Clean up environment variable references in login/logout
278 |
279 | **4.2 CLI Command Cloud Mode Integration** ✅
280 | - [x] Add runtime `cloud_mode_enabled` checks to all CLI commands
281 | - [x] Update `list_projects()` to conditionally authenticate based on cloud mode
282 | - [x] Update `remove_project()` to conditionally authenticate based on cloud mode
283 | - [x] Update `run_sync()` to conditionally authenticate based on cloud mode
284 | - [x] Update `get_project_info()` to conditionally authenticate based on cloud mode
285 | - [x] Update `run_status()` to conditionally authenticate based on cloud mode
286 | - [x] Remove auth from `set_default_project()` (local-only command, no cloud version)
287 | - [x] Create CLI integration tests (`test-int/cli/`) to validate both local and cloud modes
288 | - [x] Replace mock-heavy CLI tests with integration tests (deleted 5 mock test files)
289 |
290 | **4.3 OAuth Authentication Fixes** ✅
291 | - [x] Restore missing `SettingsConfigDict` in `BasicMemoryConfig`
292 | - [x] Fix environment variable reading with `BASIC_MEMORY_` prefix
293 | - [x] Fix `.env` file loading
294 | - [x] Fix extra field handling for config files
295 | - [x] Resolve `bm cloud login` OAuth failure ("Something went wrong" error)
296 | - [x] Implement PKCE (Proof Key for Code Exchange) for device flow
297 | - [x] Generate code verifier and SHA256 challenge for device authorization
298 | - [x] Send code_verifier with token polling requests
299 | - [x] Support both PKCE-required and PKCE-optional OAuth clients
300 | - [x] Verify authentication flow works end-to-end with staging and production
301 | - [x] Document WorkOS requirement: redirect URI must be configured even for device flow
302 |
303 | **4.4 Update Documentation**
304 | - [ ] Update `cloud-cli.md` with cloud mode toggle workflow
305 | - [ ] Document `bm cloud login` → use normal commands
306 | - [ ] Add examples of cloud mode usage
307 | - [ ] Document mount vs bisync directory isolation
308 | - [ ] Add troubleshooting section
309 |
310 | ### Phase 5: Mount Updates ✅
311 |
312 | **5.1 Fixed Mount Directory** ✅
313 | - [x] Change mount path to `~/basic-memory-cloud/` (fixed, no tenant ID)
314 | - [x] Update `get_default_mount_path()` function
315 | - [x] Remove configurability (fixed location)
316 | - [x] Update mount commands to use new path
317 |
318 | **5.2 Mount at Bucket Root** ✅
319 | - [x] Ensure mount uses bucket root (not subdirectory)
320 | - [x] Test with multiple projects
321 | - [x] Verify all projects visible in mount
322 |
323 | **Implementation:** Mount uses fixed `~/basic-memory-cloud/` directory and syncs entire bucket root `basic-memory-{tenant_id}:{bucket_name}` for all projects.
324 |
325 | ### Phase 6: Safety & Validation ✅
326 |
327 | **6.1 Directory Conflict Prevention** ✅
328 | - [x] Implement `validate_bisync_directory()` check
329 | - [x] Detect if bisync dir == mount dir
330 | - [x] Detect if bisync dir is currently mounted
331 | - [x] Show clear error messages with solutions
332 |
333 | **6.2 State Management** ✅
334 | - [x] Use `--workdir` for bisync state
335 | - [x] Store state in `~/.basic-memory/bisync-state/{tenant-id}/`
336 | - [x] Ensure state directory created before bisync
337 |
338 | **Implementation:** `validate_bisync_directory()` prevents conflicts by checking directory equality and mount status. State managed in isolated `~/.basic-memory/bisync-state/{tenant-id}/` directory using `--workdir` flag.
339 |
340 | ### Phase 7: Cloud-Side Implementation (Deferred to Cloud Repo)
341 |
342 | **7.1 Project Discovery Service (Cloud)** - Deferred
343 | - [ ] Create `ProjectDiscoveryService` background job
344 | - [ ] Scan `/app/data/` every 2 minutes
345 | - [ ] Auto-register new directories as projects
346 | - [ ] Log discovery events
347 | - [ ] Handle errors gracefully
348 |
349 | **7.2 Project API Updates (Cloud)** - Deferred
350 | - [ ] Ensure `POST /proxy/projects/projects` creates directory synchronously
351 | - [ ] Return 201 with project details
352 | - [ ] Ensure directory ready immediately after creation
353 |
354 | **Note:** Phase 7 is cloud-side work that belongs in the basic-memory-cloud repository. The CLI-side implementation (Phase 2.3 auto-registration) is complete and working - it calls the existing cloud API endpoints.
355 |
356 | ### Phase 8: Testing & Documentation
357 |
358 | **8.1 Test Scenarios**
359 | - [x] Test: Cloud mode toggle (login/logout)
360 | - [x] Test: Local-first project creation (bisync)
361 | - [x] Test: Cloud-first project creation (API)
362 | - [x] Test: Multi-project bidirectional sync
363 | - [x] Test: MCP tools in cloud mode
364 | - [x] Test: Watch mode continuous sync
365 | - [x] Test: Safety profile protection (max_delete implemented)
366 | - [x] Test: No RCLONE_TEST files (check_access=False in all profiles)
367 | - [x] Test: Mount/bisync directory isolation (validate_bisync_directory)
368 | - [x] Test: Integrity check command (bm cloud check)
369 |
370 | **8.2 Documentation**
371 | - [x] Update cloud-cli.md with cloud mode instructions
372 | - [x] Document Dropbox model paradigm
373 | - [x] Update command reference with new commands
374 | - [x] Document `bm sync` dual mode behavior
375 | - [x] Document `bm cloud check` command
376 | - [x] Document directory structure and fixed paths
377 | - [ ] Update README with quick start
378 | - [ ] Create migration guide for existing users
379 | - [ ] Create video/GIF demos
380 |
381 | ### Success Criteria Checklist
382 |
383 | - [x] `bm cloud login` enables cloud mode for all commands
384 | - [x] `bm cloud logout` reverts to local mode
385 | - [x] `bm project`, `bm tool`, `bm sync` work transparently in both modes
386 | - [x] `bm sync` runs bisync in cloud mode, local sync in local mode
387 | - [x] Single sync operation handles all projects bidirectionally
388 | - [x] Local directories auto-create cloud projects via API
389 | - [x] Cloud projects auto-sync to local directories
390 | - [x] No RCLONE_TEST files in user directories
391 | - [x] Bisync profiles provide safety via `max_delete` limits
392 | - [x] `bm sync --watch` enables continuous sync
393 | - [x] No duplicate `bm cloud project` commands (removed)
394 | - [x] `bm cloud check` command for integrity verification
395 | - [ ] Documentation covers cloud mode toggle and workflows
396 | - [ ] Edge cases handled gracefully with clear errors
397 |
398 | ## How (High Level)
399 |
400 | ### Architecture Overview
401 |
402 | **Cloud Mode Toggle:**
403 | ```
404 | ┌─────────────────────────────────────┐
405 | │ bm cloud login │
406 | │ ├─ Authenticate via OAuth │
407 | │ ├─ Set cloud_mode: true in config │
408 | │ └─ Set BASIC_MEMORY_PROXY_URL │
409 | └─────────────────────────────────────┘
410 | ↓
411 | ┌─────────────────────────────────────┐
412 | │ All CLI commands use async_client │
413 | │ ├─ async_client checks proxy URL │
414 | │ ├─ If set: HTTP to cloud │
415 | │ └─ If not: Local ASGI │
416 | └─────────────────────────────────────┘
417 | ↓
418 | ┌─────────────────────────────────────┐
419 | │ bm project add "work" │
420 | │ bm tool write-note ... │
421 | │ bm sync (triggers bisync) │
422 | │ → All work against cloud │
423 | └─────────────────────────────────────┘
424 | ```
425 |
426 | **Storage Hierarchy:**
427 | ```
428 | Cloud Container: Bucket: Local Sync Dir:
429 | /app/data/ (mounted) ←→ production-tenant-{id}/ ←→ ~/basic-memory-cloud-sync/
430 | ├── basic-memory/ ├── basic-memory/ ├── basic-memory/
431 | │ ├── notes/ │ ├── notes/ │ ├── notes/
432 | │ └── concepts/ │ └── concepts/ │ └── concepts/
433 | ├── work-project/ ├── work-project/ ├── work-project/
434 | │ └── tasks/ │ └── tasks/ │ └── tasks/
435 | └── personal/ └── personal/ └── personal/
436 | └── journal/ └── journal/ └── journal/
437 |
438 | Bidirectional sync via rclone bisync
439 | ```
440 |
441 | ### Sync Flow
442 |
443 | **`bm sync` execution (in cloud mode):**
444 |
445 | 1. **Check cloud mode**
446 | ```python
447 | if not config.cloud_mode:
448 | # Run normal local file sync
449 | run_local_sync()
450 | return
451 |
452 | # Cloud mode: Run bisync
453 | ```
454 |
455 | 2. **Fetch cloud projects**
456 | ```python
457 | # GET /proxy/projects/projects (via async_client)
458 | cloud_projects = fetch_cloud_projects()
459 | cloud_project_names = {p["name"] for p in cloud_projects["projects"]}
460 | ```
461 |
462 | 3. **Scan local sync directory**
463 | ```python
464 | sync_dir = config.bisync_config["sync_dir"] # ~/basic-memory-cloud-sync
465 | local_dirs = [d.name for d in sync_dir.iterdir()
466 | if d.is_dir() and not d.name.startswith('.')]
467 | ```
468 |
469 | 4. **Create missing cloud projects**
470 | ```python
471 | for dir_name in local_dirs:
472 | if dir_name not in cloud_project_names:
473 | # POST /proxy/projects/projects (via async_client)
474 | create_cloud_project(name=dir_name)
475 | # Blocks until 201 response
476 | ```
477 |
478 | 5. **Run bisync on bucket root**
479 | ```bash
480 | rclone bisync \
481 | ~/basic-memory-cloud-sync \
482 | basic-memory-{tenant}:{bucket} \
483 | --filters-file ~/.basic-memory/.bmignore.rclone \
484 | --conflict-resolve=newer \
485 | --max-delete=25
486 | # Syncs ALL project subdirectories bidirectionally
487 | ```
488 |
489 | 6. **Notify cloud to refresh** (commit d48b1dc)
490 | ```python
491 | # After rclone bisync completes, sync each project's database
492 | for project in cloud_projects:
493 | # POST /{project}/project/sync (via async_client)
494 | # Triggers background sync for this project
495 | await run_sync(project=project_name)
496 | ```
497 |
498 | ### Key Changes
499 |
500 | **1. Cloud Mode via Config**
501 |
502 | **Config changes:**
503 | ```python
504 | class Config:
505 | cloud_mode: bool = False
506 | cloud_host: str = "https://cloud.basicmemory.com"
507 | bisync_config: dict = {
508 | "profile": "balanced",
509 | "sync_dir": "~/basic-memory-cloud-sync"
510 | }
511 | ```
512 |
513 | **async_client.py behavior:**
514 | ```python
515 | def create_client() -> AsyncClient:
516 | # Check config first, then environment
517 | config = ConfigManager().config
518 | proxy_url = os.getenv("BASIC_MEMORY_PROXY_URL") or \
519 | (config.cloud_host if config.cloud_mode else None)
520 |
521 | if proxy_url:
522 | return AsyncClient(base_url=proxy_url) # HTTP to cloud
523 | else:
524 | return AsyncClient(transport=ASGITransport(...)) # Local ASGI
525 | ```
526 |
527 | **2. Login/Logout Sets Cloud Mode**
528 |
529 | ```python
530 | # bm cloud login
531 | async def login():
532 | # Existing OAuth flow...
533 | success = await auth.login()
534 | if success:
535 | config.cloud_mode = True
536 | config.save()
537 | os.environ["BASIC_MEMORY_PROXY_URL"] = config.cloud_host
538 | ```
539 |
540 | ```python
541 | # bm cloud logout
542 | def logout():
543 | config.cloud_mode = False
544 | config.save()
545 | os.environ.pop("BASIC_MEMORY_PROXY_URL", None)
546 | ```
547 |
548 | **3. Remove Duplicate Commands**
549 |
550 | **Delete:**
551 | - `bm cloud project list` → use `bm project list`
552 | - `bm cloud project add` → use `bm project add`
553 |
554 | **Keep:**
555 | - `bm cloud login` - Enable cloud mode
556 | - `bm cloud logout` - Disable cloud mode
557 | - `bm cloud status` - Show current mode & connection
558 | - `bm cloud setup` - Initial bisync setup
559 | - `bm cloud bisync` - Power-user command with full options
560 | - `bm cloud check` - Verify file integrity between local and cloud
561 |
562 | **4. Sync Command Dual Mode**
563 |
564 | ```python
565 | # bm sync
566 | def sync_command(watch: bool = False, profile: str = "balanced"):
567 | config = ConfigManager().config
568 |
569 | if config.cloud_mode:
570 | # Run bisync for cloud sync
571 | run_bisync(profile=profile, watch=watch)
572 | else:
573 | # Run local file sync
574 | run_local_sync()
575 | ```
576 |
577 | **5. Remove RCLONE_TEST Files**
578 |
579 | ```python
580 | # All profiles: check_access=False
581 | BISYNC_PROFILES = {
582 | "safe": RcloneBisyncProfile(check_access=False, max_delete=10),
583 | "balanced": RcloneBisyncProfile(check_access=False, max_delete=25),
584 | "fast": RcloneBisyncProfile(check_access=False, max_delete=50),
585 | }
586 | ```
587 |
588 | **6. Sync Bucket Root (All Projects)**
589 |
590 | ```python
591 | # Sync entire bucket, not subdirectory
592 | rclone_remote = f"basic-memory-{tenant_id}:{bucket_name}"
593 | ```
594 |
595 | ## How to Evaluate
596 |
597 | ### Test Scenarios
598 |
599 | **1. Cloud Mode Toggle**
600 | ```bash
601 | # Start in local mode
602 | bm project list
603 | # → Shows local projects
604 |
605 | # Enable cloud mode
606 | bm cloud login
607 | # → Authenticates, sets cloud_mode=true
608 |
609 | bm project list
610 | # → Now shows cloud projects (same command!)
611 |
612 | # Disable cloud mode
613 | bm cloud logout
614 |
615 | bm project list
616 | # → Back to local projects
617 | ```
618 |
619 | **Expected:** ✅ Single command works in both modes
620 |
621 | **2. Local-First Project Creation (Cloud Mode)**
622 | ```bash
623 | # Enable cloud mode
624 | bm cloud login
625 |
626 | # Create new project locally in sync dir
627 | mkdir ~/basic-memory-cloud-sync/my-research
628 | echo "# Research Notes" > ~/basic-memory-cloud-sync/my-research/index.md
629 |
630 | # Run sync (triggers bisync in cloud mode)
631 | bm sync
632 |
633 | # Verify:
634 | # - Cloud project created automatically via API
635 | # - Files synced to bucket:/my-research/
636 | # - Cloud database updated
637 | # - `bm project list` shows new project
638 | ```
639 |
640 | **Expected:** ✅ Project visible in cloud project list
641 |
642 | **3. Cloud-First Project Creation**
643 | ```bash
644 | # In cloud mode
645 | bm project add "work-notes"
646 | # → Creates project on cloud (via async_client HTTP)
647 |
648 | # Run sync
649 | bm sync
650 |
651 | # Verify:
652 | # - Local directory ~/basic-memory-cloud-sync/work-notes/ created
653 | # - Files sync bidirectionally
654 | # - Can use `bm tool write-note` to add content remotely
655 | ```
656 |
657 | **Expected:** ✅ Project accessible via all CLI commands
658 |
659 | **4. Multi-Project Bidirectional Sync**
660 | ```bash
661 | # Setup: 3 projects in cloud mode
662 | # Modify files in all 3 locally and remotely
663 |
664 | bm sync
665 |
666 | # Verify:
667 | # - All 3 projects sync simultaneously
668 | # - Changes propagate correctly
669 | # - No cross-project interference
670 | ```
671 |
672 | **Expected:** ✅ All projects in sync state
673 |
674 | **5. MCP Tools Work in Cloud Mode**
675 | ```bash
676 | # In cloud mode
677 | bm tool write-note \
678 | --title "Meeting Notes" \
679 | --folder "work-notes" \
680 | --content "Discussion points..."
681 |
682 | # Verify:
683 | # - Note created on cloud (via async_client HTTP)
684 | # - Next `bm sync` pulls note to local
685 | # - Note appears in ~/basic-memory-cloud-sync/work-notes/
686 | ```
687 |
688 | **Expected:** ✅ Tools work transparently in cloud mode
689 |
690 | **6. Watch Mode Continuous Sync**
691 | ```bash
692 | # In cloud mode
693 | bm sync --watch
694 |
695 | # While running:
696 | # - Create local folder → auto-creates cloud project
697 | # - Edit files locally → syncs to cloud
698 | # - Edit files remotely → syncs to local
699 | # - Create project via API → appears locally
700 |
701 | # Verify:
702 | # - Continuous bidirectional sync
703 | # - New projects handled automatically
704 | # - No manual intervention needed
705 | ```
706 |
707 | **Expected:** ✅ Seamless continuous sync
708 |
709 | **7. Safety Profile Protection**
710 | ```bash
711 | # Create project with 15 files locally
712 | # Delete project from cloud (simulate error)
713 |
714 | bm sync --profile safe
715 |
716 | # Verify:
717 | # - Bisync detects 15 pending deletions
718 | # - Exceeds max_delete=10 limit
719 | # - Aborts with clear error
720 | # - No files deleted locally
721 | ```
722 |
723 | **Expected:** ✅ Safety limit prevents data loss
724 |
725 | **8. No RCLONE_TEST Files**
726 | ```bash
727 | # After setup and multiple syncs
728 | ls -la ~/basic-memory-cloud-sync/
729 |
730 | # Verify:
731 | # - No RCLONE_TEST files
732 | # - No .rclone state files (in ~/.basic-memory/bisync-state/)
733 | # - Clean directory structure
734 | ```
735 |
736 | **Expected:** ✅ User directory stays clean
737 |
738 | ### Success Criteria
739 |
740 | - [x] `bm cloud login` enables cloud mode for all commands
741 | - [x] `bm cloud logout` reverts to local mode
742 | - [x] `bm project`, `bm tool`, `bm sync` work in both modes transparently
743 | - [x] `bm sync` runs bisync in cloud mode, local sync in local mode
744 | - [x] Single sync operation handles all projects bidirectionally
745 | - [x] Local directories auto-create cloud projects via API
746 | - [x] Cloud projects auto-sync to local directories
747 | - [x] No RCLONE_TEST files in user directories
748 | - [x] Bisync profiles provide safety via `max_delete` limits
749 | - [x] `bm sync --watch` enables continuous sync
750 | - [x] No duplicate `bm cloud project` commands (removed)
751 | - [x] `bm cloud check` command for integrity verification
752 | - [ ] Documentation covers cloud mode toggle and workflows
753 | - [ ] Edge cases handled gracefully with clear errors
754 |
755 | ## Notes
756 |
757 | ### API Contract
758 |
759 | **Cloud must provide:**
760 |
761 | 1. **Project Management APIs:**
762 | - `GET /proxy/projects/projects` - List all projects
763 | - `POST /proxy/projects/projects` - Create project synchronously
764 | - `POST /proxy/sync` - Trigger cache refresh
765 |
766 | 2. **Project Discovery Service (Background):**
767 | - **Purpose**: Auto-register projects created via mount, direct bucket uploads, or any non-API method
768 | - **Interval**: Every 2 minutes
769 | - **Behavior**:
770 | - Scan `/app/data/` for directories
771 | - Register any directory not already in project database
772 | - Log discovery events
773 | - **Implementation**:
774 | ```python
775 | class ProjectDiscoveryService:
776 | """Background service to auto-discover projects from filesystem."""
777 |
778 | async def run(self):
779 | """Scan /app/data/ and register new project directories."""
780 | data_path = Path("/app/data")
781 |
782 | for dir_path in data_path.iterdir():
783 | # Skip hidden and special directories
784 | if not dir_path.is_dir() or dir_path.name.startswith('.'):
785 | continue
786 |
787 | project_name = dir_path.name
788 |
789 | # Check if project already registered
790 | project = await self.project_repo.get_by_name(project_name)
791 | if not project:
792 | # Auto-register new project
793 | await self.project_repo.create(
794 | name=project_name,
795 | path=str(dir_path)
796 | )
797 | logger.info(f"Auto-discovered project: {project_name}")
798 | ```
799 |
800 | **Project Creation (API-based):**
801 | - API creates `/app/data/{project-name}/` directory
802 | - Registers project in database
803 | - Returns 201 with project details
804 | - Directory ready for bisync immediately
805 |
806 | **Project Creation (Discovery-based):**
807 | - User creates folder via mount: `~/basic-memory-cloud/new-project/`
808 | - Files appear in `/app/data/new-project/` (mounted bucket)
809 | - Discovery service finds directory on next scan (within 2 minutes)
810 | - Auto-registers as project
811 | - User sees project in `bm project list` after discovery
812 |
813 | **Why Both Methods:**
814 | - **API**: Immediate registration when using bisync (client-side scan + API call)
815 | - **Discovery**: Delayed registration when using mount (no API call hook)
816 | - **Result**: Projects created ANY way (API, mount, bisync, WebDAV) eventually registered
817 | - **Trade-off**: 2-minute delay for mount-created projects is acceptable
818 |
819 | ### Mount vs Bisync Directory Isolation
820 |
821 | **Critical Safety Requirement**: Mount and bisync MUST use different directories to prevent conflicts.
822 |
823 | **The Dropbox Model Applied:**
824 |
825 | Both mount and bisync operate at **bucket level** (all projects), following the Dropbox/iCloud paradigm:
826 |
827 | ```
828 | ~/basic-memory-cloud/ # Mount: Read-through cache (like Dropbox folder)
829 | ├── work-notes/
830 | ├── personal/
831 | └── research/
832 |
833 | ~/basic-memory-cloud-sync/ # Bisync: Bidirectional sync (like Dropbox sync folder)
834 | ├── work-notes/
835 | ├── personal/
836 | └── research/
837 | ```
838 |
839 | **Mount Directory (Fixed):**
840 | ```bash
841 | # Fixed location, not configurable
842 | ~/basic-memory-cloud/
843 | ```
844 | - **Scope**: Entire bucket (all projects)
845 | - **Method**: NFS mount via `rclone nfsmount`
846 | - **Behavior**: Read-through cache to cloud bucket
847 | - **Credentials**: One IAM credential set per tenant
848 | - **Process**: One rclone mount process
849 | - **Use Case**: Quick access, browsing, light editing
850 | - **Known Issue**: Obsidian compatibility problems with NFS
851 | - **Not Configurable**: Fixed location prevents user error
852 |
853 | **Why Fixed Location:**
854 | - One mount point per machine (like `/Users/you/Dropbox`)
855 | - Prevents credential proliferation (one credential set, not N)
856 | - Prevents multiple mount processes (resource efficiency)
857 | - Familiar pattern users already understand
858 | - Simple operations: `mount` once, `unmount` once
859 |
860 | **Bisync Directory (User Configurable):**
861 | ```bash
862 | # Default location
863 | ~/basic-memory-cloud-sync/
864 |
865 | # User can override
866 | bm cloud setup --dir ~/my-knowledge-base
867 | ```
868 | - **Scope**: Entire bucket (all projects)
869 | - **Method**: Bidirectional sync via `rclone bisync`
870 | - **Behavior**: Full local copy with periodic sync
871 | - **Credentials**: Same IAM credential set as mount
872 | - **Use Case**: Full offline access, reliable editing, Obsidian support
873 | - **Configurable**: Users may want specific locations (external drive, existing folder structure)
874 |
875 | **Why User Configurable:**
876 | - Users have preferences for where local copies live
877 | - May want sync folder on external drive
878 | - May want to integrate with existing folder structure
879 | - Default works for most, option available for power users
880 |
881 | **Conflict Prevention:**
882 | ```python
883 | def validate_bisync_directory(bisync_dir: Path):
884 | """Ensure bisync directory doesn't conflict with mount."""
885 | mount_dir = Path.home() / "basic-memory-cloud"
886 |
887 | if bisync_dir.resolve() == mount_dir.resolve():
888 | raise BisyncError(
889 | f"Cannot use {bisync_dir} for bisync - it's the mount directory!\n"
890 | f"Mount and bisync must use different directories.\n\n"
891 | f"Options:\n"
892 | f" 1. Use default: ~/basic-memory-cloud-sync/\n"
893 | f" 2. Specify different directory: --dir ~/my-sync-folder"
894 | )
895 |
896 | # Check if mount is active at this location
897 | result = subprocess.run(["mount"], capture_output=True, text=True)
898 | if str(bisync_dir) in result.stdout and "rclone" in result.stdout:
899 | raise BisyncError(
900 | f"{bisync_dir} is currently mounted via 'bm cloud mount'\n"
901 | f"Cannot use mounted directory for bisync.\n\n"
902 | f"Either:\n"
903 | f" 1. Unmount first: bm cloud unmount\n"
904 | f" 2. Use different directory for bisync"
905 | )
906 | ```
907 |
908 | **Why This Matters:**
909 | - Mounting and syncing the SAME directory would create infinite loops
910 | - rclone mount → bisync detects changes → syncs to bucket → mount sees changes → triggers bisync → ∞
911 | - Separate directories = clean separation of concerns
912 | - Mount is read-heavy caching layer, bisync is write-heavy bidirectional sync
913 |
914 | ### Future Enhancements
915 |
916 | **Phase 2 (Not in this spec):**
917 | - **Near Real-Time Sync**: Integrate `watch_service.py` with cloud mode
918 | - Watch service detects local changes (already battle-tested)
919 | - Queue changes in memory
920 | - Use `rclone copy` for individual file sync (near instant)
921 | - Example: `rclone copyto ~/sync/project/file.md tenant:{bucket}/project/file.md`
922 | - Fallback to full `rclone bisync` every N seconds for bidirectional changes
923 | - Provides near real-time sync without polling overhead
924 | - Per-project bisync profiles (different safety levels per project)
925 | - Selective project sync (exclude specific projects from sync)
926 | - Project deletion workflow (cascade to cloud/local)
927 | - Conflict resolution UI/CLI
928 |
929 | **Phase 3:**
930 | - Project sharing between tenants
931 | - Incremental backup/restore
932 | - Sync statistics and bandwidth monitoring
933 | - Mobile app integration with cloud mode
934 |
935 | ### Related Specs
936 |
937 | - **SPEC-8**: TigrisFS Integration - Original bisync implementation
938 | - **SPEC-6**: Explicit Project Parameter Architecture - Multi-project foundations
939 | - **SPEC-5**: CLI Cloud Upload via WebDAV - Cloud file operations
940 |
941 | ### Implementation Notes
942 |
943 | **Architectural Simplifications:**
944 | - **Unified CLI**: Eliminated duplicate commands by using mode toggle
945 | - **Single Entry Point**: All commands route through `async_client` which handles mode
946 | - **Config-Driven**: Cloud mode stored in persistent config, not just environment
947 | - **Transparent Routing**: Existing commands work without modification in cloud mode
948 |
949 | **Complexity Trade-offs:**
950 | - Removed: Separate `bm cloud project` command namespace
951 | - Removed: Complex state detection for new projects
952 | - Removed: RCLONE_TEST marker file management
953 | - Added: Simple cloud_mode flag and config integration
954 | - Added: Simple project list comparison before sync
955 | - Relied on: Existing bisync profile safety mechanisms
956 | - Result: Significantly simpler, more maintainable code
957 |
958 | **User Experience:**
959 | - **Mental Model**: "Toggle cloud mode, use normal commands"
960 | - **No Learning Curve**: Same commands work locally and in cloud
961 | - **Minimal Config**: Just login/logout to switch modes
962 | - **Safety**: Profile system gives users control over safety/speed trade-offs
963 | - **"Just Works"**: Create folders anywhere, they sync automatically
964 |
965 | **Migration Path:**
966 | - Existing `bm cloud project` users: Use `bm project` instead
967 | - Existing `bm cloud bisync` becomes `bm sync` in cloud mode
968 | - Config automatically migrates on first `bm cloud login`
969 |
970 |
971 | ## Testing
972 |
973 |
974 | Initial Setup (One Time)
975 |
976 | 1. Login to cloud and enable cloud mode:
977 | bm cloud login
978 | # → Authenticates via OAuth
979 | # → Sets cloud_mode=true in config
980 | # → Sets BASIC_MEMORY_PROXY_URL environment variable
981 | # → All CLI commands now route to cloud
982 |
983 | 2. Check cloud mode status:
984 | bm cloud status
985 | # → Shows: Mode: Cloud (enabled)
986 | # → Shows: Host: https://cloud.basicmemory.com
987 | # → Checks cloud health
988 |
989 | 3. Set up bidirectional sync:
990 | bm cloud bisync-setup
991 | # Or with custom directory:
992 | bm cloud bisync-setup --dir ~/my-sync-folder
993 |
994 | # This will:
995 | # → Install rclone (if not already installed)
996 | # → Get tenant info (tenant_id, bucket_name)
997 | # → Generate scoped IAM credentials
998 | # → Configure rclone with credentials
999 | # → Create sync directory (default: ~/basic-memory-cloud-sync/)
1000 | # → Validate no conflict with mount directory
1001 | # → Run initial --resync to establish baseline
1002 |
1003 | Normal Usage
1004 |
1005 | 4. Create local project and sync:
1006 | # Create a local project directory
1007 | mkdir ~/basic-memory-cloud-sync/my-research
1008 | echo "# Research Notes" > ~/basic-memory-cloud-sync/my-research/readme.md
1009 |
1010 | # Run sync
1011 | bm cloud bisync
1012 |
1013 | # Auto-magic happens:
1014 | # → Checks for new local directories
1015 | # → Finds "my-research" not in cloud
1016 | # → Creates project on cloud via POST /proxy/projects/projects
1017 | # → Runs bidirectional sync (all projects)
1018 | # → Syncs to bucket root (all projects synced together)
1019 |
1020 | 5. Watch mode for continuous sync:
1021 | bm cloud bisync --watch
1022 | # Or with custom interval:
1023 | bm cloud bisync --watch --interval 30
1024 |
1025 | # → Syncs every 60 seconds (or custom interval)
1026 | # → Auto-registers new projects on each run
1027 | # → Press Ctrl+C to stop
1028 |
1029 | 6. Check bisync status:
1030 | bm cloud bisync-status
1031 | # → Shows tenant ID
1032 | # → Shows sync directory path
1033 | # → Shows initialization status
1034 | # → Shows last sync time
1035 | # → Lists available profiles (safe/balanced/fast)
1036 |
1037 | 7. Manual sync with different profiles:
1038 | # Safe mode (max 10 deletes, preserves conflicts)
1039 | bm cloud bisync --profile safe
1040 |
1041 | # Balanced mode (max 25 deletes, auto-resolve to newer) - default
1042 | bm cloud bisync --profile balanced
1043 |
1044 | # Fast mode (max 50 deletes, skip verification)
1045 | bm cloud bisync --profile fast
1046 |
1047 | 8. Dry run to preview changes:
1048 | bm cloud bisync --dry-run
1049 | # → Shows what would be synced without making changes
1050 |
1051 | 9. Force resync (if needed):
1052 | bm cloud bisync --resync
1053 | # → Establishes new baseline
1054 | # → Use if sync state is corrupted
1055 |
1056 | 10. Check file integrity:
1057 | bm cloud check
1058 | # → Verifies all files match between local and cloud
1059 | # → Read-only operation (no data transfer)
1060 | # → Shows differences if any found
1061 |
1062 | # Faster one-way check
1063 | bm cloud check --one-way
1064 | # → Only checks for missing files on destination
1065 |
1066 | Verify Cloud Mode Integration
1067 |
1068 | 11. Test that all commands work in cloud mode:
1069 | # List cloud projects (not local)
1070 | bm project list
1071 |
1072 | # Create project on cloud
1073 | bm project add "work-notes"
1074 |
1075 | # Use MCP tools against cloud
1076 | bm tool write-note --title "Test" --folder "my-research" --content "Hello"
1077 |
1078 | # All of these work against cloud because cloud_mode=true
1079 |
1080 | 12. Switch back to local mode:
1081 | bm cloud logout
1082 | # → Sets cloud_mode=false
1083 | # → Clears BASIC_MEMORY_PROXY_URL
1084 | # → All commands now work locally again
1085 |
1086 | Expected Directory Structure
1087 |
1088 | ~/basic-memory-cloud-sync/ # Your local sync directory
1089 | ├── my-research/ # Auto-created cloud project
1090 | │ ├── readme.md
1091 | │ └── notes.md
1092 | ├── work-notes/ # Another project
1093 | │ └── tasks.md
1094 | └── personal/ # Another project
1095 | └── journal.md
1096 |
1097 | # All sync bidirectionally with:
1098 | bucket:/ # Cloud bucket root
1099 | ├── my-research/
1100 | ├── work-notes/
1101 | └── personal/
1102 |
1103 | Key Points to Test
1104 |
1105 | 1. ✅ Cloud mode toggle works (login/logout)
1106 | 2. ✅ Bisync setup validates directory (no conflict with mount)
1107 | 3. ✅ Local directories auto-create cloud projects
1108 | 4. ✅ All projects sync together (bucket root)
1109 | 5. ✅ No RCLONE_TEST files created
1110 | 6. ✅ Changes sync bidirectionally
1111 | 7. ✅ Watch mode continuous sync works
1112 | 8. ✅ Profile safety limits work (max_delete)
1113 | 9. ✅ `bm sync` adapts to cloud mode automatically
1114 | 10. ✅ `bm cloud check` verifies file integrity without side effects
1115 |
```