# Directory Structure
```
├── .dockerignore
├── .DS_Store
├── .gitignore
├── Dockerfile
├── LICENSE
├── package-lock.json
├── package.json
├── readme.md
└── src
├── bear-mcp-server.js
├── create-index.js
├── lib
│ └── explore-database.js
└── utils.js
```
# Files
--------------------------------------------------------------------------------
/.dockerignore:
--------------------------------------------------------------------------------
```
1 | node_modules
2 | npm-debug.log
3 | .DS_Store
4 |
```
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
```
1 | node_modules
2 | .DS_Store
3 |
4 | # Vector index files
5 | src/note_vectors.index
6 | src/note_vectors.json
7 |
```
--------------------------------------------------------------------------------
/readme.md:
--------------------------------------------------------------------------------
```markdown
1 | # Bear Notes MCP Server with RAG
2 |
3 | Looking to supercharge your Bear Notes experience with AI assistants? This little gem connects your personal knowledge base to AI systems using semantic search and RAG (Retrieval-Augmented Generation).
4 |
5 | I built this because I wanted my AI assistants to actually understand what's in my notes, not just perform simple text matching. The result is rather sweet, if I do say so myself.
6 |
7 | ## Getting Started
8 |
9 | Setting up is straightforward:
10 |
11 | ```bash
12 | git clone [your-repo-url]
13 | cd bear-mcp-server
14 | npm install
15 | ```
16 |
17 | Make the scripts executable (because permissions matter):
18 |
19 | ```bash
20 | chmod +x src/bear-mcp-server.js
21 | chmod +x src/create-index.js
22 | ```
23 |
24 | ## First Things First: Index Your Notes
25 |
26 | Before diving in, you'll need to create vector embeddings of your notes:
27 |
28 | ```bash
29 | npm run index
30 | ```
31 |
32 | Fair warning: this might take a few minutes if you're a prolific note-taker like me. It's converting all your notes into mathematical vectors that capture their meaning— clever stuff 😉.
33 |
34 | ## Configuration
35 |
36 | Update your MCP configuration file:
37 |
38 | ```json
39 | {
40 | "mcpServers": {
41 | "bear-notes": {
42 | "command": "node",
43 | "args": [
44 | "/absolute/path/to/bear-mcp-server/src/bear-mcp-server.js"
45 | ],
46 | "env": {
47 | "BEAR_DATABASE_PATH": "/Users/yourusername/Library/Group Containers/9K33E3U3T4.net.shinyfrog.net.bear/Application Data/database.sqlite"
48 | }
49 | }
50 | }
51 | }
52 | ```
53 |
54 | > 🚨 _Remember to replace the path with your actual installation location. No prizes for using the example path verbatim, I'm afraid._
55 |
56 | ## What Makes This Special?
57 |
58 | - **Semantic Search**: Find notes based on meaning, not just keywords. Ask about "productivity systems" and it'll find your notes on GTD and Pomodoro, even if they don't contain those exact words.
59 |
60 | - **RAG Support**: Your AI assistants can now pull in relevant context from your notes, even when you haven't explicitly mentioned them.
61 |
62 | - **All Local Processing**: Everything runs on your machine. No data leaves your computer, no API keys needed, no internet dependency (after initial setup).
63 |
64 | - **Graceful Fallbacks**: If semantic search isn't available for whatever reason, it'll quietly fall back to traditional search. Belt and braces.
65 |
66 | ## How It Works
67 |
68 | ### The Clever Bits
69 |
70 | This server uses the Xenova implementation of transformers.js with the all-MiniLM-L6-v2 model:
71 |
72 | - It creates 384-dimensional vectors that capture the semantic essence of your notes
73 | - All processing happens locally on your machine
74 | - The first startup might be a tad slow while the model loads, but it's zippy after that
75 |
76 | ### The Flow
77 |
78 | 1. Your query gets converted into a vector using the transformer model
79 | 2. This vector is compared to the pre-indexed vectors of your notes
80 | 3. Notes with similar meanings are returned, regardless of exact keyword matches
81 | 4. AI assistants use these relevant notes as context for their responses
82 |
83 | ## Project Structure
84 |
85 | Nothing too complex here:
86 |
87 | ```
88 | bear-mcp-server/
89 | ├── package.json
90 | ├── readme.md
91 | └── src/
92 | ├── bear-mcp-server.js # Main MCP server
93 | ├── create-index.js # Script to index notes
94 | ├── utils.js # Utility functions
95 | ├── lib/ # Additional utilities and diagnostic scripts
96 | │ └── explore-database.js # Database exploration and diagnostic tool
97 | ├── note_vectors.index # Generated vector index (after indexing)
98 | └── note_vectors.json # Note ID mapping (after indexing)
99 | ```
100 |
101 | ## Available Tools for AI Assistants
102 |
103 | AI assistants connecting to this server can use these tools:
104 |
105 | 1. **search_notes**: Find notes that match a query
106 | - Parameters: `query` (required), `limit` (optional, default: 10), `semantic` (optional, default: true)
107 |
108 | 2. **get_note**: Fetch a specific note by its ID
109 | - Parameters: `id` (required)
110 |
111 | 3. **get_tags**: List all tags used in your Bear Notes
112 |
113 | 4. **retrieve_for_rag**: Get notes semantically similar to a query, specifically formatted for RAG
114 | - Parameters: `query` (required), `limit` (optional, default: 5)
115 |
116 | ## Requirements
117 |
118 | - Node.js version 16 or higher
119 | - Bear Notes for macOS
120 | - An MCP-compatible AI assistant client
121 |
122 | ## Limitations & Caveats
123 |
124 | - Read-only access to Bear Notes (we're not modifying your precious notes)
125 | - macOS only (sorry Windows and Linux folks)
126 | - If you add loads of new notes, you'll want to rebuild the index with `npm run index`
127 | - First startup is a bit like waiting for the kettle to boil while the embedding model loads
128 |
129 | ## Troubleshooting
130 |
131 | If things go wonky:
132 |
133 | 1. Double-check your Bear database path
134 | 2. Make sure you've run the indexing process with `npm run index`
135 | 3. Check permissions on the Bear Notes database
136 | 4. Verify the server scripts are executable
137 | 5. Look for error messages in the logs
138 |
139 | When in doubt, try turning it off and on again. Works more often than we'd like to admit.
140 |
141 | ## 🐳 Running with Docker (Optional)
142 |
143 | Prefer containers? You can run everything inside Docker too.
144 |
145 | ### 1. Build the Docker image
146 |
147 | ```bash
148 | docker build -t bear-mcp-server .
149 | ```
150 |
151 | ### 2. Index your notes
152 |
153 | You'll still need to run the indexing step before anything useful happens:
154 |
155 | ```bash
156 | docker run \
157 | -v /path/to/your/NoteDatabase.sqlite:/app/database.sqlite \
158 | -e BEAR_DATABASE_PATH=/app/database.sqlite \
159 | bear-mcp-server \
160 | npm run index
161 | ```
162 |
163 | > 🛠 Replace `/path/to/your/NoteDatabase.sqlite` with the actual path to your Bear database.
164 |
165 | ### 3. Start the server
166 |
167 | Once indexed, fire it up:
168 |
169 | ```bash
170 | docker run \
171 | -v /path/to/your/NoteDatabase.sqlite:/app/database.sqlite \
172 | -e BEAR_DATABASE_PATH=/app/database.sqlite \
173 | -p 8000:8000 \
174 | bear-mcp-server
175 | ```
176 |
177 | Boom—your AI assistant is now running in a container and talking to your notes.
178 |
179 | ## License
180 |
181 | MIT (Feel free to tinker, share, and improve)
```
--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------
```dockerfile
1 | # Use the official Node.js 16 LTS image as the base
2 | FROM node:16-slim
3 |
4 | # Set the working directory
5 | WORKDIR /app
6 |
7 | # Copy package files and install dependencies
8 | COPY package*.json ./
9 | RUN npm install --production
10 |
11 | # Copy the rest of the application code
12 | COPY . .
13 |
14 | # Make the server script executable
15 | RUN chmod +x src/bear-mcp-server.js
16 |
17 | # Define the default command to run the server
18 | CMD ["node", "src/bear-mcp-server.js"]
19 |
```
--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------
```json
1 | {
2 | "name": "bear-mcp-server",
3 | "version": "1.0.0",
4 | "description": "Model Context Protocol server for Bear Notes with RAG capabilities",
5 | "main": "src/bear-mcp-server.js",
6 | "type": "module",
7 | "scripts": {
8 | "start": "node src/bear-mcp-server.js",
9 | "index": "node src/create-index.js",
10 | "test": "node src/lib/explore-database.js"
11 | },
12 | "dependencies": {
13 | "@modelcontextprotocol/sdk": "latest",
14 | "sqlite3": "latest",
15 | "@xenova/transformers": "^2.15.0",
16 | "faiss-node": "^0.5.1"
17 | },
18 | "engines": {
19 | "node": ">=16.0.0"
20 | }
21 | }
```
--------------------------------------------------------------------------------
/src/create-index.js:
--------------------------------------------------------------------------------
```javascript
1 | #!/usr/bin/env node
2 |
3 | import { getDbPath, createDb, initEmbedder, createEmbedding } from './utils.js';
4 | // Fix for CommonJS module import in ESM
5 | import faissNode from 'faiss-node';
6 | const { IndexFlatL2 } = faissNode;
7 |
8 | import fs from 'fs/promises';
9 | import path from 'path';
10 | import { fileURLToPath } from 'url';
11 |
12 | // Get current file path for ES modules
13 | const __filename = fileURLToPath(import.meta.url);
14 | const __dirname = path.dirname(__filename);
15 |
16 | // Path to save the vector index
17 | const INDEX_PATH = path.join(__dirname, 'note_vectors');
18 |
19 | // Main indexing function
20 | async function createVectorIndex() {
21 | console.log('Starting to create vector index for Bear Notes...');
22 |
23 | // Initialize the embedding model
24 | const modelInitialized = await initEmbedder();
25 | if (!modelInitialized) {
26 | console.error('Failed to initialize embedding model');
27 | process.exit(1);
28 | }
29 |
30 | // Connect to the database
31 | const dbPath = getDbPath();
32 | const db = createDb(dbPath);
33 |
34 | try {
35 | // Get all non-trashed notes
36 | const notes = await db.allAsync(`
37 | SELECT
38 | ZUNIQUEIDENTIFIER as id,
39 | ZTITLE as title,
40 | ZTEXT as content
41 | FROM ZSFNOTE
42 | WHERE ZTRASHED = 0
43 | `);
44 |
45 | console.log(`Found ${notes.length} notes to index`);
46 |
47 | // Create vectors for all notes
48 | const noteIds = [];
49 | const dimension = 384; // Dimension of the all-MiniLM-L6-v2 model
50 |
51 | // Create FAISS index
52 | const index = new IndexFlatL2(dimension);
53 |
54 | // Process notes in batches to avoid memory issues
55 | for (let i = 0; i < notes.length; i++) {
56 | const note = notes[i];
57 |
58 | // Create a combined text for embedding
59 | const textToEmbed = `${note.title}\n${note.content || ''}`.trim();
60 |
61 | if (textToEmbed) {
62 | try {
63 | // Create embedding for the note
64 | const embedding = await createEmbedding(textToEmbed);
65 |
66 | // Add to index
67 | index.add(embedding);
68 |
69 | // Store note ID
70 | noteIds.push(note.id);
71 |
72 | if ((i + 1) % 50 === 0 || i === notes.length - 1) {
73 | console.log(`Indexed ${i + 1} of ${notes.length} notes`);
74 | }
75 | } catch (error) {
76 | console.error(`Error embedding note ${note.id}:`, error.message);
77 | }
78 | }
79 | }
80 |
81 | console.log(`Successfully created embeddings for ${noteIds.length} notes`);
82 |
83 | // Create mapping from index positions to note IDs
84 | const noteIdMap = {};
85 | for (let i = 0; i < noteIds.length; i++) {
86 | noteIdMap[i] = noteIds[i];
87 | }
88 |
89 | // Save the index and mapping
90 | index.write(`${INDEX_PATH}.index`);
91 | await fs.writeFile(`${INDEX_PATH}.json`, JSON.stringify(noteIdMap));
92 |
93 | console.log(`Vector index saved to ${INDEX_PATH}`);
94 | } catch (error) {
95 | console.error('Error creating vector index:', error);
96 | } finally {
97 | // Close the database connection
98 | db.close();
99 | }
100 | }
101 |
102 | // Run the indexing
103 | createVectorIndex().then(() => {
104 | console.log('Indexing complete');
105 | process.exit(0);
106 | }).catch(error => {
107 | console.error('Indexing failed:', error);
108 | process.exit(1);
109 | });
```
--------------------------------------------------------------------------------
/src/bear-mcp-server.js:
--------------------------------------------------------------------------------
```javascript
1 | #!/usr/bin/env node
2 |
3 | import { Server } from '@modelcontextprotocol/sdk/server/index.js';
4 | import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
5 | import {
6 | CallToolRequestSchema,
7 | ErrorCode,
8 | ListToolsRequestSchema,
9 | McpError,
10 | } from '@modelcontextprotocol/sdk/types.js';
11 | import {
12 | getDbPath,
13 | createDb,
14 | searchNotes,
15 | retrieveNote,
16 | getAllTags,
17 | loadVectorIndex,
18 | initEmbedder,
19 | retrieveForRAG
20 | } from './utils.js';
21 |
22 | // Initialize dependencies
23 | async function initialize() {
24 | console.error('Initializing Bear Notes MCP server...');
25 |
26 | // Initialize database connection
27 | const dbPath = getDbPath();
28 | const db = createDb(dbPath);
29 |
30 | // Initialize embedding model
31 | const modelInitialized = await initEmbedder();
32 | if (!modelInitialized) {
33 | console.error('Warning: Embedding model initialization failed, semantic search will not be available');
34 | }
35 |
36 | // Load vector index
37 | const indexLoaded = await loadVectorIndex();
38 | if (!indexLoaded) {
39 | console.error('Warning: Vector index not found, semantic search will not be available');
40 | console.error('Run "npm run index" to create the vector index');
41 | }
42 |
43 | return { db, hasSemanticSearch: modelInitialized && indexLoaded };
44 | }
45 |
46 | // Main function
47 | async function main() {
48 | // Initialize components
49 | const { db, hasSemanticSearch } = await initialize();
50 |
51 | // Create MCP server
52 | const server = new Server(
53 | {
54 | name: 'bear-notes',
55 | version: '1.0.0',
56 | },
57 | {
58 | capabilities: {
59 | tools: {},
60 | }
61 | }
62 | );
63 |
64 | // Register the list tools handler
65 | server.setRequestHandler(ListToolsRequestSchema, async () => {
66 | const tools = [
67 | {
68 | name: 'search_notes',
69 | description: 'Search for notes in Bear that match a query',
70 | inputSchema: {
71 | type: 'object',
72 | properties: {
73 | query: {
74 | type: 'string',
75 | description: 'Search query to find matching notes',
76 | },
77 | limit: {
78 | type: 'number',
79 | description: 'Maximum number of results to return (default: 10)',
80 | },
81 | semantic: {
82 | type: 'boolean',
83 | description: 'Use semantic search instead of keyword search (default: true)',
84 | }
85 | },
86 | required: ['query'],
87 | },
88 | },
89 | {
90 | name: 'get_note',
91 | description: 'Retrieve a specific note by its ID',
92 | inputSchema: {
93 | type: 'object',
94 | properties: {
95 | id: {
96 | type: 'string',
97 | description: 'Unique identifier of the note to retrieve',
98 | },
99 | },
100 | required: ['id'],
101 | },
102 | },
103 | {
104 | name: 'get_tags',
105 | description: 'Get all tags used in Bear Notes',
106 | inputSchema: {
107 | type: 'object',
108 | properties: {},
109 | },
110 | }
111 | ];
112 |
113 | // Add RAG tool if semantic search is available
114 | if (hasSemanticSearch) {
115 | tools.push({
116 | name: 'retrieve_for_rag',
117 | description: 'Retrieve notes that are semantically similar to a query for RAG',
118 | inputSchema: {
119 | type: 'object',
120 | properties: {
121 | query: {
122 | type: 'string',
123 | description: 'Query for which to find relevant notes',
124 | },
125 | limit: {
126 | type: 'number',
127 | description: 'Maximum number of notes to retrieve (default: 5)',
128 | },
129 | },
130 | required: ['query'],
131 | },
132 | });
133 | }
134 |
135 | return { tools };
136 | });
137 |
138 | // Register the call tool handler
139 | server.setRequestHandler(CallToolRequestSchema, async (request) => {
140 | if (request.params.name === 'search_notes') {
141 | const { query, limit = 10, semantic = true } = request.params.arguments;
142 | const useSemanticSearch = semantic && hasSemanticSearch;
143 |
144 | try {
145 | const notes = await searchNotes(db, query, limit, useSemanticSearch);
146 | return {
147 | toolResult: {
148 | notes,
149 | searchMethod: useSemanticSearch ? 'semantic' : 'keyword'
150 | }
151 | };
152 | } catch (error) {
153 | return {
154 | toolResult: {
155 | error: `Search failed: ${error.message}`,
156 | searchMethod: 'keyword',
157 | notes: []
158 | }
159 | };
160 | }
161 | }
162 |
163 | if (request.params.name === 'get_note') {
164 | const { id } = request.params.arguments;
165 | try {
166 | const note = await retrieveNote(db, id);
167 | return { toolResult: { note } };
168 | } catch (error) {
169 | return { toolResult: { error: error.message } };
170 | }
171 | }
172 |
173 | if (request.params.name === 'get_tags') {
174 | try {
175 | const tags = await getAllTags(db);
176 | return { toolResult: { tags } };
177 | } catch (error) {
178 | return { toolResult: { error: error.message } };
179 | }
180 | }
181 |
182 | if (request.params.name === 'retrieve_for_rag' && hasSemanticSearch) {
183 | const { query, limit = 5 } = request.params.arguments;
184 | try {
185 | const context = await retrieveForRAG(db, query, limit);
186 | return {
187 | toolResult: {
188 | context,
189 | query
190 | }
191 | };
192 | } catch (error) {
193 | return {
194 | toolResult: {
195 | error: `RAG retrieval failed: ${error.message}`,
196 | context: []
197 | }
198 | };
199 | }
200 | }
201 |
202 | throw new McpError(ErrorCode.MethodNotFound, 'Tool not found');
203 | });
204 |
205 | // Use stdio transport instead of HTTP
206 | const transport = new StdioServerTransport();
207 |
208 | // Start the server with stdio transport
209 | await server.connect(transport);
210 |
211 | // Handle process termination
212 | ['SIGINT', 'SIGTERM', 'SIGHUP'].forEach(signal => {
213 | process.on(signal, () => {
214 | console.error(`Received ${signal}, shutting down Bear Notes MCP server...`);
215 | db.close(() => {
216 | console.error('Database connection closed.');
217 | process.exit(0);
218 | });
219 | });
220 | });
221 |
222 | // Important: Log to stderr for debugging, not stdout
223 | console.error('Bear Notes MCP server ready');
224 | }
225 |
226 | // Run the main function
227 | main().catch(error => {
228 | console.error('Server error:', error);
229 | process.exit(1);
230 | });
```
--------------------------------------------------------------------------------
/src/lib/explore-database.js:
--------------------------------------------------------------------------------
```javascript
1 | #!/usr/bin/env node
2 |
3 | import sqlite3 from 'sqlite3';
4 | import { promisify } from 'util';
5 | import path from 'path';
6 | import os from 'os';
7 |
8 | // Default path to Bear's database
9 | const defaultDBPath = path.join(
10 | os.homedir(),
11 | 'Library/Group Containers/9K33E3U3T4.net.shinyfrog.bear/Application Data/database.sqlite'
12 | );
13 |
14 | // Get the database path from environment variable or use default
15 | const dbPath = process.env.BEAR_DATABASE_PATH || defaultDBPath;
16 |
17 | console.log(`Examining Bear database at: ${dbPath}`);
18 |
19 | // Connect to the database
20 | const db = new sqlite3.Database(dbPath, sqlite3.OPEN_READONLY, (err) => {
21 | if (err) {
22 | console.error('Error connecting to Bear database:', err.message);
23 | process.exit(1);
24 | }
25 | console.log('Connected to Bear Notes database successfully');
26 | });
27 |
28 | // Promisify database methods
29 | db.allAsync = promisify(db.all).bind(db);
30 | db.getAsync = promisify(db.get).bind(db);
31 |
32 | async function examineDatabase() {
33 | try {
34 | // List all tables in the database
35 | const tables = await db.allAsync(`
36 | SELECT name FROM sqlite_master
37 | WHERE type='table'
38 | ORDER BY name;
39 | `);
40 |
41 | console.log('\n--- All Tables in Bear Database ---');
42 | tables.forEach(table => console.log(table.name));
43 |
44 | // Find tables related to tags
45 | const tagTables = tables.filter(table =>
46 | table.name.toLowerCase().includes('tag') ||
47 | table.name.toLowerCase().includes('z_')
48 | );
49 |
50 | console.log('\n--- Potential Tag-Related Tables ---');
51 | tagTables.forEach(table => console.log(table.name));
52 |
53 | // Detect Z_* junction tables which often connect many-to-many relationships
54 | const junctionTables = tables.filter(table =>
55 | table.name.startsWith('Z_') &&
56 | !table.name.includes('FTS')
57 | );
58 |
59 | console.log('\n--- Junction Tables (Z_*) ---');
60 | junctionTables.forEach(table => console.log(table.name));
61 |
62 | // Get schema for each tag-related table
63 | console.log('\n--- Schema Details for Tag-Related Tables ---');
64 | for (const table of tagTables) {
65 | const schema = await db.allAsync(`PRAGMA table_info(${table.name})`);
66 | console.log(`\nTable: ${table.name}`);
67 | schema.forEach(col => {
68 | console.log(` - ${col.name} (${col.type})`);
69 | });
70 | }
71 |
72 | // Check if Z_7TAGS exists and suggest alternatives
73 | const hasZ7Tags = tables.some(table => table.name === 'Z_7TAGS');
74 | if (!hasZ7Tags) {
75 | console.log('\n--- Z_7TAGS Table Not Found ---');
76 |
77 | // Look for possible alternative junction tables between notes and tags
78 | console.log('\nPossible alternatives for note-tag relationships:');
79 | for (const table of junctionTables) {
80 | try {
81 | // Get the first few rows to sample the data
82 | const sampleData = await db.allAsync(`SELECT * FROM ${table.name} LIMIT 5`);
83 | if (sampleData && sampleData.length > 0) {
84 | console.log(`\nTable ${table.name} contents (sample):`);
85 | console.log(JSON.stringify(sampleData, null, 2));
86 | }
87 | } catch (error) {
88 | console.error(`Error reading from ${table.name}:`, error.message);
89 | }
90 | }
91 |
92 | // Look specifically at the ZSFNOTETAG table structure and contents
93 | if (tables.some(table => table.name === 'ZSFNOTETAG')) {
94 | try {
95 | console.log('\nExamining ZSFNOTETAG table structure:');
96 | const noteTagSchema = await db.allAsync(`PRAGMA table_info(ZSFNOTETAG)`);
97 | noteTagSchema.forEach(col => {
98 | console.log(` - ${col.name} (${col.type})`);
99 | });
100 |
101 | // Sample some data from the note tag table
102 | const noteTagSample = await db.allAsync(`SELECT * FROM ZSFNOTETAG LIMIT 5`);
103 | console.log('\nZSFNOTETAG sample data:');
104 | console.log(JSON.stringify(noteTagSample, null, 2));
105 | } catch (error) {
106 | console.error('Error examining ZSFNOTETAG:', error.message);
107 | }
108 | }
109 |
110 | // Look for ZSFNOTE structure to understand how notes are stored
111 | if (tables.some(table => table.name === 'ZSFNOTE')) {
112 | try {
113 | console.log('\nExamining ZSFNOTE table structure:');
114 | const noteSchema = await db.allAsync(`PRAGMA table_info(ZSFNOTE)`);
115 | noteSchema.forEach(col => {
116 | console.log(` - ${col.name} (${col.type})`);
117 | });
118 | } catch (error) {
119 | console.error('Error examining ZSFNOTE:', error.message);
120 | }
121 | }
122 | }
123 |
124 | // Try actual query used in the code to see what error it produces
125 | try {
126 | console.log('\n--- Testing the Problematic Query ---');
127 | // Get a sample note ID first
128 | const sampleNote = await db.getAsync(`
129 | SELECT ZUNIQUEIDENTIFIER as id FROM ZSFNOTE LIMIT 1
130 | `);
131 |
132 | if (sampleNote) {
133 | try {
134 | const tags = await db.allAsync(`
135 | SELECT ZT.ZTITLE as tag_name
136 | FROM Z_5TAGS ZNT
137 | JOIN ZSFNOTETAG ZT ON ZT.Z_PK = ZNT.Z_13TAGS
138 | JOIN ZSFNOTE ZN ON ZN.Z_PK = ZNT.Z_5NOTES
139 | WHERE ZN.ZUNIQUEIDENTIFIER = ?
140 | `, [sampleNote.id]);
141 |
142 | console.log('Query succeeded with results:', tags);
143 | } catch (error) {
144 | console.error('The problematic query failed with error:', error.message);
145 |
146 | // Try to identify the correct join pattern
147 | console.log('\nAttempting to find the correct table relationship...');
148 |
149 | for (const jTable of junctionTables) {
150 | // Skip large tables for performance reasons
151 | const count = await db.getAsync(`SELECT COUNT(*) as count FROM ${jTable.name}`);
152 | if (count.count > 1000) {
153 | console.log(`Skipping large table ${jTable.name} with ${count.count} rows`);
154 | continue;
155 | }
156 |
157 | const schema = await db.allAsync(`PRAGMA table_info(${jTable.name})`);
158 | const columns = schema.map(col => col.name);
159 |
160 | // Look for columns that might connect to notes and tags
161 | const noteCols = columns.filter(col => col.includes('NOTE') || col.includes('NOTES'));
162 | const tagCols = columns.filter(col => col.includes('TAG') || col.includes('TAGS'));
163 |
164 | if (noteCols.length > 0 && tagCols.length > 0) {
165 | console.log(`\nPotential junction table: ${jTable.name}`);
166 | console.log(` Note columns: ${noteCols.join(', ')}`);
167 | console.log(` Tag columns: ${tagCols.join(', ')}`);
168 |
169 | // Try a sample query with this table
170 | try {
171 | const noteCol = noteCols[0];
172 | const tagCol = tagCols[0];
173 |
174 | const testQuery = `
175 | SELECT ZT.ZTITLE as tag_name
176 | FROM ${jTable.name} J
177 | JOIN ZSFNOTETAG ZT ON ZT.Z_PK = J.${tagCol}
178 | JOIN ZSFNOTE ZN ON ZN.Z_PK = J.${noteCol}
179 | WHERE ZN.ZUNIQUEIDENTIFIER = ?
180 | LIMIT 5
181 | `;
182 |
183 | console.log(`Trying query: ${testQuery}`);
184 | const testResult = await db.allAsync(testQuery, [sampleNote.id]);
185 |
186 | console.log(`Test query succeeded! Found ${testResult.length} tags:`, testResult);
187 |
188 | // Print the full working query for implementation
189 | console.log('\nWORKING QUERY:');
190 | console.log(`
191 | SELECT ZT.ZTITLE as tag_name
192 | FROM ${jTable.name} J
193 | JOIN ZSFNOTETAG ZT ON ZT.Z_PK = J.${tagCol}
194 | JOIN ZSFNOTE ZN ON ZN.Z_PK = J.${noteCol}
195 | WHERE ZN.ZUNIQUEIDENTIFIER = ?
196 | `);
197 | } catch (testError) {
198 | console.log(`Test query failed: ${testError.message}`);
199 | }
200 | }
201 | }
202 | }
203 | } else {
204 | console.log('No notes found in the database');
205 | }
206 | } catch (queryError) {
207 | console.error('Error running test query:', queryError.message);
208 | }
209 |
210 | } catch (error) {
211 | console.error('Error examining database:', error.message);
212 | } finally {
213 | db.close(() => {
214 | console.log('\nDatabase connection closed.');
215 | });
216 | }
217 | }
218 |
219 | examineDatabase();
220 |
```
--------------------------------------------------------------------------------
/src/utils.js:
--------------------------------------------------------------------------------
```javascript
1 | import sqlite3 from 'sqlite3';
2 | import path from 'path';
3 | import os from 'os';
4 | import { promisify } from 'util';
5 | import { fileURLToPath } from 'url';
6 | import fs from 'fs/promises';
7 | import { pipeline } from '@xenova/transformers';
8 | // Fix for CommonJS module import in ESM
9 | import faissNode from 'faiss-node';
10 | const { IndexFlatL2 } = faissNode;
11 |
12 | // Get current file path for ES modules
13 | const __filename = fileURLToPath(import.meta.url);
14 | const __dirname = path.dirname(__filename);
15 |
16 | // Setup SQLite with verbose mode
17 | const sqlite = sqlite3.verbose();
18 | const { Database } = sqlite;
19 |
20 | // Default path to Bear's database
21 | const defaultDBPath = path.join(
22 | os.homedir(),
23 | 'Library/Group Containers/9K33E3U3T4.net.shinyfrog.bear/Application Data/database.sqlite'
24 | );
25 |
26 | // Path to the vector index - store in src directory
27 | const INDEX_PATH = path.join(__dirname, 'note_vectors');
28 |
29 | // Embedding model name
30 | const EMBEDDING_MODEL = 'Xenova/all-MiniLM-L6-v2';
31 |
32 | // Global variables for embedding model and vector index
33 | let embedder = null;
34 | let vectorIndex = null;
35 | let noteIdMap = null;
36 |
37 | // Get the database path from environment variable or use default
38 | export const getDbPath = () => process.env.BEAR_DATABASE_PATH || defaultDBPath;
39 |
40 | // Create and configure database connection
41 | export const createDb = (dbPath) => {
42 | const db = new Database(dbPath, sqlite3.OPEN_READONLY, (err) => {
43 | if (err) {
44 | console.error('Error connecting to Bear database:', err.message);
45 | process.exit(1);
46 | }
47 | console.error('Connected to Bear Notes database at:', dbPath);
48 | });
49 |
50 | // Promisify database methods
51 | db.allAsync = promisify(db.all).bind(db);
52 | db.getAsync = promisify(db.get).bind(db);
53 |
54 | return db;
55 | };
56 |
57 | // Initialize the embedding model
58 | export const initEmbedder = async () => {
59 | if (!embedder) {
60 | try {
61 | // Using Xenova's implementation of transformers
62 | console.error(`Initializing embedding model (${EMBEDDING_MODEL})...`);
63 | embedder = await pipeline('feature-extraction', EMBEDDING_MODEL);
64 | console.error('Embedding model initialized');
65 | return true;
66 | } catch (error) {
67 | console.error('Error initializing embedding model:', error);
68 | return false;
69 | }
70 | }
71 | return true;
72 | };
73 |
74 | // Load the vector index
75 | export const loadVectorIndex = async () => {
76 | try {
77 | if (!vectorIndex) {
78 | // Check if index exists
79 | try {
80 | await fs.access(`${INDEX_PATH}.index`);
81 |
82 | // Load index using the direct file reading method
83 | vectorIndex = IndexFlatL2.read(`${INDEX_PATH}.index`);
84 |
85 | const idMapData = await fs.readFile(`${INDEX_PATH}.json`, 'utf8');
86 | noteIdMap = JSON.parse(idMapData);
87 |
88 | console.error(`Loaded vector index with ${vectorIndex.ntotal} vectors`);
89 | return true;
90 | } catch (error) {
91 | console.error('Vector index not found. Please run indexing first:', error.message);
92 | return false;
93 | }
94 | }
95 | return true;
96 | } catch (error) {
97 | console.error('Error loading vector index:', error);
98 | return false;
99 | }
100 | };
101 |
102 | // Create text embeddings
103 | export const createEmbedding = async (text) => {
104 | if (!embedder) {
105 | const initialized = await initEmbedder();
106 | if (!initialized) {
107 | throw new Error('Failed to initialize embedding model');
108 | }
109 | }
110 |
111 | try {
112 | // Generate embeddings using Xenova transformers
113 | const result = await embedder(text, {
114 | pooling: 'mean',
115 | normalize: true
116 | });
117 |
118 | // Return the embedding as a regular array
119 | return Array.from(result.data);
120 | } catch (error) {
121 | console.error('Error creating embedding:', error);
122 | throw error;
123 | }
124 | };
125 |
126 | // Search for notes using semantic search
127 | export const semanticSearch = async (db, query, limit = 10) => {
128 | try {
129 | // Ensure vector index is loaded
130 | if (!vectorIndex || !noteIdMap) {
131 | const loaded = await loadVectorIndex();
132 | if (!loaded) {
133 | throw new Error('Vector index not available. Please run indexing first.');
134 | }
135 | }
136 |
137 | // Create embedding for the query
138 | const queryEmbedding = await createEmbedding(query);
139 |
140 | // Search in vector index
141 | const { labels, distances } = vectorIndex.search(queryEmbedding, limit);
142 |
143 | // Get note IDs from the results
144 | const noteIds = labels.map(idx => noteIdMap[idx]).filter(id => id);
145 |
146 | if (noteIds.length === 0) {
147 | return [];
148 | }
149 |
150 | // Prepare placeholders for SQL query
151 | const placeholders = noteIds.map(() => '?').join(',');
152 |
153 | // Get full note details from database
154 | const notes = await db.allAsync(`
155 | SELECT
156 | ZUNIQUEIDENTIFIER as id,
157 | ZTITLE as title,
158 | ZTEXT as content,
159 | ZSUBTITLE as subtitle,
160 | ZCREATIONDATE as creation_date
161 | FROM ZSFNOTE
162 | WHERE ZUNIQUEIDENTIFIER IN (${placeholders}) AND ZTRASHED = 0
163 | ORDER BY ZMODIFICATIONDATE DESC
164 | `, noteIds);
165 |
166 | // Get tags for each note
167 | for (const note of notes) {
168 | try {
169 | const tags = await db.allAsync(`
170 | SELECT ZT.ZTITLE as tag_name
171 | FROM Z_5TAGS ZNT
172 | JOIN ZSFNOTETAG ZT ON ZT.Z_PK = ZNT.Z_13TAGS
173 | JOIN ZSFNOTE ZN ON ZN.Z_PK = ZNT.Z_5NOTES
174 | WHERE ZN.ZUNIQUEIDENTIFIER = ?
175 | `, [note.id]);
176 | note.tags = tags.map(t => t.tag_name);
177 | } catch (tagError) {
178 | console.error(`Error fetching tags for note ${note.id}:`, tagError.message);
179 | note.tags = [];
180 | }
181 |
182 | // Convert Apple's timestamp (seconds since 2001-01-01) to standard timestamp
183 | if (note.creation_date) {
184 | // Apple's reference date is 2001-01-01, so add seconds to get UNIX timestamp
185 | note.creation_date = new Date((note.creation_date + 978307200) * 1000).toISOString();
186 | }
187 |
188 | // Store the semantic similarity score (lower distance is better)
189 | const idx = noteIds.indexOf(note.id);
190 | note.score = idx >= 0 ? 1 - distances[idx] : 0;
191 | }
192 |
193 | // Sort by similarity score
194 | return notes.sort((a, b) => b.score - a.score);
195 | } catch (error) {
196 | console.error('Semantic search error:', error);
197 | throw error;
198 | }
199 | };
200 |
201 | // Fallback to keyword search if vector search fails
202 | export const searchNotes = async (db, query, limit = 10, useSemanticSearch = true) => {
203 | try {
204 | // Try semantic search first if enabled
205 | if (useSemanticSearch) {
206 | try {
207 | const semanticResults = await semanticSearch(db, query, limit);
208 | if (semanticResults && semanticResults.length > 0) {
209 | return semanticResults;
210 | }
211 | } catch (error) {
212 | console.error('Semantic search failed, falling back to keyword search:', error.message);
213 | }
214 | }
215 |
216 | // Fallback to keyword search
217 | const notes = await db.allAsync(`
218 | SELECT
219 | ZUNIQUEIDENTIFIER as id,
220 | ZTITLE as title,
221 | ZTEXT as content,
222 | ZSUBTITLE as subtitle,
223 | ZCREATIONDATE as creation_date
224 | FROM ZSFNOTE
225 | WHERE ZTRASHED = 0 AND (ZTITLE LIKE ? OR ZTEXT LIKE ?)
226 | ORDER BY ZMODIFICATIONDATE DESC
227 | LIMIT ?
228 | `, [`%${query}%`, `%${query}%`, limit]);
229 |
230 | // Get tags for each note
231 | for (const note of notes) {
232 | try {
233 | const tags = await db.allAsync(`
234 | SELECT ZT.ZTITLE as tag_name
235 | FROM Z_5TAGS ZNT
236 | JOIN ZSFNOTETAG ZT ON ZT.Z_PK = ZNT.Z_13TAGS
237 | JOIN ZSFNOTE ZN ON ZN.Z_PK = ZNT.Z_5NOTES
238 | WHERE ZN.ZUNIQUEIDENTIFIER = ?
239 | `, [note.id]);
240 |
241 | note.tags = tags.map(t => t.tag_name);
242 | } catch (tagError) {
243 | console.error(`Error fetching tags for note ${note.id}:`, tagError.message);
244 | note.tags = [];
245 | }
246 |
247 | // Convert Apple's timestamp (seconds since 2001-01-01) to standard timestamp
248 | if (note.creation_date) {
249 | // Apple's reference date is 2001-01-01, so add seconds to get UNIX timestamp
250 | note.creation_date = new Date((note.creation_date + 978307200) * 1000).toISOString();
251 | }
252 | }
253 |
254 | return notes;
255 | } catch (error) {
256 | console.error('Search error:', error);
257 | throw error;
258 | }
259 | };
260 |
261 | // Retrieve a specific note by ID
262 | export const retrieveNote = async (db, id) => {
263 | try {
264 | if (!id) {
265 | throw new Error('Note ID is required');
266 | }
267 |
268 | // Get the note by ID
269 | const note = await db.getAsync(`
270 | SELECT
271 | ZUNIQUEIDENTIFIER as id,
272 | ZTITLE as title,
273 | ZTEXT as content,
274 | ZSUBTITLE as subtitle,
275 | ZCREATIONDATE as creation_date
276 | FROM ZSFNOTE
277 | WHERE ZUNIQUEIDENTIFIER = ? AND ZTRASHED = 0
278 | `, [id]);
279 |
280 | if (!note) {
281 | throw new Error('Note not found');
282 | }
283 |
284 | // Get tags for the note
285 | try {
286 | const tags = await db.allAsync(`
287 | SELECT ZT.ZTITLE as tag_name
288 | FROM Z_5TAGS ZNT
289 | JOIN ZSFNOTETAG ZT ON ZT.Z_PK = ZNT.Z_13TAGS
290 | JOIN ZSFNOTE ZN ON ZN.Z_PK = ZNT.Z_5NOTES
291 | WHERE ZN.ZUNIQUEIDENTIFIER = ?
292 | `, [note.id]);
293 | note.tags = tags.map(t => t.tag_name);
294 | } catch (tagError) {
295 | console.error(`Error fetching tags for note ${note.id}:`, tagError.message);
296 | note.tags = [];
297 | }
298 |
299 | // Convert Apple's timestamp (seconds since 2001-01-01) to standard timestamp
300 | if (note.creation_date) {
301 | // Apple's reference date is 2001-01-01, so add seconds to get UNIX timestamp
302 | note.creation_date = new Date((note.creation_date + 978307200) * 1000).toISOString();
303 | }
304 |
305 | return note;
306 | } catch (error) {
307 | console.error('Retrieve error:', error);
308 | throw error;
309 | }
310 | };
311 |
312 | // Get all tags
313 | export const getAllTags = async (db) => {
314 | try {
315 | const tags = await db.allAsync('SELECT ZTITLE as name FROM ZSFNOTETAG');
316 | return tags.map(tag => tag.name);
317 | } catch (error) {
318 | console.error('Get tags error:', error);
319 | throw error;
320 | }
321 | };
322 |
323 | // RAG function to retrieve notes that are semantically similar to a query
324 | export const retrieveForRAG = async (db, query, limit = 5) => {
325 | try {
326 | // Get semantically similar notes
327 | const notes = await semanticSearch(db, query, limit);
328 |
329 | // Format for RAG context
330 | return notes.map(note => ({
331 | id: note.id,
332 | title: note.title,
333 | content: note.content,
334 | tags: note.tags,
335 | score: note.score
336 | }));
337 | } catch (error) {
338 | console.error('RAG retrieval error:', error);
339 | // Fallback to keyword search
340 | const notes = await searchNotes(db, query, limit, false);
341 | return notes.map(note => ({
342 | id: note.id,
343 | title: note.title,
344 | content: note.content,
345 | tags: note.tags
346 | }));
347 | }
348 | };
```