# Directory Structure
```
├── .dockerignore
├── .DS_Store
├── .gitignore
├── Dockerfile
├── LICENSE
├── package-lock.json
├── package.json
├── readme.md
└── src
├── bear-mcp-server.js
├── create-index.js
├── lib
│ └── explore-database.js
└── utils.js
```
# Files
--------------------------------------------------------------------------------
/.dockerignore:
--------------------------------------------------------------------------------
```
node_modules
npm-debug.log
.DS_Store
```
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
```
node_modules
.DS_Store
# Vector index files
src/note_vectors.index
src/note_vectors.json
```
--------------------------------------------------------------------------------
/readme.md:
--------------------------------------------------------------------------------
```markdown
# Bear Notes MCP Server with RAG
Looking to supercharge your Bear Notes experience with AI assistants? This little gem connects your personal knowledge base to AI systems using semantic search and RAG (Retrieval-Augmented Generation).
I built this because I wanted my AI assistants to actually understand what's in my notes, not just perform simple text matching. The result is rather sweet, if I do say so myself.
## Getting Started
Setting up is straightforward:
```bash
git clone [your-repo-url]
cd bear-mcp-server
npm install
```
Make the scripts executable (because permissions matter):
```bash
chmod +x src/bear-mcp-server.js
chmod +x src/create-index.js
```
## First Things First: Index Your Notes
Before diving in, you'll need to create vector embeddings of your notes:
```bash
npm run index
```
Fair warning: this might take a few minutes if you're a prolific note-taker like me. It's converting all your notes into mathematical vectors that capture their meaning— clever stuff 😉.
## Configuration
Update your MCP configuration file:
```json
{
"mcpServers": {
"bear-notes": {
"command": "node",
"args": [
"/absolute/path/to/bear-mcp-server/src/bear-mcp-server.js"
],
"env": {
"BEAR_DATABASE_PATH": "/Users/yourusername/Library/Group Containers/9K33E3U3T4.net.shinyfrog.net.bear/Application Data/database.sqlite"
}
}
}
}
```
> 🚨 _Remember to replace the path with your actual installation location. No prizes for using the example path verbatim, I'm afraid._
## What Makes This Special?
- **Semantic Search**: Find notes based on meaning, not just keywords. Ask about "productivity systems" and it'll find your notes on GTD and Pomodoro, even if they don't contain those exact words.
- **RAG Support**: Your AI assistants can now pull in relevant context from your notes, even when you haven't explicitly mentioned them.
- **All Local Processing**: Everything runs on your machine. No data leaves your computer, no API keys needed, no internet dependency (after initial setup).
- **Graceful Fallbacks**: If semantic search isn't available for whatever reason, it'll quietly fall back to traditional search. Belt and braces.
## How It Works
### The Clever Bits
This server uses the Xenova implementation of transformers.js with the all-MiniLM-L6-v2 model:
- It creates 384-dimensional vectors that capture the semantic essence of your notes
- All processing happens locally on your machine
- The first startup might be a tad slow while the model loads, but it's zippy after that
### The Flow
1. Your query gets converted into a vector using the transformer model
2. This vector is compared to the pre-indexed vectors of your notes
3. Notes with similar meanings are returned, regardless of exact keyword matches
4. AI assistants use these relevant notes as context for their responses
## Project Structure
Nothing too complex here:
```
bear-mcp-server/
├── package.json
├── readme.md
└── src/
├── bear-mcp-server.js # Main MCP server
├── create-index.js # Script to index notes
├── utils.js # Utility functions
├── lib/ # Additional utilities and diagnostic scripts
│ └── explore-database.js # Database exploration and diagnostic tool
├── note_vectors.index # Generated vector index (after indexing)
└── note_vectors.json # Note ID mapping (after indexing)
```
## Available Tools for AI Assistants
AI assistants connecting to this server can use these tools:
1. **search_notes**: Find notes that match a query
- Parameters: `query` (required), `limit` (optional, default: 10), `semantic` (optional, default: true)
2. **get_note**: Fetch a specific note by its ID
- Parameters: `id` (required)
3. **get_tags**: List all tags used in your Bear Notes
4. **retrieve_for_rag**: Get notes semantically similar to a query, specifically formatted for RAG
- Parameters: `query` (required), `limit` (optional, default: 5)
## Requirements
- Node.js version 16 or higher
- Bear Notes for macOS
- An MCP-compatible AI assistant client
## Limitations & Caveats
- Read-only access to Bear Notes (we're not modifying your precious notes)
- macOS only (sorry Windows and Linux folks)
- If you add loads of new notes, you'll want to rebuild the index with `npm run index`
- First startup is a bit like waiting for the kettle to boil while the embedding model loads
## Troubleshooting
If things go wonky:
1. Double-check your Bear database path
2. Make sure you've run the indexing process with `npm run index`
3. Check permissions on the Bear Notes database
4. Verify the server scripts are executable
5. Look for error messages in the logs
When in doubt, try turning it off and on again. Works more often than we'd like to admit.
## 🐳 Running with Docker (Optional)
Prefer containers? You can run everything inside Docker too.
### 1. Build the Docker image
```bash
docker build -t bear-mcp-server .
```
### 2. Index your notes
You'll still need to run the indexing step before anything useful happens:
```bash
docker run \
-v /path/to/your/NoteDatabase.sqlite:/app/database.sqlite \
-e BEAR_DATABASE_PATH=/app/database.sqlite \
bear-mcp-server \
npm run index
```
> 🛠 Replace `/path/to/your/NoteDatabase.sqlite` with the actual path to your Bear database.
### 3. Start the server
Once indexed, fire it up:
```bash
docker run \
-v /path/to/your/NoteDatabase.sqlite:/app/database.sqlite \
-e BEAR_DATABASE_PATH=/app/database.sqlite \
-p 8000:8000 \
bear-mcp-server
```
Boom—your AI assistant is now running in a container and talking to your notes.
## License
MIT (Feel free to tinker, share, and improve)
```
--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------
```dockerfile
# Use the official Node.js 16 LTS image as the base
FROM node:16-slim
# Set the working directory
WORKDIR /app
# Copy package files and install dependencies
COPY package*.json ./
RUN npm install --production
# Copy the rest of the application code
COPY . .
# Make the server script executable
RUN chmod +x src/bear-mcp-server.js
# Define the default command to run the server
CMD ["node", "src/bear-mcp-server.js"]
```
--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------
```json
{
"name": "bear-mcp-server",
"version": "1.0.0",
"description": "Model Context Protocol server for Bear Notes with RAG capabilities",
"main": "src/bear-mcp-server.js",
"type": "module",
"scripts": {
"start": "node src/bear-mcp-server.js",
"index": "node src/create-index.js",
"test": "node src/lib/explore-database.js"
},
"dependencies": {
"@modelcontextprotocol/sdk": "latest",
"sqlite3": "latest",
"@xenova/transformers": "^2.15.0",
"faiss-node": "^0.5.1"
},
"engines": {
"node": ">=16.0.0"
}
}
```
--------------------------------------------------------------------------------
/src/create-index.js:
--------------------------------------------------------------------------------
```javascript
#!/usr/bin/env node
import { getDbPath, createDb, initEmbedder, createEmbedding } from './utils.js';
// Fix for CommonJS module import in ESM
import faissNode from 'faiss-node';
const { IndexFlatL2 } = faissNode;
import fs from 'fs/promises';
import path from 'path';
import { fileURLToPath } from 'url';
// Get current file path for ES modules
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
// Path to save the vector index
const INDEX_PATH = path.join(__dirname, 'note_vectors');
// Main indexing function
async function createVectorIndex() {
console.log('Starting to create vector index for Bear Notes...');
// Initialize the embedding model
const modelInitialized = await initEmbedder();
if (!modelInitialized) {
console.error('Failed to initialize embedding model');
process.exit(1);
}
// Connect to the database
const dbPath = getDbPath();
const db = createDb(dbPath);
try {
// Get all non-trashed notes
const notes = await db.allAsync(`
SELECT
ZUNIQUEIDENTIFIER as id,
ZTITLE as title,
ZTEXT as content
FROM ZSFNOTE
WHERE ZTRASHED = 0
`);
console.log(`Found ${notes.length} notes to index`);
// Create vectors for all notes
const noteIds = [];
const dimension = 384; // Dimension of the all-MiniLM-L6-v2 model
// Create FAISS index
const index = new IndexFlatL2(dimension);
// Process notes in batches to avoid memory issues
for (let i = 0; i < notes.length; i++) {
const note = notes[i];
// Create a combined text for embedding
const textToEmbed = `${note.title}\n${note.content || ''}`.trim();
if (textToEmbed) {
try {
// Create embedding for the note
const embedding = await createEmbedding(textToEmbed);
// Add to index
index.add(embedding);
// Store note ID
noteIds.push(note.id);
if ((i + 1) % 50 === 0 || i === notes.length - 1) {
console.log(`Indexed ${i + 1} of ${notes.length} notes`);
}
} catch (error) {
console.error(`Error embedding note ${note.id}:`, error.message);
}
}
}
console.log(`Successfully created embeddings for ${noteIds.length} notes`);
// Create mapping from index positions to note IDs
const noteIdMap = {};
for (let i = 0; i < noteIds.length; i++) {
noteIdMap[i] = noteIds[i];
}
// Save the index and mapping
index.write(`${INDEX_PATH}.index`);
await fs.writeFile(`${INDEX_PATH}.json`, JSON.stringify(noteIdMap));
console.log(`Vector index saved to ${INDEX_PATH}`);
} catch (error) {
console.error('Error creating vector index:', error);
} finally {
// Close the database connection
db.close();
}
}
// Run the indexing
createVectorIndex().then(() => {
console.log('Indexing complete');
process.exit(0);
}).catch(error => {
console.error('Indexing failed:', error);
process.exit(1);
});
```
--------------------------------------------------------------------------------
/src/bear-mcp-server.js:
--------------------------------------------------------------------------------
```javascript
#!/usr/bin/env node
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ErrorCode,
ListToolsRequestSchema,
McpError,
} from '@modelcontextprotocol/sdk/types.js';
import {
getDbPath,
createDb,
searchNotes,
retrieveNote,
getAllTags,
loadVectorIndex,
initEmbedder,
retrieveForRAG
} from './utils.js';
// Initialize dependencies
async function initialize() {
console.error('Initializing Bear Notes MCP server...');
// Initialize database connection
const dbPath = getDbPath();
const db = createDb(dbPath);
// Initialize embedding model
const modelInitialized = await initEmbedder();
if (!modelInitialized) {
console.error('Warning: Embedding model initialization failed, semantic search will not be available');
}
// Load vector index
const indexLoaded = await loadVectorIndex();
if (!indexLoaded) {
console.error('Warning: Vector index not found, semantic search will not be available');
console.error('Run "npm run index" to create the vector index');
}
return { db, hasSemanticSearch: modelInitialized && indexLoaded };
}
// Main function
async function main() {
// Initialize components
const { db, hasSemanticSearch } = await initialize();
// Create MCP server
const server = new Server(
{
name: 'bear-notes',
version: '1.0.0',
},
{
capabilities: {
tools: {},
}
}
);
// Register the list tools handler
server.setRequestHandler(ListToolsRequestSchema, async () => {
const tools = [
{
name: 'search_notes',
description: 'Search for notes in Bear that match a query',
inputSchema: {
type: 'object',
properties: {
query: {
type: 'string',
description: 'Search query to find matching notes',
},
limit: {
type: 'number',
description: 'Maximum number of results to return (default: 10)',
},
semantic: {
type: 'boolean',
description: 'Use semantic search instead of keyword search (default: true)',
}
},
required: ['query'],
},
},
{
name: 'get_note',
description: 'Retrieve a specific note by its ID',
inputSchema: {
type: 'object',
properties: {
id: {
type: 'string',
description: 'Unique identifier of the note to retrieve',
},
},
required: ['id'],
},
},
{
name: 'get_tags',
description: 'Get all tags used in Bear Notes',
inputSchema: {
type: 'object',
properties: {},
},
}
];
// Add RAG tool if semantic search is available
if (hasSemanticSearch) {
tools.push({
name: 'retrieve_for_rag',
description: 'Retrieve notes that are semantically similar to a query for RAG',
inputSchema: {
type: 'object',
properties: {
query: {
type: 'string',
description: 'Query for which to find relevant notes',
},
limit: {
type: 'number',
description: 'Maximum number of notes to retrieve (default: 5)',
},
},
required: ['query'],
},
});
}
return { tools };
});
// Register the call tool handler
server.setRequestHandler(CallToolRequestSchema, async (request) => {
if (request.params.name === 'search_notes') {
const { query, limit = 10, semantic = true } = request.params.arguments;
const useSemanticSearch = semantic && hasSemanticSearch;
try {
const notes = await searchNotes(db, query, limit, useSemanticSearch);
return {
toolResult: {
notes,
searchMethod: useSemanticSearch ? 'semantic' : 'keyword'
}
};
} catch (error) {
return {
toolResult: {
error: `Search failed: ${error.message}`,
searchMethod: 'keyword',
notes: []
}
};
}
}
if (request.params.name === 'get_note') {
const { id } = request.params.arguments;
try {
const note = await retrieveNote(db, id);
return { toolResult: { note } };
} catch (error) {
return { toolResult: { error: error.message } };
}
}
if (request.params.name === 'get_tags') {
try {
const tags = await getAllTags(db);
return { toolResult: { tags } };
} catch (error) {
return { toolResult: { error: error.message } };
}
}
if (request.params.name === 'retrieve_for_rag' && hasSemanticSearch) {
const { query, limit = 5 } = request.params.arguments;
try {
const context = await retrieveForRAG(db, query, limit);
return {
toolResult: {
context,
query
}
};
} catch (error) {
return {
toolResult: {
error: `RAG retrieval failed: ${error.message}`,
context: []
}
};
}
}
throw new McpError(ErrorCode.MethodNotFound, 'Tool not found');
});
// Use stdio transport instead of HTTP
const transport = new StdioServerTransport();
// Start the server with stdio transport
await server.connect(transport);
// Handle process termination
['SIGINT', 'SIGTERM', 'SIGHUP'].forEach(signal => {
process.on(signal, () => {
console.error(`Received ${signal}, shutting down Bear Notes MCP server...`);
db.close(() => {
console.error('Database connection closed.');
process.exit(0);
});
});
});
// Important: Log to stderr for debugging, not stdout
console.error('Bear Notes MCP server ready');
}
// Run the main function
main().catch(error => {
console.error('Server error:', error);
process.exit(1);
});
```
--------------------------------------------------------------------------------
/src/lib/explore-database.js:
--------------------------------------------------------------------------------
```javascript
#!/usr/bin/env node
import sqlite3 from 'sqlite3';
import { promisify } from 'util';
import path from 'path';
import os from 'os';
// Default path to Bear's database
const defaultDBPath = path.join(
os.homedir(),
'Library/Group Containers/9K33E3U3T4.net.shinyfrog.bear/Application Data/database.sqlite'
);
// Get the database path from environment variable or use default
const dbPath = process.env.BEAR_DATABASE_PATH || defaultDBPath;
console.log(`Examining Bear database at: ${dbPath}`);
// Connect to the database
const db = new sqlite3.Database(dbPath, sqlite3.OPEN_READONLY, (err) => {
if (err) {
console.error('Error connecting to Bear database:', err.message);
process.exit(1);
}
console.log('Connected to Bear Notes database successfully');
});
// Promisify database methods
db.allAsync = promisify(db.all).bind(db);
db.getAsync = promisify(db.get).bind(db);
async function examineDatabase() {
try {
// List all tables in the database
const tables = await db.allAsync(`
SELECT name FROM sqlite_master
WHERE type='table'
ORDER BY name;
`);
console.log('\n--- All Tables in Bear Database ---');
tables.forEach(table => console.log(table.name));
// Find tables related to tags
const tagTables = tables.filter(table =>
table.name.toLowerCase().includes('tag') ||
table.name.toLowerCase().includes('z_')
);
console.log('\n--- Potential Tag-Related Tables ---');
tagTables.forEach(table => console.log(table.name));
// Detect Z_* junction tables which often connect many-to-many relationships
const junctionTables = tables.filter(table =>
table.name.startsWith('Z_') &&
!table.name.includes('FTS')
);
console.log('\n--- Junction Tables (Z_*) ---');
junctionTables.forEach(table => console.log(table.name));
// Get schema for each tag-related table
console.log('\n--- Schema Details for Tag-Related Tables ---');
for (const table of tagTables) {
const schema = await db.allAsync(`PRAGMA table_info(${table.name})`);
console.log(`\nTable: ${table.name}`);
schema.forEach(col => {
console.log(` - ${col.name} (${col.type})`);
});
}
// Check if Z_7TAGS exists and suggest alternatives
const hasZ7Tags = tables.some(table => table.name === 'Z_7TAGS');
if (!hasZ7Tags) {
console.log('\n--- Z_7TAGS Table Not Found ---');
// Look for possible alternative junction tables between notes and tags
console.log('\nPossible alternatives for note-tag relationships:');
for (const table of junctionTables) {
try {
// Get the first few rows to sample the data
const sampleData = await db.allAsync(`SELECT * FROM ${table.name} LIMIT 5`);
if (sampleData && sampleData.length > 0) {
console.log(`\nTable ${table.name} contents (sample):`);
console.log(JSON.stringify(sampleData, null, 2));
}
} catch (error) {
console.error(`Error reading from ${table.name}:`, error.message);
}
}
// Look specifically at the ZSFNOTETAG table structure and contents
if (tables.some(table => table.name === 'ZSFNOTETAG')) {
try {
console.log('\nExamining ZSFNOTETAG table structure:');
const noteTagSchema = await db.allAsync(`PRAGMA table_info(ZSFNOTETAG)`);
noteTagSchema.forEach(col => {
console.log(` - ${col.name} (${col.type})`);
});
// Sample some data from the note tag table
const noteTagSample = await db.allAsync(`SELECT * FROM ZSFNOTETAG LIMIT 5`);
console.log('\nZSFNOTETAG sample data:');
console.log(JSON.stringify(noteTagSample, null, 2));
} catch (error) {
console.error('Error examining ZSFNOTETAG:', error.message);
}
}
// Look for ZSFNOTE structure to understand how notes are stored
if (tables.some(table => table.name === 'ZSFNOTE')) {
try {
console.log('\nExamining ZSFNOTE table structure:');
const noteSchema = await db.allAsync(`PRAGMA table_info(ZSFNOTE)`);
noteSchema.forEach(col => {
console.log(` - ${col.name} (${col.type})`);
});
} catch (error) {
console.error('Error examining ZSFNOTE:', error.message);
}
}
}
// Try actual query used in the code to see what error it produces
try {
console.log('\n--- Testing the Problematic Query ---');
// Get a sample note ID first
const sampleNote = await db.getAsync(`
SELECT ZUNIQUEIDENTIFIER as id FROM ZSFNOTE LIMIT 1
`);
if (sampleNote) {
try {
const tags = await db.allAsync(`
SELECT ZT.ZTITLE as tag_name
FROM Z_5TAGS ZNT
JOIN ZSFNOTETAG ZT ON ZT.Z_PK = ZNT.Z_13TAGS
JOIN ZSFNOTE ZN ON ZN.Z_PK = ZNT.Z_5NOTES
WHERE ZN.ZUNIQUEIDENTIFIER = ?
`, [sampleNote.id]);
console.log('Query succeeded with results:', tags);
} catch (error) {
console.error('The problematic query failed with error:', error.message);
// Try to identify the correct join pattern
console.log('\nAttempting to find the correct table relationship...');
for (const jTable of junctionTables) {
// Skip large tables for performance reasons
const count = await db.getAsync(`SELECT COUNT(*) as count FROM ${jTable.name}`);
if (count.count > 1000) {
console.log(`Skipping large table ${jTable.name} with ${count.count} rows`);
continue;
}
const schema = await db.allAsync(`PRAGMA table_info(${jTable.name})`);
const columns = schema.map(col => col.name);
// Look for columns that might connect to notes and tags
const noteCols = columns.filter(col => col.includes('NOTE') || col.includes('NOTES'));
const tagCols = columns.filter(col => col.includes('TAG') || col.includes('TAGS'));
if (noteCols.length > 0 && tagCols.length > 0) {
console.log(`\nPotential junction table: ${jTable.name}`);
console.log(` Note columns: ${noteCols.join(', ')}`);
console.log(` Tag columns: ${tagCols.join(', ')}`);
// Try a sample query with this table
try {
const noteCol = noteCols[0];
const tagCol = tagCols[0];
const testQuery = `
SELECT ZT.ZTITLE as tag_name
FROM ${jTable.name} J
JOIN ZSFNOTETAG ZT ON ZT.Z_PK = J.${tagCol}
JOIN ZSFNOTE ZN ON ZN.Z_PK = J.${noteCol}
WHERE ZN.ZUNIQUEIDENTIFIER = ?
LIMIT 5
`;
console.log(`Trying query: ${testQuery}`);
const testResult = await db.allAsync(testQuery, [sampleNote.id]);
console.log(`Test query succeeded! Found ${testResult.length} tags:`, testResult);
// Print the full working query for implementation
console.log('\nWORKING QUERY:');
console.log(`
SELECT ZT.ZTITLE as tag_name
FROM ${jTable.name} J
JOIN ZSFNOTETAG ZT ON ZT.Z_PK = J.${tagCol}
JOIN ZSFNOTE ZN ON ZN.Z_PK = J.${noteCol}
WHERE ZN.ZUNIQUEIDENTIFIER = ?
`);
} catch (testError) {
console.log(`Test query failed: ${testError.message}`);
}
}
}
}
} else {
console.log('No notes found in the database');
}
} catch (queryError) {
console.error('Error running test query:', queryError.message);
}
} catch (error) {
console.error('Error examining database:', error.message);
} finally {
db.close(() => {
console.log('\nDatabase connection closed.');
});
}
}
examineDatabase();
```
--------------------------------------------------------------------------------
/src/utils.js:
--------------------------------------------------------------------------------
```javascript
import sqlite3 from 'sqlite3';
import path from 'path';
import os from 'os';
import { promisify } from 'util';
import { fileURLToPath } from 'url';
import fs from 'fs/promises';
import { pipeline } from '@xenova/transformers';
// Fix for CommonJS module import in ESM
import faissNode from 'faiss-node';
const { IndexFlatL2 } = faissNode;
// Get current file path for ES modules
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
// Setup SQLite with verbose mode
const sqlite = sqlite3.verbose();
const { Database } = sqlite;
// Default path to Bear's database
const defaultDBPath = path.join(
os.homedir(),
'Library/Group Containers/9K33E3U3T4.net.shinyfrog.bear/Application Data/database.sqlite'
);
// Path to the vector index - store in src directory
const INDEX_PATH = path.join(__dirname, 'note_vectors');
// Embedding model name
const EMBEDDING_MODEL = 'Xenova/all-MiniLM-L6-v2';
// Global variables for embedding model and vector index
let embedder = null;
let vectorIndex = null;
let noteIdMap = null;
// Get the database path from environment variable or use default
export const getDbPath = () => process.env.BEAR_DATABASE_PATH || defaultDBPath;
// Create and configure database connection
export const createDb = (dbPath) => {
const db = new Database(dbPath, sqlite3.OPEN_READONLY, (err) => {
if (err) {
console.error('Error connecting to Bear database:', err.message);
process.exit(1);
}
console.error('Connected to Bear Notes database at:', dbPath);
});
// Promisify database methods
db.allAsync = promisify(db.all).bind(db);
db.getAsync = promisify(db.get).bind(db);
return db;
};
// Initialize the embedding model
export const initEmbedder = async () => {
if (!embedder) {
try {
// Using Xenova's implementation of transformers
console.error(`Initializing embedding model (${EMBEDDING_MODEL})...`);
embedder = await pipeline('feature-extraction', EMBEDDING_MODEL);
console.error('Embedding model initialized');
return true;
} catch (error) {
console.error('Error initializing embedding model:', error);
return false;
}
}
return true;
};
// Load the vector index
export const loadVectorIndex = async () => {
try {
if (!vectorIndex) {
// Check if index exists
try {
await fs.access(`${INDEX_PATH}.index`);
// Load index using the direct file reading method
vectorIndex = IndexFlatL2.read(`${INDEX_PATH}.index`);
const idMapData = await fs.readFile(`${INDEX_PATH}.json`, 'utf8');
noteIdMap = JSON.parse(idMapData);
console.error(`Loaded vector index with ${vectorIndex.ntotal} vectors`);
return true;
} catch (error) {
console.error('Vector index not found. Please run indexing first:', error.message);
return false;
}
}
return true;
} catch (error) {
console.error('Error loading vector index:', error);
return false;
}
};
// Create text embeddings
export const createEmbedding = async (text) => {
if (!embedder) {
const initialized = await initEmbedder();
if (!initialized) {
throw new Error('Failed to initialize embedding model');
}
}
try {
// Generate embeddings using Xenova transformers
const result = await embedder(text, {
pooling: 'mean',
normalize: true
});
// Return the embedding as a regular array
return Array.from(result.data);
} catch (error) {
console.error('Error creating embedding:', error);
throw error;
}
};
// Search for notes using semantic search
export const semanticSearch = async (db, query, limit = 10) => {
try {
// Ensure vector index is loaded
if (!vectorIndex || !noteIdMap) {
const loaded = await loadVectorIndex();
if (!loaded) {
throw new Error('Vector index not available. Please run indexing first.');
}
}
// Create embedding for the query
const queryEmbedding = await createEmbedding(query);
// Search in vector index
const { labels, distances } = vectorIndex.search(queryEmbedding, limit);
// Get note IDs from the results
const noteIds = labels.map(idx => noteIdMap[idx]).filter(id => id);
if (noteIds.length === 0) {
return [];
}
// Prepare placeholders for SQL query
const placeholders = noteIds.map(() => '?').join(',');
// Get full note details from database
const notes = await db.allAsync(`
SELECT
ZUNIQUEIDENTIFIER as id,
ZTITLE as title,
ZTEXT as content,
ZSUBTITLE as subtitle,
ZCREATIONDATE as creation_date
FROM ZSFNOTE
WHERE ZUNIQUEIDENTIFIER IN (${placeholders}) AND ZTRASHED = 0
ORDER BY ZMODIFICATIONDATE DESC
`, noteIds);
// Get tags for each note
for (const note of notes) {
try {
const tags = await db.allAsync(`
SELECT ZT.ZTITLE as tag_name
FROM Z_5TAGS ZNT
JOIN ZSFNOTETAG ZT ON ZT.Z_PK = ZNT.Z_13TAGS
JOIN ZSFNOTE ZN ON ZN.Z_PK = ZNT.Z_5NOTES
WHERE ZN.ZUNIQUEIDENTIFIER = ?
`, [note.id]);
note.tags = tags.map(t => t.tag_name);
} catch (tagError) {
console.error(`Error fetching tags for note ${note.id}:`, tagError.message);
note.tags = [];
}
// Convert Apple's timestamp (seconds since 2001-01-01) to standard timestamp
if (note.creation_date) {
// Apple's reference date is 2001-01-01, so add seconds to get UNIX timestamp
note.creation_date = new Date((note.creation_date + 978307200) * 1000).toISOString();
}
// Store the semantic similarity score (lower distance is better)
const idx = noteIds.indexOf(note.id);
note.score = idx >= 0 ? 1 - distances[idx] : 0;
}
// Sort by similarity score
return notes.sort((a, b) => b.score - a.score);
} catch (error) {
console.error('Semantic search error:', error);
throw error;
}
};
// Fallback to keyword search if vector search fails
export const searchNotes = async (db, query, limit = 10, useSemanticSearch = true) => {
try {
// Try semantic search first if enabled
if (useSemanticSearch) {
try {
const semanticResults = await semanticSearch(db, query, limit);
if (semanticResults && semanticResults.length > 0) {
return semanticResults;
}
} catch (error) {
console.error('Semantic search failed, falling back to keyword search:', error.message);
}
}
// Fallback to keyword search
const notes = await db.allAsync(`
SELECT
ZUNIQUEIDENTIFIER as id,
ZTITLE as title,
ZTEXT as content,
ZSUBTITLE as subtitle,
ZCREATIONDATE as creation_date
FROM ZSFNOTE
WHERE ZTRASHED = 0 AND (ZTITLE LIKE ? OR ZTEXT LIKE ?)
ORDER BY ZMODIFICATIONDATE DESC
LIMIT ?
`, [`%${query}%`, `%${query}%`, limit]);
// Get tags for each note
for (const note of notes) {
try {
const tags = await db.allAsync(`
SELECT ZT.ZTITLE as tag_name
FROM Z_5TAGS ZNT
JOIN ZSFNOTETAG ZT ON ZT.Z_PK = ZNT.Z_13TAGS
JOIN ZSFNOTE ZN ON ZN.Z_PK = ZNT.Z_5NOTES
WHERE ZN.ZUNIQUEIDENTIFIER = ?
`, [note.id]);
note.tags = tags.map(t => t.tag_name);
} catch (tagError) {
console.error(`Error fetching tags for note ${note.id}:`, tagError.message);
note.tags = [];
}
// Convert Apple's timestamp (seconds since 2001-01-01) to standard timestamp
if (note.creation_date) {
// Apple's reference date is 2001-01-01, so add seconds to get UNIX timestamp
note.creation_date = new Date((note.creation_date + 978307200) * 1000).toISOString();
}
}
return notes;
} catch (error) {
console.error('Search error:', error);
throw error;
}
};
// Retrieve a specific note by ID
export const retrieveNote = async (db, id) => {
try {
if (!id) {
throw new Error('Note ID is required');
}
// Get the note by ID
const note = await db.getAsync(`
SELECT
ZUNIQUEIDENTIFIER as id,
ZTITLE as title,
ZTEXT as content,
ZSUBTITLE as subtitle,
ZCREATIONDATE as creation_date
FROM ZSFNOTE
WHERE ZUNIQUEIDENTIFIER = ? AND ZTRASHED = 0
`, [id]);
if (!note) {
throw new Error('Note not found');
}
// Get tags for the note
try {
const tags = await db.allAsync(`
SELECT ZT.ZTITLE as tag_name
FROM Z_5TAGS ZNT
JOIN ZSFNOTETAG ZT ON ZT.Z_PK = ZNT.Z_13TAGS
JOIN ZSFNOTE ZN ON ZN.Z_PK = ZNT.Z_5NOTES
WHERE ZN.ZUNIQUEIDENTIFIER = ?
`, [note.id]);
note.tags = tags.map(t => t.tag_name);
} catch (tagError) {
console.error(`Error fetching tags for note ${note.id}:`, tagError.message);
note.tags = [];
}
// Convert Apple's timestamp (seconds since 2001-01-01) to standard timestamp
if (note.creation_date) {
// Apple's reference date is 2001-01-01, so add seconds to get UNIX timestamp
note.creation_date = new Date((note.creation_date + 978307200) * 1000).toISOString();
}
return note;
} catch (error) {
console.error('Retrieve error:', error);
throw error;
}
};
// Get all tags
export const getAllTags = async (db) => {
try {
const tags = await db.allAsync('SELECT ZTITLE as name FROM ZSFNOTETAG');
return tags.map(tag => tag.name);
} catch (error) {
console.error('Get tags error:', error);
throw error;
}
};
// RAG function to retrieve notes that are semantically similar to a query
export const retrieveForRAG = async (db, query, limit = 5) => {
try {
// Get semantically similar notes
const notes = await semanticSearch(db, query, limit);
// Format for RAG context
return notes.map(note => ({
id: note.id,
title: note.title,
content: note.content,
tags: note.tags,
score: note.score
}));
} catch (error) {
console.error('RAG retrieval error:', error);
// Fallback to keyword search
const notes = await searchNotes(db, query, limit, false);
return notes.map(note => ({
id: note.id,
title: note.title,
content: note.content,
tags: note.tags
}));
}
};
```