#
tokens: 9423/50000 10/10 files
lines: off (toggle) GitHub
raw markdown copy
# Directory Structure

```
├── .chroma_env.example
├── .gitignore
├── Cargo.toml
├── LICENSE
├── PROMPT.md
├── README.md
└── src
    ├── client.rs
    ├── config.rs
    ├── lib.rs
    ├── main.rs
    └── tools.rs
```

# Files

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
# Generated by Cargo
/target/

# Temporary test results
/test-results/

# Remove Cargo.lock from gitignore if creating an executable, leave it for libraries
# More information here https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html
Cargo.lock

# These are backup files generated by rustfmt
**/*.rs.bk

# MSVC Windows builds of rustc generate these, which store debugging information
*.pdb

# Environment variables file
.chroma_env

# IDE files
.idea/
.vscode/

# macOS files
.DS_Store

```

--------------------------------------------------------------------------------
/.chroma_env.example:
--------------------------------------------------------------------------------

```
# ChromaDB Client Configuration
# Uncomment and set the values as needed

# Client type: http, cloud, persistent, ephemeral
# CHROMA_CLIENT_TYPE=ephemeral

# Directory for persistent client data (only used with persistent client)
# CHROMA_DATA_DIR=/path/to/data

# HTTP client configuration
# CHROMA_HOST=localhost
# CHROMA_PORT=8000
# CHROMA_SSL=true
# CHROMA_CUSTOM_AUTH_CREDENTIALS=username:password

# Cloud client configuration
# CHROMA_TENANT=my-tenant
# CHROMA_DATABASE=my-database
# CHROMA_API_KEY=my-api-key

```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
# 🧠 mcp.chroma

A ChromaDB MCP server for vector embeddings, collections, and document management.

[![Rust](https://img.shields.io/badge/Rust-000000?style=for-the-badge&logo=rust&logoColor=white)](https://www.rust-lang.org/)
[![MCP](https://img.shields.io/badge/MCP-Protocol-blue?style=for-the-badge)](https://modelcontextprotocol.io/)
[![ChromaDB](https://img.shields.io/badge/ChromaDB-Vector_Database-purple?style=for-the-badge)](https://www.trychroma.com/)

## 📋 Overview

This MCP server provides a interface for working with [ChromaDB](https://www.trychroma.com/), a vector database for embeddings. It enables operations on collections and documents through a set of tools accessible via the MCP (Model-Controller-Protocol) interface.

## ✨ Features

- 📊 Collection management (create, list, modify, delete)
- 📄 Document operations (add, query, get, update, delete)
- 🧠 Thought processing for session management
- 🔌 Multiple client types (HTTP, Cloud, Persistent, Ephemeral)

## 🚀 Installation

Clone the repository and build with Cargo:

```bash
git clone https://github.com/yourusername/mcp-chroma.git
cd mcp-chroma
cargo build --release
```

## 🛠️ Usage

### Setting Up Environment

Create a `.chroma_env` file in your project directory with the configuration parameters:

```
CHROMA_CLIENT_TYPE=ephemeral
CHROMA_HOST=localhost
CHROMA_PORT=8000
```

### Running the Server

```bash
# Run with default configuration
./mcp-chroma

# Run with specific client type
./mcp-chroma --client-type http --host localhost --port 8000

# Run with persistent storage
./mcp-chroma --client-type persistent --data-dir ./chroma_data
```

### Available Client Types

1. **Ephemeral**: In-memory client (default)
2. **Persistent**: Local storage client with persistence
3. **HTTP**: Remote client via HTTP
4. **Cloud**: Managed cloud client

## ⚙️ Configuration Options

| Option | Environment Variable | Description | Default |
|--------|---------------------|-------------|---------|
| `--client-type` | `CHROMA_CLIENT_TYPE` | Type of client (ephemeral, persistent, http, cloud) | ephemeral |
| `--data-dir` | `CHROMA_DATA_DIR` | Directory for persistent storage | None |
| `--host` | `CHROMA_HOST` | Host for HTTP client | None |
| `--port` | `CHROMA_PORT` | Port for HTTP client | None |
| `--ssl` | `CHROMA_SSL` | Use SSL for HTTP client | true |
| `--tenant` | `CHROMA_TENANT` | Tenant for cloud client | None |
| `--database` | `CHROMA_DATABASE` | Database for cloud client | None |
| `--api-key` | `CHROMA_API_KEY` | API key for cloud client | None |
| `--dotenv-path` | `CHROMA_DOTENV_PATH` | Path to .env file | .chroma_env |

## 🧰 Tools

### Collection Tools

- `chroma_list_collections`: List all collections
- `chroma_create_collection`: Create a new collection
- `chroma_peek_collection`: Preview documents in a collection
- `chroma_get_collection_info`: Get metadata about a collection
- `chroma_get_collection_count`: Count documents in a collection
- `chroma_modify_collection`: Update collection properties
- `chroma_delete_collection`: Delete a collection

### Document Tools

- `chroma_add_documents`: Add documents to a collection
- `chroma_query_documents`: Search for similar documents
- `chroma_get_documents`: Retrieve documents from a collection
- `chroma_update_documents`: Update existing documents
- `chroma_delete_documents`: Delete documents from a collection

### Thought Processing

- `process_thought`: Process thoughts in an ongoing session

## 📝 Examples

### Creating a Collection

```json
{
  "collection_name": "my_documents",
  "metadata": {
    "description": "A collection of example documents"
  }
}
```

### Querying Documents

```json
{
  "collection_name": "my_documents",
  "query_texts": ["What are the benefits of vector databases?"],
  "n_results": 3
}
```

## 🔧 Integration with Claude

You can use MCP-Chroma with Claude by setting up a configuration like:

```json
{
  "mcpServers": {
    "chroma": {
      "command": "mcp-chroma",
      "args": [
        "--client-type",
        "http",
        "--host",
        "localhost",
        "--port",
        "8000"
      ],
      "env": {
        "CHROMA_API_KEY": "<YOUR_API_KEY>"
      }
    }
  }
}
```

## 🖥️ Integration with Cursor

To use MCP-Chroma with Cursor, add the following to your `.vscode/mcp.json` file:

```json
{
  "mcp": {
    "inputs": [
      {
        "type": "promptString",
        "id": "chroma_api_key",
        "description": "ChromaDB API Key",
        "password": true
      }
    ],
    "servers": {
      "chroma": {
        "command": "mcp-chroma",
        "args": [
          "--client-type",
          "http",
          "--host",
          "localhost",
          "--port",
          "8000"
        ],
        "env": {
          "CHROMA_API_KEY": "${input:chroma_api_key}"
        }
      }
    }
  }
}
```

## 📄 License

[MIT License](LICENSE)

```

--------------------------------------------------------------------------------
/src/lib.rs:
--------------------------------------------------------------------------------

```rust
pub mod client;
pub mod tools; 
pub mod config;
```

--------------------------------------------------------------------------------
/PROMPT.md:
--------------------------------------------------------------------------------

```markdown
When debugging, exploring, or performing any operation related to:
- Vector embeddings
- Document collections
- Semantic search
- Vector similarity

Always use the built-in ChromaDB tools (`chroma_*`) for these operations rather than attempting to implement alternative solutions.

Remember that all operations involving embeddings storage, retrieval, and querying should leverage ChromaDB's functionality through the provided tools. 
```

--------------------------------------------------------------------------------
/Cargo.toml:
--------------------------------------------------------------------------------

```toml
[package]
name = "mcp-chroma"
version = "1.0.2"
edition = "2024"
description = "MCP server for ChromaDB"
authors = ["viable"]

[dependencies]
mcp-server = "0.1.0"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
tokio = { version = "1.0", features = ["rt-multi-thread", "macros", "io-util", "sync", "time"] }
anyhow = "1.0"
colored = "3.0"
async-trait = "0.1.88"
mcp-spec = "0.1.0"
thiserror = "2.0.12"
tracing = "0.1"
tracing-subscriber = "0.3"
clap = { version = "4.3", features = ["derive", "env"] }
reqwest = { version = "0.12.15", features = ["json", "native-tls"] }
dotenv = "0.15"
uuid = { version = "1.3", features = ["v4", "serde"] }

[profile.release]
codegen-units = 1
opt-level = 3
panic = "abort"
lto = true
debug = false
strip = true

```

--------------------------------------------------------------------------------
/src/config.rs:
--------------------------------------------------------------------------------

```rust
use clap::Parser;
use std::path::PathBuf;

#[derive(Debug, Parser, Clone)]
#[command(author, version, about, long_about = None)]
pub struct Config {
    #[arg(long, env = "CHROMA_CLIENT_TYPE", default_value = "ephemeral")]
    #[arg(value_enum)]
    pub client_type: ClientType,

    #[arg(long, env = "CHROMA_DATA_DIR")]
    pub data_dir: Option<PathBuf>,

    #[arg(long, env = "CHROMA_HOST")]
    pub host: Option<String>,

    #[arg(long, env = "CHROMA_PORT")]
    pub port: Option<u16>,

    #[arg(long, env = "CHROMA_CUSTOM_AUTH_CREDENTIALS")]
    pub custom_auth_credentials: Option<String>,

    #[arg(long, env = "CHROMA_TENANT")]
    pub tenant: Option<String>,

    #[arg(long, env = "CHROMA_DATABASE")]
    pub database: Option<String>,

    #[arg(long, env = "CHROMA_API_KEY")]
    pub api_key: Option<String>,

    #[arg(long, env = "CHROMA_SSL", default_value = "true")]
    pub ssl: bool,

    #[arg(long, env = "CHROMA_DOTENV_PATH", default_value = ".chroma_env")]
    pub dotenv_path: PathBuf,
}

#[derive(Debug, Clone, clap::ValueEnum)]
pub enum ClientType {
    Http,
    Cloud,
    Persistent,
    Ephemeral,
}

impl Config {
    pub fn validate(&self) -> anyhow::Result<()> {
        match self.client_type {
            ClientType::Http => {
                if self.host.is_none() {
                    anyhow::bail!("Host must be provided for HTTP client");
                }
            }
            ClientType::Cloud => {
                if self.tenant.is_none() {
                    anyhow::bail!("Tenant must be provided for cloud client");
                }
                if self.database.is_none() {
                    anyhow::bail!("Database must be provided for cloud client");
                }
                if self.api_key.is_none() {
                    anyhow::bail!("API key must be provided for cloud client");
                }
            }
            ClientType::Persistent => {
                if self.data_dir.is_none() {
                    anyhow::bail!("Data directory must be provided for persistent client");
                }
            }
            ClientType::Ephemeral => {}
        }
        Ok(())
    }
}


```

--------------------------------------------------------------------------------
/src/client.rs:
--------------------------------------------------------------------------------

```rust
use anyhow::Result;
use serde_json::json;
use std::sync::{Arc, Mutex, MutexGuard};

#[allow(dead_code)]
#[derive(Debug, Clone)]
pub struct ChromaClient {
    host: String,
    port: u16,
    username: Option<String>,
    password: Option<String>,
}

impl ChromaClient {
    pub fn new(
        host: &str,
        port: u16,
        username: Option<&str>,
        password: Option<&str>,
    ) -> Self {
        Self {
            host: host.to_string(),
            port,
            username: username.map(|s| s.to_string()),
            password: password.map(|s| s.to_string()),
        }
    }

    pub fn list_collections(&self, _limit: Option<usize>, _offset: Option<usize>) -> Result<Vec<String>> {
        Ok(vec!["test_collection".to_string()])
    }

    pub fn create_collection(&self, name: &str, _metadata: Option<serde_json::Value>) -> Result<String> {
        Ok(format!("Created collection: {}", name))
    }

    pub fn get_collection(&self, name: &str) -> Result<Collection> {
        Ok(Collection {
            name: name.to_string(),
        })
    }

    pub fn delete_collection(&self, _name: &str) -> Result<()> {
        Ok(())
    }
}

#[allow(dead_code)]
#[derive(Debug, Clone)]
pub struct Collection {
    pub name: String,
}

impl Collection {
    pub fn add(
        &self,
        _documents: Vec<String>,
        _metadatas: Option<Vec<serde_json::Value>>,
        _ids: Vec<String>,
    ) -> Result<()> {
        Ok(())
    }

    pub fn query(
        &self,
        _query_texts: Vec<String>,
        _n_results: usize,
        _where_filter: Option<serde_json::Value>,
        _where_document: Option<serde_json::Value>,
        _include: Vec<String>,
    ) -> Result<serde_json::Value> {
        Ok(json!({
            "ids": [["doc1", "doc2"]],
            "documents": [["document1", "document2"]],
            "metadatas": [[{"source": "test1"}, {"source": "test2"}]],
            "distances": [[0.1, 0.2]],
        }))
    }

    pub fn get(
        &self,
        _ids: Option<Vec<String>>,
        _where_filter: Option<serde_json::Value>,
        _where_document: Option<serde_json::Value>,
        _include: Vec<String>,
        _limit: Option<usize>,
        _offset: Option<usize>,
    ) -> Result<serde_json::Value> {
        Ok(json!({
            "ids": ["doc1", "doc2"],
            "documents": ["document1", "document2"],
            "metadatas": [{"source": "test1"}, {"source": "test2"}]
        }))
    }

    pub fn update(
        &self,
        _ids: Vec<String>,
        _embeddings: Option<Vec<Vec<f32>>>,
        _metadatas: Option<Vec<serde_json::Value>>,
        _documents: Option<Vec<String>>,
    ) -> Result<()> {
        Ok(())
    }

    pub fn delete(&self, _ids: Vec<String>) -> Result<()> {
        Ok(())
    }

    pub fn count(&self) -> Result<usize> {
        Ok(3)
    }

    pub fn peek(&self, _limit: usize) -> Result<serde_json::Value> {
        Ok(json!({
            "ids": ["doc1", "doc2"],
            "documents": ["document1", "document2"],
            "metadatas": [{"source": "test1"}, {"source": "test2"}]
        }))
    }

    pub fn modify(
        &self,
        _name: Option<String>,
        _metadata: Option<serde_json::Value>,
    ) -> Result<()> {
        Ok(())
    }
}

static CLIENT: Mutex<Option<ChromaClient>> = Mutex::new(None);

pub fn initialize_client() -> Result<()> {
    let host = std::env::var("CHROMA_HOST").unwrap_or_else(|_| "localhost".to_string());
    let port = std::env::var("CHROMA_PORT")
        .unwrap_or_else(|_| "8000".to_string())
        .parse()
        .unwrap_or(8000);
    let username = std::env::var("CHROMA_USERNAME").ok();
    let password = std::env::var("CHROMA_PASSWORD").ok();

    let client = ChromaClient::new(
        &host, 
        port,
        username.as_deref(),
        password.as_deref(),
    );
    
    let mut global_client = CLIENT.lock().unwrap();
    *global_client = Some(client);
    
    Ok(())
}

pub fn get_client() -> Arc<ChromaClient> {
    let client_guard: MutexGuard<Option<ChromaClient>> = CLIENT.lock().unwrap();
    
    if client_guard.is_none() {
        drop(client_guard);
        initialize_client().expect("Failed to initialize client");
        return get_client();
    }
    
    Arc::new(client_guard.as_ref().unwrap().clone())
}

```

--------------------------------------------------------------------------------
/src/main.rs:
--------------------------------------------------------------------------------

```rust
mod client;
mod config;
mod tools;

use anyhow::Result;
use clap::Parser;
use config::Config;
use mcp_server::{router::Router, Server, router::RouterService, ByteTransport};
use mcp_spec::{
    content::Content,
    handler::{PromptError, ResourceError, ToolError},
    prompt::Prompt,
    protocol::ServerCapabilities,
    resource::Resource,
    tool::Tool,
};
use serde::{Deserialize, Serialize};
use serde_json::Value;
use std::future::Future;
use std::path::Path;
use std::pin::Pin;
use tokio::io::{stdin, stdout};
use tracing_subscriber::EnvFilter;

#[derive(Clone)]
struct ChromaRouter {}

impl ChromaRouter {
    fn new(_config: Config) -> Self {
        Self {}
    }
    
    async fn call_tool_method<T, R, F, Fut>(&self, args: Value, f: F) -> Result<Value, anyhow::Error> 
    where
        T: for<'de> Deserialize<'de>,
        R: Serialize,
        F: FnOnce(T) -> Fut,
        Fut: Future<Output = Result<R>>,
    {
        let args = serde_json::from_value(args)?;
        let result = f(args).await?;
        serde_json::to_value(result).map_err(Into::into)
    }
    
    async fn dispatch_method(&self, name: &str, args: Value) -> Result<Value, anyhow::Error> {
        match name {
            "chroma_list_collections" => {
                self.call_tool_method(args, tools::chroma_list_collections).await
            }
            "chroma_create_collection" => {
                self.call_tool_method(args, tools::chroma_create_collection).await
            }
            "chroma_peek_collection" => {
                self.call_tool_method(args, tools::chroma_peek_collection).await
            }
            "chroma_get_collection_info" => {
                self.call_tool_method(args, tools::chroma_get_collection_info).await
            }
            "chroma_get_collection_count" => {
                self.call_tool_method(args, tools::chroma_get_collection_count).await
            }
            "chroma_modify_collection" => {
                self.call_tool_method(args, tools::chroma_modify_collection).await
            }
            "chroma_delete_collection" => {
                self.call_tool_method(args, tools::chroma_delete_collection).await
            }
            "chroma_add_documents" => {
                self.call_tool_method(args, tools::chroma_add_documents).await
            }
            "chroma_query_documents" => {
                self.call_tool_method(args, tools::chroma_query_documents).await
            }
            "chroma_get_documents" => {
                self.call_tool_method(args, tools::chroma_get_documents).await
            }
            "chroma_update_documents" => {
                self.call_tool_method(args, tools::chroma_update_documents).await
            }
            "chroma_delete_documents" => {
                self.call_tool_method(args, tools::chroma_delete_documents).await
            }
            "process_thought" => {
                self.call_tool_method(args, tools::process_thought).await
            }
            _ => Err(anyhow::anyhow!("Method not found: {}", name)),
        }
    }
}

impl Router for ChromaRouter {
    fn name(&self) -> String {
        "mcp-chroma".to_string()
    }

    fn instructions(&self) -> String {
        "ChromaDB MCP Server provides tools to work with vector embeddings, collections, and documents.".to_string()
    }

    fn capabilities(&self) -> ServerCapabilities {
        mcp_server::router::CapabilitiesBuilder::new()
            .with_tools(true)
            .build()
    }

    fn list_tools(&self) -> Vec<Tool> {
        tools::get_tool_definitions()
    }

    fn call_tool(
        &self,
        tool_name: &str,
        arguments: Value,
    ) -> Pin<Box<dyn Future<Output = Result<Vec<Content>, ToolError>> + Send + 'static>> {
        let tool_name = tool_name.to_string();
        
        Box::pin(async move {
            let router = ChromaRouter::new(Config::parse());
            match router.dispatch_method(&tool_name, arguments).await {
                Ok(value) => {
                    let json_str = serde_json::to_string_pretty(&value)
                        .map_err(|e| ToolError::ExecutionError(e.to_string()))?;
                    Ok(vec![Content::text(json_str)])
                }
                Err(err) => Err(ToolError::ExecutionError(err.to_string())),
            }
        })
    }

    fn list_resources(&self) -> Vec<Resource> {
        vec![]
    }

    fn read_resource(
        &self,
        _uri: &str,
    ) -> Pin<Box<dyn Future<Output = Result<String, ResourceError>> + Send + 'static>> {
        Box::pin(async { Err(ResourceError::NotFound("Resource not found".to_string())) })
    }

    fn list_prompts(&self) -> Vec<Prompt> {
        vec![]
    }

    fn get_prompt(
        &self,
        _prompt_name: &str,
    ) -> Pin<Box<dyn Future<Output = Result<String, PromptError>> + Send + 'static>> {
        Box::pin(async { Err(PromptError::NotFound("Prompt not found".to_string())) })
    }
}

async fn run_server(transport: ByteTransport<tokio::io::Stdin, tokio::io::Stdout>, config: Config) -> Result<()> {
    let router = ChromaRouter::new(config);
    let router_service = RouterService(router);
    let server = Server::new(router_service);
    
    tracing::info!("Starting MCP server with transport");
    server.run(transport).await?;
    
    Ok(())
}

#[tokio::main]
async fn main() -> Result<()> {
    tracing_subscriber::fmt()
        .with_env_filter(EnvFilter::from_default_env().add_directive(tracing::Level::INFO.into()))
        .with_writer(std::io::stderr)
        .init();

    let mut config = Config::parse();

    if Path::new(&config.dotenv_path).exists() {
        tracing::debug!("Loading environment from {}", config.dotenv_path.display());
        dotenv::from_path(&config.dotenv_path)?;
        config = Config::parse();
    } else {
        tracing::warn!("Environment file {} not found, using defaults", config.dotenv_path.display());
    }
    
    config.validate()?;
    client::initialize_client()?;
    run_server(ByteTransport::new(stdin(), stdout()), config).await
}

```

--------------------------------------------------------------------------------
/src/tools.rs:
--------------------------------------------------------------------------------

```rust
use crate::client::get_client;
use anyhow::{anyhow, Result};
use serde::{Deserialize, Serialize};
use serde_json::Value;
use mcp_spec::tool::Tool;


#[derive(Debug, Serialize, Deserialize)]
pub struct ListCollectionsRequest {
    pub limit: Option<usize>,
    pub offset: Option<usize>,
}

pub async fn chroma_list_collections(request: ListCollectionsRequest) -> Result<Vec<String>> {
    let client = get_client();
    client.list_collections(request.limit, request.offset)
}

#[derive(Debug, Serialize, Deserialize)]
pub struct CreateCollectionRequest {
    pub collection_name: String,
    pub embedding_function_name: Option<String>,
    pub metadata: Option<Value>,
    pub space: Option<String>,
    pub ef_construction: Option<i32>,
    pub ef_search: Option<i32>,
    pub max_neighbors: Option<i32>,
    pub num_threads: Option<i32>,
    pub batch_size: Option<i32>,
    pub sync_threshold: Option<i32>,
    pub resize_factor: Option<f32>,
}

pub async fn chroma_create_collection(request: CreateCollectionRequest) -> Result<String> {
    let client = get_client();
    client.create_collection(&request.collection_name, request.metadata)
}

#[derive(Debug, Serialize, Deserialize)]
pub struct PeekCollectionRequest {
    pub collection_name: String,
    pub limit: usize,
}

pub async fn chroma_peek_collection(request: PeekCollectionRequest) -> Result<Value> {
    let client = get_client();
    let collection = client.get_collection(&request.collection_name)?;
    collection.peek(request.limit)
}

#[derive(Debug, Serialize, Deserialize)]
pub struct GetCollectionInfoRequest {
    pub collection_name: String,
}

pub async fn chroma_get_collection_info(request: GetCollectionInfoRequest) -> Result<Value> {
    let client = get_client();
    let collection = client.get_collection(&request.collection_name)?;
    let count = collection.count()?;
    let sample_documents = collection.peek(3)?;
    
    Ok(serde_json::json!({
        "name": request.collection_name,
        "count": count,
        "sample_documents": sample_documents
    }))
}

#[derive(Debug, Serialize, Deserialize)]
pub struct GetCollectionCountRequest {
    pub collection_name: String,
}

pub async fn chroma_get_collection_count(request: GetCollectionCountRequest) -> Result<usize> {
    let client = get_client();
    let collection = client.get_collection(&request.collection_name)?;
    collection.count()
}

#[derive(Debug, Serialize, Deserialize)]
pub struct ModifyCollectionRequest {
    pub collection_name: String,
    pub new_name: Option<String>,
    pub new_metadata: Option<Value>,
    pub ef_search: Option<i32>,
    pub num_threads: Option<i32>,
    pub batch_size: Option<i32>,
    pub sync_threshold: Option<i32>,
    pub resize_factor: Option<f32>,
}

pub async fn chroma_modify_collection(request: ModifyCollectionRequest) -> Result<String> {
    let client = get_client();
    let collection = client.get_collection(&request.collection_name)?;
    collection.modify(request.new_name.clone(), request.new_metadata.clone())?;
    
    let mut modified_aspects = Vec::new();
    if request.new_name.is_some() { modified_aspects.push("name"); }
    if request.new_metadata.is_some() { modified_aspects.push("metadata"); }
    if request.ef_search.is_some() || request.num_threads.is_some() || 
       request.batch_size.is_some() || request.sync_threshold.is_some() || 
       request.resize_factor.is_some() { modified_aspects.push("hnsw"); }
    
    Ok(format!("Successfully modified collection {}: updated {}", 
               request.collection_name, 
               modified_aspects.join(" and ")))
}

#[derive(Debug, Serialize, Deserialize)]
pub struct DeleteCollectionRequest {
    pub collection_name: String,
}

pub async fn chroma_delete_collection(request: DeleteCollectionRequest) -> Result<String> {
    let client = get_client();
    client.delete_collection(&request.collection_name)?;
    Ok(format!("Successfully deleted collection {}", request.collection_name))
}


#[derive(Debug, Serialize, Deserialize)]
pub struct AddDocumentsRequest {
    pub collection_name: String,
    pub documents: Vec<String>,
    pub metadatas: Option<Vec<Value>>,
    pub ids: Option<Vec<String>>,
}

pub async fn chroma_add_documents(request: AddDocumentsRequest) -> Result<String> {
    if request.documents.is_empty() {
        return Err(anyhow!("The 'documents' list cannot be empty."));
    }
    
    let client = get_client();
    let collection = client.get_collection(&request.collection_name)?;
    
    let ids = match request.ids {
        Some(ids) => ids,
        None => (0..request.documents.len()).map(|i| i.to_string()).collect(),
    };
    
    let documents_len = request.documents.len();
    collection.add(request.documents.clone(), request.metadatas.clone(), ids)?;
    
    Ok(format!("Successfully added {} documents to collection {}", 
               documents_len, 
               request.collection_name))
}

#[derive(Debug, Serialize, Deserialize)]
pub struct QueryDocumentsRequest {
    pub collection_name: String,
    pub query_texts: Vec<String>,
    pub n_results: Option<usize>,
    pub where_filter: Option<Value>,
    pub where_document: Option<Value>,
    pub include: Option<Vec<String>>,
}

pub async fn chroma_query_documents(request: QueryDocumentsRequest) -> Result<Value> {
    if request.query_texts.is_empty() {
        return Err(anyhow!("The 'query_texts' list cannot be empty."));
    }
    
    let client = get_client();
    let collection = client.get_collection(&request.collection_name)?;
    
    let n_results = request.n_results.unwrap_or(5);
    let include = request.include.unwrap_or_else(|| vec!["documents".to_string(), "metadatas".to_string(), "distances".to_string()]);
    
    collection.query(request.query_texts, n_results, request.where_filter, request.where_document, include)
}

#[derive(Debug, Serialize, Deserialize)]
pub struct GetDocumentsRequest {
    pub collection_name: String,
    pub ids: Option<Vec<String>>,
    pub where_filter: Option<Value>,
    pub where_document: Option<Value>,
    pub include: Option<Vec<String>>,
    pub limit: Option<usize>,
    pub offset: Option<usize>,
}

pub async fn chroma_get_documents(request: GetDocumentsRequest) -> Result<Value> {
    let client = get_client();
    let collection = client.get_collection(&request.collection_name)?;
    
    let include = request.include.unwrap_or_else(|| vec!["documents".to_string(), "metadatas".to_string()]);
    
    collection.get(request.ids, request.where_filter, request.where_document, include, request.limit, request.offset)
}

#[derive(Debug, Serialize, Deserialize)]
pub struct UpdateDocumentsRequest {
    pub collection_name: String,
    pub ids: Vec<String>,
    pub embeddings: Option<Vec<Vec<f32>>>,
    pub metadatas: Option<Vec<Value>>,
    pub documents: Option<Vec<String>>,
}

pub async fn chroma_update_documents(request: UpdateDocumentsRequest) -> Result<String> {
    if request.ids.is_empty() {
        return Err(anyhow!("The 'ids' list cannot be empty."));
    }
    
    if request.embeddings.is_none() && request.metadatas.is_none() && request.documents.is_none() {
        return Err(anyhow!("At least one of 'embeddings', 'metadatas', or 'documents' must be provided for update."));
    }
    
    let check_length = |name: &str, len: usize| {
        if len != request.ids.len() {
            return Err(anyhow!("Length of '{}' list must match length of 'ids' list.", name));
        }
        Ok(())
    };
    
    if let Some(ref embeddings) = request.embeddings {
        check_length("embeddings", embeddings.len())?;
    }
    
    if let Some(ref metadatas) = request.metadatas {
        check_length("metadatas", metadatas.len())?;
    }
    
    if let Some(ref documents) = request.documents {
        check_length("documents", documents.len())?;
    }
    
    let client = get_client();
    let collection = client.get_collection(&request.collection_name)?;
    
    collection.update(request.ids.clone(), request.embeddings, request.metadatas, request.documents)?;
    
    Ok(format!(
        "Successfully updated {} documents in collection '{}'",
        request.ids.len(),
        request.collection_name
    ))
}

#[derive(Debug, Serialize, Deserialize)]
pub struct DeleteDocumentsRequest {
    pub collection_name: String,
    pub ids: Vec<String>,
}

pub async fn chroma_delete_documents(request: DeleteDocumentsRequest) -> Result<String> {
    if request.ids.is_empty() {
        return Err(anyhow!("The 'ids' list cannot be empty."));
    }
    
    let client = get_client();
    let collection = client.get_collection(&request.collection_name)?;
    
    collection.delete(request.ids.clone())?;
    
    Ok(format!(
        "Successfully deleted {} documents from collection '{}'",
        request.ids.len(),
        request.collection_name
    ))
}


#[derive(Debug, Serialize, Deserialize)]
pub struct ThoughtData {
    pub session_id: String,
    pub thought: String,
    pub thought_number: usize,
    pub total_thoughts: usize,
    pub next_thought_needed: bool,
    pub is_revision: Option<bool>,
    pub revises_thought: Option<usize>,
    pub branch_from_thought: Option<usize>,
    pub branch_id: Option<String>,
    pub needs_more_thoughts: Option<bool>,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct ThoughtResponse {
    pub session_id: String,
    pub thought_number: usize,
    pub total_thoughts: usize,
    pub next_thought_needed: bool,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub error: Option<String>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub status: Option<String>,
}

fn validate_thought_data(input_data: &ThoughtData) -> Result<()> {
    if input_data.session_id.is_empty() {
        return Err(anyhow!("Invalid sessionId: must be provided"));
    }
    if input_data.thought.is_empty() {
        return Err(anyhow!("Invalid thought: must be a string"));
    }
    if input_data.thought_number == 0 {
        return Err(anyhow!("Invalid thoughtNumber: must be a number greater than 0"));
    }
    if input_data.total_thoughts == 0 {
        return Err(anyhow!("Invalid totalThoughts: must be a number greater than 0"));
    }
    
    Ok(())
}

pub async fn process_thought(input_data: ThoughtData) -> Result<ThoughtResponse> {
    match validate_thought_data(&input_data) {
        Ok(_) => {
            let total_thoughts = std::cmp::max(input_data.thought_number, input_data.total_thoughts);
            
            Ok(ThoughtResponse {
                session_id: input_data.session_id,
                thought_number: input_data.thought_number,
                total_thoughts,
                next_thought_needed: input_data.next_thought_needed,
                error: None,
                status: None,
            })
        }
        Err(e) => {
            Ok(ThoughtResponse {
                session_id: input_data.session_id,
                thought_number: input_data.thought_number,
                total_thoughts: input_data.total_thoughts,
                next_thought_needed: input_data.next_thought_needed,
                error: Some(e.to_string()),
                status: Some("failed".to_string()),
            })
        }
    }
}

pub fn get_tool_definitions() -> Vec<Tool> {
    let mut tools = Vec::new();
    
    let add_tool = |tools: &mut Vec<Tool>, name: &str, description: &str, schema: Value| {
        tools.push(Tool {
            name: name.to_string(),
            description: description.to_string(),
            input_schema: schema,
        });
    };
    
    add_tool(
        &mut tools,
        "chroma_list_collections",
        "Lists all collections in the ChromaDB instance",
        serde_json::to_value(serde_json::json!({
            "type": "object", 
            "properties": {
                "limit": {"type": "integer", "description": "Maximum number of collections to return"},
                "offset": {"type": "integer", "description": "Offset for pagination"}
            }
        })).unwrap()
    );
    
    add_tool(
        &mut tools,
        "chroma_create_collection",
        "Creates a new collection in ChromaDB",
        serde_json::to_value(serde_json::json!({
            "type": "object", 
            "required": ["collection_name"],
            "properties": {
                "collection_name": {"type": "string", "description": "Name of the collection to create"},
                "metadata": {"type": "object", "description": "Optional metadata for the collection"},
                "embedding_function_name": {"type": "string", "description": "Name of the embedding function to use"}
            }
        })).unwrap()
    );
    
    add_tool(
        &mut tools,
        "chroma_peek_collection",
        "Shows a sample of documents in a collection",
        serde_json::to_value(serde_json::json!({
            "type": "object", 
            "required": ["collection_name", "limit"],
            "properties": {
                "collection_name": {"type": "string", "description": "Name of the collection to peek"},
                "limit": {"type": "integer", "description": "Number of documents to return"}
            }
        })).unwrap()
    );
    
    add_tool(
        &mut tools,
        "chroma_get_collection_info",
        "Gets metadata about a collection",
        serde_json::to_value(serde_json::json!({
            "type": "object", 
            "required": ["collection_name"],
            "properties": {
                "collection_name": {"type": "string", "description": "Name of the collection"}
            }
        })).unwrap()
    );
    
    add_tool(
        &mut tools,
        "chroma_get_collection_count",
        "Counts the number of documents in a collection",
        serde_json::to_value(serde_json::json!({
            "type": "object", 
            "required": ["collection_name"],
            "properties": {
                "collection_name": {"type": "string", "description": "Name of the collection"}
            }
        })).unwrap()
    );
    
    add_tool(
        &mut tools,
        "chroma_modify_collection",
        "Modifies collection properties",
        serde_json::to_value(serde_json::json!({
            "type": "object", 
            "required": ["collection_name"],
            "properties": {
                "collection_name": {"type": "string", "description": "Name of the collection to modify"},
                "new_name": {"type": "string", "description": "New name for the collection"},
                "new_metadata": {"type": "object", "description": "New metadata for the collection"}
            }
        })).unwrap()
    );
    
    add_tool(
        &mut tools,
        "chroma_delete_collection",
        "Deletes a collection",
        serde_json::to_value(serde_json::json!({
            "type": "object", 
            "required": ["collection_name"],
            "properties": {
                "collection_name": {"type": "string", "description": "Name of the collection to delete"}
            }
        })).unwrap()
    );
    
    add_tool(
        &mut tools,
        "chroma_add_documents",
        "Adds documents to a collection",
        serde_json::to_value(serde_json::json!({
            "type": "object", 
            "required": ["collection_name", "documents"],
            "properties": {
                "collection_name": {"type": "string", "description": "Name of the collection"},
                "documents": {"type": "array", "items": {"type": "string"}, "description": "List of documents to add"},
                "metadatas": {"type": "array", "items": {"type": "object"}, "description": "List of metadata objects for documents"},
                "ids": {"type": "array", "items": {"type": "string"}, "description": "List of IDs for documents"}
            }
        })).unwrap()
    );
    
    add_tool(
        &mut tools,
        "chroma_query_documents",
        "Searches for similar documents in a collection",
        serde_json::to_value(serde_json::json!({
            "type": "object", 
            "required": ["collection_name", "query_texts"],
            "properties": {
                "collection_name": {"type": "string", "description": "Name of the collection"},
                "query_texts": {"type": "array", "items": {"type": "string"}, "description": "List of query texts"},
                "n_results": {"type": "integer", "description": "Number of results to return per query"},
                "where_filter": {"type": "object", "description": "Filter by metadata"},
                "where_document": {"type": "object", "description": "Filter by document content"}
            }
        })).unwrap()
    );
    
    add_tool(
        &mut tools,
        "chroma_get_documents",
        "Retrieves documents from a collection",
        serde_json::to_value(serde_json::json!({
            "type": "object", 
            "required": ["collection_name"],
            "properties": {
                "collection_name": {"type": "string", "description": "Name of the collection"},
                "ids": {"type": "array", "items": {"type": "string"}, "description": "List of document IDs to retrieve"},
                "where_filter": {"type": "object", "description": "Filter by metadata"},
                "where_document": {"type": "object", "description": "Filter by document content"},
                "limit": {"type": "integer", "description": "Maximum number of documents to return"},
                "offset": {"type": "integer", "description": "Offset for pagination"}
            }
        })).unwrap()
    );
    
    add_tool(
        &mut tools,
        "chroma_update_documents",
        "Updates documents in a collection",
        serde_json::to_value(serde_json::json!({
            "type": "object", 
            "required": ["collection_name", "ids"],
            "properties": {
                "collection_name": {"type": "string", "description": "Name of the collection"},
                "ids": {"type": "array", "items": {"type": "string"}, "description": "List of document IDs to update"},
                "documents": {"type": "array", "items": {"type": "string"}, "description": "List of document contents"},
                "metadatas": {"type": "array", "items": {"type": "object"}, "description": "List of metadata objects"}
            }
        })).unwrap()
    );
    
    add_tool(
        &mut tools,
        "chroma_delete_documents",
        "Deletes documents from a collection",
        serde_json::to_value(serde_json::json!({
            "type": "object", 
            "required": ["collection_name", "ids"],
            "properties": {
                "collection_name": {"type": "string", "description": "Name of the collection"},
                "ids": {"type": "array", "items": {"type": "string"}, "description": "List of document IDs to delete"}
            }
        })).unwrap()
    );
    
    add_tool(
        &mut tools,
        "process_thought",
        "Processes a thought in an ongoing session",
        serde_json::to_value(serde_json::json!({
            "type": "object", 
            "required": ["session_id", "thought", "thought_number", "total_thoughts", "next_thought_needed"],
            "properties": {
                "session_id": {"type": "string", "description": "Session identifier"},
                "thought": {"type": "string", "description": "Content of the current thought"},
                "thought_number": {"type": "integer", "description": "Number of this thought in the sequence"},
                "total_thoughts": {"type": "integer", "description": "Total expected thoughts"},
                "next_thought_needed": {"type": "boolean", "description": "Whether another thought is needed"}
            }
        })).unwrap()
    );
    
    tools
}

```