dmayboroda/minima # codebase.md

# Directory Structure

```
├── .env.sample
├── .gitignore
├── .gitmodules
├── assets
│   ├── logo-full-b.svg
│   ├── logo-full-w.svg
│   └── logo-full.svg
├── chat
│   ├── .gitignore
│   ├── Dockerfile
│   ├── package-lock.json
│   ├── package.json
│   ├── public
│   │   ├── favicon.ico
│   │   ├── index.html
│   │   ├── logo192.png
│   │   ├── logo512.png
│   │   ├── manifest.json
│   │   └── robots.txt
│   ├── README.md
│   └── src
│       ├── App.css
│       ├── App.js
│       ├── App.test.js
│       ├── ChatApp.tsx
│       ├── index.css
│       ├── index.js
│       ├── logo.svg
│       ├── reportWebVitals.js
│       └── setupTests.js
├── docker-compose-chatgpt.yml
├── docker-compose-mcp.yml
├── docker-compose-ollama.yml
├── electron
│   ├── assets
│   │   ├── css
│   │   │   ├── no-topbar.css
│   │   │   └── style.css
│   │   ├── icons
│   │   │   ├── mac
│   │   │   │   └── favicon.icns
│   │   │   ├── png
│   │   │   │   └── favicon.png
│   │   │   └── win
│   │   │       └── favicon.ico
│   │   └── js
│   │       └── renderer.js
│   ├── index.html
│   ├── main.js
│   ├── package-lock.json
│   ├── package.json
│   ├── preload.js
│   ├── README.md
│   └── src
│       ├── menu.js
│       ├── print.js
│       ├── view.js
│       └── window.js
├── indexer
│   ├── app.py
│   ├── async_loop.py
│   ├── async_queue.py
│   ├── Dockerfile
│   ├── indexer.py
│   ├── requirements.txt
│   ├── singleton.py
│   └── storage.py
├── LICENSE
├── linker
│   ├── app.py
│   ├── Dockerfile
│   ├── requestor.py
│   └── requirements.txt
├── llm
│   ├── app.py
│   ├── async_answer_to_socket.py
│   ├── async_question_to_answer.py
│   ├── async_queue.py
│   ├── async_socket_to_chat.py
│   ├── control_flow_commands.py
│   ├── Dockerfile
│   ├── llm_chain.py
│   ├── minima_embed.py
│   └── requirements.txt
├── mcp-server
│   ├── pyproject.toml
│   ├── README.md
│   ├── src
│   │   └── minima
│   │       ├── __init__.py
│   │       ├── requestor.py
│   │       └── server.py
│   └── uv.lock
├── README.md
├── run_in_copilot.sh
└── run.sh
```

# Files

--------------------------------------------------------------------------------
/.env.sample:
--------------------------------------------------------------------------------

```
LOCAL_FILES_PATH
EMBEDDING_MODEL_ID
EMBEDDING_SIZE
OLLAMA_MODEL
RERANKER_MODEL
USER_ID
PASSWORD
```

--------------------------------------------------------------------------------
/.gitmodules:
--------------------------------------------------------------------------------

```
[submodule "minima-ui"]
    path = minima-ui
    url = [email protected]:pshenok/minima-ui.git
[submodule "aws"]
	path = aws
	url = https://github.com/pshenok/minima-aws.git

```

--------------------------------------------------------------------------------
/chat/.gitignore:
--------------------------------------------------------------------------------

```
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.

# dependencies
/node_modules
/.pnp
.pnp.js

# testing
/coverage

# production
/build

# misc
.DS_Store
.env.local
.env.development.local
.env.test.local
.env.production.local

npm-debug.log*
yarn-debug.log*
yarn-error.log*

```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
__pycache__/
*.py[cod]
*$py.class

*.log

.cache
*.cover

.python-version
.env
*.env
.venv

.vscode/

local_files/
qdrant_data/
indexer_data/
ollama/

*.db
.envrc
.DS_Store

logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
firebase-debug.log*
firebase-debug.*.log*

.firebase/

pids
*.pid
*.seed
*.pid.lock

lib-cov
coverage
.nyc_output
.grunt
bower_components
.lock-wscript
build/Release
node_modules/
.npm
.eslintcache
.node_repl_history
*.tgz
.yarn-integrity
.dataconnect

node_modules/
*.local
```

--------------------------------------------------------------------------------
/electron/README.md:
--------------------------------------------------------------------------------

```markdown
## Installation

```bash
npm install
```

## Run

```bash
npm start
```

## Build

Binary files for Windows, Linux and Mac are available in the `release-builds/` folder.

### For Windows

```bash
npm run package-win
```

### For Linux

```bash
npm run package-linux
```

### For Mac

```bash
npm run package-mac
```
```

--------------------------------------------------------------------------------
/mcp-server/README.md:
--------------------------------------------------------------------------------

```markdown
# minima MCP server

RAG on local files with MCP

Please go throuh all steps from main README

just add folliwing to **/Library/Application\ Support/Claude/claude_desktop_config.json**

```
{
    "mcpServers": {
      "minima": {
        "command": "uv",
        "args": [
          "--directory",
          "/path_to_cloned_minima_project/mcp-server",
          "run",
          "minima"
        ]
      }
    }
  }
```
After just open a Claude app and ask to find a context in your local files
```

--------------------------------------------------------------------------------
/chat/README.md:
--------------------------------------------------------------------------------

```markdown
# Getting Started with Create React App

This project was bootstrapped with [Create React App](https://github.com/facebook/create-react-app).

## Available Scripts

In the project directory, you can run:

### `npm start`

Runs the app in the development mode.\
Open [http://localhost:3000](http://localhost:3000) to view it in your browser.

The page will reload when you make changes.\
You may also see any lint errors in the console.

### `npm test`

Launches the test runner in the interactive watch mode.\
See the section about [running tests](https://facebook.github.io/create-react-app/docs/running-tests) for more information.

### `npm run build`

Builds the app for production to the `build` folder.\
It correctly bundles React in production mode and optimizes the build for the best performance.

The build is minified and the filenames include the hashes.\
Your app is ready to be deployed!

See the section about [deployment](https://facebook.github.io/create-react-app/docs/deployment) for more information.

### `npm run eject`

**Note: this is a one-way operation. Once you `eject`, you can't go back!**

If you aren't satisfied with the build tool and configuration choices, you can `eject` at any time. This command will remove the single build dependency from your project.

Instead, it will copy all the configuration files and the transitive dependencies (webpack, Babel, ESLint, etc) right into your project so you have full control over them. All of the commands except `eject` will still work, but they will point to the copied scripts so you can tweak them. At this point you're on your own.

You don't have to ever use `eject`. The curated feature set is suitable for small and middle deployments, and you shouldn't feel obligated to use this feature. However we understand that this tool wouldn't be useful if you couldn't customize it when you are ready for it.

## Learn More

You can learn more in the [Create React App documentation](https://facebook.github.io/create-react-app/docs/getting-started).

To learn React, check out the [React documentation](https://reactjs.org/).

### Code Splitting

This section has moved here: [https://facebook.github.io/create-react-app/docs/code-splitting](https://facebook.github.io/create-react-app/docs/code-splitting)

### Analyzing the Bundle Size

This section has moved here: [https://facebook.github.io/create-react-app/docs/analyzing-the-bundle-size](https://facebook.github.io/create-react-app/docs/analyzing-the-bundle-size)

### Making a Progressive Web App

This section has moved here: [https://facebook.github.io/create-react-app/docs/making-a-progressive-web-app](https://facebook.github.io/create-react-app/docs/making-a-progressive-web-app)

### Advanced Configuration

This section has moved here: [https://facebook.github.io/create-react-app/docs/advanced-configuration](https://facebook.github.io/create-react-app/docs/advanced-configuration)

### Deployment

This section has moved here: [https://facebook.github.io/create-react-app/docs/deployment](https://facebook.github.io/create-react-app/docs/deployment)

### `npm run build` fails to minify

This section has moved here: [https://facebook.github.io/create-react-app/docs/troubleshooting#npm-run-build-fails-to-minify](https://facebook.github.io/create-react-app/docs/troubleshooting#npm-run-build-fails-to-minify)

```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
<p align="center">
  <a href="https://mnma.ai/" target="blank"><img src="assets/logo-full.svg" width="300" alt="MNMA Logo" /></a>
</p>

**Minima** is an open source RAG on-premises containers, with ability to integrate with ChatGPT and MCP. 
Minima can also be used as a fully local RAG.

Minima currently supports three modes:
1. Isolated installation – Operate fully on-premises with containers, free from external dependencies such as ChatGPT or Claude. All neural networks (LLM, reranker, embedding) run on your cloud or PC, ensuring your data remains secure.

2. Custom GPT – Query your local documents using ChatGPT app or web with custom GPTs. The indexer running on your cloud or local PC, while the primary LLM remains ChatGPT.

3. Anthropic Claude – Use Anthropic Claude app to query your local documents. The indexer operates on your local PC, while Anthropic Claude serves as the primary LLM.

---

## Running as Containers

1. Create a .env file in the project’s root directory (where you’ll find env.sample). Place .env in the same folder and copy all environment variables from env.sample to .env.

2. Ensure your .env file includes the following variables:
<ul>
   <li> LOCAL_FILES_PATH </li>
   <li> EMBEDDING_MODEL_ID </li>
   <li> EMBEDDING_SIZE </li>
   <li> OLLAMA_MODEL </li>
   <li> RERANKER_MODEL </li>
   <li> USER_ID </li> - required for ChatGPT integration, just use your email
   <li> PASSWORD </li> - required for ChatGPT integration, just use any password
</ul>

3. For fully local installation use: **docker compose -f docker-compose-ollama.yml --env-file .env up --build**.

4. For ChatGPT enabled installation use: **docker compose -f docker-compose-chatgpt.yml --env-file .env up --build**.

5. For MCP integration (Anthropic Desktop app usage): **docker compose -f docker-compose-mcp.yml --env-file .env up --build**.

6. In case of ChatGPT enabled installation copy OTP from terminal where you launched docker and use [Minima GPT](https://chatgpt.com/g/g-r1MNTSb0Q-minima-local-computer-search)  

7. If you use Anthropic Claude, just add folliwing to **/Library/Application\ Support/Claude/claude_desktop_config.json**

```
{
    "mcpServers": {
      "minima": {
        "command": "uv",
        "args": [
          "--directory",
          "/path_to_cloned_minima_project/mcp-server",
          "run",
          "minima"
        ]
      }
    }
  }
```
   
8. To use fully local installation go to `cd electron`, then run `npm install` and `npm start` which will launch Minima electron app.

9. Ask anything, and you'll get answers based on local files in {LOCAL_FILES_PATH} folder.
---

## Variables Explained

**LOCAL_FILES_PATH**: Specify the root folder for indexing (on your cloud or local pc). Indexing is a recursive process, meaning all documents within subfolders of this root folder will also be indexed. Supported file types: .pdf, .xls, .docx, .txt, .md, .csv.

**EMBEDDING_MODEL_ID**: Specify the embedding model to use. Currently, only Sentence Transformer models are supported. Testing has been done with sentence-transformers/all-mpnet-base-v2, but other Sentence Transformer models can be used.

**EMBEDDING_SIZE**: Define the embedding dimension provided by the model, which is needed to configure Qdrant vector storage. Ensure this value matches the actual embedding size of the specified EMBEDDING_MODEL_ID.

**OLLAMA_MODEL**: Set up the Ollama model, use an ID available on the Ollama [site](https://ollama.com/search). Please, use LLM model here, not an embedding.

**RERANKER_MODEL**: Specify the reranker model. Currently, we have tested with BAAI rerankers. You can explore all available rerankers using this [link](https://huggingface.co/collections/BAAI/).

**USER_ID**: Just use your email here, this is needed to authenticate custom GPT to search in your data.

**PASSWORD**: Put any password here, this is used to create a firebase account for the email specified above.

---

## Examples

**Example of .env file for on-premises/local usage:**
```
LOCAL_FILES_PATH=/Users/davidmayboroda/Downloads/PDFs/
EMBEDDING_MODEL_ID=sentence-transformers/all-mpnet-base-v2
EMBEDDING_SIZE=768
OLLAMA_MODEL=qwen2:0.5b # must be LLM model id from Ollama models page
RERANKER_MODEL=BAAI/bge-reranker-base # please, choose any BAAI reranker model
```

To use a chat ui, please navigate to **http://localhost:3000**

**Example of .env file for Claude app:**
```
LOCAL_FILES_PATH=/Users/davidmayboroda/Downloads/PDFs/
EMBEDDING_MODEL_ID=sentence-transformers/all-mpnet-base-v2
EMBEDDING_SIZE=768
```
For the Claude app, please apply the changes to the claude_desktop_config.json file as outlined above.

**To use MCP with GitHub Copilot:**
1. Create a .env file in the project’s root directory (where you’ll find env.sample). Place .env in the same folder and copy all environment variables from env.sample to .env.

2. Ensure your .env file includes the following variables:
    - LOCAL_FILES_PATH
    - EMBEDDING_MODEL_ID
    - EMBEDDING_SIZE
      
3. Create or update the `.vscode/mcp.json` with the following configuration:

````json
{
  "servers": {
    "minima": {
      "type": "stdio",
      "command": "path_to_cloned_minima_project/run_in_copilot.sh",
      "args": [
        "path_to_cloned_minima_project"
      ]
    }
  }
}
````

**Example of .env file for ChatGPT custom GPT usage:**
```
LOCAL_FILES_PATH=/Users/davidmayboroda/Downloads/PDFs/
EMBEDDING_MODEL_ID=sentence-transformers/all-mpnet-base-v2
EMBEDDING_SIZE=768
[email protected] # your real email
PASSWORD=password # you can create here password that you want
```

Also, you can run minima using **run.sh**.

---

## Installing via Smithery (MCP usage)

To install Minima for Claude Desktop automatically via [Smithery](https://smithery.ai/protocol/minima):

```bash
npx -y @smithery/cli install minima --client claude
```

**For MCP usage, please be sure that your local machines python is >=3.10 and 'uv' installed.**

Minima (https://github.com/dmayboroda/minima) is licensed under the Mozilla Public License v2.0 (MPLv2).

```

--------------------------------------------------------------------------------
/chat/public/robots.txt:
--------------------------------------------------------------------------------

```
# https://www.robotstxt.org/robotstxt.html
User-agent: *
Disallow:

```

--------------------------------------------------------------------------------
/electron/assets/css/style.css:
--------------------------------------------------------------------------------

```css
/* Main */

body {
  margin: 0;
  padding: 0;
  -webkit-user-select: none;
  -webkit-app-region: drag;
}

```

--------------------------------------------------------------------------------
/linker/requirements.txt:
--------------------------------------------------------------------------------

```
httpx
google-cloud-firestore
firebase_admin
asyncio==3.4.3
fastapi==0.111.0
requests==2.32.3
uvicorn[standard]
```

--------------------------------------------------------------------------------
/electron/assets/css/no-topbar.css:
--------------------------------------------------------------------------------

```css
/* No topbar */

#webview {
  position: absolute;
  top: 0;
  left: 0;
  width: 100%;
  height: 100%;
  display: inline-flex !important;
}

```

--------------------------------------------------------------------------------
/chat/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
FROM node:20-alpine as build

WORKDIR /app

COPY package*.json ./

RUN npm install

COPY . .

RUN npm run build

EXPOSE 3000

CMD ["npm", "run", "start"] 
```

--------------------------------------------------------------------------------
/electron/preload.js:
--------------------------------------------------------------------------------

```javascript
const { contextBridge, ipcRenderer } = require("electron");

contextBridge.exposeInMainWorld("electron", {
  print: (arg) => ipcRenderer.invoke("print", arg),
});

```

--------------------------------------------------------------------------------
/llm/control_flow_commands.py:
--------------------------------------------------------------------------------

```python
PREFIX = 'CONTROL FLOW COMMAND:'

CFC_CLIENT_DISCONNECTED = PREFIX + 'CLIENT DISCONNECTED'
CFC_CHAT_STARTED = PREFIX + 'CHAT STARTED'
CFC_CHAT_STOPPED = PREFIX + 'CHAT STOPPED'
```

--------------------------------------------------------------------------------
/mcp-server/src/minima/__init__.py:
--------------------------------------------------------------------------------

```python
from . import server
import asyncio

def main():
    """Main entry point for the package."""
    asyncio.run(server.main())

# Optionally expose other important items at package level
__all__ = ['main', 'server']
```

--------------------------------------------------------------------------------
/indexer/singleton.py:
--------------------------------------------------------------------------------

```python
class Singleton(type):
    _instances = {}

    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
        return cls._instances[cls]

```

--------------------------------------------------------------------------------
/chat/src/setupTests.js:
--------------------------------------------------------------------------------

```javascript
// jest-dom adds custom jest matchers for asserting on DOM nodes.
// allows you to do things like:
// expect(element).toHaveTextContent(/react/i)
// learn more: https://github.com/testing-library/jest-dom
import '@testing-library/jest-dom';

```

--------------------------------------------------------------------------------
/chat/src/App.test.js:
--------------------------------------------------------------------------------

```javascript
import { render, screen } from '@testing-library/react';
import App from './App';

test('renders learn react link', () => {
  render(<App />);
  const linkElement = screen.getByText(/learn react/i);
  expect(linkElement).toBeInTheDocument();
});

```

--------------------------------------------------------------------------------
/llm/requirements.txt:
--------------------------------------------------------------------------------

```
requests
ollama
langgraph
langchain
langchain-core
langchain_qdrant
langchain-ollama
langchain_community==0.2.17
langchain-huggingface
sentence-transformers==2.6.0
transformers
asyncio==3.4.3
fastapi==0.111.0
qdrant-client
uvicorn[standard]
python-dotenv
pydantic
```

--------------------------------------------------------------------------------
/run_in_copilot.sh:
--------------------------------------------------------------------------------

```bash
#!/bin/bash
# run_in_copilot.sh

WORKDIR="${1:-$(pwd)}" # Default to current directory if no argument is provided

echo "[run_in_copilot] A working directory be used: $WORKDIR"

docker compose -f "$WORKDIR/docker-compose-mcp.yml" up -d

uv --directory "$WORKDIR/mcp-server" run minima

```

--------------------------------------------------------------------------------
/electron/src/view.js:
--------------------------------------------------------------------------------

```javascript
const electron = require("electron");
const { BrowserView } = electron;

exports.createBrowserView = (mainWindow) => {
  const view = new BrowserView();
  mainWindow.setBrowserView(view);
  view.setBounds({ x: 0, y: 0, width: 1024, height: 768 });
  view.webContents.loadURL("http://localhost:3000/");
};

```

--------------------------------------------------------------------------------
/electron/src/print.js:
--------------------------------------------------------------------------------

```javascript
const { ipcMain, BrowserWindow } = require("electron");

ipcMain.handle("print", async (event, arg) => {
  let printWindow = new BrowserWindow({ "auto-hide-menu-bar": true });
  printWindow.loadURL(arg);

  printWindow.webContents.on("did-finish-load", () => {
    printWindow.webContents.print();
  });
});

```

--------------------------------------------------------------------------------
/linker/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
FROM python:3.11-slim-buster

WORKDIR /usr/src/app
RUN pip install --upgrade pip
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .

ENV PORT 8000
ENV CURRENT_HOST 0.0.0.0
ENV WORKERS 1

CMD ["sh", "-c", "uvicorn app:app --loop asyncio --reload --workers ${WORKERS} --host $CURRENT_HOST --port $PORT --proxy-headers"]
```

--------------------------------------------------------------------------------
/chat/src/reportWebVitals.js:
--------------------------------------------------------------------------------

```javascript
const reportWebVitals = onPerfEntry => {
  if (onPerfEntry && onPerfEntry instanceof Function) {
    import('web-vitals').then(({ getCLS, getFID, getFCP, getLCP, getTTFB }) => {
      getCLS(onPerfEntry);
      getFID(onPerfEntry);
      getFCP(onPerfEntry);
      getLCP(onPerfEntry);
      getTTFB(onPerfEntry);
    });
  }
};

export default reportWebVitals;

```

--------------------------------------------------------------------------------
/indexer/requirements.txt:
--------------------------------------------------------------------------------

```
langchain
langchain-core
langfuse
langchain_qdrant
langchain_community
langchain-huggingface
sentence-transformers==2.6.0
transformers
asyncio==3.4.3
fastapi==0.111.0
qdrant-client
uvicorn[standard]
unstructured[xlsx]
unstructured[pptx]
python-magic
python-dotenv
openpyxl
docx2txt
pymupdf==1.25.1
pydantic
fastapi-utilities
sqlmodel
nltk
unstructured
python-pptx
```

--------------------------------------------------------------------------------
/chat/src/index.css:
--------------------------------------------------------------------------------

```css
body {
  margin: 0;
  font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Roboto', 'Oxygen',
    'Ubuntu', 'Cantarell', 'Fira Sans', 'Droid Sans', 'Helvetica Neue',
    sans-serif;
  -webkit-font-smoothing: antialiased;
  -moz-osx-font-smoothing: grayscale;
}

code {
  font-family: source-code-pro, Menlo, Monaco, Consolas, 'Courier New',
    monospace;
}

```

--------------------------------------------------------------------------------
/mcp-server/pyproject.toml:
--------------------------------------------------------------------------------

```toml
[project]
name = "minima"
version = "0.0.1"
description = "RAG on local files with MCP"
readme = "README.md"
requires-python = ">=3.10"
dependencies = [
 "httpx>=0.28.0",
 "mcp>=1.0.0",
 "pydantic>=2.10.2",
]
[[project.authors]]
name = "David Mayboroda"
email = "[email protected]"

[build-system]
requires = [ "hatchling",]
build-backend = "hatchling.build"

[project.scripts]
minima = "minima:main"

```

--------------------------------------------------------------------------------
/electron/index.html:
--------------------------------------------------------------------------------

```html
<html>
  <head>
    <link rel="stylesheet" href="assets/css/style.css" />
    <link rel="stylesheet" href="assets/css/no-topbar.css">
    <script defer src="assets/js/renderer.js"></script>
    <meta
      http-equiv="Content-Security-Policy"
      content="default-src 'self'; style-src 'self' 'unsafe-inline';"
    />
  </head>
  <body>
    <webview
      id="webview"
      autosize="on"
      src="http://localhost:3000/"
    ></webview>
  </body>
</html>

```

--------------------------------------------------------------------------------
/llm/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
FROM python:3.11-slim-buster

WORKDIR /usr/src/app

ARG RERANKER_MODEL

RUN pip install --upgrade pip
COPY requirements.txt .
RUN pip install huggingface_hub
RUN huggingface-cli download $RERANKER_MODEL --repo-type model
RUN pip install --no-cache-dir -r requirements.txt
COPY . .

ENV PORT 8000
ENV CURRENT_HOST 0.0.0.0
ENV WORKERS 1

CMD ["sh", "-c", "uvicorn app:app --loop asyncio --reload --workers ${WORKERS} --host $CURRENT_HOST --port $PORT --proxy-headers"]
```

--------------------------------------------------------------------------------
/chat/src/App.js:
--------------------------------------------------------------------------------

```javascript
import React from 'react';
import './App.css';
import ChatApp from './ChatApp.tsx';
import { ToastContainer, toast } from 'react-toastify';

function App() {
    return (
        <div className="App" style={{ display: 'flex', flexDirection: 'column', height: '100%' }}>
            <header className="App-header" style={{height: "100%"}}>
                <ChatApp />
                <ToastContainer />
            </header>
        </div>
    );
}

export default App;

```

--------------------------------------------------------------------------------
/electron/main.js:
--------------------------------------------------------------------------------

```javascript
const { app } = require("electron");

app.allowRendererProcessReuse = true;
app.on("ready", () => {
  const window = require("./src/window");
  mainWindow = window.createBrowserWindow(app);
  
  mainWindow.loadURL(`file://${__dirname}/index.html`, { userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36' });

  require("./src/print");
});

app.on("window-all-closed", () => {
  app.quit();
});

```

--------------------------------------------------------------------------------
/indexer/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
FROM python:3.11-slim-buster

WORKDIR /usr/src/app
RUN pip install --upgrade pip

ARG EMBEDDING_MODEL_ID

RUN pip install huggingface_hub
RUN huggingface-cli download $EMBEDDING_MODEL_ID --repo-type model

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .

ENV PORT 8000
ENV CURRENT_HOST 0.0.0.0
ENV WORKERS 1

CMD ["sh", "-c", "uvicorn app:app --loop asyncio --reload --workers ${WORKERS} --host $CURRENT_HOST --port $PORT --proxy-headers"]
```

--------------------------------------------------------------------------------
/chat/public/manifest.json:
--------------------------------------------------------------------------------

```json
{
  "short_name": "React App",
  "name": "Create React App Sample",
  "icons": [
    {
      "src": "favicon.ico",
      "sizes": "64x64 32x32 24x24 16x16",
      "type": "image/x-icon"
    },
    {
      "src": "logo192.png",
      "type": "image/png",
      "sizes": "192x192"
    },
    {
      "src": "logo512.png",
      "type": "image/png",
      "sizes": "512x512"
    }
  ],
  "start_url": ".",
  "display": "standalone",
  "theme_color": "#000000",
  "background_color": "#ffffff"
}

```

--------------------------------------------------------------------------------
/chat/src/index.js:
--------------------------------------------------------------------------------

```javascript
import React from 'react';
import ReactDOM from 'react-dom/client';
import './index.css';
import App from './App';
import reportWebVitals from './reportWebVitals';

const root = ReactDOM.createRoot(document.getElementById('root'));
root.render(
  <React.StrictMode>
    <App />
  </React.StrictMode>
);

// If you want to start measuring performance in your app, pass a function
// to log results (for example: reportWebVitals(console.log))
// or send to an analytics endpoint. Learn more: https://bit.ly/CRA-vitals
reportWebVitals();

```

--------------------------------------------------------------------------------
/chat/src/App.css:
--------------------------------------------------------------------------------

```css
.App {
  text-align: center;
}

.App-logo {
  height: 40vmin;
  pointer-events: none;
}

@media (prefers-reduced-motion: no-preference) {
  .App-logo {
    animation: App-logo-spin infinite 20s linear;
  }
}

.App-header {
  background-color: #282c34;
  min-height: 100vh;
  display: flex;
  flex-direction: column;
  align-items: center;
  justify-content: center;
  font-size: calc(10px + 2vmin);
  color: white;
}

.App-link {
  color: #61dafb;
}

@keyframes App-logo-spin {
  from {
    transform: rotate(0deg);
  }
  to {
    transform: rotate(360deg);
  }
}

```

--------------------------------------------------------------------------------
/llm/async_answer_to_socket.py:
--------------------------------------------------------------------------------

```python
import logging
from fastapi import WebSocket
from async_queue import AsyncQueue
import control_flow_commands as cfc
import starlette.websockets as ws

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("llm")

async def loop(response_queue: AsyncQueue, websocket: WebSocket):
    while True:
        data = await response_queue.dequeue()

        if data == cfc.CFC_CLIENT_DISCONNECTED:
            break
        else:
            logger.info(f"Sending data: {data}")
            try:
                await websocket.send_text(data)
            except ws.WebSocketDisconnect:
                break
```

--------------------------------------------------------------------------------
/electron/src/window.js:
--------------------------------------------------------------------------------

```javascript
const path = require("path");
const { BrowserWindow } = require("electron");

exports.createBrowserWindow = () => {
  return new BrowserWindow({
    width: 1024,
    height: 768,
    minWidth: 400, 
    minHeight: 600,
    icon: path.join(__dirname, "assets/icons/png/favicon.png"),
    backgroundColor: "#fff",
    autoHideMenuBar: true,
    webPreferences: {
      devTools: false,
      contextIsolation: true,
      webviewTag: true,
      preload: path.join(__dirname, "../preload.js"),
      enableRemoteModule: true,
      nodeIntegration: false,
      nativeWindowOpen: true,
      webSecurity: true,
      allowRunningInsecureContent: true
    },
  });
};

```

--------------------------------------------------------------------------------
/run.sh:
--------------------------------------------------------------------------------

```bash
#!/bin/bash

echo "Select an option:"
echo "1) Fully Local Setup"
echo "2) ChatGPT Integration"
echo "3) MCP usage"
echo "4) Quit"

read -p "Enter your choice (1, 2, 3 or 4): " user_choice

case "$user_choice" in
    1)
        echo "Starting fully local setup..."
        docker compose -f docker-compose-ollama.yml --env-file .env up --build
        ;;
    2)
        echo "Starting with ChatGPT integration..."
        docker compose -f docker-compose-chatgpt.yml --env-file .env up --build
        ;;
    3)
        echo "Exiting the script. Goodbye!"
        docker compose -f docker-compose-mcp.yml --env-file .env up --build
        ;;
    4)
        echo "Exiting the script. Goodbye!"
        exit 0
        ;;
    *)
        echo "Invalid input. Please enter 1, 2, or 3."
        ;;
esac
```

--------------------------------------------------------------------------------
/llm/app.py:
--------------------------------------------------------------------------------

```python
import logging
import asyncio
from fastapi import FastAPI
from fastapi import WebSocket
from llm_chain import LLMChain
from async_queue import AsyncQueue

import async_socket_to_chat
import async_question_to_answer
import async_answer_to_socket

app = FastAPI()

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("llm")

@app.websocket("/llm/")
async def chat_client(websocket: WebSocket):

    question_queue = AsyncQueue()
    response_queue = AsyncQueue()

    answer_to_socket_promise = async_answer_to_socket.loop(response_queue, websocket)
    question_to_answer_promise = async_question_to_answer.loop(question_queue, response_queue)
    socket_to_chat_promise = async_socket_to_chat.loop(websocket, question_queue, response_queue)

    await asyncio.gather(
        answer_to_socket_promise,
        question_to_answer_promise,
        socket_to_chat_promise,
    )
```

--------------------------------------------------------------------------------
/mcp-server/src/minima/requestor.py:
--------------------------------------------------------------------------------

```python
import httpx
import logging


logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

REQUEST_DATA_URL = "http://localhost:8001/query"
REQUEST_HEADERS = {
    'Accept': 'application/json',
    'Content-Type': 'application/json'
}

async def request_data(query):
    payload = {
        "query": query
    }
    async with httpx.AsyncClient() as client:
        try:
            logger.info(f"Requesting data from indexer with query: {query}")
            response = await client.post(REQUEST_DATA_URL, 
                                         headers=REQUEST_HEADERS, 
                                         json=payload)
            response.raise_for_status()
            data = response.json()
            logger.info(f"Received data: {data}")
            return data

        except Exception as e:
            logger.error(f"HTTP error: {e}")
            return { "error": str(e) }
```

--------------------------------------------------------------------------------
/linker/requestor.py:
--------------------------------------------------------------------------------

```python
import httpx
import logging
import asyncio

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

REQUEST_DATA_URL = "http://indexer:8000/query"
REQUEST_HEADERS = {
    'Accept': 'application/json',
    'Content-Type': 'application/json'
}

async def request_data(query):
    payload = {
        "query": query
    }
    async with httpx.AsyncClient() as client:
        try:
            logger.info(f"Requesting data from indexer with query: {query}")
            response = await client.post(REQUEST_DATA_URL, 
                                         headers=REQUEST_HEADERS, 
                                         json=payload)
            response.raise_for_status()
            data = response.json()
            logger.info(f"Received data: {data}")
            return data

        except Exception as e:
            logger.error(f"HTTP error: {e}")
            return { "error": str(e) }
```

--------------------------------------------------------------------------------
/docker-compose-mcp.yml:
--------------------------------------------------------------------------------

```yaml
version: '3.9'
services:

  qdrant:
    image: qdrant/qdrant:latest
    container_name: qdrant
    ports:
      - 6333:6333
      - 6334:6334
    expose:
      - 6333
      - 6334
      - 6335
    volumes:
      - ./qdrant_data:/qdrant/storage
    environment:
      QDRANT__LOG_LEVEL: "INFO"

  indexer:
    build: 
      context: ./indexer
      dockerfile: Dockerfile
      args:
        EMBEDDING_MODEL_ID: ${EMBEDDING_MODEL_ID}
        EMBEDDING_SIZE: ${EMBEDDING_SIZE}
    volumes:
      - ${LOCAL_FILES_PATH}:/usr/src/app/local_files/
      - ./indexer:/usr/src/app
      - ./indexer_data:/indexer/storage
    ports:
      - 8001:8000
    environment:
      - PYTHONPATH=/usr/src
      - PYTHONUNBUFFERED=TRUE
      - LOCAL_FILES_PATH=${LOCAL_FILES_PATH}
      - EMBEDDING_MODEL_ID=${EMBEDDING_MODEL_ID}
      - EMBEDDING_SIZE=${EMBEDDING_SIZE}
      - CONTAINER_PATH=/usr/src/app/local_files/
    depends_on:
      - qdrant
```

--------------------------------------------------------------------------------
/indexer/async_queue.py:
--------------------------------------------------------------------------------

```python
import asyncio

from collections import deque

class AsyncQueueDequeueInterrupted(Exception):
  
    def __init__(self, message="AsyncQueue dequeue was interrupted"):
        self.message = message
        super().__init__(self.message)

class AsyncQueue:

    def __init__(self):
        self._data = deque([])
        self._presense_of_data = asyncio.Event()

    def enqueue(self, value):
        self._data.append(value)

        if len(self._data) == 1:
            self._presense_of_data.set()

    async def dequeue(self):
        await self._presense_of_data.wait()

        if len(self._data) < 1:
            raise AsyncQueueDequeueInterrupted("AsyncQueue was dequeue was interrupted")

        result = self._data.popleft()

        if not self._data:
            self._presense_of_data.clear()

        return result

    def size(self):
        result = len(self._data)
        return result

    def shutdown(self):
        self._presense_of_data.set()
```

--------------------------------------------------------------------------------
/llm/async_queue.py:
--------------------------------------------------------------------------------

```python
import asyncio

from collections import deque

class AsyncQueueDequeueInterrupted(Exception):
    def __init__(self, message="AsyncQueue dequeue was interrupted"):
        self.message = message
        super().__init__(self.message)

class AsyncQueue:
    def __init__(self) -> None:
        self._data = deque([])
        self._presence_of_data = asyncio.Event()

    def enqueue(self, value):
        self._data.append(value)

        if len(self._data) == 1:
            self._presence_of_data.set()

    async def dequeue(self):
        await self._presence_of_data.wait()

        if len(self._data) < 1:
            raise AsyncQueueDequeueInterrupted("AsyncQueue was dequeue was interrupted")

        result = self._data.popleft()

        if not self._data:
            self._presence_of_data.clear()

        return result

    def size(self):
        result = len(self._data)
        return result

    def shutdown(self):
        self._presence_of_data.set()

```

--------------------------------------------------------------------------------
/electron/package.json:
--------------------------------------------------------------------------------

```json
{
  "name": "minima-app",
  "productName": "Minima",
  "version": "1.0.0",
  "description": "Minima is your local AI search app. Chat with your documents, photos and more.",
  "main": "main.js",
  
  "scripts": {
    "start": "electron .",
    "package-mac": "npx electron-packager . --overwrite --platform=darwin --arch=x64 --icon=assets/icons/mac/favicon.icns --prune=true --out=release-builds",
    "package-win": "npx electron-packager . --overwrite --asar=true --platform=win32 --arch=x64 --icon=assets/icons/win/favicon.ico --prune=true --out=release-builds --version-string.CompanyName=CE --version-string.FileDescription=CE --version-string.ProductName=\"Minima\"",
    "package-linux": "npx electron-packager . --overwrite --platform=linux --arch=x64 --icon=assets/icons/png/favicon.png --prune=true --out=release-builds"
   },

  "repository": "https://github.com/dmayboroda/minima",
  "author": "Minima team",
  "license": "MPLv2",

  "devDependencies": {
    "electron": "^22.0.0"
  }
}

```

--------------------------------------------------------------------------------
/chat/package.json:
--------------------------------------------------------------------------------

```json
{
  "name": "chat",
  "version": "0.1.0",
  "private": true,
  "dependencies": {
    "@emotion/react": "^11.13.5",
    "@emotion/styled": "^11.13.5",
    "@mui/material": "^5.16.7",
    "@mui/system": "^6.1.8",
    "@testing-library/jest-dom": "^5.17.0",
    "@testing-library/react": "^13.4.0",
    "@testing-library/user-event": "^13.5.0",
    "antd": "^5.22.7",
    "react": "^18.3.1",
    "react-dom": "^18.3.1",
    "react-scripts": "5.0.1",
    "react-toastify": "^11.0.3",
    "typescript": "^4.9.5",
    "web-vitals": "^2.1.4"
  },
  "scripts": {
    "start": "react-scripts start",
    "build": "react-scripts build",
    "test": "react-scripts test",
    "eject": "react-scripts eject"
  },
  "eslintConfig": {
    "extends": [
      "react-app",
      "react-app/jest"
    ]
  },
  "browserslist": {
    "production": [
      ">0.2%",
      "not dead",
      "not op_mini all"
    ],
    "development": [
      "last 1 chrome version",
      "last 1 firefox version",
      "last 1 safari version"
    ]
  }
}

```

--------------------------------------------------------------------------------
/docker-compose-chatgpt.yml:
--------------------------------------------------------------------------------

```yaml
version: '3.9'
services:

  qdrant:
    image: qdrant/qdrant:latest
    container_name: qdrant
    ports:
      - 6333:6333
      - 6334:6334
    expose:
      - 6333
      - 6334
      - 6335
    volumes:
      - ./qdrant_data:/qdrant/storage
    environment:
      QDRANT__LOG_LEVEL: "INFO"

  indexer:
    build: 
      context: ./indexer
      dockerfile: Dockerfile
      args:
        EMBEDDING_MODEL_ID: ${EMBEDDING_MODEL_ID}
        EMBEDDING_SIZE: ${EMBEDDING_SIZE}
    volumes:
      - ${LOCAL_FILES_PATH}:/usr/src/app/local_files/
      - ./indexer:/usr/src/app
      - ./indexer_data:/indexer/storage
    ports:
      - 8001:8000
    environment:
      - PYTHONPATH=/usr/src
      - PYTHONUNBUFFERED=TRUE
      - LOCAL_FILES_PATH=${LOCAL_FILES_PATH}
      - EMBEDDING_MODEL_ID=${EMBEDDING_MODEL_ID}
      - EMBEDDING_SIZE=${EMBEDDING_SIZE}
      - CONTAINER_PATH=/usr/src/app/local_files/
    depends_on:
      - qdrant

  linker:
    build: ./linker
    volumes:
      - ./linker:/usr/src/app
    ports:
      - 8002:8000
    environment:
      - PYTHONPATH=/usr/src
      - PYTHONUNBUFFERED=TRUE
      - FIRESTORE_COLLECTION_NAME=userTasks
      - TASKS_COLLECTION=tasks
      - USER_ID=${USER_ID}
      - PASSWORD=${PASSWORD}
      - FB_PROJECT=localragex
    depends_on:
      - qdrant
```

--------------------------------------------------------------------------------
/electron/assets/js/renderer.js:
--------------------------------------------------------------------------------

```javascript
const getControlsHeight = () => {
  const controls = document.querySelector("#controls");
  if (controls) {
    return controls.offsetHeight;
  }
  return 0;
};

const calculateLayoutSize = () => {
  const webview = document.querySelector("webview");
  const windowWidth = document.documentElement.clientWidth;
  const windowHeight = document.documentElement.clientHeight;
  const controlsHeight = getControlsHeight();
  const webviewHeight = windowHeight - controlsHeight;

  webview.style.width = windowWidth + "px";
  webview.style.height = webviewHeight + "px";
};

window.addEventListener("DOMContentLoaded", () => {
  calculateLayoutSize();

  // Dynamic resize function (responsive)
  window.onresize = calculateLayoutSize;

  // Home button exists
  if (document.querySelector("#home")) {
    document.querySelector("#home").onclick = () => {
      const home = document.getElementById("webview").getAttribute("data-home");
      document.querySelector("webview").src = home;
    };
  }

  // Print button exits
  if (document.querySelector("#print_button")) {
    document
      .querySelector("#print_button")
      .addEventListener("click", async () => {
        const url = document.querySelector("webview").getAttribute("src");

        // Launch print window
        await window.electron.print(url);
      });
  }
});

```

--------------------------------------------------------------------------------
/llm/async_question_to_answer.py:
--------------------------------------------------------------------------------

```python
import json
import logging
from llm_chain import LLMChain
from async_queue import AsyncQueue
import control_flow_commands as cfc

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("chat")

async def loop(
        questions_queue: AsyncQueue,
        response_queue: AsyncQueue,
):

    llm_chain = LLMChain()

    while True:
        data = await questions_queue.dequeue()
        data = data.replace("\n", "")

        if data == cfc.CFC_CLIENT_DISCONNECTED:
            response_queue.enqueue(
                json.dumps({
                    "reporter": "output_message",
                    "type": "disconnect_message",
                })
            )
            break

        if data == cfc.CFC_CHAT_STARTED:
            response_queue.enqueue(
                json.dumps({
                    "reporter": "output_message",
                    "type": "start_message",
                })
            )
            
        elif data == cfc.CFC_CHAT_STOPPED:
            response_queue.enqueue(
                json.dumps({
                    "reporter": "output_message",
                    "type": "stop_message",
                })
            )
            
        elif data:
            result = llm_chain.invoke(data)
            response_queue.enqueue(
                json.dumps({
                    "reporter": "output_message",
                    "type": "answer",
                    "message": result["answer"],
                    "links": list(result["links"])
                })
            )
```

--------------------------------------------------------------------------------
/llm/async_socket_to_chat.py:
--------------------------------------------------------------------------------

```python
import json
import logging
from fastapi import WebSocket
from async_queue import AsyncQueue
import starlette.websockets as ws
import control_flow_commands as cfc

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("llm")

async def loop(
    websocket: WebSocket, 
    questions_queue: AsyncQueue,
    respone_queue: AsyncQueue
):

    await websocket.accept()
    while True:
        try:
            message = await websocket.receive_text()

            if message == cfc.CFC_CHAT_STARTED:
                logger.info(f"Start message {message}")
                questions_queue.enqueue(message)

            elif message == cfc.CFC_CHAT_STOPPED:
                logger.info(f"Stop message {message}")
                questions_queue.enqueue(message)
                respone_queue.enqueue(json.dumps({
                    "reporter": "input_message",
                    "type": "stop_message",
                    "message": message
                }))

            else:
                logger.info(f"Question: {message}")
                questions_queue.enqueue(message)
                respone_queue.enqueue(json.dumps({
                    "reporter": "input_message",
                    "type": "question",
                    "message": message
                }))
                
        except ws.WebSocketDisconnect as e:
            logger.info("Client disconnected")
            questions_queue.enqueue(cfc.CFC_CLIENT_DISCONNECTED)
            respone_queue.enqueue(cfc.CFC_CLIENT_DISCONNECTED)
            break
```

--------------------------------------------------------------------------------
/electron/src/menu.js:
--------------------------------------------------------------------------------

```javascript
exports.createTemplate = (name) => {
  let template = [
    {
      label: "Edit",
      submenu: [
        { role: "undo" },
        { role: "redo" },
        { type: "separator" },
        { role: "cut" },
        { role: "copy" },
        { role: "paste" },
        { role: "pasteandmatchstyle" },
        { role: "delete" },
        { role: "selectall" },
      ],
    },
    {
      label: "View",
      submenu: [
        { role: "reload" },
        { role: "forcereload" },
        { role: "toggledevtools" },
        { type: "separator" },
        { role: "resetzoom" },
        { role: "zoomin" },
        { role: "zoomout" },
        { type: "separator" },
        { role: "togglefullscreen" },
      ],
    },
    {
      role: "window",
      submenu: [{ role: "minimize" }, { role: "close" }],
    },
  ];

  if (process.platform === "darwin") {
    template.unshift({
      label: name,
      submenu: [
        { type: "separator" },
        { role: "services", submenu: [] },
        { type: "separator" },
        { role: "hide" },
        { role: "hideothers" },
        { role: "unhide" },
        { type: "separator" },
        { role: "quit" },
      ],
    });

    template[1].submenu.push(
      { type: "separator" },
      {
        label: "Speech",
        submenu: [{ role: "startspeaking" }, { role: "stopspeaking" }],
      }
    );

    template[3].submenu = [
      { role: "close" },
      { role: "minimize" },
      { role: "zoom" },
      { type: "separator" },
      { role: "front" },
    ];
  }

  return template;
};

```

--------------------------------------------------------------------------------
/llm/minima_embed.py:
--------------------------------------------------------------------------------

```python
import requests
import logging
from typing import Any, List
from pydantic import BaseModel
from langchain_core.embeddings import Embeddings

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

REQUEST_DATA_URL = "http://indexer:8000/embedding"
REQUEST_HEADERS = {
    'Accept': 'application/json',
    'Content-Type': 'application/json'
}

class MinimaEmbeddings(BaseModel, Embeddings):

    def __init__(self, **kwargs: Any):
        super().__init__(**kwargs)

    def embed_documents(self, texts: list[str]) -> list[list[float]]:
        results = []
        for text in texts:
            embedding = self.request_data(text)
            if "error" in embedding:
                logger.error(f"Error in embedding: {embedding['error']}")
            else:
                embedding = embedding["result"]
                results.append(embedding)
        return results

    def embed_query(self, text: str) -> list[float]:
        return self.embed_documents([text])[0]

    def request_data(self, query):
        payload = {
            "query": query
        }
        try:
            logger.info(f"Requesting data from indexer with query: {query}")
            response = requests.post(REQUEST_DATA_URL, headers=REQUEST_HEADERS, json=payload)
            response.raise_for_status()
            data = response.json()
            logger.info(f"Received data: {data}")
            return data

        except requests.exceptions.RequestException as e:
            logger.error(f"HTTP error: {e}")
            return {"error": str(e)}
```

--------------------------------------------------------------------------------
/chat/public/index.html:
--------------------------------------------------------------------------------

```html
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <link rel="icon" href="%PUBLIC_URL%/favicon.ico" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <meta name="theme-color" content="#000000" />
    <meta
      name="description"
      content="Web site created using create-react-app"
    />
    <link rel="apple-touch-icon" href="%PUBLIC_URL%/logo192.png" />
    <!--
      manifest.json provides metadata used when your web app is installed on a
      user's mobile device or desktop. See https://developers.google.com/web/fundamentals/web-app-manifest/
    -->
    <link rel="manifest" href="%PUBLIC_URL%/manifest.json" />
    <!--
      Notice the use of %PUBLIC_URL% in the tags above.
      It will be replaced with the URL of the `public` folder during the build.
      Only files inside the `public` folder can be referenced from the HTML.

      Unlike "/favicon.ico" or "favicon.ico", "%PUBLIC_URL%/favicon.ico" will
      work correctly both with client-side routing and a non-root public URL.
      Learn how to configure a non-root public URL by running `npm run build`.
    -->
    <title>React App</title>
  </head>
  <body>
    <noscript>You need to enable JavaScript to run this app.</noscript>
    <div id="root"></div>
    <!--
      This HTML file is a template.
      If you open it directly in the browser, you will see an empty page.

      You can add webfonts, meta tags, or analytics to this file.
      The build step will place the bundled scripts into the <body> tag.

      To begin the development, run `npm start` or `yarn start`.
      To create a production bundle, use `npm run build` or `yarn build`.
    -->
  </body>
</html>

```

--------------------------------------------------------------------------------
/docker-compose-ollama.yml:
--------------------------------------------------------------------------------

```yaml
version: '3.9'
services:

  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    ports:
      - 11434:11434
    expose:
      - 11434
    volumes:
      - ./ollama:/root/.ollama
    entrypoint: ["/bin/sh", "-c"]
    environment:
      - OLLAMA_MODEL=${OLLAMA_MODEL}
    command: >
      "ollama serve &
      sleep 10 &&
      ollama pull ${OLLAMA_MODEL} &&
      wait"

  qdrant:
    image: qdrant/qdrant:latest
    container_name: qdrant
    ports:
      - 6333:6333
      - 6334:6334
    expose:
      - 6333
      - 6334
      - 6335
    volumes:
      - ./qdrant_data:/qdrant/storage
    environment:
      QDRANT__LOG_LEVEL: "INFO"

  indexer:
    build: 
      context: ./indexer
      dockerfile: Dockerfile
      args:
        EMBEDDING_MODEL_ID: ${EMBEDDING_MODEL_ID}
        EMBEDDING_SIZE: ${EMBEDDING_SIZE}
    volumes:
      - ${LOCAL_FILES_PATH}:/usr/src/app/local_files/
      - ./indexer:/usr/src/app
      - ./indexer_data:/indexer/storage
    ports:
      - 8001:8000
    environment:
      - PYTHONPATH=/usr/src
      - PYTHONUNBUFFERED=TRUE
      - LOCAL_FILES_PATH=${LOCAL_FILES_PATH}
      - EMBEDDING_MODEL_ID=${EMBEDDING_MODEL_ID}
      - EMBEDDING_SIZE=${EMBEDDING_SIZE}
      - CONTAINER_PATH=/usr/src/app/local_files/
    depends_on:
      - qdrant

  llm:
    build: 
      context: ././llm
      dockerfile: Dockerfile
      args:
        RERANKER_MODEL: ${RERANKER_MODEL}
    volumes:
      - ./llm:/usr/src/app
    ports:
      - 8003:8000
    environment:
      - PYTHONPATH=/usr/src
      - PYTHONUNBUFFERED=TRUE
      - OLLAMA_MODEL=${OLLAMA_MODEL}
      - RERANKER_MODEL=${RERANKER_MODEL}
      - LOCAL_FILES_PATH=${LOCAL_FILES_PATH}
      - CONTAINER_PATH=/usr/src/app/local_files/
    depends_on:
      - ollama
      - qdrant
      - indexer

  chat:
    build: ./chat
    volumes:
      - ./chat:/usr/src/app
    ports:
      - 3000:3000
    depends_on:
      - ollama
      - qdrant
      - llm

```

--------------------------------------------------------------------------------
/indexer/async_loop.py:
--------------------------------------------------------------------------------

```python
import os
import uuid
import asyncio
import logging
from indexer import Indexer
from concurrent.futures import ThreadPoolExecutor

logger = logging.getLogger(__name__)
executor = ThreadPoolExecutor()

CONTAINER_PATH = os.environ.get("CONTAINER_PATH")
AVAILABLE_EXTENSIONS = [".pdf", ".xls", "xlsx", ".doc", ".docx", ".txt", ".md", ".csv", ".ppt", ".pptx"]


async def crawl_loop(async_queue):
    logger.info(f"Starting crawl loop with path: {CONTAINER_PATH}")
    existing_file_paths: list[str] = []
    for root, _, files in os.walk(CONTAINER_PATH):
        logger.info(f"Processing folder: {root}")
        for file in files:
            if not any(file.endswith(ext) for ext in AVAILABLE_EXTENSIONS):
                logger.info(f"Skipping file: {file}")
                continue
            path = os.path.join(root, file)
            message = {
                "path": path,
                "file_id": str(uuid.uuid4()),
                "last_updated_seconds": round(os.path.getmtime(path)),
                "type": "file"
            }
            existing_file_paths.append(path)
            async_queue.enqueue(message)
            logger.info(f"File enqueue: {path}")
        aggregate_message = {
            "existing_file_paths": existing_file_paths,
            "type": "all_files"
        }
        async_queue.enqueue(aggregate_message)
        async_queue.enqueue({"type": "stop"})


async def index_loop(async_queue, indexer: Indexer):
    loop = asyncio.get_running_loop()
    logger.info("Starting index loop")
    while True:
        if async_queue.size() == 0:
            logger.info("No files to index. Indexing stopped, all files indexed.")
            await asyncio.sleep(1)
            continue
        message = await async_queue.dequeue()
        logger.info(f"Processing message: {message}")
        try:
            if message["type"] == "file":
                await loop.run_in_executor(executor, indexer.index, message)
            elif message["type"] == "all_files":
                await loop.run_in_executor(executor, indexer.purge, message)
            elif message["type"] == "stop":
                break
        except Exception as e:
            logger.error(f"Error in processing message: {e}")
            logger.error(f"Failed to process message: {message}")
        await asyncio.sleep(1)


```

--------------------------------------------------------------------------------
/chat/src/logo.svg:
--------------------------------------------------------------------------------

```
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 841.9 595.3"><g fill="#61DAFB"><path d="M666.3 296.5c0-32.5-40.7-63.3-103.1-82.4 14.4-63.6 8-114.2-20.2-130.4-6.5-3.8-14.1-5.6-22.4-5.6v22.3c4.6 0 8.3.9 11.4 2.6 13.6 7.8 19.5 37.5 14.9 75.7-1.1 9.4-2.9 19.3-5.1 29.4-19.6-4.8-41-8.5-63.5-10.9-13.5-18.5-27.5-35.3-41.6-50 32.6-30.3 63.2-46.9 84-46.9V78c-27.5 0-63.5 19.6-99.9 53.6-36.4-33.8-72.4-53.2-99.9-53.2v22.3c20.7 0 51.4 16.5 84 46.6-14 14.7-28 31.4-41.3 49.9-22.6 2.4-44 6.1-63.6 11-2.3-10-4-19.7-5.2-29-4.7-38.2 1.1-67.9 14.6-75.8 3-1.8 6.9-2.6 11.5-2.6V78.5c-8.4 0-16 1.8-22.6 5.6-28.1 16.2-34.4 66.7-19.9 130.1-62.2 19.2-102.7 49.9-102.7 82.3 0 32.5 40.7 63.3 103.1 82.4-14.4 63.6-8 114.2 20.2 130.4 6.5 3.8 14.1 5.6 22.5 5.6 27.5 0 63.5-19.6 99.9-53.6 36.4 33.8 72.4 53.2 99.9 53.2 8.4 0 16-1.8 22.6-5.6 28.1-16.2 34.4-66.7 19.9-130.1 62-19.1 102.5-49.9 102.5-82.3zm-130.2-66.7c-3.7 12.9-8.3 26.2-13.5 39.5-4.1-8-8.4-16-13.1-24-4.6-8-9.5-15.8-14.4-23.4 14.2 2.1 27.9 4.7 41 7.9zm-45.8 106.5c-7.8 13.5-15.8 26.3-24.1 38.2-14.9 1.3-30 2-45.2 2-15.1 0-30.2-.7-45-1.9-8.3-11.9-16.4-24.6-24.2-38-7.6-13.1-14.5-26.4-20.8-39.8 6.2-13.4 13.2-26.8 20.7-39.9 7.8-13.5 15.8-26.3 24.1-38.2 14.9-1.3 30-2 45.2-2 15.1 0 30.2.7 45 1.9 8.3 11.9 16.4 24.6 24.2 38 7.6 13.1 14.5 26.4 20.8 39.8-6.3 13.4-13.2 26.8-20.7 39.9zm32.3-13c5.4 13.4 10 26.8 13.8 39.8-13.1 3.2-26.9 5.9-41.2 8 4.9-7.7 9.8-15.6 14.4-23.7 4.6-8 8.9-16.1 13-24.1zM421.2 430c-9.3-9.6-18.6-20.3-27.8-32 9 .4 18.2.7 27.5.7 9.4 0 18.7-.2 27.8-.7-9 11.7-18.3 22.4-27.5 32zm-74.4-58.9c-14.2-2.1-27.9-4.7-41-7.9 3.7-12.9 8.3-26.2 13.5-39.5 4.1 8 8.4 16 13.1 24 4.7 8 9.5 15.8 14.4 23.4zM420.7 163c9.3 9.6 18.6 20.3 27.8 32-9-.4-18.2-.7-27.5-.7-9.4 0-18.7.2-27.8.7 9-11.7 18.3-22.4 27.5-32zm-74 58.9c-4.9 7.7-9.8 15.6-14.4 23.7-4.6 8-8.9 16-13 24-5.4-13.4-10-26.8-13.8-39.8 13.1-3.1 26.9-5.8 41.2-7.9zm-90.5 125.2c-35.4-15.1-58.3-34.9-58.3-50.6 0-15.7 22.9-35.6 58.3-50.6 8.6-3.7 18-7 27.7-10.1 5.7 19.6 13.2 40 22.5 60.9-9.2 20.8-16.6 41.1-22.2 60.6-9.9-3.1-19.3-6.5-28-10.2zM310 490c-13.6-7.8-19.5-37.5-14.9-75.7 1.1-9.4 2.9-19.3 5.1-29.4 19.6 4.8 41 8.5 63.5 10.9 13.5 18.5 27.5 35.3 41.6 50-32.6 30.3-63.2 46.9-84 46.9-4.5-.1-8.3-1-11.3-2.7zm237.2-76.2c4.7 38.2-1.1 67.9-14.6 75.8-3 1.8-6.9 2.6-11.5 2.6-20.7 0-51.4-16.5-84-46.6 14-14.7 28-31.4 41.3-49.9 22.6-2.4 44-6.1 63.6-11 2.3 10.1 4.1 19.8 5.2 29.1zm38.5-66.7c-8.6 3.7-18 7-27.7 10.1-5.7-19.6-13.2-40-22.5-60.9 9.2-20.8 16.6-41.1 22.2-60.6 9.9 3.1 19.3 6.5 28.1 10.2 35.4 15.1 58.3 34.9 58.3 50.6-.1 15.7-23 35.6-58.4 50.6zM320.8 78.4z"/><circle cx="420.9" cy="296.5" r="45.7"/><path d="M520.5 78.1z"/></g></svg>
```

--------------------------------------------------------------------------------
/indexer/app.py:
--------------------------------------------------------------------------------

```python
import nltk
import logging
import asyncio
from indexer import Indexer
from pydantic import BaseModel
from storage import MinimaStore
from async_queue import AsyncQueue
from fastapi import FastAPI, APIRouter
from contextlib import asynccontextmanager
from fastapi_utilities import repeat_every
from async_loop import index_loop, crawl_loop

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

indexer = Indexer()
router = APIRouter()
async_queue = AsyncQueue()
MinimaStore.create_db_and_tables()

def init_loader_dependencies():
    nltk.download('punkt')
    nltk.download('punkt_tab')
    nltk.download('wordnet')
    nltk.download('omw-1.4')
    nltk.download('punkt')
    nltk.download('averaged_perceptron_tagger_eng')

init_loader_dependencies()

class Query(BaseModel):
    query: str


@router.post(
    "/query", 
    response_description='Query local data storage',
)
async def query(request: Query):
    logger.info(f"Received query: {query}")
    try:
        result = indexer.find(request.query)
        logger.info(f"Found {len(result)} results for query: {query}")
        logger.info(f"Results: {result}")
        return {"result": result}
    except Exception as e:
        logger.error(f"Error in processing query: {e}")
        return {"error": str(e)}


@router.post(
    "/embedding", 
    response_description='Get embedding for a query',
)
async def embedding(request: Query):
    logger.info(f"Received embedding request: {request}")
    try:
        result = indexer.embed(request.query)
        logger.info(f"Found {len(result)} results for query: {request.query}")
        return {"result": result}
    except Exception as e:
        logger.error(f"Error in processing embedding: {e}")
        return {"error": str(e)}    


@asynccontextmanager
async def lifespan(app: FastAPI):
    tasks = [
        asyncio.create_task(crawl_loop(async_queue)),
        asyncio.create_task(index_loop(async_queue, indexer))
    ]
    await schedule_reindexing()
    try:
        yield
    finally:
        for task in tasks:
            task.cancel()
        await asyncio.gather(*tasks, return_exceptions=True)


def create_app() -> FastAPI:
    app = FastAPI(
        openapi_url="/indexer/openapi.json",
        docs_url="/indexer/docs",
        lifespan=lifespan
    )
    app.include_router(router)
    return app

async def trigger_re_indexer():
    logger.info("Reindexing triggered")
    try:
        await asyncio.gather(
            crawl_loop(async_queue),
            index_loop(async_queue, indexer)
        )
        logger.info("reindexing finished")
    except Exception as e:
        logger.error(f"error in scheduled reindexing {e}")


@repeat_every(seconds=60*20)
async def schedule_reindexing():
    await trigger_re_indexer()

app = create_app()
```

--------------------------------------------------------------------------------
/linker/app.py:
--------------------------------------------------------------------------------

```python
import os
import logging
import asyncio
import random
import string
from fastapi import FastAPI
from requestor import request_data
from contextlib import asynccontextmanager

import json
import requests

from requests.exceptions import HTTPError
from google.oauth2.credentials import Credentials
from google.cloud.firestore import Client


def sign_in_with_email_and_password(email, password):
    request_url = "https://signinaction-xl7gclbspq-uc.a.run.app"
    headers = {"content-type": "application/json; charset=UTF-8"}
    data = json.dumps({"login": email, "password": password})
    req = requests.post(request_url, headers=headers, data=data)
    try:
        req.raise_for_status()
    except HTTPError as e:
        raise HTTPError(e, "error")
    return req.json()


logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

USERS_COLLECTION_NAME = "users_otp"
COLLECTION_NAME = os.environ.get("FIRESTORE_COLLECTION_NAME")
TASKS_COLLECTION = os.environ.get("TASKS_COLLECTION")
USER_ID = os.environ.get("USER_ID")
PASSWORD = os.environ.get("PASSWORD")
FB_PROJECT = os.environ.get("FB_PROJECT")

app = FastAPI()
response = sign_in_with_email_and_password(USER_ID, PASSWORD)
creds = Credentials(response["idToken"], response["refreshToken"])
# noinspection PyTypeChecker
db = Client(FB_PROJECT, creds)


async def poll_firestore():
    logger.info(f"Polling Firestore collection: {COLLECTION_NAME}")
    random_otp = ''.join(random.choices(string.ascii_uppercase + string.digits, k=16))
    doc_ref = db.collection(USERS_COLLECTION_NAME).document(USER_ID)
    try:
        doc_ref.update({'otp': random_otp})
    except Exception as e:
        print("The error is: ", e)

    if doc_ref.get().exists:
        doc_ref.update({'otp': random_otp})
    else:
        doc_ref.create({'otp': random_otp})
    
    while True:
        print(f"OTP for this computer in Minima GPT: {random_otp}")
        try:
            docs = db.collection(COLLECTION_NAME).document(USER_ID).collection(TASKS_COLLECTION).stream()
            for doc in docs:
                data = doc.to_dict()
                if data['status'] == 'PENDING':
                    response = await request_data(data['request'])
                    if 'error' not in response:
                        logger.info(f"Updating Firestore document: {doc.id}")
                        doc_ref = db.collection(COLLECTION_NAME).document(USER_ID).collection(TASKS_COLLECTION).document(doc.id)
                        doc_ref.update({
                            'status': 'COMPLETED',
                            'links': response['result']['links'],
                            'result': response['result']['output']
                        })
                    else:
                        logger.error(f"Error in processing request: {response['error']}")
            await asyncio.sleep(0.5)
        except Exception as e:
            logger.error(f"Error in polling Firestore collection: {e}")
            await asyncio.sleep(0.5)

@asynccontextmanager
async def lifespan(app: FastAPI):
    logger.info("Starting Firestore polling")
    poll_task = asyncio.create_task(poll_firestore())
    yield
    poll_task.cancel()


def create_app() -> FastAPI:
    app = FastAPI(
        openapi_url="/linker/openapi.json",
        docs_url="/linker/docs",
        lifespan=lifespan
    )
    return app

app = create_app()
```

--------------------------------------------------------------------------------
/mcp-server/src/minima/server.py:
--------------------------------------------------------------------------------

```python
import logging
import mcp.server.stdio
from typing import Annotated
from mcp.server import Server
from .requestor import request_data
from pydantic import BaseModel, Field
from mcp.server.stdio import stdio_server
from mcp.shared.exceptions import McpError
from mcp.server import NotificationOptions, Server
from mcp.server.models import InitializationOptions
from mcp.types import (
    GetPromptResult,
    Prompt,
    PromptArgument,
    PromptMessage,
    TextContent,
    Tool,
    INVALID_PARAMS,
    INTERNAL_ERROR,
)


logging.basicConfig(
    level=logging.DEBUG, 
    format='%(asctime)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler("app.log"),
        logging.StreamHandler()
    ]
)

server = Server("minima")

class Query(BaseModel):
    text: Annotated[
        str, 
        Field(description="context to find")
    ]

@server.list_tools()
async def list_tools() -> list[Tool]:
    return [
        Tool(
            name="minima-query",
            description="Find a context in local files (PDF, CSV, DOCX, MD, TXT)",
            inputSchema=Query.model_json_schema(),
        )
    ]
    
@server.list_prompts()
async def list_prompts() -> list[Prompt]:
    logging.info("List of prompts")
    return [
        Prompt(
            name="minima-query",
            description="Find a context in a local files",
            arguments=[
                PromptArgument(
                    name="context", description="Context to search", required=True
                )
            ]
        )            
    ]
    
@server.call_tool()
async def call_tool(name, arguments: dict) -> list[TextContent]:
    if name != "minima-query":
        logging.error(f"Unknown tool: {name}")
        raise ValueError(f"Unknown tool: {name}")

    logging.info("Calling tools")
    try:
        args = Query(**arguments)
    except ValueError as e:
        logging.error(str(e))
        raise McpError(INVALID_PARAMS, str(e))
        
    context = args.text
    logging.info(f"Context: {context}")
    if not context:
        logging.error("Context is required")
        raise McpError(INVALID_PARAMS, "Context is required")

    output = await request_data(context)
    if "error" in output:
        logging.error(output["error"])
        raise McpError(INTERNAL_ERROR, output["error"])
    
    logging.info(f"Get prompt: {output}")    
    output = output['result']['output']
    #links = output['result']['links']
    result = []
    result.append(TextContent(type="text", text=output))
    return result
    
@server.get_prompt()
async def get_prompt(name: str, arguments: dict | None) -> GetPromptResult:
    if not arguments or "context" not in arguments:
        logging.error("Context is required")
        raise McpError(INVALID_PARAMS, "Context is required")
        
    context = arguments["text"]

    output = await request_data(context)
    if "error" in output:
        error = output["error"]
        logging.error(error)
        return GetPromptResult(
            description=f"Faild to find a {context}",
            messages=[
                PromptMessage(
                    role="user", 
                    content=TextContent(type="text", text=error),
                )
            ]
        )

    logging.info(f"Get prompt: {output}")    
    output = output['result']['output']
    return GetPromptResult(
        description=f"Found content for this {context}",
        messages=[
            PromptMessage(
                role="user", 
                content=TextContent(type="text", text=output)
            )
        ]
    )

async def main():
    async with mcp.server.stdio.stdio_server() as (read_stream, write_stream):
        await server.run(
            read_stream,
            write_stream,
            InitializationOptions(
                server_name="minima",
                server_version="0.0.1",
                capabilities=server.get_capabilities(
                    notification_options=NotificationOptions(),
                    experimental_capabilities={},
                ),
            ),
        )

```

--------------------------------------------------------------------------------
/indexer/storage.py:
--------------------------------------------------------------------------------

```python
import logging
from sqlmodel import Field, Session, SQLModel, create_engine, select

from singleton import Singleton
from enum import Enum

logger = logging.getLogger(__name__)


class IndexingStatus(Enum):
    new_file = 1
    need_reindexing = 2
    no_need_reindexing = 3


class MinimaDoc(SQLModel, table=True):
    fpath: str = Field(primary_key=True)
    last_updated_seconds: int | None = Field(default=None, index=True)


class MinimaDocUpdate(SQLModel):
    fpath: str | None = None
    last_updated_seconds: int | None = None


sqlite_file_name = "/indexer/storage/database.db"
sqlite_url = f"sqlite:///{sqlite_file_name}"

connect_args = {"check_same_thread": False}
engine = create_engine(sqlite_url, connect_args=connect_args)


class MinimaStore(metaclass=Singleton):

    @staticmethod
    def create_db_and_tables():
        SQLModel.metadata.create_all(engine)

    @staticmethod
    def delete_m_doc(fpath: str) -> None:
        with Session(engine) as session:
            statement = select(MinimaDoc).where(MinimaDoc.fpath == fpath)
            results = session.exec(statement)
            doc = results.one()
            session.delete(doc)
            session.commit()
            print("doc deleted:", doc)

    @staticmethod
    def select_m_doc(fpath: str) -> MinimaDoc:
        with Session(engine) as session:
            statement = select(MinimaDoc).where(MinimaDoc.fpath == fpath)
            results = session.exec(statement)
            doc = results.one()
            print("doc:", doc)
            return doc

    @staticmethod
    def find_removed_files(existing_file_paths: set[str]):
        removed_files: list[str] = []
        with Session(engine) as session:
            statement = select(MinimaDoc)
            results = session.exec(statement)
            logger.debug(f"find_removed_files count found {results}")
            for doc in results:
                logger.debug(f"find_removed_files file {doc.fpath} checking to remove")
                if doc.fpath not in existing_file_paths:
                    logger.debug(f"find_removed_files file {doc.fpath} does not exist anymore, removing")
                    removed_files.append(doc.fpath)
        for fpath in removed_files:
            MinimaStore.delete_m_doc(fpath)
        return removed_files

    @staticmethod
    def check_needs_indexing(fpath: str, last_updated_seconds: int) -> IndexingStatus:
        indexing_status: IndexingStatus = IndexingStatus.no_need_reindexing
        try:
            with Session(engine) as session:
                statement = select(MinimaDoc).where(MinimaDoc.fpath == fpath)
                results = session.exec(statement)
                doc = results.first()
                if doc is not None:
                    logger.debug(
                        f"file {fpath} new last updated={last_updated_seconds} old last updated: {doc.last_updated_seconds}"
                    )
                    if doc.last_updated_seconds < last_updated_seconds:
                        indexing_status = IndexingStatus.need_reindexing
                        logger.debug(f"file {fpath} needs indexing, timestamp changed")
                        doc_update = MinimaDocUpdate(fpath=fpath, last_updated_seconds=last_updated_seconds)
                        doc_data = doc_update.model_dump(exclude_unset=True)
                        doc.sqlmodel_update(doc_data)
                        session.add(doc)
                        session.commit()
                    else:
                        logger.debug(f"file {fpath} doesn't need indexing, timestamp same")
                else:
                    doc = MinimaDoc(fpath=fpath, last_updated_seconds=last_updated_seconds)
                    session.add(doc)
                    session.commit()
                    logger.debug(f"file {fpath} needs indexing, new file")
                    indexing_status = IndexingStatus.new_file
            return indexing_status
        except Exception as e:
            logger.error(f"error updating file in the store {e}, skipping indexing")
            return IndexingStatus.no_need_reindexing

```

--------------------------------------------------------------------------------
/assets/logo-full-w.svg:
--------------------------------------------------------------------------------

```
<svg width="280" height="354" viewBox="0 0 280 354" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M277.218 94.6656C276.083 88.5047 274.217 82.4376 272.178 76.5164C270.667 72.1068 266.641 70.2512 262.189 71.0852C260.445 71.4084 258.64 71.669 257.017 72.3362C246.917 76.4956 236.674 80.3631 226.807 85.0541C203.645 96.0729 183.038 110.907 165.545 130.099C162.523 133.424 159.805 137.042 157.036 140.596C155.992 141.952 155.495 144.047 154.197 144.797C153.203 145.371 151.347 144.36 149.867 144.057C149.796 144.036 149.704 144.057 149.633 144.036C141.896 142.546 134.199 142.973 126.573 144.766C125.163 145.1 124.524 144.672 123.855 143.557C122.739 141.733 121.685 139.825 120.295 138.23C111.533 128.128 102.447 118.371 92.0928 109.917C70.3503 92.1846 45.7074 80.311 19.4825 71.6064C13.3675 69.5841 8.40846 72.3883 6.55264 78.8307C0.883763 98.5019 -1.5501 118.496 1.02574 138.991C2.95255 154.367 7.69858 168.691 15.923 181.7C25.9424 197.525 38.6289 210.337 55.6254 218.155C56.4773 218.551 57.2683 219.083 58.0998 219.552C55.311 221.95 52.5932 224.118 50.0579 226.495C47.0765 229.268 47.1982 231.353 50.6157 233.625C53.6479 235.648 56.8322 237.472 60.0672 239.15C72.8957 245.77 85.6126 252.661 98.6744 258.748C108.278 263.231 117.608 267.933 125.883 274.729C134.665 281.933 144.624 281.599 153.67 274.667C158.953 270.622 164.561 266.807 170.524 263.982C184.296 257.466 198.26 251.399 211.535 243.789C217.021 240.631 222.649 237.712 228.136 234.553C231.533 232.614 231.928 230.06 229.423 227.016C228.389 225.755 227.142 224.671 225.955 223.566C224.586 222.294 223.166 221.074 221.453 219.552C222.183 219.156 222.507 218.947 222.862 218.791C231.675 214.997 239.25 209.336 246.481 202.967C260.881 190.249 270.698 174.466 275.778 155.847C281.275 135.686 281.021 115.129 277.218 94.6656ZM69.4376 199.328C51.4067 200.913 37.7162 193.094 26.6929 178.886C13.8137 162.29 9.34144 143.098 9.2096 122.416C9.13862 110.73 11.0249 99.315 13.479 87.9626C13.7325 86.7534 13.9151 85.5233 14.199 84.314C14.9495 81.1137 16.8763 79.8419 19.7563 81.3326C24.2691 83.6677 28.9138 86.0132 32.8586 89.2031C54.0738 106.299 69.3159 127.816 76.871 154.607C80.0553 165.897 79.8525 177.416 76.1307 188.643C75.005 192.062 73.1492 195.232 71.4759 198.432C71.2021 198.964 70.1576 199.266 69.4376 199.328ZM128.043 246.145C123.895 241.996 118.642 241.61 113.257 241.152C108.724 240.766 104.11 240.109 99.7595 238.785C91.4032 236.252 85.5416 230.623 82.3877 222.127C82.1241 221.418 81.11 220.73 80.3393 220.553C77.5099 219.927 75.1267 218.603 73.6664 216.049C73.0782 215.017 72.8247 213.475 73.1188 212.359C73.2608 211.796 74.9746 211.4 75.9786 211.379C82.0227 211.265 88.087 210.993 94.1109 211.306C113.004 212.276 125.254 226.891 129.047 243.247C129.3 244.342 129.422 245.457 129.666 246.896C128.875 246.541 128.357 246.458 128.043 246.145ZM142.981 268.495C138.62 271.195 132.485 267.891 131.968 262.637C131.724 260.26 133.235 259.103 134.898 258.311C136.46 257.57 138.255 257.341 139.938 256.893C144.188 257.195 146.713 258.644 147.372 261.407C147.889 263.565 145.972 266.65 142.981 268.495ZM205.876 214.986C204.487 217.769 202.692 220.146 199.152 220.271C198.402 220.303 197.307 221.23 197.013 222.002C192.834 233.135 181.872 240.287 170.453 240.766C165.241 240.985 159.906 240.683 155.14 243.508C153.568 244.446 152.128 245.603 150.404 246.802C150.688 241.319 152.716 236.617 154.836 232.009C160.616 219.5 170.849 213.433 183.728 211.41C186.101 211.035 188.545 210.983 190.958 210.941C195.065 210.879 199.173 210.921 203.29 210.941C205.947 210.941 207.083 212.578 205.876 214.986ZM265.972 153.21C261.561 166.345 254.928 178.125 244.97 187.59C237.1 195.075 227.537 199.068 216.788 199.85C215.662 199.933 214.526 199.902 213.401 199.996C207.914 200.475 209.304 201.153 206.707 195.888C198.899 180.032 199.649 163.885 205.247 147.622C212.457 126.679 224.606 109.177 240.599 94.3737C245.964 89.4116 251.774 85.1271 258.366 82.0206C262.139 80.238 264.187 80.9886 265.353 85.1063C268.751 97.0216 270.535 109.197 270.282 123.125C270.515 132.403 269.41 142.984 265.972 153.21Z" fill="white"/>
<path d="M198.179 85.6275C193.625 92.7787 187.794 99.1586 182.064 105.434C171.122 117.422 161.032 129.974 154.501 145.152C149.735 156.223 148.964 167.898 150.232 179.782C150.546 182.795 151.246 185.776 151.773 188.768C151.621 188.873 151.469 188.977 151.317 189.081C146.611 183.525 141.743 178.104 137.241 172.381C130.892 164.281 126.359 155.066 123.662 145.068C120.376 132.934 119.788 120.56 121.989 108.113C123.865 97.4698 126.664 87.1807 132.13 77.7778C139.401 65.2683 149.065 54.8124 158.588 44.1794C165.352 36.6215 172.075 28.9699 178.16 20.8283C182.561 14.9176 185.594 8.04777 185.502 0C187.977 4.38875 190.512 8.7358 192.895 13.1767C198.077 22.8819 202.651 32.8374 204.527 43.8771C207.012 58.5444 206.281 72.8991 198.179 85.6275Z" fill="white"/>
<path d="M80.1795 320H94.5429V353.855H85.6376V327.835L72.0881 353.855H69.4548L55.9053 327.835V353.855H47V320H61.3634L70.7475 338.378L80.1795 320Z" fill="white"/>
<path d="M99.2253 320H113.732L129.053 345.972V320H138.054V353.807H123.404L108.226 327.835V353.807H99.2253V320Z" fill="white"/>
<path d="M175.841 320H190.204V353.807H181.299V327.835L167.75 353.807H165.116L151.567 327.835V353.807H142.662V320H157.025L166.409 338.378L175.841 320Z" fill="white"/>
<path d="M222.85 320.193L233 354H223.903L221.557 346.165H203.555L201.209 354H192.112L202.262 320.193H222.85ZM216.003 327.593H209.109L205.949 338.33H219.163L216.003 327.593Z" fill="white"/>
</svg>

```

--------------------------------------------------------------------------------
/assets/logo-full.svg:
--------------------------------------------------------------------------------

```
<svg width="280" height="354" viewBox="0 0 280 354" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M277.218 94.6656C276.083 88.5047 274.217 82.4376 272.178 76.5164C270.667 72.1068 266.641 70.2512 262.189 71.0852C260.445 71.4084 258.64 71.669 257.017 72.3362C246.917 76.4956 236.674 80.3631 226.807 85.0541C203.645 96.0729 183.038 110.907 165.545 130.099C162.523 133.424 159.805 137.042 157.036 140.596C155.992 141.952 155.495 144.047 154.197 144.797C153.203 145.371 151.347 144.36 149.867 144.057C149.796 144.036 149.704 144.057 149.633 144.036C141.896 142.546 134.199 142.973 126.573 144.766C125.163 145.1 124.524 144.672 123.855 143.557C122.739 141.733 121.685 139.825 120.295 138.23C111.533 128.128 102.447 118.371 92.0928 109.917C70.3503 92.1846 45.7074 80.311 19.4825 71.6064C13.3675 69.5841 8.40846 72.3883 6.55264 78.8307C0.883763 98.5019 -1.5501 118.496 1.02574 138.991C2.95255 154.367 7.69858 168.691 15.923 181.7C25.9424 197.525 38.6289 210.337 55.6254 218.155C56.4773 218.551 57.2683 219.083 58.0998 219.552C55.311 221.95 52.5932 224.118 50.0579 226.495C47.0765 229.268 47.1982 231.353 50.6157 233.625C53.6479 235.648 56.8322 237.472 60.0672 239.15C72.8957 245.77 85.6126 252.661 98.6744 258.748C108.278 263.231 117.608 267.933 125.883 274.729C134.665 281.933 144.624 281.599 153.67 274.667C158.953 270.622 164.561 266.807 170.524 263.982C184.296 257.466 198.26 251.399 211.535 243.789C217.021 240.631 222.649 237.712 228.136 234.553C231.533 232.614 231.928 230.06 229.423 227.016C228.389 225.755 227.142 224.671 225.955 223.566C224.586 222.294 223.166 221.074 221.453 219.552C222.183 219.156 222.507 218.947 222.862 218.791C231.675 214.997 239.25 209.336 246.481 202.967C260.881 190.249 270.698 174.466 275.778 155.847C281.275 135.686 281.021 115.129 277.218 94.6656ZM69.4376 199.328C51.4067 200.913 37.7162 193.094 26.6929 178.886C13.8137 162.29 9.34144 143.098 9.2096 122.416C9.13862 110.73 11.0249 99.315 13.479 87.9626C13.7325 86.7534 13.9151 85.5233 14.199 84.314C14.9495 81.1137 16.8763 79.8419 19.7563 81.3326C24.2691 83.6677 28.9138 86.0132 32.8586 89.2031C54.0738 106.299 69.3159 127.816 76.871 154.607C80.0553 165.897 79.8525 177.416 76.1307 188.643C75.005 192.062 73.1492 195.232 71.4759 198.432C71.2021 198.964 70.1576 199.266 69.4376 199.328ZM128.043 246.145C123.895 241.996 118.642 241.61 113.257 241.152C108.724 240.766 104.11 240.109 99.7595 238.785C91.4032 236.252 85.5416 230.623 82.3877 222.127C82.1241 221.418 81.11 220.73 80.3393 220.553C77.5099 219.927 75.1267 218.603 73.6664 216.049C73.0782 215.017 72.8247 213.475 73.1188 212.359C73.2608 211.796 74.9746 211.4 75.9786 211.379C82.0227 211.265 88.087 210.993 94.1109 211.306C113.004 212.276 125.254 226.891 129.047 243.247C129.3 244.342 129.422 245.457 129.666 246.896C128.875 246.541 128.357 246.458 128.043 246.145ZM142.981 268.495C138.62 271.195 132.485 267.891 131.968 262.637C131.724 260.26 133.235 259.103 134.898 258.311C136.46 257.57 138.255 257.341 139.938 256.893C144.188 257.195 146.713 258.644 147.372 261.407C147.889 263.565 145.972 266.65 142.981 268.495ZM205.876 214.986C204.487 217.769 202.692 220.146 199.152 220.271C198.402 220.303 197.307 221.23 197.013 222.002C192.834 233.135 181.872 240.287 170.453 240.766C165.241 240.985 159.906 240.683 155.14 243.508C153.568 244.446 152.128 245.603 150.404 246.802C150.688 241.319 152.716 236.617 154.836 232.009C160.616 219.5 170.849 213.433 183.728 211.41C186.101 211.035 188.545 210.983 190.958 210.941C195.065 210.879 199.173 210.921 203.29 210.941C205.947 210.941 207.083 212.578 205.876 214.986ZM265.972 153.21C261.561 166.345 254.928 178.125 244.97 187.59C237.1 195.075 227.537 199.068 216.788 199.85C215.662 199.933 214.526 199.902 213.401 199.996C207.914 200.475 209.304 201.153 206.707 195.888C198.899 180.032 199.649 163.885 205.247 147.622C212.457 126.679 224.606 109.177 240.599 94.3737C245.964 89.4116 251.774 85.1271 258.366 82.0206C262.139 80.238 264.187 80.9886 265.353 85.1063C268.751 97.0216 270.535 109.197 270.282 123.125C270.515 132.403 269.41 142.984 265.972 153.21Z" fill="#D8FD87"/>
<path d="M198.179 85.6275C193.625 92.7787 187.794 99.1586 182.064 105.434C171.122 117.422 161.032 129.974 154.501 145.152C149.735 156.223 148.964 167.898 150.232 179.782C150.546 182.795 151.246 185.776 151.773 188.768C151.621 188.873 151.469 188.977 151.317 189.081C146.611 183.525 141.743 178.104 137.241 172.381C130.892 164.281 126.359 155.066 123.662 145.068C120.376 132.934 119.788 120.56 121.989 108.113C123.865 97.4698 126.664 87.1807 132.13 77.7778C139.401 65.2683 149.065 54.8124 158.588 44.1794C165.352 36.6215 172.075 28.9699 178.16 20.8283C182.561 14.9176 185.594 8.04777 185.502 0C187.977 4.38875 190.512 8.7358 192.895 13.1767C198.077 22.8819 202.651 32.8374 204.527 43.8771C207.012 58.5444 206.281 72.8991 198.179 85.6275Z" fill="#D8FD87"/>
<path d="M80.1795 320H94.5429V353.855H85.6376V327.835L72.0881 353.855H69.4548L55.9053 327.835V353.855H47V320H61.3634L70.7475 338.378L80.1795 320Z" fill="white"/>
<path d="M99.2253 320H113.732L129.053 345.972V320H138.054V353.807H123.404L108.226 327.835V353.807H99.2253V320Z" fill="white"/>
<path d="M175.841 320H190.204V353.807H181.299V327.835L167.75 353.807H165.116L151.567 327.835V353.807H142.662V320H157.025L166.409 338.378L175.841 320Z" fill="white"/>
<path d="M222.85 320.193L233 354H223.903L221.557 346.165H203.555L201.209 354H192.112L202.262 320.193H222.85ZM216.003 327.593H209.109L205.949 338.33H219.163L216.003 327.593Z" fill="white"/>
</svg>

```

--------------------------------------------------------------------------------
/assets/logo-full-b.svg:
--------------------------------------------------------------------------------

```
<svg width="280" height="354" viewBox="0 0 280 354" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M277.218 94.6656C276.083 88.5047 274.217 82.4376 272.178 76.5164C270.667 72.1068 266.641 70.2512 262.189 71.0852C260.445 71.4084 258.64 71.669 257.017 72.3362C246.917 76.4956 236.674 80.3631 226.807 85.0541C203.645 96.0729 183.038 110.907 165.545 130.099C162.523 133.424 159.805 137.042 157.036 140.596C155.992 141.952 155.495 144.047 154.197 144.797C153.203 145.371 151.347 144.36 149.867 144.057C149.796 144.036 149.704 144.057 149.633 144.036C141.896 142.546 134.199 142.973 126.573 144.766C125.163 145.1 124.524 144.672 123.855 143.557C122.739 141.733 121.685 139.825 120.295 138.23C111.533 128.128 102.447 118.371 92.0928 109.917C70.3503 92.1846 45.7074 80.311 19.4825 71.6064C13.3675 69.5841 8.40846 72.3883 6.55264 78.8307C0.883763 98.5019 -1.5501 118.496 1.02574 138.991C2.95255 154.367 7.69858 168.691 15.923 181.7C25.9424 197.525 38.6289 210.337 55.6254 218.155C56.4773 218.551 57.2683 219.083 58.0998 219.552C55.311 221.95 52.5932 224.118 50.0579 226.495C47.0765 229.268 47.1982 231.353 50.6157 233.625C53.6479 235.648 56.8322 237.472 60.0672 239.15C72.8957 245.77 85.6126 252.661 98.6744 258.748C108.278 263.231 117.608 267.933 125.883 274.729C134.665 281.933 144.624 281.599 153.67 274.667C158.953 270.622 164.561 266.807 170.524 263.982C184.296 257.466 198.26 251.399 211.535 243.789C217.021 240.631 222.649 237.712 228.136 234.553C231.533 232.614 231.928 230.06 229.423 227.016C228.389 225.755 227.142 224.671 225.955 223.566C224.586 222.294 223.166 221.074 221.453 219.552C222.183 219.156 222.507 218.947 222.862 218.791C231.675 214.997 239.25 209.336 246.481 202.967C260.881 190.249 270.698 174.466 275.778 155.847C281.275 135.686 281.021 115.129 277.218 94.6656ZM69.4376 199.328C51.4067 200.913 37.7162 193.094 26.6929 178.886C13.8137 162.29 9.34144 143.098 9.2096 122.416C9.13862 110.73 11.0249 99.315 13.479 87.9626C13.7325 86.7534 13.9151 85.5233 14.199 84.314C14.9495 81.1137 16.8763 79.8419 19.7563 81.3326C24.2691 83.6677 28.9138 86.0132 32.8586 89.2031C54.0738 106.299 69.3159 127.816 76.871 154.607C80.0553 165.897 79.8525 177.416 76.1307 188.643C75.005 192.062 73.1492 195.232 71.4759 198.432C71.2021 198.964 70.1576 199.266 69.4376 199.328ZM128.043 246.145C123.895 241.996 118.642 241.61 113.257 241.152C108.724 240.766 104.11 240.109 99.7595 238.785C91.4032 236.252 85.5416 230.623 82.3877 222.127C82.1241 221.418 81.11 220.73 80.3393 220.553C77.5099 219.927 75.1267 218.603 73.6664 216.049C73.0782 215.017 72.8247 213.475 73.1188 212.359C73.2608 211.796 74.9746 211.4 75.9786 211.379C82.0227 211.265 88.087 210.993 94.1109 211.306C113.004 212.276 125.254 226.891 129.047 243.247C129.3 244.342 129.422 245.457 129.666 246.896C128.875 246.541 128.357 246.458 128.043 246.145ZM142.981 268.495C138.62 271.195 132.485 267.891 131.968 262.637C131.724 260.26 133.235 259.103 134.898 258.311C136.46 257.57 138.255 257.341 139.938 256.893C144.188 257.195 146.713 258.644 147.372 261.407C147.889 263.565 145.972 266.65 142.981 268.495ZM205.876 214.986C204.487 217.769 202.692 220.146 199.152 220.271C198.402 220.303 197.307 221.23 197.013 222.002C192.834 233.135 181.872 240.287 170.453 240.766C165.241 240.985 159.906 240.683 155.14 243.508C153.568 244.446 152.128 245.603 150.404 246.802C150.688 241.319 152.716 236.617 154.836 232.009C160.616 219.5 170.849 213.433 183.728 211.41C186.101 211.035 188.545 210.983 190.958 210.941C195.065 210.879 199.173 210.921 203.29 210.941C205.947 210.941 207.083 212.578 205.876 214.986ZM265.972 153.21C261.561 166.345 254.928 178.125 244.97 187.59C237.1 195.075 227.537 199.068 216.788 199.85C215.662 199.933 214.526 199.902 213.401 199.996C207.914 200.475 209.304 201.153 206.707 195.888C198.899 180.032 199.649 163.885 205.247 147.622C212.457 126.679 224.606 109.177 240.599 94.3737C245.964 89.4116 251.774 85.1271 258.366 82.0206C262.139 80.238 264.187 80.9886 265.353 85.1063C268.751 97.0216 270.535 109.197 270.282 123.125C270.515 132.403 269.41 142.984 265.972 153.21Z" fill="#1F1F1F"/>
<path d="M198.179 85.6275C193.625 92.7787 187.794 99.1586 182.064 105.434C171.122 117.422 161.032 129.974 154.501 145.152C149.735 156.223 148.964 167.898 150.232 179.782C150.546 182.795 151.246 185.776 151.773 188.768C151.621 188.873 151.469 188.977 151.317 189.081C146.611 183.525 141.743 178.104 137.241 172.381C130.892 164.281 126.359 155.066 123.662 145.068C120.376 132.934 119.788 120.56 121.989 108.113C123.865 97.4698 126.664 87.1807 132.13 77.7778C139.401 65.2683 149.065 54.8124 158.588 44.1794C165.352 36.6215 172.075 28.9699 178.16 20.8283C182.561 14.9176 185.594 8.04777 185.502 0C187.977 4.38875 190.512 8.7358 192.895 13.1767C198.077 22.8819 202.651 32.8374 204.527 43.8771C207.012 58.5444 206.281 72.8991 198.179 85.6275Z" fill="#1F1F1F"/>
<path d="M80.1795 320H94.5429V353.855H85.6376V327.835L72.0881 353.855H69.4548L55.9053 327.835V353.855H47V320H61.3634L70.7475 338.378L80.1795 320Z" fill="#1F1F1F"/>
<path d="M99.2253 320H113.732L129.053 345.972V320H138.054V353.807H123.404L108.226 327.835V353.807H99.2253V320Z" fill="#1F1F1F"/>
<path d="M175.841 320H190.204V353.807H181.299V327.835L167.75 353.807H165.116L151.567 327.835V353.807H142.662V320H157.025L166.409 338.378L175.841 320Z" fill="#1F1F1F"/>
<path d="M222.85 320.193L233 354H223.903L221.557 346.165H203.555L201.209 354H192.112L202.262 320.193H222.85ZM216.003 327.593H209.109L205.949 338.33H219.163L216.003 327.593Z" fill="#1F1F1F"/>
</svg>

```

--------------------------------------------------------------------------------
/indexer/indexer.py:
--------------------------------------------------------------------------------

```python
import os
import uuid
import torch
import logging
import time
from dataclasses import dataclass
from typing import List, Dict
from pathlib import Path

from qdrant_client import QdrantClient
from langchain_qdrant import QdrantVectorStore
from langchain_huggingface import HuggingFaceEmbeddings
from qdrant_client.http.models import Distance, VectorParams, Filter, FieldCondition, MatchValue
from langchain.text_splitter import RecursiveCharacterTextSplitter

from langchain_community.document_loaders import (
    TextLoader,
    CSVLoader,
    Docx2txtLoader,
    UnstructuredExcelLoader,
    PyMuPDFLoader,
    UnstructuredPowerPointLoader,
)

from storage import MinimaStore, IndexingStatus

logger = logging.getLogger(__name__)


@dataclass
class Config:
    EXTENSIONS_TO_LOADERS = {
        ".pdf": PyMuPDFLoader,
        ".pptx": UnstructuredPowerPointLoader,
        ".ppt": UnstructuredPowerPointLoader,
        ".xls": UnstructuredExcelLoader,
        ".xlsx": UnstructuredExcelLoader,
        ".docx": Docx2txtLoader,
        ".doc": Docx2txtLoader,
        ".txt": TextLoader,
        ".md": TextLoader,
        ".csv": CSVLoader,
    }
    
    DEVICE = torch.device(
        "mps" if torch.backends.mps.is_available() else
        "cuda" if torch.cuda.is_available() else
        "cpu"
    )
    
    START_INDEXING = os.environ.get("START_INDEXING")
    LOCAL_FILES_PATH = os.environ.get("LOCAL_FILES_PATH")
    CONTAINER_PATH = os.environ.get("CONTAINER_PATH")
    QDRANT_COLLECTION = "mnm_storage"
    QDRANT_BOOTSTRAP = "qdrant"
    EMBEDDING_MODEL_ID = os.environ.get("EMBEDDING_MODEL_ID")
    EMBEDDING_SIZE = os.environ.get("EMBEDDING_SIZE")
    
    CHUNK_SIZE = 500
    CHUNK_OVERLAP = 200

class Indexer:
    def __init__(self):
        self.config = Config()
        self.qdrant = self._initialize_qdrant()
        self.embed_model = self._initialize_embeddings()
        self.document_store = self._setup_collection()
        self.text_splitter = self._initialize_text_splitter()

    def _initialize_qdrant(self) -> QdrantClient:
        return QdrantClient(host=self.config.QDRANT_BOOTSTRAP)

    def _initialize_embeddings(self) -> HuggingFaceEmbeddings:
        return HuggingFaceEmbeddings(
            model_name=self.config.EMBEDDING_MODEL_ID,
            model_kwargs={'device': self.config.DEVICE},
            encode_kwargs={'normalize_embeddings': False}
        )

    def _initialize_text_splitter(self) -> RecursiveCharacterTextSplitter:
        return RecursiveCharacterTextSplitter(
            chunk_size=self.config.CHUNK_SIZE,
            chunk_overlap=self.config.CHUNK_OVERLAP
        )

    def _setup_collection(self) -> QdrantVectorStore:
        if not self.qdrant.collection_exists(self.config.QDRANT_COLLECTION):
            self.qdrant.create_collection(
                collection_name=self.config.QDRANT_COLLECTION,
                vectors_config=VectorParams(
                    size=self.config.EMBEDDING_SIZE,
                    distance=Distance.COSINE
                ),
            )
        self.qdrant.create_payload_index(
            collection_name=self.config.QDRANT_COLLECTION,
            field_name="fpath",
            field_schema="keyword"
        )
        return QdrantVectorStore(
            client=self.qdrant,
            collection_name=self.config.QDRANT_COLLECTION,
            embedding=self.embed_model,
        )

    def _create_loader(self, file_path: str):
        file_extension = Path(file_path).suffix.lower()
        loader_class = self.config.EXTENSIONS_TO_LOADERS.get(file_extension)
        
        if not loader_class:
            raise ValueError(f"Unsupported file type: {file_extension}")
        
        return loader_class(file_path=file_path)

    def _process_file(self, loader) -> List[str]:
        try:
            documents = loader.load_and_split(self.text_splitter)
            if not documents:
                logger.warning(f"No documents loaded from {loader.file_path}")
                return []

            for doc in documents:
                doc.metadata['file_path'] = loader.file_path

            uuids = [str(uuid.uuid4()) for _ in range(len(documents))]
            ids = self.document_store.add_documents(documents=documents, ids=uuids)
            
            logger.info(f"Successfully processed {len(ids)} documents from {loader.file_path}")
            return ids
            
        except Exception as e:
            logger.error(f"Error processing file {loader.file_path}: {str(e)}")
            return []

    def index(self, message: Dict[str, any]) -> None:
        start = time.time()
        path, file_id, last_updated_seconds = message["path"], message["file_id"], message["last_updated_seconds"]
        logger.info(f"Processing file: {path} (ID: {file_id})")
        indexing_status: IndexingStatus = MinimaStore.check_needs_indexing(fpath=path, last_updated_seconds=last_updated_seconds)
        if indexing_status != IndexingStatus.no_need_reindexing:
            logger.info(f"Indexing needed for {path} with status: {indexing_status}")
            try:
                if indexing_status == IndexingStatus.need_reindexing:
                    logger.info(f"Removing {path} from index storage for reindexing")
                    self.remove_from_storage(files_to_remove=[path])
                loader = self._create_loader(path)
                ids = self._process_file(loader)
                if ids:
                    logger.info(f"Successfully indexed {path} with IDs: {ids}")
            except Exception as e:
                logger.error(f"Failed to index file {path}: {str(e)}")
        else:
            logger.info(f"Skipping {path}, no indexing required. timestamp didn't change")
        end = time.time()
        logger.info(f"Processing took {end - start} seconds for file {path}")

    def purge(self, message: Dict[str, any]) -> None:
        existing_file_paths: list[str] = message["existing_file_paths"]
        files_to_remove = MinimaStore.find_removed_files(existing_file_paths=set(existing_file_paths))
        if len(files_to_remove) > 0:
            logger.info(f"purge processing removing old files {files_to_remove}")
            self.remove_from_storage(files_to_remove)
        else:
            logger.info("Nothing to purge")

    def remove_from_storage(self, files_to_remove: list[str]):
        filter_conditions = Filter(
            must=[
                FieldCondition(
                    key="fpath",
                    match=MatchValue(value=fpath)
                )
                for fpath in files_to_remove
            ]
        )
        response = self.qdrant.delete(
            collection_name=self.config.QDRANT_COLLECTION,
            points_selector=filter_conditions,
            wait=True
        )
        logger.info(f"Delete response for {len(files_to_remove)} for files: {files_to_remove} is: {response}")

    def find(self, query: str) -> Dict[str, any]:
        try:
            logger.info(f"Searching for: {query}")
            found = self.document_store.search(query, search_type="similarity")
            
            if not found:
                logger.info("No results found")
                return {"links": set(), "output": ""}

            links = set()
            results = []
            
            for item in found:
                path = item.metadata["file_path"].replace(
                    self.config.CONTAINER_PATH,
                    self.config.LOCAL_FILES_PATH
                )
                links.add(f"file://{path}")
                results.append(item.page_content)

            output = {
                "links": links,
                "output": ". ".join(results)
            }
            
            logger.info(f"Found {len(found)} results")
            return output
            
        except Exception as e:
            logger.error(f"Search failed: {str(e)}")
            return {"error": "Unable to find anything for the given query"}

    def embed(self, query: str):
        return self.embed_model.embed_query(query)
```

--------------------------------------------------------------------------------
/llm/llm_chain.py:
--------------------------------------------------------------------------------

```python
import os
import uuid
import torch
import datetime
import logging
from dataclasses import dataclass
from typing import Sequence, Optional
from langchain.schema import Document
from qdrant_client import QdrantClient
from langchain_ollama import ChatOllama
from minima_embed import MinimaEmbeddings
from langgraph.graph import START, StateGraph
from langchain_qdrant import QdrantVectorStore
from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages
from typing_extensions import Annotated, TypedDict
from langgraph.checkpoint.memory import MemorySaver
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.messages import AIMessage, HumanMessage
from langchain.chains.retrieval import create_retrieval_chain
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import CrossEncoderReranker
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_community.cross_encoders.huggingface import HuggingFaceCrossEncoder
from langchain.chains.history_aware_retriever import create_history_aware_retriever

logger = logging.getLogger(__name__)

CONTEXTUALIZE_Q_SYSTEM_PROMPT = (
    "Given a chat history and the latest user question "
    "which might reference context in the chat history, "
    "formulate a standalone question which can be understood "
    "without the chat history. Do NOT answer the question, "
    "just reformulate it if needed and otherwise return it as is."
)

SYSTEM_PROMPT = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

QUERY_ENHANCEMENT_PROMPT = (
    "You are an expert at converting user questions into queries."
    "You have access to a users files."
    "Perform query expansion."
    "Just return one expanded query, do not add any other text."
    "If there are acronyms or words you are not familiar with, do not try to rephrase them."
    "Do not change the original meaning of the question and do not add any additional information."
)

class ParaphrasedQuery(BaseModel):
    paraphrased_query: str = Field(
        ...,
        description="A unique paraphrasing of the original question.",
    )

@dataclass
class LLMConfig:
    """Configuration settings for the LLM Chain"""
    qdrant_collection: str = "mnm_storage"
    qdrant_host: str = "qdrant"
    ollama_url: str = "http://ollama:11434"
    ollama_model: str = os.environ.get("OLLAMA_MODEL")
    rerank_model: str = os.environ.get("RERANKER_MODEL")
    temperature: float = 0.5
    device: torch.device = torch.device(
        "mps" if torch.backends.mps.is_available() else
        "cuda" if torch.cuda.is_available() else
        "cpu"
    )


@dataclass
class LocalConfig:
    LOCAL_FILES_PATH = os.environ.get("LOCAL_FILES_PATH")
    CONTAINER_PATH = os.environ.get("CONTAINER_PATH")


class State(TypedDict):
    """State definition for the LLM Chain"""
    input: str
    chat_history: Annotated[Sequence[BaseMessage], add_messages]
    context: str
    answer: str
    init_query: str


class LLMChain:
    """A chain for processing LLM queries with context awareness and retrieval capabilities"""

    def __init__(self, config: Optional[LLMConfig] = None):
        """Initialize the LLM Chain with optional custom configuration"""
        self.localConfig = LocalConfig()
        self.config = config or LLMConfig()
        self.llm = self._setup_llm()
        self.document_store = self._setup_document_store()
        self.chain = self._setup_chain()
        self.graph = self._create_graph()

    def _setup_llm(self) -> ChatOllama:
        """Initialize the LLM model"""
        return ChatOllama(
            base_url=self.config.ollama_url,
            model=self.config.ollama_model,
            temperature=self.config.temperature
        )

    def _setup_document_store(self) -> QdrantVectorStore:
        """Initialize the document store with vector embeddings"""
        qdrant = QdrantClient(host=self.config.qdrant_host)
        embed_model = MinimaEmbeddings()
        return QdrantVectorStore(
            client=qdrant,
            collection_name=self.config.qdrant_collection,
            embedding=embed_model
        )

    def _setup_chain(self):
        """Set up the retrieval and QA chain"""
        # Initialize retriever with reranking
        base_retriever = self.document_store.as_retriever()
        reranker = HuggingFaceCrossEncoder(
            model_name=self.config.rerank_model,
            model_kwargs={'device': self.config.device},
        )
        compression_retriever = ContextualCompressionRetriever(
            base_compressor=CrossEncoderReranker(model=reranker, top_n=3),
            base_retriever=base_retriever
        )

        # Create history-aware retriever
        contextualize_prompt = ChatPromptTemplate.from_messages([
            ("system", CONTEXTUALIZE_Q_SYSTEM_PROMPT),
            MessagesPlaceholder("chat_history"),
            ("human", "{input}"),
        ])
        history_aware_retriever = create_history_aware_retriever(
            self.llm, compression_retriever, contextualize_prompt
        )

        # Create QA chain
        qa_prompt = ChatPromptTemplate.from_messages([
            ("system", SYSTEM_PROMPT),
            MessagesPlaceholder("chat_history"),
            ("human", "{input}"),
        ])
        qa_chain = create_stuff_documents_chain(self.llm, qa_prompt)
        retrieval_chain = create_retrieval_chain(history_aware_retriever, qa_chain)

        return retrieval_chain

    def _create_graph(self) -> StateGraph:
        """Create the processing graph"""
        workflow = StateGraph(state_schema=State)
        workflow.add_node("enhance", self._enhance_query)
        workflow.add_node("retrieval", self._call_model)
        workflow.add_edge(START, "enhance")
        workflow.add_edge("enhance", "retrieval")
        return workflow.compile(checkpointer=MemorySaver())

    def _enhance_query(self, state: State) -> str:
        """Enhance the query using the LLM"""
        prompt_enhancement = ChatPromptTemplate.from_messages([
            ("system", QUERY_ENHANCEMENT_PROMPT),
            ("human", "{input}"),
        ])
        query_enhancement = prompt_enhancement | self.llm
        enhanced_query = query_enhancement.invoke({
            "input": state["input"]
        })
        logger.info(f"Enhanced query: {enhanced_query}")
        state["init_query"] = state["input"]
        state["input"] = enhanced_query.content
        return state

    def _call_model(self, state: State) -> dict:
        """Process the query through the model"""
        logger.info(f"Processing query: {state['init_query']}")
        logger.info(f"Enhanced query: {state['input']}")
        response = self.chain.invoke(state)
        logger.info(f"Received response: {response['answer']}")
        return {
            "chat_history": [
                HumanMessage(state["init_query"]),
                AIMessage(response["answer"]),
            ],
            "context": response["context"],
            "answer": response["answer"],
        }
    
    def invoke(self, message: str) -> dict:
        """
        Process a user message and return the response
        
        Args:
            message: The user's input message
            
        Returns:
            dict: Contains the model's response or error information
        """
        try:
            logger.info(f"Processing query: {message}")
            config = {
                "configurable": {
                    "thread_id": uuid.uuid4(),
                    "thread_ts": datetime.datetime.now().isoformat()
                }   
            }
            result = self.graph.invoke(
                {"input": message},
                config=config
            )
            logger.info(f"OUTPUT: {result}")
            links = set()
            for ctx in result["context"]:
                doc: Document = ctx
                path = doc.metadata["file_path"].replace(
                    self.localConfig.CONTAINER_PATH,
                    self.localConfig.LOCAL_FILES_PATH
                )
                links.add(f"file://{path}")
            return {"answer": result["answer"], "links": links}
        except Exception as e:
            logger.error(f"Error processing query", exc_info=True)
            return {"error": str(e), "status": "error"}
```

--------------------------------------------------------------------------------
/chat/src/ChatApp.tsx:
--------------------------------------------------------------------------------

```typescript
import React, { useState, useEffect } from 'react';
import {
    Layout,
    Typography,
    List as AntList,
    Input,
    ConfigProvider,
    Switch,
    theme,
    Button,
} from 'antd';
import { ArrowRightOutlined } from '@ant-design/icons';
import {ToastContainer, toast, Bounce} from 'react-toastify';

const { Header, Content, Footer } = Layout;
const { TextArea } = Input;
const { Link: AntLink, Paragraph, Title } = Typography;
const { defaultAlgorithm, darkAlgorithm } = theme;

interface Message {
    type: 'answer' | 'question' | 'full';
    reporter: 'output_message' | 'user';
    message: string;
    links: string[];
}

const ChatApp: React.FC = () => {
    const [ws, setWs] = useState<WebSocket | null>(null);
    const [input, setInput] = useState<string>('');
    const [messages, setMessages] = useState<Message[]>([]);
    const [isDarkMode, setIsDarkMode] = useState(false);

    // Toggle light/dark theme
    const toggleTheme = () => setIsDarkMode((prev) => !prev);

    // WebSocket Setup
    useEffect(() => {
        const webSocket = new WebSocket('ws://localhost:8003/llm/');

        webSocket.onmessage = (event) => {
            const message_curr: Message = JSON.parse(event.data);

            if (message_curr.reporter === 'output_message') {
                setMessages((messages_prev) => {
                    if (messages_prev.length === 0) return [message_curr];
                    const last = messages_prev[messages_prev.length - 1];

                    // If last message is question or 'full', append new
                    if (last.type === 'question' || last.type === 'full') {
                        return [...messages_prev, message_curr];
                    }

                    // If incoming message is 'full', replace last message
                    if (message_curr.type === 'full') {
                        return [...messages_prev.slice(0, -1), message_curr];
                    }

                    // Otherwise, merge partial message
                    return [
                        ...messages_prev.slice(0, -1),
                        {
                            ...last,
                            message: last.message + message_curr.message,
                        },
                    ];
                });
            }
        };

        setWs(webSocket);
        return () => {
            webSocket.close();
        };
    }, []);

    // Send message
    const sendMessage = (): void => {
        try {
            if (ws && input.trim()) {
                ws.send(input);
                setMessages((prev) => [
                    ...prev,
                    {
                        type: 'question',
                        reporter: 'user',
                        message: input,
                        links: [],
                    },
                ]);
                setInput('');
            }
        } catch (e) {
            console.error(e);
        }
    };

    async function handleLinkClick(link: string) {
        await navigator.clipboard.writeText(link);
        toast('Link copied!', {
            position: "top-right",
            autoClose: 1000,
            hideProgressBar: true,
            closeOnClick: true,
            pauseOnHover: true,
            draggable: false,
            progress: undefined,
            theme: "light",
            transition: Bounce,
        });
    }

    return (
        <ConfigProvider
            theme={{
                algorithm: isDarkMode ? darkAlgorithm : defaultAlgorithm,
                token: {
                    borderRadius: 2,
                },
            }}
        >
            <Layout
                style={{
                    width: '100%',
                    height: '100vh',
                    margin: '0 auto',
                    display: 'flex',
                    flexDirection: 'column',
                    overflow: 'hidden',
                }}
            >
                {/* Header with Theme Toggle */}
                <Header
                    style={{
                        backgroundImage: isDarkMode
                            ? 'linear-gradient(45deg, #10161A, #394B59)' // Dark gradient
                            : 'linear-gradient(45deg, #2f3f48, #586770)', // Light gradient
                        borderBottomLeftRadius: 2,
                        borderBottomRightRadius: 2,
                        display: 'flex',
                        alignItems: 'center',
                        justifyContent: 'space-between',
                        padding: '0 16px',
                    }}
                >
                    <Title level={4} style={{ margin: 0, color: 'white' }}>
                        Minima
                    </Title>
                    <Switch
                        checked={isDarkMode}
                        onChange={toggleTheme}
                        checkedChildren="Dark"
                        unCheckedChildren="Light"
                    />
                </Header>

                {/* Messages */}
                <Content style={{ padding: '16px', display: 'flex', flexDirection: 'column' }}>
                    <AntList
                        style={{
                            flexGrow: 1,
                            marginBottom: 16,
                            border: '1px solid #ccc',
                            borderRadius: 4,
                            overflowY: 'auto',
                            padding: '16px',
                        }}
                    >
                        {messages.map((msg, index) => {
                            const isUser = msg.reporter === 'user';
                            return (
                                <AntList.Item
                                    key={index}
                                    style={{
                                        display: 'flex',
                                        flexDirection: 'column',
                                        alignItems: isUser ? 'flex-end' : 'flex-start',
                                        border: 'none',
                                    }}
                                >
                                    <div
                                        style={{
                                            maxWidth: '60%',
                                            borderRadius: 16,
                                            padding: '8px 16px',
                                            wordBreak: 'break-word',
                                            textAlign: isUser ? 'right' : 'left',
                                            backgroundImage: isUser
                                                ? 'linear-gradient(120deg, #1a62aa, #007bff)'
                                                : 'linear-gradient(120deg, #abcbe8, #7bade0)',
                                            color: isUser ? 'white' : 'black',
                                        }}
                                    >
                                        <Paragraph
                                            style={{
                                                margin: 0,
                                                color: 'inherit',
                                                fontSize: '1rem',
                                                fontWeight: 500,
                                                lineHeight: '1.4',
                                            }}
                                        >
                                            {msg.message}
                                        </Paragraph>

                                        {/* Links, if any */}
                                        {msg.links?.length > 0 && (
                                            <div style={{ marginTop: 4 }}>
                                                {msg.links.map((link, linkIndex) => (
                                                    <React.Fragment key={linkIndex}>
                                                        <br />
                                                        <AntLink
                                                            onClick={async () => {
                                                                await handleLinkClick(link)
                                                            }}
                                                            href={link}
                                                            target="_blank"
                                                            rel="noopener noreferrer"
                                                            style={{
                                                                color: 'inherit',
                                                                textDecoration: 'underline',
                                                            }}
                                                        >
                                                            {link}
                                                        </AntLink>
                                                    </React.Fragment>
                                                ))}
                                            </div>
                                        )}
                                    </div>
                                </AntList.Item>
                            );
                        })}
                    </AntList>
                </Content>

                {/* Footer with TextArea & Circular Arrow Button */}
                <Footer style={{ padding: '16px' }}>
                    <div style={{ position: 'relative', width: '100%' }}>
                        <TextArea
                            placeholder="Type your message here..."
                            rows={5}
                            value={input}
                            onChange={(e) => setInput(e.target.value)}
                            onPressEnter={(e) => {
                                // Allow SHIFT+ENTER for multiline
                                if (!e.shiftKey) {
                                    e.preventDefault();
                                    sendMessage();
                                }
                            }}
                            style={{
                                width: '100%',
                                border: '1px solid #ccc',
                                borderRadius: 4,
                                resize: 'none',
                                paddingRight: 60, // Extra space so text won't overlap the button
                            }}
                        />
                        <Button
                            shape="circle"
                            icon={<ArrowRightOutlined />}
                            onClick={sendMessage}
                            style={{
                                position: 'absolute',
                                bottom: 8,
                                right: 8,
                                width: 40,
                                height: 40,
                                minWidth: 40,
                                borderRadius: '50%',
                                fontWeight: 'bold',
                                display: 'flex',
                                alignItems: 'center',
                                justifyContent: 'center',
                            }}
                        />
                    </div>
                </Footer>
            </Layout>
        </ConfigProvider>
    );
};

export default ChatApp;
```