dmayboroda/minima # codebase.md

# Directory Structure

```
├── .env.sample
├── .gitignore
├── .gitmodules
├── assets
│   ├── logo-full-b.svg
│   ├── logo-full-w.svg
│   └── logo-full.svg
├── chat
│   ├── .gitignore
│   ├── Dockerfile
│   ├── package-lock.json
│   ├── package.json
│   ├── public
│   │   ├── favicon.ico
│   │   ├── index.html
│   │   ├── logo192.png
│   │   ├── logo512.png
│   │   ├── manifest.json
│   │   └── robots.txt
│   ├── README.md
│   └── src
│       ├── App.css
│       ├── App.js
│       ├── App.test.js
│       ├── ChatApp.tsx
│       ├── index.css
│       ├── index.js
│       ├── logo.svg
│       ├── reportWebVitals.js
│       └── setupTests.js
├── docker-compose-chatgpt.yml
├── docker-compose-mcp.yml
├── docker-compose-ollama.yml
├── electron
│   ├── assets
│   │   ├── css
│   │   │   ├── no-topbar.css
│   │   │   └── style.css
│   │   ├── icons
│   │   │   ├── mac
│   │   │   │   └── favicon.icns
│   │   │   ├── png
│   │   │   │   └── favicon.png
│   │   │   └── win
│   │   │       └── favicon.ico
│   │   └── js
│   │       └── renderer.js
│   ├── index.html
│   ├── main.js
│   ├── package-lock.json
│   ├── package.json
│   ├── preload.js
│   ├── README.md
│   └── src
│       ├── menu.js
│       ├── print.js
│       ├── view.js
│       └── window.js
├── indexer
│   ├── app.py
│   ├── async_loop.py
│   ├── async_queue.py
│   ├── Dockerfile
│   ├── indexer.py
│   ├── requirements.txt
│   ├── singleton.py
│   └── storage.py
├── LICENSE
├── linker
│   ├── app.py
│   ├── Dockerfile
│   ├── requestor.py
│   └── requirements.txt
├── llm
│   ├── app.py
│   ├── async_answer_to_socket.py
│   ├── async_question_to_answer.py
│   ├── async_queue.py
│   ├── async_socket_to_chat.py
│   ├── control_flow_commands.py
│   ├── Dockerfile
│   ├── llm_chain.py
│   ├── minima_embed.py
│   └── requirements.txt
├── mcp-server
│   ├── pyproject.toml
│   ├── README.md
│   ├── src
│   │   └── minima
│   │       ├── __init__.py
│   │       ├── requestor.py
│   │       └── server.py
│   └── uv.lock
├── README.md
├── run_in_copilot.sh
└── run.sh
```

# Files

--------------------------------------------------------------------------------
/.env.sample:
--------------------------------------------------------------------------------

```
1 | LOCAL_FILES_PATH
2 | EMBEDDING_MODEL_ID
3 | EMBEDDING_SIZE
4 | OLLAMA_MODEL
5 | RERANKER_MODEL
6 | USER_ID
7 | PASSWORD
```

--------------------------------------------------------------------------------
/.gitmodules:
--------------------------------------------------------------------------------

```
1 | [submodule "minima-ui"]
2 |     path = minima-ui
3 |     url = [email protected]:pshenok/minima-ui.git
4 | [submodule "aws"]
5 | 	path = aws
6 | 	url = https://github.com/pshenok/minima-aws.git
7 | 
```

--------------------------------------------------------------------------------
/chat/.gitignore:
--------------------------------------------------------------------------------

```
 1 | # See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
 2 | 
 3 | # dependencies
 4 | /node_modules
 5 | /.pnp
 6 | .pnp.js
 7 | 
 8 | # testing
 9 | /coverage
10 | 
11 | # production
12 | /build
13 | 
14 | # misc
15 | .DS_Store
16 | .env.local
17 | .env.development.local
18 | .env.test.local
19 | .env.production.local
20 | 
21 | npm-debug.log*
22 | yarn-debug.log*
23 | yarn-error.log*
24 | 
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
 1 | __pycache__/
 2 | *.py[cod]
 3 | *$py.class
 4 | 
 5 | *.log
 6 | 
 7 | .cache
 8 | *.cover
 9 | 
10 | .python-version
11 | .env
12 | *.env
13 | .venv
14 | 
15 | .vscode/
16 | 
17 | local_files/
18 | qdrant_data/
19 | indexer_data/
20 | ollama/
21 | 
22 | *.db
23 | .envrc
24 | .DS_Store
25 | 
26 | logs
27 | *.log
28 | npm-debug.log*
29 | yarn-debug.log*
30 | yarn-error.log*
31 | firebase-debug.log*
32 | firebase-debug.*.log*
33 | 
34 | .firebase/
35 | 
36 | pids
37 | *.pid
38 | *.seed
39 | *.pid.lock
40 | 
41 | lib-cov
42 | coverage
43 | .nyc_output
44 | .grunt
45 | bower_components
46 | .lock-wscript
47 | build/Release
48 | node_modules/
49 | .npm
50 | .eslintcache
51 | .node_repl_history
52 | *.tgz
53 | .yarn-integrity
54 | .dataconnect
55 | 
56 | node_modules/
57 | *.local
```

--------------------------------------------------------------------------------
/electron/README.md:
--------------------------------------------------------------------------------

```markdown
 1 | ## Installation
 2 | 
 3 | ```bash
 4 | npm install
 5 | ```
 6 | 
 7 | ## Run
 8 | 
 9 | ```bash
10 | npm start
11 | ```
12 | 
13 | ## Build
14 | 
15 | Binary files for Windows, Linux and Mac are available in the `release-builds/` folder.
16 | 
17 | ### For Windows
18 | 
19 | ```bash
20 | npm run package-win
21 | ```
22 | 
23 | ### For Linux
24 | 
25 | ```bash
26 | npm run package-linux
27 | ```
28 | 
29 | ### For Mac
30 | 
31 | ```bash
32 | npm run package-mac
33 | ```
```

--------------------------------------------------------------------------------
/mcp-server/README.md:
--------------------------------------------------------------------------------

```markdown
 1 | # minima MCP server
 2 | 
 3 | RAG on local files with MCP
 4 | 
 5 | Please go throuh all steps from main README
 6 | 
 7 | just add folliwing to **/Library/Application\ Support/Claude/claude_desktop_config.json**
 8 | 
 9 | ```
10 | {
11 |     "mcpServers": {
12 |       "minima": {
13 |         "command": "uv",
14 |         "args": [
15 |           "--directory",
16 |           "/path_to_cloned_minima_project/mcp-server",
17 |           "run",
18 |           "minima"
19 |         ]
20 |       }
21 |     }
22 |   }
23 | ```
24 | After just open a Claude app and ask to find a context in your local files
```

--------------------------------------------------------------------------------
/chat/README.md:
--------------------------------------------------------------------------------

```markdown
 1 | # Getting Started with Create React App
 2 | 
 3 | This project was bootstrapped with [Create React App](https://github.com/facebook/create-react-app).
 4 | 
 5 | ## Available Scripts
 6 | 
 7 | In the project directory, you can run:
 8 | 
 9 | ### `npm start`
10 | 
11 | Runs the app in the development mode.\
12 | Open [http://localhost:3000](http://localhost:3000) to view it in your browser.
13 | 
14 | The page will reload when you make changes.\
15 | You may also see any lint errors in the console.
16 | 
17 | ### `npm test`
18 | 
19 | Launches the test runner in the interactive watch mode.\
20 | See the section about [running tests](https://facebook.github.io/create-react-app/docs/running-tests) for more information.
21 | 
22 | ### `npm run build`
23 | 
24 | Builds the app for production to the `build` folder.\
25 | It correctly bundles React in production mode and optimizes the build for the best performance.
26 | 
27 | The build is minified and the filenames include the hashes.\
28 | Your app is ready to be deployed!
29 | 
30 | See the section about [deployment](https://facebook.github.io/create-react-app/docs/deployment) for more information.
31 | 
32 | ### `npm run eject`
33 | 
34 | **Note: this is a one-way operation. Once you `eject`, you can't go back!**
35 | 
36 | If you aren't satisfied with the build tool and configuration choices, you can `eject` at any time. This command will remove the single build dependency from your project.
37 | 
38 | Instead, it will copy all the configuration files and the transitive dependencies (webpack, Babel, ESLint, etc) right into your project so you have full control over them. All of the commands except `eject` will still work, but they will point to the copied scripts so you can tweak them. At this point you're on your own.
39 | 
40 | You don't have to ever use `eject`. The curated feature set is suitable for small and middle deployments, and you shouldn't feel obligated to use this feature. However we understand that this tool wouldn't be useful if you couldn't customize it when you are ready for it.
41 | 
42 | ## Learn More
43 | 
44 | You can learn more in the [Create React App documentation](https://facebook.github.io/create-react-app/docs/getting-started).
45 | 
46 | To learn React, check out the [React documentation](https://reactjs.org/).
47 | 
48 | ### Code Splitting
49 | 
50 | This section has moved here: [https://facebook.github.io/create-react-app/docs/code-splitting](https://facebook.github.io/create-react-app/docs/code-splitting)
51 | 
52 | ### Analyzing the Bundle Size
53 | 
54 | This section has moved here: [https://facebook.github.io/create-react-app/docs/analyzing-the-bundle-size](https://facebook.github.io/create-react-app/docs/analyzing-the-bundle-size)
55 | 
56 | ### Making a Progressive Web App
57 | 
58 | This section has moved here: [https://facebook.github.io/create-react-app/docs/making-a-progressive-web-app](https://facebook.github.io/create-react-app/docs/making-a-progressive-web-app)
59 | 
60 | ### Advanced Configuration
61 | 
62 | This section has moved here: [https://facebook.github.io/create-react-app/docs/advanced-configuration](https://facebook.github.io/create-react-app/docs/advanced-configuration)
63 | 
64 | ### Deployment
65 | 
66 | This section has moved here: [https://facebook.github.io/create-react-app/docs/deployment](https://facebook.github.io/create-react-app/docs/deployment)
67 | 
68 | ### `npm run build` fails to minify
69 | 
70 | This section has moved here: [https://facebook.github.io/create-react-app/docs/troubleshooting#npm-run-build-fails-to-minify](https://facebook.github.io/create-react-app/docs/troubleshooting#npm-run-build-fails-to-minify)
71 | 
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
  1 | <p align="center">
  2 |   <a href="https://mnma.ai/" target="blank"><img src="assets/logo-full.svg" width="300" alt="MNMA Logo" /></a>
  3 | </p>
  4 | 
  5 | **Minima** is an open source RAG on-premises containers, with ability to integrate with ChatGPT and MCP. 
  6 | Minima can also be used as a fully local RAG.
  7 | 
  8 | Minima currently supports three modes:
  9 | 1. Isolated installation – Operate fully on-premises with containers, free from external dependencies such as ChatGPT or Claude. All neural networks (LLM, reranker, embedding) run on your cloud or PC, ensuring your data remains secure.
 10 | 
 11 | 2. Custom GPT – Query your local documents using ChatGPT app or web with custom GPTs. The indexer running on your cloud or local PC, while the primary LLM remains ChatGPT.
 12 | 
 13 | 3. Anthropic Claude – Use Anthropic Claude app to query your local documents. The indexer operates on your local PC, while Anthropic Claude serves as the primary LLM.
 14 | 
 15 | ---
 16 | 
 17 | ## Running as Containers
 18 | 
 19 | 1. Create a .env file in the project’s root directory (where you’ll find env.sample). Place .env in the same folder and copy all environment variables from env.sample to .env.
 20 | 
 21 | 2. Ensure your .env file includes the following variables:
 22 | <ul>
 23 |    <li> LOCAL_FILES_PATH </li>
 24 |    <li> EMBEDDING_MODEL_ID </li>
 25 |    <li> EMBEDDING_SIZE </li>
 26 |    <li> OLLAMA_MODEL </li>
 27 |    <li> RERANKER_MODEL </li>
 28 |    <li> USER_ID </li> - required for ChatGPT integration, just use your email
 29 |    <li> PASSWORD </li> - required for ChatGPT integration, just use any password
 30 | </ul>
 31 | 
 32 | 3. For fully local installation use: **docker compose -f docker-compose-ollama.yml --env-file .env up --build**.
 33 | 
 34 | 4. For ChatGPT enabled installation use: **docker compose -f docker-compose-chatgpt.yml --env-file .env up --build**.
 35 | 
 36 | 5. For MCP integration (Anthropic Desktop app usage): **docker compose -f docker-compose-mcp.yml --env-file .env up --build**.
 37 | 
 38 | 6. In case of ChatGPT enabled installation copy OTP from terminal where you launched docker and use [Minima GPT](https://chatgpt.com/g/g-r1MNTSb0Q-minima-local-computer-search)  
 39 | 
 40 | 7. If you use Anthropic Claude, just add folliwing to **/Library/Application\ Support/Claude/claude_desktop_config.json**
 41 | 
 42 | ```
 43 | {
 44 |     "mcpServers": {
 45 |       "minima": {
 46 |         "command": "uv",
 47 |         "args": [
 48 |           "--directory",
 49 |           "/path_to_cloned_minima_project/mcp-server",
 50 |           "run",
 51 |           "minima"
 52 |         ]
 53 |       }
 54 |     }
 55 |   }
 56 | ```
 57 |    
 58 | 8. To use fully local installation go to `cd electron`, then run `npm install` and `npm start` which will launch Minima electron app.
 59 | 
 60 | 9. Ask anything, and you'll get answers based on local files in {LOCAL_FILES_PATH} folder.
 61 | ---
 62 | 
 63 | ## Variables Explained
 64 | 
 65 | **LOCAL_FILES_PATH**: Specify the root folder for indexing (on your cloud or local pc). Indexing is a recursive process, meaning all documents within subfolders of this root folder will also be indexed. Supported file types: .pdf, .xls, .docx, .txt, .md, .csv.
 66 | 
 67 | **EMBEDDING_MODEL_ID**: Specify the embedding model to use. Currently, only Sentence Transformer models are supported. Testing has been done with sentence-transformers/all-mpnet-base-v2, but other Sentence Transformer models can be used.
 68 | 
 69 | **EMBEDDING_SIZE**: Define the embedding dimension provided by the model, which is needed to configure Qdrant vector storage. Ensure this value matches the actual embedding size of the specified EMBEDDING_MODEL_ID.
 70 | 
 71 | **OLLAMA_MODEL**: Set up the Ollama model, use an ID available on the Ollama [site](https://ollama.com/search). Please, use LLM model here, not an embedding.
 72 | 
 73 | **RERANKER_MODEL**: Specify the reranker model. Currently, we have tested with BAAI rerankers. You can explore all available rerankers using this [link](https://huggingface.co/collections/BAAI/).
 74 | 
 75 | **USER_ID**: Just use your email here, this is needed to authenticate custom GPT to search in your data.
 76 | 
 77 | **PASSWORD**: Put any password here, this is used to create a firebase account for the email specified above.
 78 | 
 79 | ---
 80 | 
 81 | ## Examples
 82 | 
 83 | **Example of .env file for on-premises/local usage:**
 84 | ```
 85 | LOCAL_FILES_PATH=/Users/davidmayboroda/Downloads/PDFs/
 86 | EMBEDDING_MODEL_ID=sentence-transformers/all-mpnet-base-v2
 87 | EMBEDDING_SIZE=768
 88 | OLLAMA_MODEL=qwen2:0.5b # must be LLM model id from Ollama models page
 89 | RERANKER_MODEL=BAAI/bge-reranker-base # please, choose any BAAI reranker model
 90 | ```
 91 | 
 92 | To use a chat ui, please navigate to **http://localhost:3000**
 93 | 
 94 | **Example of .env file for Claude app:**
 95 | ```
 96 | LOCAL_FILES_PATH=/Users/davidmayboroda/Downloads/PDFs/
 97 | EMBEDDING_MODEL_ID=sentence-transformers/all-mpnet-base-v2
 98 | EMBEDDING_SIZE=768
 99 | ```
100 | For the Claude app, please apply the changes to the claude_desktop_config.json file as outlined above.
101 | 
102 | **To use MCP with GitHub Copilot:**
103 | 1. Create a .env file in the project’s root directory (where you’ll find env.sample). Place .env in the same folder and copy all environment variables from env.sample to .env.
104 | 
105 | 2. Ensure your .env file includes the following variables:
106 |     - LOCAL_FILES_PATH
107 |     - EMBEDDING_MODEL_ID
108 |     - EMBEDDING_SIZE
109 |       
110 | 3. Create or update the `.vscode/mcp.json` with the following configuration:
111 | 
112 | ````json
113 | {
114 |   "servers": {
115 |     "minima": {
116 |       "type": "stdio",
117 |       "command": "path_to_cloned_minima_project/run_in_copilot.sh",
118 |       "args": [
119 |         "path_to_cloned_minima_project"
120 |       ]
121 |     }
122 |   }
123 | }
124 | ````
125 | 
126 | **Example of .env file for ChatGPT custom GPT usage:**
127 | ```
128 | LOCAL_FILES_PATH=/Users/davidmayboroda/Downloads/PDFs/
129 | EMBEDDING_MODEL_ID=sentence-transformers/all-mpnet-base-v2
130 | EMBEDDING_SIZE=768
131 | [email protected] # your real email
132 | PASSWORD=password # you can create here password that you want
133 | ```
134 | 
135 | Also, you can run minima using **run.sh**.
136 | 
137 | ---
138 | 
139 | ## Installing via Smithery (MCP usage)
140 | 
141 | To install Minima for Claude Desktop automatically via [Smithery](https://smithery.ai/protocol/minima):
142 | 
143 | ```bash
144 | npx -y @smithery/cli install minima --client claude
145 | ```
146 | 
147 | **For MCP usage, please be sure that your local machines python is >=3.10 and 'uv' installed.**
148 | 
149 | Minima (https://github.com/dmayboroda/minima) is licensed under the Mozilla Public License v2.0 (MPLv2).
150 | 
```

--------------------------------------------------------------------------------
/chat/public/robots.txt:
--------------------------------------------------------------------------------

```
1 | # https://www.robotstxt.org/robotstxt.html
2 | User-agent: *
3 | Disallow:
4 | 
```

--------------------------------------------------------------------------------
/electron/assets/css/style.css:
--------------------------------------------------------------------------------

```css
1 | /* Main */
2 | 
3 | body {
4 |   margin: 0;
5 |   padding: 0;
6 |   -webkit-user-select: none;
7 |   -webkit-app-region: drag;
8 | }
9 | 
```

--------------------------------------------------------------------------------
/linker/requirements.txt:
--------------------------------------------------------------------------------

```
1 | httpx
2 | google-cloud-firestore
3 | firebase_admin
4 | asyncio==3.4.3
5 | fastapi==0.111.0
6 | requests==2.32.3
7 | uvicorn[standard]
```

--------------------------------------------------------------------------------
/electron/assets/css/no-topbar.css:
--------------------------------------------------------------------------------

```css
 1 | /* No topbar */
 2 | 
 3 | #webview {
 4 |   position: absolute;
 5 |   top: 0;
 6 |   left: 0;
 7 |   width: 100%;
 8 |   height: 100%;
 9 |   display: inline-flex !important;
10 | }
11 | 
```

--------------------------------------------------------------------------------
/chat/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
 1 | FROM node:20-alpine as build
 2 | 
 3 | WORKDIR /app
 4 | 
 5 | COPY package*.json ./
 6 | 
 7 | RUN npm install
 8 | 
 9 | COPY . .
10 | 
11 | RUN npm run build
12 | 
13 | EXPOSE 3000
14 | 
15 | CMD ["npm", "run", "start"] 
```

--------------------------------------------------------------------------------
/electron/preload.js:
--------------------------------------------------------------------------------

```javascript
1 | const { contextBridge, ipcRenderer } = require("electron");
2 | 
3 | contextBridge.exposeInMainWorld("electron", {
4 |   print: (arg) => ipcRenderer.invoke("print", arg),
5 | });
6 | 
```

--------------------------------------------------------------------------------
/llm/control_flow_commands.py:
--------------------------------------------------------------------------------

```python
1 | PREFIX = 'CONTROL FLOW COMMAND:'
2 | 
3 | CFC_CLIENT_DISCONNECTED = PREFIX + 'CLIENT DISCONNECTED'
4 | CFC_CHAT_STARTED = PREFIX + 'CHAT STARTED'
5 | CFC_CHAT_STOPPED = PREFIX + 'CHAT STOPPED'
```

--------------------------------------------------------------------------------
/mcp-server/src/minima/__init__.py:
--------------------------------------------------------------------------------

```python
1 | from . import server
2 | import asyncio
3 | 
4 | def main():
5 |     """Main entry point for the package."""
6 |     asyncio.run(server.main())
7 | 
8 | # Optionally expose other important items at package level
9 | __all__ = ['main', 'server']
```

--------------------------------------------------------------------------------
/indexer/singleton.py:
--------------------------------------------------------------------------------

```python
1 | class Singleton(type):
2 |     _instances = {}
3 | 
4 |     def __call__(cls, *args, **kwargs):
5 |         if cls not in cls._instances:
6 |             cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
7 |         return cls._instances[cls]
8 | 
```

--------------------------------------------------------------------------------
/chat/src/setupTests.js:
--------------------------------------------------------------------------------

```javascript
1 | // jest-dom adds custom jest matchers for asserting on DOM nodes.
2 | // allows you to do things like:
3 | // expect(element).toHaveTextContent(/react/i)
4 | // learn more: https://github.com/testing-library/jest-dom
5 | import '@testing-library/jest-dom';
6 | 
```

--------------------------------------------------------------------------------
/chat/src/App.test.js:
--------------------------------------------------------------------------------

```javascript
1 | import { render, screen } from '@testing-library/react';
2 | import App from './App';
3 | 
4 | test('renders learn react link', () => {
5 |   render(<App />);
6 |   const linkElement = screen.getByText(/learn react/i);
7 |   expect(linkElement).toBeInTheDocument();
8 | });
9 | 
```

--------------------------------------------------------------------------------
/llm/requirements.txt:
--------------------------------------------------------------------------------

```
 1 | requests
 2 | ollama
 3 | langgraph
 4 | langchain
 5 | langchain-core
 6 | langchain_qdrant
 7 | langchain-ollama
 8 | langchain_community==0.2.17
 9 | langchain-huggingface
10 | sentence-transformers==2.6.0
11 | transformers
12 | asyncio==3.4.3
13 | fastapi==0.111.0
14 | qdrant-client
15 | uvicorn[standard]
16 | python-dotenv
17 | pydantic
```

--------------------------------------------------------------------------------
/run_in_copilot.sh:
--------------------------------------------------------------------------------

```bash
 1 | #!/bin/bash
 2 | # run_in_copilot.sh
 3 | 
 4 | WORKDIR="${1:-$(pwd)}" # Default to current directory if no argument is provided
 5 | 
 6 | echo "[run_in_copilot] A working directory be used: $WORKDIR"
 7 | 
 8 | docker compose -f "$WORKDIR/docker-compose-mcp.yml" up -d
 9 | 
10 | uv --directory "$WORKDIR/mcp-server" run minima
11 | 
```

--------------------------------------------------------------------------------
/electron/src/view.js:
--------------------------------------------------------------------------------

```javascript
 1 | const electron = require("electron");
 2 | const { BrowserView } = electron;
 3 | 
 4 | exports.createBrowserView = (mainWindow) => {
 5 |   const view = new BrowserView();
 6 |   mainWindow.setBrowserView(view);
 7 |   view.setBounds({ x: 0, y: 0, width: 1024, height: 768 });
 8 |   view.webContents.loadURL("http://localhost:3000/");
 9 | };
10 | 
```

--------------------------------------------------------------------------------
/electron/src/print.js:
--------------------------------------------------------------------------------

```javascript
 1 | const { ipcMain, BrowserWindow } = require("electron");
 2 | 
 3 | ipcMain.handle("print", async (event, arg) => {
 4 |   let printWindow = new BrowserWindow({ "auto-hide-menu-bar": true });
 5 |   printWindow.loadURL(arg);
 6 | 
 7 |   printWindow.webContents.on("did-finish-load", () => {
 8 |     printWindow.webContents.print();
 9 |   });
10 | });
11 | 
```

--------------------------------------------------------------------------------
/linker/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
 1 | FROM python:3.11-slim-buster
 2 | 
 3 | WORKDIR /usr/src/app
 4 | RUN pip install --upgrade pip
 5 | COPY requirements.txt .
 6 | RUN pip install --no-cache-dir -r requirements.txt
 7 | COPY . .
 8 | 
 9 | ENV PORT 8000
10 | ENV CURRENT_HOST 0.0.0.0
11 | ENV WORKERS 1
12 | 
13 | CMD ["sh", "-c", "uvicorn app:app --loop asyncio --reload --workers ${WORKERS} --host $CURRENT_HOST --port $PORT --proxy-headers"]
```

--------------------------------------------------------------------------------
/chat/src/reportWebVitals.js:
--------------------------------------------------------------------------------

```javascript
 1 | const reportWebVitals = onPerfEntry => {
 2 |   if (onPerfEntry && onPerfEntry instanceof Function) {
 3 |     import('web-vitals').then(({ getCLS, getFID, getFCP, getLCP, getTTFB }) => {
 4 |       getCLS(onPerfEntry);
 5 |       getFID(onPerfEntry);
 6 |       getFCP(onPerfEntry);
 7 |       getLCP(onPerfEntry);
 8 |       getTTFB(onPerfEntry);
 9 |     });
10 |   }
11 | };
12 | 
13 | export default reportWebVitals;
14 | 
```

--------------------------------------------------------------------------------
/indexer/requirements.txt:
--------------------------------------------------------------------------------

```
 1 | langchain
 2 | langchain-core
 3 | langfuse
 4 | langchain_qdrant
 5 | langchain_community
 6 | langchain-huggingface
 7 | sentence-transformers==2.6.0
 8 | transformers
 9 | asyncio==3.4.3
10 | fastapi==0.111.0
11 | qdrant-client
12 | uvicorn[standard]
13 | unstructured[xlsx]
14 | unstructured[pptx]
15 | python-magic
16 | python-dotenv
17 | openpyxl
18 | docx2txt
19 | pymupdf==1.25.1
20 | pydantic
21 | fastapi-utilities
22 | sqlmodel
23 | nltk
24 | unstructured
25 | python-pptx
```

--------------------------------------------------------------------------------
/chat/src/index.css:
--------------------------------------------------------------------------------

```css
 1 | body {
 2 |   margin: 0;
 3 |   font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Roboto', 'Oxygen',
 4 |     'Ubuntu', 'Cantarell', 'Fira Sans', 'Droid Sans', 'Helvetica Neue',
 5 |     sans-serif;
 6 |   -webkit-font-smoothing: antialiased;
 7 |   -moz-osx-font-smoothing: grayscale;
 8 | }
 9 | 
10 | code {
11 |   font-family: source-code-pro, Menlo, Monaco, Consolas, 'Courier New',
12 |     monospace;
13 | }
14 | 
```

--------------------------------------------------------------------------------
/mcp-server/pyproject.toml:
--------------------------------------------------------------------------------

```toml
 1 | [project]
 2 | name = "minima"
 3 | version = "0.0.1"
 4 | description = "RAG on local files with MCP"
 5 | readme = "README.md"
 6 | requires-python = ">=3.10"
 7 | dependencies = [
 8 |  "httpx>=0.28.0",
 9 |  "mcp>=1.0.0",
10 |  "pydantic>=2.10.2",
11 | ]
12 | [[project.authors]]
13 | name = "David Mayboroda"
14 | email = "[email protected]"
15 | 
16 | [build-system]
17 | requires = [ "hatchling",]
18 | build-backend = "hatchling.build"
19 | 
20 | [project.scripts]
21 | minima = "minima:main"
22 | 
```

--------------------------------------------------------------------------------
/electron/index.html:
--------------------------------------------------------------------------------

```html
 1 | <html>
 2 |   <head>
 3 |     <link rel="stylesheet" href="assets/css/style.css" />
 4 |     <link rel="stylesheet" href="assets/css/no-topbar.css">
 5 |     <script defer src="assets/js/renderer.js"></script>
 6 |     <meta
 7 |       http-equiv="Content-Security-Policy"
 8 |       content="default-src 'self'; style-src 'self' 'unsafe-inline';"
 9 |     />
10 |   </head>
11 |   <body>
12 |     <webview
13 |       id="webview"
14 |       autosize="on"
15 |       src="http://localhost:3000/"
16 |     ></webview>
17 |   </body>
18 | </html>
19 | 
```

--------------------------------------------------------------------------------
/llm/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
 1 | FROM python:3.11-slim-buster
 2 | 
 3 | WORKDIR /usr/src/app
 4 | 
 5 | ARG RERANKER_MODEL
 6 | 
 7 | RUN pip install --upgrade pip
 8 | COPY requirements.txt .
 9 | RUN pip install huggingface_hub
10 | RUN huggingface-cli download $RERANKER_MODEL --repo-type model
11 | RUN pip install --no-cache-dir -r requirements.txt
12 | COPY . .
13 | 
14 | ENV PORT 8000
15 | ENV CURRENT_HOST 0.0.0.0
16 | ENV WORKERS 1
17 | 
18 | CMD ["sh", "-c", "uvicorn app:app --loop asyncio --reload --workers ${WORKERS} --host $CURRENT_HOST --port $PORT --proxy-headers"]
```

--------------------------------------------------------------------------------
/chat/src/App.js:
--------------------------------------------------------------------------------

```javascript
 1 | import React from 'react';
 2 | import './App.css';
 3 | import ChatApp from './ChatApp.tsx';
 4 | import { ToastContainer, toast } from 'react-toastify';
 5 | 
 6 | function App() {
 7 |     return (
 8 |         <div className="App" style={{ display: 'flex', flexDirection: 'column', height: '100%' }}>
 9 |             <header className="App-header" style={{height: "100%"}}>
10 |                 <ChatApp />
11 |                 <ToastContainer />
12 |             </header>
13 |         </div>
14 |     );
15 | }
16 | 
17 | export default App;
18 | 
```

--------------------------------------------------------------------------------
/electron/main.js:
--------------------------------------------------------------------------------

```javascript
 1 | const { app } = require("electron");
 2 | 
 3 | app.allowRendererProcessReuse = true;
 4 | app.on("ready", () => {
 5 |   const window = require("./src/window");
 6 |   mainWindow = window.createBrowserWindow(app);
 7 |   
 8 |   mainWindow.loadURL(`file://${__dirname}/index.html`, { userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36' });
 9 | 
10 |   require("./src/print");
11 | });
12 | 
13 | app.on("window-all-closed", () => {
14 |   app.quit();
15 | });
16 | 
```

--------------------------------------------------------------------------------
/indexer/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
 1 | FROM python:3.11-slim-buster
 2 | 
 3 | WORKDIR /usr/src/app
 4 | RUN pip install --upgrade pip
 5 | 
 6 | ARG EMBEDDING_MODEL_ID
 7 | 
 8 | RUN pip install huggingface_hub
 9 | RUN huggingface-cli download $EMBEDDING_MODEL_ID --repo-type model
10 | 
11 | COPY requirements.txt .
12 | RUN pip install --no-cache-dir -r requirements.txt
13 | COPY . .
14 | 
15 | ENV PORT 8000
16 | ENV CURRENT_HOST 0.0.0.0
17 | ENV WORKERS 1
18 | 
19 | CMD ["sh", "-c", "uvicorn app:app --loop asyncio --reload --workers ${WORKERS} --host $CURRENT_HOST --port $PORT --proxy-headers"]
```

--------------------------------------------------------------------------------
/chat/public/manifest.json:
--------------------------------------------------------------------------------

```json
 1 | {
 2 |   "short_name": "React App",
 3 |   "name": "Create React App Sample",
 4 |   "icons": [
 5 |     {
 6 |       "src": "favicon.ico",
 7 |       "sizes": "64x64 32x32 24x24 16x16",
 8 |       "type": "image/x-icon"
 9 |     },
10 |     {
11 |       "src": "logo192.png",
12 |       "type": "image/png",
13 |       "sizes": "192x192"
14 |     },
15 |     {
16 |       "src": "logo512.png",
17 |       "type": "image/png",
18 |       "sizes": "512x512"
19 |     }
20 |   ],
21 |   "start_url": ".",
22 |   "display": "standalone",
23 |   "theme_color": "#000000",
24 |   "background_color": "#ffffff"
25 | }
26 | 
```

--------------------------------------------------------------------------------
/chat/src/index.js:
--------------------------------------------------------------------------------

```javascript
 1 | import React from 'react';
 2 | import ReactDOM from 'react-dom/client';
 3 | import './index.css';
 4 | import App from './App';
 5 | import reportWebVitals from './reportWebVitals';
 6 | 
 7 | const root = ReactDOM.createRoot(document.getElementById('root'));
 8 | root.render(
 9 |   <React.StrictMode>
10 |     <App />
11 |   </React.StrictMode>
12 | );
13 | 
14 | // If you want to start measuring performance in your app, pass a function
15 | // to log results (for example: reportWebVitals(console.log))
16 | // or send to an analytics endpoint. Learn more: https://bit.ly/CRA-vitals
17 | reportWebVitals();
18 | 
```

--------------------------------------------------------------------------------
/chat/src/App.css:
--------------------------------------------------------------------------------

```css
 1 | .App {
 2 |   text-align: center;
 3 | }
 4 | 
 5 | .App-logo {
 6 |   height: 40vmin;
 7 |   pointer-events: none;
 8 | }
 9 | 
10 | @media (prefers-reduced-motion: no-preference) {
11 |   .App-logo {
12 |     animation: App-logo-spin infinite 20s linear;
13 |   }
14 | }
15 | 
16 | .App-header {
17 |   background-color: #282c34;
18 |   min-height: 100vh;
19 |   display: flex;
20 |   flex-direction: column;
21 |   align-items: center;
22 |   justify-content: center;
23 |   font-size: calc(10px + 2vmin);
24 |   color: white;
25 | }
26 | 
27 | .App-link {
28 |   color: #61dafb;
29 | }
30 | 
31 | @keyframes App-logo-spin {
32 |   from {
33 |     transform: rotate(0deg);
34 |   }
35 |   to {
36 |     transform: rotate(360deg);
37 |   }
38 | }
39 | 
```

--------------------------------------------------------------------------------
/llm/async_answer_to_socket.py:
--------------------------------------------------------------------------------

```python
 1 | import logging
 2 | from fastapi import WebSocket
 3 | from async_queue import AsyncQueue
 4 | import control_flow_commands as cfc
 5 | import starlette.websockets as ws
 6 | 
 7 | logging.basicConfig(level=logging.INFO)
 8 | logger = logging.getLogger("llm")
 9 | 
10 | async def loop(response_queue: AsyncQueue, websocket: WebSocket):
11 |     while True:
12 |         data = await response_queue.dequeue()
13 | 
14 |         if data == cfc.CFC_CLIENT_DISCONNECTED:
15 |             break
16 |         else:
17 |             logger.info(f"Sending data: {data}")
18 |             try:
19 |                 await websocket.send_text(data)
20 |             except ws.WebSocketDisconnect:
21 |                 break
```

--------------------------------------------------------------------------------
/electron/src/window.js:
--------------------------------------------------------------------------------

```javascript
 1 | const path = require("path");
 2 | const { BrowserWindow } = require("electron");
 3 | 
 4 | exports.createBrowserWindow = () => {
 5 |   return new BrowserWindow({
 6 |     width: 1024,
 7 |     height: 768,
 8 |     minWidth: 400, 
 9 |     minHeight: 600,
10 |     icon: path.join(__dirname, "assets/icons/png/favicon.png"),
11 |     backgroundColor: "#fff",
12 |     autoHideMenuBar: true,
13 |     webPreferences: {
14 |       devTools: false,
15 |       contextIsolation: true,
16 |       webviewTag: true,
17 |       preload: path.join(__dirname, "../preload.js"),
18 |       enableRemoteModule: true,
19 |       nodeIntegration: false,
20 |       nativeWindowOpen: true,
21 |       webSecurity: true,
22 |       allowRunningInsecureContent: true
23 |     },
24 |   });
25 | };
26 | 
```

--------------------------------------------------------------------------------
/run.sh:
--------------------------------------------------------------------------------

```bash
 1 | #!/bin/bash
 2 | 
 3 | echo "Select an option:"
 4 | echo "1) Fully Local Setup"
 5 | echo "2) ChatGPT Integration"
 6 | echo "3) MCP usage"
 7 | echo "4) Quit"
 8 | 
 9 | read -p "Enter your choice (1, 2, 3 or 4): " user_choice
10 | 
11 | case "$user_choice" in
12 |     1)
13 |         echo "Starting fully local setup..."
14 |         docker compose -f docker-compose-ollama.yml --env-file .env up --build
15 |         ;;
16 |     2)
17 |         echo "Starting with ChatGPT integration..."
18 |         docker compose -f docker-compose-chatgpt.yml --env-file .env up --build
19 |         ;;
20 |     3)
21 |         echo "Exiting the script. Goodbye!"
22 |         docker compose -f docker-compose-mcp.yml --env-file .env up --build
23 |         ;;
24 |     4)
25 |         echo "Exiting the script. Goodbye!"
26 |         exit 0
27 |         ;;
28 |     *)
29 |         echo "Invalid input. Please enter 1, 2, or 3."
30 |         ;;
31 | esac
```

--------------------------------------------------------------------------------
/llm/app.py:
--------------------------------------------------------------------------------

```python
 1 | import logging
 2 | import asyncio
 3 | from fastapi import FastAPI
 4 | from fastapi import WebSocket
 5 | from llm_chain import LLMChain
 6 | from async_queue import AsyncQueue
 7 | 
 8 | import async_socket_to_chat
 9 | import async_question_to_answer
10 | import async_answer_to_socket
11 | 
12 | app = FastAPI()
13 | 
14 | logging.basicConfig(level=logging.INFO)
15 | logger = logging.getLogger("llm")
16 | 
17 | @app.websocket("/llm/")
18 | async def chat_client(websocket: WebSocket):
19 | 
20 |     question_queue = AsyncQueue()
21 |     response_queue = AsyncQueue()
22 | 
23 |     answer_to_socket_promise = async_answer_to_socket.loop(response_queue, websocket)
24 |     question_to_answer_promise = async_question_to_answer.loop(question_queue, response_queue)
25 |     socket_to_chat_promise = async_socket_to_chat.loop(websocket, question_queue, response_queue)
26 | 
27 |     await asyncio.gather(
28 |         answer_to_socket_promise,
29 |         question_to_answer_promise,
30 |         socket_to_chat_promise,
31 |     )
```

--------------------------------------------------------------------------------
/mcp-server/src/minima/requestor.py:
--------------------------------------------------------------------------------

```python
 1 | import httpx
 2 | import logging
 3 | 
 4 | 
 5 | logging.basicConfig(level=logging.INFO)
 6 | logger = logging.getLogger(__name__)
 7 | 
 8 | REQUEST_DATA_URL = "http://localhost:8001/query"
 9 | REQUEST_HEADERS = {
10 |     'Accept': 'application/json',
11 |     'Content-Type': 'application/json'
12 | }
13 | 
14 | async def request_data(query):
15 |     payload = {
16 |         "query": query
17 |     }
18 |     async with httpx.AsyncClient() as client:
19 |         try:
20 |             logger.info(f"Requesting data from indexer with query: {query}")
21 |             response = await client.post(REQUEST_DATA_URL, 
22 |                                          headers=REQUEST_HEADERS, 
23 |                                          json=payload)
24 |             response.raise_for_status()
25 |             data = response.json()
26 |             logger.info(f"Received data: {data}")
27 |             return data
28 | 
29 |         except Exception as e:
30 |             logger.error(f"HTTP error: {e}")
31 |             return { "error": str(e) }
```

--------------------------------------------------------------------------------
/linker/requestor.py:
--------------------------------------------------------------------------------

```python
 1 | import httpx
 2 | import logging
 3 | import asyncio
 4 | 
 5 | logging.basicConfig(level=logging.INFO)
 6 | logger = logging.getLogger(__name__)
 7 | 
 8 | REQUEST_DATA_URL = "http://indexer:8000/query"
 9 | REQUEST_HEADERS = {
10 |     'Accept': 'application/json',
11 |     'Content-Type': 'application/json'
12 | }
13 | 
14 | async def request_data(query):
15 |     payload = {
16 |         "query": query
17 |     }
18 |     async with httpx.AsyncClient() as client:
19 |         try:
20 |             logger.info(f"Requesting data from indexer with query: {query}")
21 |             response = await client.post(REQUEST_DATA_URL, 
22 |                                          headers=REQUEST_HEADERS, 
23 |                                          json=payload)
24 |             response.raise_for_status()
25 |             data = response.json()
26 |             logger.info(f"Received data: {data}")
27 |             return data
28 | 
29 |         except Exception as e:
30 |             logger.error(f"HTTP error: {e}")
31 |             return { "error": str(e) }
```

--------------------------------------------------------------------------------
/docker-compose-mcp.yml:
--------------------------------------------------------------------------------

```yaml
 1 | version: '3.9'
 2 | services:
 3 | 
 4 |   qdrant:
 5 |     image: qdrant/qdrant:latest
 6 |     container_name: qdrant
 7 |     ports:
 8 |       - 6333:6333
 9 |       - 6334:6334
10 |     expose:
11 |       - 6333
12 |       - 6334
13 |       - 6335
14 |     volumes:
15 |       - ./qdrant_data:/qdrant/storage
16 |     environment:
17 |       QDRANT__LOG_LEVEL: "INFO"
18 | 
19 |   indexer:
20 |     build: 
21 |       context: ./indexer
22 |       dockerfile: Dockerfile
23 |       args:
24 |         EMBEDDING_MODEL_ID: ${EMBEDDING_MODEL_ID}
25 |         EMBEDDING_SIZE: ${EMBEDDING_SIZE}
26 |     volumes:
27 |       - ${LOCAL_FILES_PATH}:/usr/src/app/local_files/
28 |       - ./indexer:/usr/src/app
29 |       - ./indexer_data:/indexer/storage
30 |     ports:
31 |       - 8001:8000
32 |     environment:
33 |       - PYTHONPATH=/usr/src
34 |       - PYTHONUNBUFFERED=TRUE
35 |       - LOCAL_FILES_PATH=${LOCAL_FILES_PATH}
36 |       - EMBEDDING_MODEL_ID=${EMBEDDING_MODEL_ID}
37 |       - EMBEDDING_SIZE=${EMBEDDING_SIZE}
38 |       - CONTAINER_PATH=/usr/src/app/local_files/
39 |     depends_on:
40 |       - qdrant
```

--------------------------------------------------------------------------------
/indexer/async_queue.py:
--------------------------------------------------------------------------------

```python
 1 | import asyncio
 2 | 
 3 | from collections import deque
 4 | 
 5 | class AsyncQueueDequeueInterrupted(Exception):
 6 |   
 7 |     def __init__(self, message="AsyncQueue dequeue was interrupted"):
 8 |         self.message = message
 9 |         super().__init__(self.message)
10 | 
11 | class AsyncQueue:
12 | 
13 |     def __init__(self):
14 |         self._data = deque([])
15 |         self._presense_of_data = asyncio.Event()
16 | 
17 |     def enqueue(self, value):
18 |         self._data.append(value)
19 | 
20 |         if len(self._data) == 1:
21 |             self._presense_of_data.set()
22 | 
23 |     async def dequeue(self):
24 |         await self._presense_of_data.wait()
25 | 
26 |         if len(self._data) < 1:
27 |             raise AsyncQueueDequeueInterrupted("AsyncQueue was dequeue was interrupted")
28 | 
29 |         result = self._data.popleft()
30 | 
31 |         if not self._data:
32 |             self._presense_of_data.clear()
33 | 
34 |         return result
35 | 
36 |     def size(self):
37 |         result = len(self._data)
38 |         return result
39 | 
40 |     def shutdown(self):
41 |         self._presense_of_data.set()
```

--------------------------------------------------------------------------------
/llm/async_queue.py:
--------------------------------------------------------------------------------

```python
 1 | import asyncio
 2 | 
 3 | from collections import deque
 4 | 
 5 | class AsyncQueueDequeueInterrupted(Exception):
 6 |     def __init__(self, message="AsyncQueue dequeue was interrupted"):
 7 |         self.message = message
 8 |         super().__init__(self.message)
 9 | 
10 | class AsyncQueue:
11 |     def __init__(self) -> None:
12 |         self._data = deque([])
13 |         self._presence_of_data = asyncio.Event()
14 | 
15 |     def enqueue(self, value):
16 |         self._data.append(value)
17 | 
18 |         if len(self._data) == 1:
19 |             self._presence_of_data.set()
20 | 
21 |     async def dequeue(self):
22 |         await self._presence_of_data.wait()
23 | 
24 |         if len(self._data) < 1:
25 |             raise AsyncQueueDequeueInterrupted("AsyncQueue was dequeue was interrupted")
26 | 
27 |         result = self._data.popleft()
28 | 
29 |         if not self._data:
30 |             self._presence_of_data.clear()
31 | 
32 |         return result
33 | 
34 |     def size(self):
35 |         result = len(self._data)
36 |         return result
37 | 
38 |     def shutdown(self):
39 |         self._presence_of_data.set()
40 | 
```

--------------------------------------------------------------------------------
/electron/package.json:
--------------------------------------------------------------------------------

```json
 1 | {
 2 |   "name": "minima-app",
 3 |   "productName": "Minima",
 4 |   "version": "1.0.0",
 5 |   "description": "Minima is your local AI search app. Chat with your documents, photos and more.",
 6 |   "main": "main.js",
 7 |   
 8 |   "scripts": {
 9 |     "start": "electron .",
10 |     "package-mac": "npx electron-packager . --overwrite --platform=darwin --arch=x64 --icon=assets/icons/mac/favicon.icns --prune=true --out=release-builds",
11 |     "package-win": "npx electron-packager . --overwrite --asar=true --platform=win32 --arch=x64 --icon=assets/icons/win/favicon.ico --prune=true --out=release-builds --version-string.CompanyName=CE --version-string.FileDescription=CE --version-string.ProductName=\"Minima\"",
12 |     "package-linux": "npx electron-packager . --overwrite --platform=linux --arch=x64 --icon=assets/icons/png/favicon.png --prune=true --out=release-builds"
13 |    },
14 | 
15 |   "repository": "https://github.com/dmayboroda/minima",
16 |   "author": "Minima team",
17 |   "license": "MPLv2",
18 | 
19 |   "devDependencies": {
20 |     "electron": "^22.0.0"
21 |   }
22 | }
23 | 
```

--------------------------------------------------------------------------------
/chat/package.json:
--------------------------------------------------------------------------------

```json
 1 | {
 2 |   "name": "chat",
 3 |   "version": "0.1.0",
 4 |   "private": true,
 5 |   "dependencies": {
 6 |     "@emotion/react": "^11.13.5",
 7 |     "@emotion/styled": "^11.13.5",
 8 |     "@mui/material": "^5.16.7",
 9 |     "@mui/system": "^6.1.8",
10 |     "@testing-library/jest-dom": "^5.17.0",
11 |     "@testing-library/react": "^13.4.0",
12 |     "@testing-library/user-event": "^13.5.0",
13 |     "antd": "^5.22.7",
14 |     "react": "^18.3.1",
15 |     "react-dom": "^18.3.1",
16 |     "react-scripts": "5.0.1",
17 |     "react-toastify": "^11.0.3",
18 |     "typescript": "^4.9.5",
19 |     "web-vitals": "^2.1.4"
20 |   },
21 |   "scripts": {
22 |     "start": "react-scripts start",
23 |     "build": "react-scripts build",
24 |     "test": "react-scripts test",
25 |     "eject": "react-scripts eject"
26 |   },
27 |   "eslintConfig": {
28 |     "extends": [
29 |       "react-app",
30 |       "react-app/jest"
31 |     ]
32 |   },
33 |   "browserslist": {
34 |     "production": [
35 |       ">0.2%",
36 |       "not dead",
37 |       "not op_mini all"
38 |     ],
39 |     "development": [
40 |       "last 1 chrome version",
41 |       "last 1 firefox version",
42 |       "last 1 safari version"
43 |     ]
44 |   }
45 | }
46 | 
```

--------------------------------------------------------------------------------
/docker-compose-chatgpt.yml:
--------------------------------------------------------------------------------

```yaml
 1 | version: '3.9'
 2 | services:
 3 | 
 4 |   qdrant:
 5 |     image: qdrant/qdrant:latest
 6 |     container_name: qdrant
 7 |     ports:
 8 |       - 6333:6333
 9 |       - 6334:6334
10 |     expose:
11 |       - 6333
12 |       - 6334
13 |       - 6335
14 |     volumes:
15 |       - ./qdrant_data:/qdrant/storage
16 |     environment:
17 |       QDRANT__LOG_LEVEL: "INFO"
18 | 
19 |   indexer:
20 |     build: 
21 |       context: ./indexer
22 |       dockerfile: Dockerfile
23 |       args:
24 |         EMBEDDING_MODEL_ID: ${EMBEDDING_MODEL_ID}
25 |         EMBEDDING_SIZE: ${EMBEDDING_SIZE}
26 |     volumes:
27 |       - ${LOCAL_FILES_PATH}:/usr/src/app/local_files/
28 |       - ./indexer:/usr/src/app
29 |       - ./indexer_data:/indexer/storage
30 |     ports:
31 |       - 8001:8000
32 |     environment:
33 |       - PYTHONPATH=/usr/src
34 |       - PYTHONUNBUFFERED=TRUE
35 |       - LOCAL_FILES_PATH=${LOCAL_FILES_PATH}
36 |       - EMBEDDING_MODEL_ID=${EMBEDDING_MODEL_ID}
37 |       - EMBEDDING_SIZE=${EMBEDDING_SIZE}
38 |       - CONTAINER_PATH=/usr/src/app/local_files/
39 |     depends_on:
40 |       - qdrant
41 | 
42 |   linker:
43 |     build: ./linker
44 |     volumes:
45 |       - ./linker:/usr/src/app
46 |     ports:
47 |       - 8002:8000
48 |     environment:
49 |       - PYTHONPATH=/usr/src
50 |       - PYTHONUNBUFFERED=TRUE
51 |       - FIRESTORE_COLLECTION_NAME=userTasks
52 |       - TASKS_COLLECTION=tasks
53 |       - USER_ID=${USER_ID}
54 |       - PASSWORD=${PASSWORD}
55 |       - FB_PROJECT=localragex
56 |     depends_on:
57 |       - qdrant
```

--------------------------------------------------------------------------------
/electron/assets/js/renderer.js:
--------------------------------------------------------------------------------

```javascript
 1 | const getControlsHeight = () => {
 2 |   const controls = document.querySelector("#controls");
 3 |   if (controls) {
 4 |     return controls.offsetHeight;
 5 |   }
 6 |   return 0;
 7 | };
 8 | 
 9 | const calculateLayoutSize = () => {
10 |   const webview = document.querySelector("webview");
11 |   const windowWidth = document.documentElement.clientWidth;
12 |   const windowHeight = document.documentElement.clientHeight;
13 |   const controlsHeight = getControlsHeight();
14 |   const webviewHeight = windowHeight - controlsHeight;
15 | 
16 |   webview.style.width = windowWidth + "px";
17 |   webview.style.height = webviewHeight + "px";
18 | };
19 | 
20 | window.addEventListener("DOMContentLoaded", () => {
21 |   calculateLayoutSize();
22 | 
23 |   // Dynamic resize function (responsive)
24 |   window.onresize = calculateLayoutSize;
25 | 
26 |   // Home button exists
27 |   if (document.querySelector("#home")) {
28 |     document.querySelector("#home").onclick = () => {
29 |       const home = document.getElementById("webview").getAttribute("data-home");
30 |       document.querySelector("webview").src = home;
31 |     };
32 |   }
33 | 
34 |   // Print button exits
35 |   if (document.querySelector("#print_button")) {
36 |     document
37 |       .querySelector("#print_button")
38 |       .addEventListener("click", async () => {
39 |         const url = document.querySelector("webview").getAttribute("src");
40 | 
41 |         // Launch print window
42 |         await window.electron.print(url);
43 |       });
44 |   }
45 | });
46 | 
```

--------------------------------------------------------------------------------
/llm/async_question_to_answer.py:
--------------------------------------------------------------------------------

```python
 1 | import json
 2 | import logging
 3 | from llm_chain import LLMChain
 4 | from async_queue import AsyncQueue
 5 | import control_flow_commands as cfc
 6 | 
 7 | logging.basicConfig(level=logging.INFO)
 8 | logger = logging.getLogger("chat")
 9 | 
10 | async def loop(
11 |         questions_queue: AsyncQueue,
12 |         response_queue: AsyncQueue,
13 | ):
14 | 
15 |     llm_chain = LLMChain()
16 | 
17 |     while True:
18 |         data = await questions_queue.dequeue()
19 |         data = data.replace("\n", "")
20 | 
21 |         if data == cfc.CFC_CLIENT_DISCONNECTED:
22 |             response_queue.enqueue(
23 |                 json.dumps({
24 |                     "reporter": "output_message",
25 |                     "type": "disconnect_message",
26 |                 })
27 |             )
28 |             break
29 | 
30 |         if data == cfc.CFC_CHAT_STARTED:
31 |             response_queue.enqueue(
32 |                 json.dumps({
33 |                     "reporter": "output_message",
34 |                     "type": "start_message",
35 |                 })
36 |             )
37 |             
38 |         elif data == cfc.CFC_CHAT_STOPPED:
39 |             response_queue.enqueue(
40 |                 json.dumps({
41 |                     "reporter": "output_message",
42 |                     "type": "stop_message",
43 |                 })
44 |             )
45 |             
46 |         elif data:
47 |             result = llm_chain.invoke(data)
48 |             response_queue.enqueue(
49 |                 json.dumps({
50 |                     "reporter": "output_message",
51 |                     "type": "answer",
52 |                     "message": result["answer"],
53 |                     "links": list(result["links"])
54 |                 })
55 |             )
```

--------------------------------------------------------------------------------
/llm/async_socket_to_chat.py:
--------------------------------------------------------------------------------

```python
 1 | import json
 2 | import logging
 3 | from fastapi import WebSocket
 4 | from async_queue import AsyncQueue
 5 | import starlette.websockets as ws
 6 | import control_flow_commands as cfc
 7 | 
 8 | logging.basicConfig(level=logging.INFO)
 9 | logger = logging.getLogger("llm")
10 | 
11 | async def loop(
12 |     websocket: WebSocket, 
13 |     questions_queue: AsyncQueue,
14 |     respone_queue: AsyncQueue
15 | ):
16 | 
17 |     await websocket.accept()
18 |     while True:
19 |         try:
20 |             message = await websocket.receive_text()
21 | 
22 |             if message == cfc.CFC_CHAT_STARTED:
23 |                 logger.info(f"Start message {message}")
24 |                 questions_queue.enqueue(message)
25 | 
26 |             elif message == cfc.CFC_CHAT_STOPPED:
27 |                 logger.info(f"Stop message {message}")
28 |                 questions_queue.enqueue(message)
29 |                 respone_queue.enqueue(json.dumps({
30 |                     "reporter": "input_message",
31 |                     "type": "stop_message",
32 |                     "message": message
33 |                 }))
34 | 
35 |             else:
36 |                 logger.info(f"Question: {message}")
37 |                 questions_queue.enqueue(message)
38 |                 respone_queue.enqueue(json.dumps({
39 |                     "reporter": "input_message",
40 |                     "type": "question",
41 |                     "message": message
42 |                 }))
43 |                 
44 |         except ws.WebSocketDisconnect as e:
45 |             logger.info("Client disconnected")
46 |             questions_queue.enqueue(cfc.CFC_CLIENT_DISCONNECTED)
47 |             respone_queue.enqueue(cfc.CFC_CLIENT_DISCONNECTED)
48 |             break
```

--------------------------------------------------------------------------------
/electron/src/menu.js:
--------------------------------------------------------------------------------

```javascript
 1 | exports.createTemplate = (name) => {
 2 |   let template = [
 3 |     {
 4 |       label: "Edit",
 5 |       submenu: [
 6 |         { role: "undo" },
 7 |         { role: "redo" },
 8 |         { type: "separator" },
 9 |         { role: "cut" },
10 |         { role: "copy" },
11 |         { role: "paste" },
12 |         { role: "pasteandmatchstyle" },
13 |         { role: "delete" },
14 |         { role: "selectall" },
15 |       ],
16 |     },
17 |     {
18 |       label: "View",
19 |       submenu: [
20 |         { role: "reload" },
21 |         { role: "forcereload" },
22 |         { role: "toggledevtools" },
23 |         { type: "separator" },
24 |         { role: "resetzoom" },
25 |         { role: "zoomin" },
26 |         { role: "zoomout" },
27 |         { type: "separator" },
28 |         { role: "togglefullscreen" },
29 |       ],
30 |     },
31 |     {
32 |       role: "window",
33 |       submenu: [{ role: "minimize" }, { role: "close" }],
34 |     },
35 |   ];
36 | 
37 |   if (process.platform === "darwin") {
38 |     template.unshift({
39 |       label: name,
40 |       submenu: [
41 |         { type: "separator" },
42 |         { role: "services", submenu: [] },
43 |         { type: "separator" },
44 |         { role: "hide" },
45 |         { role: "hideothers" },
46 |         { role: "unhide" },
47 |         { type: "separator" },
48 |         { role: "quit" },
49 |       ],
50 |     });
51 | 
52 |     template[1].submenu.push(
53 |       { type: "separator" },
54 |       {
55 |         label: "Speech",
56 |         submenu: [{ role: "startspeaking" }, { role: "stopspeaking" }],
57 |       }
58 |     );
59 | 
60 |     template[3].submenu = [
61 |       { role: "close" },
62 |       { role: "minimize" },
63 |       { role: "zoom" },
64 |       { type: "separator" },
65 |       { role: "front" },
66 |     ];
67 |   }
68 | 
69 |   return template;
70 | };
71 | 
```

--------------------------------------------------------------------------------
/llm/minima_embed.py:
--------------------------------------------------------------------------------

```python
 1 | import requests
 2 | import logging
 3 | from typing import Any, List
 4 | from pydantic import BaseModel
 5 | from langchain_core.embeddings import Embeddings
 6 | 
 7 | logging.basicConfig(level=logging.INFO)
 8 | logger = logging.getLogger(__name__)
 9 | 
10 | REQUEST_DATA_URL = "http://indexer:8000/embedding"
11 | REQUEST_HEADERS = {
12 |     'Accept': 'application/json',
13 |     'Content-Type': 'application/json'
14 | }
15 | 
16 | class MinimaEmbeddings(BaseModel, Embeddings):
17 | 
18 |     def __init__(self, **kwargs: Any):
19 |         super().__init__(**kwargs)
20 | 
21 |     def embed_documents(self, texts: list[str]) -> list[list[float]]:
22 |         results = []
23 |         for text in texts:
24 |             embedding = self.request_data(text)
25 |             if "error" in embedding:
26 |                 logger.error(f"Error in embedding: {embedding['error']}")
27 |             else:
28 |                 embedding = embedding["result"]
29 |                 results.append(embedding)
30 |         return results
31 | 
32 |     def embed_query(self, text: str) -> list[float]:
33 |         return self.embed_documents([text])[0]
34 | 
35 |     def request_data(self, query):
36 |         payload = {
37 |             "query": query
38 |         }
39 |         try:
40 |             logger.info(f"Requesting data from indexer with query: {query}")
41 |             response = requests.post(REQUEST_DATA_URL, headers=REQUEST_HEADERS, json=payload)
42 |             response.raise_for_status()
43 |             data = response.json()
44 |             logger.info(f"Received data: {data}")
45 |             return data
46 | 
47 |         except requests.exceptions.RequestException as e:
48 |             logger.error(f"HTTP error: {e}")
49 |             return {"error": str(e)}
```

--------------------------------------------------------------------------------
/chat/public/index.html:
--------------------------------------------------------------------------------

```html
 1 | <!DOCTYPE html>
 2 | <html lang="en">
 3 |   <head>
 4 |     <meta charset="utf-8" />
 5 |     <link rel="icon" href="%PUBLIC_URL%/favicon.ico" />
 6 |     <meta name="viewport" content="width=device-width, initial-scale=1" />
 7 |     <meta name="theme-color" content="#000000" />
 8 |     <meta
 9 |       name="description"
10 |       content="Web site created using create-react-app"
11 |     />
12 |     <link rel="apple-touch-icon" href="%PUBLIC_URL%/logo192.png" />
13 |     <!--
14 |       manifest.json provides metadata used when your web app is installed on a
15 |       user's mobile device or desktop. See https://developers.google.com/web/fundamentals/web-app-manifest/
16 |     -->
17 |     <link rel="manifest" href="%PUBLIC_URL%/manifest.json" />
18 |     <!--
19 |       Notice the use of %PUBLIC_URL% in the tags above.
20 |       It will be replaced with the URL of the `public` folder during the build.
21 |       Only files inside the `public` folder can be referenced from the HTML.
22 | 
23 |       Unlike "/favicon.ico" or "favicon.ico", "%PUBLIC_URL%/favicon.ico" will
24 |       work correctly both with client-side routing and a non-root public URL.
25 |       Learn how to configure a non-root public URL by running `npm run build`.
26 |     -->
27 |     <title>React App</title>
28 |   </head>
29 |   <body>
30 |     <noscript>You need to enable JavaScript to run this app.</noscript>
31 |     <div id="root"></div>
32 |     <!--
33 |       This HTML file is a template.
34 |       If you open it directly in the browser, you will see an empty page.
35 | 
36 |       You can add webfonts, meta tags, or analytics to this file.
37 |       The build step will place the bundled scripts into the <body> tag.
38 | 
39 |       To begin the development, run `npm start` or `yarn start`.
40 |       To create a production bundle, use `npm run build` or `yarn build`.
41 |     -->
42 |   </body>
43 | </html>
44 | 
```

--------------------------------------------------------------------------------
/docker-compose-ollama.yml:
--------------------------------------------------------------------------------

```yaml
 1 | version: '3.9'
 2 | services:
 3 | 
 4 |   ollama:
 5 |     image: ollama/ollama:latest
 6 |     container_name: ollama
 7 |     ports:
 8 |       - 11434:11434
 9 |     expose:
10 |       - 11434
11 |     volumes:
12 |       - ./ollama:/root/.ollama
13 |     entrypoint: ["/bin/sh", "-c"]
14 |     environment:
15 |       - OLLAMA_MODEL=${OLLAMA_MODEL}
16 |     command: >
17 |       "ollama serve &
18 |       sleep 10 &&
19 |       ollama pull ${OLLAMA_MODEL} &&
20 |       wait"
21 | 
22 |   qdrant:
23 |     image: qdrant/qdrant:latest
24 |     container_name: qdrant
25 |     ports:
26 |       - 6333:6333
27 |       - 6334:6334
28 |     expose:
29 |       - 6333
30 |       - 6334
31 |       - 6335
32 |     volumes:
33 |       - ./qdrant_data:/qdrant/storage
34 |     environment:
35 |       QDRANT__LOG_LEVEL: "INFO"
36 | 
37 |   indexer:
38 |     build: 
39 |       context: ./indexer
40 |       dockerfile: Dockerfile
41 |       args:
42 |         EMBEDDING_MODEL_ID: ${EMBEDDING_MODEL_ID}
43 |         EMBEDDING_SIZE: ${EMBEDDING_SIZE}
44 |     volumes:
45 |       - ${LOCAL_FILES_PATH}:/usr/src/app/local_files/
46 |       - ./indexer:/usr/src/app
47 |       - ./indexer_data:/indexer/storage
48 |     ports:
49 |       - 8001:8000
50 |     environment:
51 |       - PYTHONPATH=/usr/src
52 |       - PYTHONUNBUFFERED=TRUE
53 |       - LOCAL_FILES_PATH=${LOCAL_FILES_PATH}
54 |       - EMBEDDING_MODEL_ID=${EMBEDDING_MODEL_ID}
55 |       - EMBEDDING_SIZE=${EMBEDDING_SIZE}
56 |       - CONTAINER_PATH=/usr/src/app/local_files/
57 |     depends_on:
58 |       - qdrant
59 | 
60 |   llm:
61 |     build: 
62 |       context: ././llm
63 |       dockerfile: Dockerfile
64 |       args:
65 |         RERANKER_MODEL: ${RERANKER_MODEL}
66 |     volumes:
67 |       - ./llm:/usr/src/app
68 |     ports:
69 |       - 8003:8000
70 |     environment:
71 |       - PYTHONPATH=/usr/src
72 |       - PYTHONUNBUFFERED=TRUE
73 |       - OLLAMA_MODEL=${OLLAMA_MODEL}
74 |       - RERANKER_MODEL=${RERANKER_MODEL}
75 |       - LOCAL_FILES_PATH=${LOCAL_FILES_PATH}
76 |       - CONTAINER_PATH=/usr/src/app/local_files/
77 |     depends_on:
78 |       - ollama
79 |       - qdrant
80 |       - indexer
81 | 
82 |   chat:
83 |     build: ./chat
84 |     volumes:
85 |       - ./chat:/usr/src/app
86 |     ports:
87 |       - 3000:3000
88 |     depends_on:
89 |       - ollama
90 |       - qdrant
91 |       - llm
92 | 
```

--------------------------------------------------------------------------------
/indexer/async_loop.py:
--------------------------------------------------------------------------------

```python
 1 | import os
 2 | import uuid
 3 | import asyncio
 4 | import logging
 5 | from indexer import Indexer
 6 | from concurrent.futures import ThreadPoolExecutor
 7 | 
 8 | logger = logging.getLogger(__name__)
 9 | executor = ThreadPoolExecutor()
10 | 
11 | CONTAINER_PATH = os.environ.get("CONTAINER_PATH")
12 | AVAILABLE_EXTENSIONS = [".pdf", ".xls", "xlsx", ".doc", ".docx", ".txt", ".md", ".csv", ".ppt", ".pptx"]
13 | 
14 | 
15 | async def crawl_loop(async_queue):
16 |     logger.info(f"Starting crawl loop with path: {CONTAINER_PATH}")
17 |     existing_file_paths: list[str] = []
18 |     for root, _, files in os.walk(CONTAINER_PATH):
19 |         logger.info(f"Processing folder: {root}")
20 |         for file in files:
21 |             if not any(file.endswith(ext) for ext in AVAILABLE_EXTENSIONS):
22 |                 logger.info(f"Skipping file: {file}")
23 |                 continue
24 |             path = os.path.join(root, file)
25 |             message = {
26 |                 "path": path,
27 |                 "file_id": str(uuid.uuid4()),
28 |                 "last_updated_seconds": round(os.path.getmtime(path)),
29 |                 "type": "file"
30 |             }
31 |             existing_file_paths.append(path)
32 |             async_queue.enqueue(message)
33 |             logger.info(f"File enqueue: {path}")
34 |         aggregate_message = {
35 |             "existing_file_paths": existing_file_paths,
36 |             "type": "all_files"
37 |         }
38 |         async_queue.enqueue(aggregate_message)
39 |         async_queue.enqueue({"type": "stop"})
40 | 
41 | 
42 | async def index_loop(async_queue, indexer: Indexer):
43 |     loop = asyncio.get_running_loop()
44 |     logger.info("Starting index loop")
45 |     while True:
46 |         if async_queue.size() == 0:
47 |             logger.info("No files to index. Indexing stopped, all files indexed.")
48 |             await asyncio.sleep(1)
49 |             continue
50 |         message = await async_queue.dequeue()
51 |         logger.info(f"Processing message: {message}")
52 |         try:
53 |             if message["type"] == "file":
54 |                 await loop.run_in_executor(executor, indexer.index, message)
55 |             elif message["type"] == "all_files":
56 |                 await loop.run_in_executor(executor, indexer.purge, message)
57 |             elif message["type"] == "stop":
58 |                 break
59 |         except Exception as e:
60 |             logger.error(f"Error in processing message: {e}")
61 |             logger.error(f"Failed to process message: {message}")
62 |         await asyncio.sleep(1)
63 | 
64 | 
```

--------------------------------------------------------------------------------
/chat/src/logo.svg:
--------------------------------------------------------------------------------

```
1 | <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 841.9 595.3"><g fill="#61DAFB"><path d="M666.3 296.5c0-32.5-40.7-63.3-103.1-82.4 14.4-63.6 8-114.2-20.2-130.4-6.5-3.8-14.1-5.6-22.4-5.6v22.3c4.6 0 8.3.9 11.4 2.6 13.6 7.8 19.5 37.5 14.9 75.7-1.1 9.4-2.9 19.3-5.1 29.4-19.6-4.8-41-8.5-63.5-10.9-13.5-18.5-27.5-35.3-41.6-50 32.6-30.3 63.2-46.9 84-46.9V78c-27.5 0-63.5 19.6-99.9 53.6-36.4-33.8-72.4-53.2-99.9-53.2v22.3c20.7 0 51.4 16.5 84 46.6-14 14.7-28 31.4-41.3 49.9-22.6 2.4-44 6.1-63.6 11-2.3-10-4-19.7-5.2-29-4.7-38.2 1.1-67.9 14.6-75.8 3-1.8 6.9-2.6 11.5-2.6V78.5c-8.4 0-16 1.8-22.6 5.6-28.1 16.2-34.4 66.7-19.9 130.1-62.2 19.2-102.7 49.9-102.7 82.3 0 32.5 40.7 63.3 103.1 82.4-14.4 63.6-8 114.2 20.2 130.4 6.5 3.8 14.1 5.6 22.5 5.6 27.5 0 63.5-19.6 99.9-53.6 36.4 33.8 72.4 53.2 99.9 53.2 8.4 0 16-1.8 22.6-5.6 28.1-16.2 34.4-66.7 19.9-130.1 62-19.1 102.5-49.9 102.5-82.3zm-130.2-66.7c-3.7 12.9-8.3 26.2-13.5 39.5-4.1-8-8.4-16-13.1-24-4.6-8-9.5-15.8-14.4-23.4 14.2 2.1 27.9 4.7 41 7.9zm-45.8 106.5c-7.8 13.5-15.8 26.3-24.1 38.2-14.9 1.3-30 2-45.2 2-15.1 0-30.2-.7-45-1.9-8.3-11.9-16.4-24.6-24.2-38-7.6-13.1-14.5-26.4-20.8-39.8 6.2-13.4 13.2-26.8 20.7-39.9 7.8-13.5 15.8-26.3 24.1-38.2 14.9-1.3 30-2 45.2-2 15.1 0 30.2.7 45 1.9 8.3 11.9 16.4 24.6 24.2 38 7.6 13.1 14.5 26.4 20.8 39.8-6.3 13.4-13.2 26.8-20.7 39.9zm32.3-13c5.4 13.4 10 26.8 13.8 39.8-13.1 3.2-26.9 5.9-41.2 8 4.9-7.7 9.8-15.6 14.4-23.7 4.6-8 8.9-16.1 13-24.1zM421.2 430c-9.3-9.6-18.6-20.3-27.8-32 9 .4 18.2.7 27.5.7 9.4 0 18.7-.2 27.8-.7-9 11.7-18.3 22.4-27.5 32zm-74.4-58.9c-14.2-2.1-27.9-4.7-41-7.9 3.7-12.9 8.3-26.2 13.5-39.5 4.1 8 8.4 16 13.1 24 4.7 8 9.5 15.8 14.4 23.4zM420.7 163c9.3 9.6 18.6 20.3 27.8 32-9-.4-18.2-.7-27.5-.7-9.4 0-18.7.2-27.8.7 9-11.7 18.3-22.4 27.5-32zm-74 58.9c-4.9 7.7-9.8 15.6-14.4 23.7-4.6 8-8.9 16-13 24-5.4-13.4-10-26.8-13.8-39.8 13.1-3.1 26.9-5.8 41.2-7.9zm-90.5 125.2c-35.4-15.1-58.3-34.9-58.3-50.6 0-15.7 22.9-35.6 58.3-50.6 8.6-3.7 18-7 27.7-10.1 5.7 19.6 13.2 40 22.5 60.9-9.2 20.8-16.6 41.1-22.2 60.6-9.9-3.1-19.3-6.5-28-10.2zM310 490c-13.6-7.8-19.5-37.5-14.9-75.7 1.1-9.4 2.9-19.3 5.1-29.4 19.6 4.8 41 8.5 63.5 10.9 13.5 18.5 27.5 35.3 41.6 50-32.6 30.3-63.2 46.9-84 46.9-4.5-.1-8.3-1-11.3-2.7zm237.2-76.2c4.7 38.2-1.1 67.9-14.6 75.8-3 1.8-6.9 2.6-11.5 2.6-20.7 0-51.4-16.5-84-46.6 14-14.7 28-31.4 41.3-49.9 22.6-2.4 44-6.1 63.6-11 2.3 10.1 4.1 19.8 5.2 29.1zm38.5-66.7c-8.6 3.7-18 7-27.7 10.1-5.7-19.6-13.2-40-22.5-60.9 9.2-20.8 16.6-41.1 22.2-60.6 9.9 3.1 19.3 6.5 28.1 10.2 35.4 15.1 58.3 34.9 58.3 50.6-.1 15.7-23 35.6-58.4 50.6zM320.8 78.4z"/><circle cx="420.9" cy="296.5" r="45.7"/><path d="M520.5 78.1z"/></g></svg>
```

--------------------------------------------------------------------------------
/indexer/app.py:
--------------------------------------------------------------------------------

```python
  1 | import nltk
  2 | import logging
  3 | import asyncio
  4 | from indexer import Indexer
  5 | from pydantic import BaseModel
  6 | from storage import MinimaStore
  7 | from async_queue import AsyncQueue
  8 | from fastapi import FastAPI, APIRouter
  9 | from contextlib import asynccontextmanager
 10 | from fastapi_utilities import repeat_every
 11 | from async_loop import index_loop, crawl_loop
 12 | 
 13 | logging.basicConfig(level=logging.INFO)
 14 | logger = logging.getLogger(__name__)
 15 | 
 16 | indexer = Indexer()
 17 | router = APIRouter()
 18 | async_queue = AsyncQueue()
 19 | MinimaStore.create_db_and_tables()
 20 | 
 21 | def init_loader_dependencies():
 22 |     nltk.download('punkt')
 23 |     nltk.download('punkt_tab')
 24 |     nltk.download('wordnet')
 25 |     nltk.download('omw-1.4')
 26 |     nltk.download('punkt')
 27 |     nltk.download('averaged_perceptron_tagger_eng')
 28 | 
 29 | init_loader_dependencies()
 30 | 
 31 | class Query(BaseModel):
 32 |     query: str
 33 | 
 34 | 
 35 | @router.post(
 36 |     "/query", 
 37 |     response_description='Query local data storage',
 38 | )
 39 | async def query(request: Query):
 40 |     logger.info(f"Received query: {query}")
 41 |     try:
 42 |         result = indexer.find(request.query)
 43 |         logger.info(f"Found {len(result)} results for query: {query}")
 44 |         logger.info(f"Results: {result}")
 45 |         return {"result": result}
 46 |     except Exception as e:
 47 |         logger.error(f"Error in processing query: {e}")
 48 |         return {"error": str(e)}
 49 | 
 50 | 
 51 | @router.post(
 52 |     "/embedding", 
 53 |     response_description='Get embedding for a query',
 54 | )
 55 | async def embedding(request: Query):
 56 |     logger.info(f"Received embedding request: {request}")
 57 |     try:
 58 |         result = indexer.embed(request.query)
 59 |         logger.info(f"Found {len(result)} results for query: {request.query}")
 60 |         return {"result": result}
 61 |     except Exception as e:
 62 |         logger.error(f"Error in processing embedding: {e}")
 63 |         return {"error": str(e)}    
 64 | 
 65 | 
 66 | @asynccontextmanager
 67 | async def lifespan(app: FastAPI):
 68 |     tasks = [
 69 |         asyncio.create_task(crawl_loop(async_queue)),
 70 |         asyncio.create_task(index_loop(async_queue, indexer))
 71 |     ]
 72 |     await schedule_reindexing()
 73 |     try:
 74 |         yield
 75 |     finally:
 76 |         for task in tasks:
 77 |             task.cancel()
 78 |         await asyncio.gather(*tasks, return_exceptions=True)
 79 | 
 80 | 
 81 | def create_app() -> FastAPI:
 82 |     app = FastAPI(
 83 |         openapi_url="/indexer/openapi.json",
 84 |         docs_url="/indexer/docs",
 85 |         lifespan=lifespan
 86 |     )
 87 |     app.include_router(router)
 88 |     return app
 89 | 
 90 | async def trigger_re_indexer():
 91 |     logger.info("Reindexing triggered")
 92 |     try:
 93 |         await asyncio.gather(
 94 |             crawl_loop(async_queue),
 95 |             index_loop(async_queue, indexer)
 96 |         )
 97 |         logger.info("reindexing finished")
 98 |     except Exception as e:
 99 |         logger.error(f"error in scheduled reindexing {e}")
100 | 
101 | 
102 | @repeat_every(seconds=60*20)
103 | async def schedule_reindexing():
104 |     await trigger_re_indexer()
105 | 
106 | app = create_app()
```

--------------------------------------------------------------------------------
/linker/app.py:
--------------------------------------------------------------------------------

```python
  1 | import os
  2 | import logging
  3 | import asyncio
  4 | import random
  5 | import string
  6 | from fastapi import FastAPI
  7 | from requestor import request_data
  8 | from contextlib import asynccontextmanager
  9 | 
 10 | import json
 11 | import requests
 12 | 
 13 | from requests.exceptions import HTTPError
 14 | from google.oauth2.credentials import Credentials
 15 | from google.cloud.firestore import Client
 16 | 
 17 | 
 18 | def sign_in_with_email_and_password(email, password):
 19 |     request_url = "https://signinaction-xl7gclbspq-uc.a.run.app"
 20 |     headers = {"content-type": "application/json; charset=UTF-8"}
 21 |     data = json.dumps({"login": email, "password": password})
 22 |     req = requests.post(request_url, headers=headers, data=data)
 23 |     try:
 24 |         req.raise_for_status()
 25 |     except HTTPError as e:
 26 |         raise HTTPError(e, "error")
 27 |     return req.json()
 28 | 
 29 | 
 30 | logging.basicConfig(level=logging.INFO)
 31 | logger = logging.getLogger(__name__)
 32 | 
 33 | USERS_COLLECTION_NAME = "users_otp"
 34 | COLLECTION_NAME = os.environ.get("FIRESTORE_COLLECTION_NAME")
 35 | TASKS_COLLECTION = os.environ.get("TASKS_COLLECTION")
 36 | USER_ID = os.environ.get("USER_ID")
 37 | PASSWORD = os.environ.get("PASSWORD")
 38 | FB_PROJECT = os.environ.get("FB_PROJECT")
 39 | 
 40 | app = FastAPI()
 41 | response = sign_in_with_email_and_password(USER_ID, PASSWORD)
 42 | creds = Credentials(response["idToken"], response["refreshToken"])
 43 | # noinspection PyTypeChecker
 44 | db = Client(FB_PROJECT, creds)
 45 | 
 46 | 
 47 | async def poll_firestore():
 48 |     logger.info(f"Polling Firestore collection: {COLLECTION_NAME}")
 49 |     random_otp = ''.join(random.choices(string.ascii_uppercase + string.digits, k=16))
 50 |     doc_ref = db.collection(USERS_COLLECTION_NAME).document(USER_ID)
 51 |     try:
 52 |         doc_ref.update({'otp': random_otp})
 53 |     except Exception as e:
 54 |         print("The error is: ", e)
 55 | 
 56 |     if doc_ref.get().exists:
 57 |         doc_ref.update({'otp': random_otp})
 58 |     else:
 59 |         doc_ref.create({'otp': random_otp})
 60 |     
 61 |     while True:
 62 |         print(f"OTP for this computer in Minima GPT: {random_otp}")
 63 |         try:
 64 |             docs = db.collection(COLLECTION_NAME).document(USER_ID).collection(TASKS_COLLECTION).stream()
 65 |             for doc in docs:
 66 |                 data = doc.to_dict()
 67 |                 if data['status'] == 'PENDING':
 68 |                     response = await request_data(data['request'])
 69 |                     if 'error' not in response:
 70 |                         logger.info(f"Updating Firestore document: {doc.id}")
 71 |                         doc_ref = db.collection(COLLECTION_NAME).document(USER_ID).collection(TASKS_COLLECTION).document(doc.id)
 72 |                         doc_ref.update({
 73 |                             'status': 'COMPLETED',
 74 |                             'links': response['result']['links'],
 75 |                             'result': response['result']['output']
 76 |                         })
 77 |                     else:
 78 |                         logger.error(f"Error in processing request: {response['error']}")
 79 |             await asyncio.sleep(0.5)
 80 |         except Exception as e:
 81 |             logger.error(f"Error in polling Firestore collection: {e}")
 82 |             await asyncio.sleep(0.5)
 83 | 
 84 | @asynccontextmanager
 85 | async def lifespan(app: FastAPI):
 86 |     logger.info("Starting Firestore polling")
 87 |     poll_task = asyncio.create_task(poll_firestore())
 88 |     yield
 89 |     poll_task.cancel()
 90 | 
 91 | 
 92 | def create_app() -> FastAPI:
 93 |     app = FastAPI(
 94 |         openapi_url="/linker/openapi.json",
 95 |         docs_url="/linker/docs",
 96 |         lifespan=lifespan
 97 |     )
 98 |     return app
 99 | 
100 | app = create_app()
```

--------------------------------------------------------------------------------
/mcp-server/src/minima/server.py:
--------------------------------------------------------------------------------

```python
  1 | import logging
  2 | import mcp.server.stdio
  3 | from typing import Annotated
  4 | from mcp.server import Server
  5 | from .requestor import request_data
  6 | from pydantic import BaseModel, Field
  7 | from mcp.server.stdio import stdio_server
  8 | from mcp.shared.exceptions import McpError
  9 | from mcp.server import NotificationOptions, Server
 10 | from mcp.server.models import InitializationOptions
 11 | from mcp.types import (
 12 |     GetPromptResult,
 13 |     Prompt,
 14 |     PromptArgument,
 15 |     PromptMessage,
 16 |     TextContent,
 17 |     Tool,
 18 |     INVALID_PARAMS,
 19 |     INTERNAL_ERROR,
 20 | )
 21 | 
 22 | 
 23 | logging.basicConfig(
 24 |     level=logging.DEBUG, 
 25 |     format='%(asctime)s - %(levelname)s - %(message)s',
 26 |     handlers=[
 27 |         logging.FileHandler("app.log"),
 28 |         logging.StreamHandler()
 29 |     ]
 30 | )
 31 | 
 32 | server = Server("minima")
 33 | 
 34 | class Query(BaseModel):
 35 |     text: Annotated[
 36 |         str, 
 37 |         Field(description="context to find")
 38 |     ]
 39 | 
 40 | @server.list_tools()
 41 | async def list_tools() -> list[Tool]:
 42 |     return [
 43 |         Tool(
 44 |             name="minima-query",
 45 |             description="Find a context in local files (PDF, CSV, DOCX, MD, TXT)",
 46 |             inputSchema=Query.model_json_schema(),
 47 |         )
 48 |     ]
 49 |     
 50 | @server.list_prompts()
 51 | async def list_prompts() -> list[Prompt]:
 52 |     logging.info("List of prompts")
 53 |     return [
 54 |         Prompt(
 55 |             name="minima-query",
 56 |             description="Find a context in a local files",
 57 |             arguments=[
 58 |                 PromptArgument(
 59 |                     name="context", description="Context to search", required=True
 60 |                 )
 61 |             ]
 62 |         )            
 63 |     ]
 64 |     
 65 | @server.call_tool()
 66 | async def call_tool(name, arguments: dict) -> list[TextContent]:
 67 |     if name != "minima-query":
 68 |         logging.error(f"Unknown tool: {name}")
 69 |         raise ValueError(f"Unknown tool: {name}")
 70 | 
 71 |     logging.info("Calling tools")
 72 |     try:
 73 |         args = Query(**arguments)
 74 |     except ValueError as e:
 75 |         logging.error(str(e))
 76 |         raise McpError(INVALID_PARAMS, str(e))
 77 |         
 78 |     context = args.text
 79 |     logging.info(f"Context: {context}")
 80 |     if not context:
 81 |         logging.error("Context is required")
 82 |         raise McpError(INVALID_PARAMS, "Context is required")
 83 | 
 84 |     output = await request_data(context)
 85 |     if "error" in output:
 86 |         logging.error(output["error"])
 87 |         raise McpError(INTERNAL_ERROR, output["error"])
 88 |     
 89 |     logging.info(f"Get prompt: {output}")    
 90 |     output = output['result']['output']
 91 |     #links = output['result']['links']
 92 |     result = []
 93 |     result.append(TextContent(type="text", text=output))
 94 |     return result
 95 |     
 96 | @server.get_prompt()
 97 | async def get_prompt(name: str, arguments: dict | None) -> GetPromptResult:
 98 |     if not arguments or "context" not in arguments:
 99 |         logging.error("Context is required")
100 |         raise McpError(INVALID_PARAMS, "Context is required")
101 |         
102 |     context = arguments["text"]
103 | 
104 |     output = await request_data(context)
105 |     if "error" in output:
106 |         error = output["error"]
107 |         logging.error(error)
108 |         return GetPromptResult(
109 |             description=f"Faild to find a {context}",
110 |             messages=[
111 |                 PromptMessage(
112 |                     role="user", 
113 |                     content=TextContent(type="text", text=error),
114 |                 )
115 |             ]
116 |         )
117 | 
118 |     logging.info(f"Get prompt: {output}")    
119 |     output = output['result']['output']
120 |     return GetPromptResult(
121 |         description=f"Found content for this {context}",
122 |         messages=[
123 |             PromptMessage(
124 |                 role="user", 
125 |                 content=TextContent(type="text", text=output)
126 |             )
127 |         ]
128 |     )
129 | 
130 | async def main():
131 |     async with mcp.server.stdio.stdio_server() as (read_stream, write_stream):
132 |         await server.run(
133 |             read_stream,
134 |             write_stream,
135 |             InitializationOptions(
136 |                 server_name="minima",
137 |                 server_version="0.0.1",
138 |                 capabilities=server.get_capabilities(
139 |                     notification_options=NotificationOptions(),
140 |                     experimental_capabilities={},
141 |                 ),
142 |             ),
143 |         )
144 | 
```

--------------------------------------------------------------------------------
/indexer/storage.py:
--------------------------------------------------------------------------------

```python
  1 | import logging
  2 | from sqlmodel import Field, Session, SQLModel, create_engine, select
  3 | 
  4 | from singleton import Singleton
  5 | from enum import Enum
  6 | 
  7 | logger = logging.getLogger(__name__)
  8 | 
  9 | 
 10 | class IndexingStatus(Enum):
 11 |     new_file = 1
 12 |     need_reindexing = 2
 13 |     no_need_reindexing = 3
 14 | 
 15 | 
 16 | class MinimaDoc(SQLModel, table=True):
 17 |     fpath: str = Field(primary_key=True)
 18 |     last_updated_seconds: int | None = Field(default=None, index=True)
 19 | 
 20 | 
 21 | class MinimaDocUpdate(SQLModel):
 22 |     fpath: str | None = None
 23 |     last_updated_seconds: int | None = None
 24 | 
 25 | 
 26 | sqlite_file_name = "/indexer/storage/database.db"
 27 | sqlite_url = f"sqlite:///{sqlite_file_name}"
 28 | 
 29 | connect_args = {"check_same_thread": False}
 30 | engine = create_engine(sqlite_url, connect_args=connect_args)
 31 | 
 32 | 
 33 | class MinimaStore(metaclass=Singleton):
 34 | 
 35 |     @staticmethod
 36 |     def create_db_and_tables():
 37 |         SQLModel.metadata.create_all(engine)
 38 | 
 39 |     @staticmethod
 40 |     def delete_m_doc(fpath: str) -> None:
 41 |         with Session(engine) as session:
 42 |             statement = select(MinimaDoc).where(MinimaDoc.fpath == fpath)
 43 |             results = session.exec(statement)
 44 |             doc = results.one()
 45 |             session.delete(doc)
 46 |             session.commit()
 47 |             print("doc deleted:", doc)
 48 | 
 49 |     @staticmethod
 50 |     def select_m_doc(fpath: str) -> MinimaDoc:
 51 |         with Session(engine) as session:
 52 |             statement = select(MinimaDoc).where(MinimaDoc.fpath == fpath)
 53 |             results = session.exec(statement)
 54 |             doc = results.one()
 55 |             print("doc:", doc)
 56 |             return doc
 57 | 
 58 |     @staticmethod
 59 |     def find_removed_files(existing_file_paths: set[str]):
 60 |         removed_files: list[str] = []
 61 |         with Session(engine) as session:
 62 |             statement = select(MinimaDoc)
 63 |             results = session.exec(statement)
 64 |             logger.debug(f"find_removed_files count found {results}")
 65 |             for doc in results:
 66 |                 logger.debug(f"find_removed_files file {doc.fpath} checking to remove")
 67 |                 if doc.fpath not in existing_file_paths:
 68 |                     logger.debug(f"find_removed_files file {doc.fpath} does not exist anymore, removing")
 69 |                     removed_files.append(doc.fpath)
 70 |         for fpath in removed_files:
 71 |             MinimaStore.delete_m_doc(fpath)
 72 |         return removed_files
 73 | 
 74 |     @staticmethod
 75 |     def check_needs_indexing(fpath: str, last_updated_seconds: int) -> IndexingStatus:
 76 |         indexing_status: IndexingStatus = IndexingStatus.no_need_reindexing
 77 |         try:
 78 |             with Session(engine) as session:
 79 |                 statement = select(MinimaDoc).where(MinimaDoc.fpath == fpath)
 80 |                 results = session.exec(statement)
 81 |                 doc = results.first()
 82 |                 if doc is not None:
 83 |                     logger.debug(
 84 |                         f"file {fpath} new last updated={last_updated_seconds} old last updated: {doc.last_updated_seconds}"
 85 |                     )
 86 |                     if doc.last_updated_seconds < last_updated_seconds:
 87 |                         indexing_status = IndexingStatus.need_reindexing
 88 |                         logger.debug(f"file {fpath} needs indexing, timestamp changed")
 89 |                         doc_update = MinimaDocUpdate(fpath=fpath, last_updated_seconds=last_updated_seconds)
 90 |                         doc_data = doc_update.model_dump(exclude_unset=True)
 91 |                         doc.sqlmodel_update(doc_data)
 92 |                         session.add(doc)
 93 |                         session.commit()
 94 |                     else:
 95 |                         logger.debug(f"file {fpath} doesn't need indexing, timestamp same")
 96 |                 else:
 97 |                     doc = MinimaDoc(fpath=fpath, last_updated_seconds=last_updated_seconds)
 98 |                     session.add(doc)
 99 |                     session.commit()
100 |                     logger.debug(f"file {fpath} needs indexing, new file")
101 |                     indexing_status = IndexingStatus.new_file
102 |             return indexing_status
103 |         except Exception as e:
104 |             logger.error(f"error updating file in the store {e}, skipping indexing")
105 |             return IndexingStatus.no_need_reindexing
106 | 
```

--------------------------------------------------------------------------------
/assets/logo-full-w.svg:
--------------------------------------------------------------------------------

```
1 | <svg width="280" height="354" viewBox="0 0 280 354" fill="none" xmlns="http://www.w3.org/2000/svg">
2 | <path d="M277.218 94.6656C276.083 88.5047 274.217 82.4376 272.178 76.5164C270.667 72.1068 266.641 70.2512 262.189 71.0852C260.445 71.4084 258.64 71.669 257.017 72.3362C246.917 76.4956 236.674 80.3631 226.807 85.0541C203.645 96.0729 183.038 110.907 165.545 130.099C162.523 133.424 159.805 137.042 157.036 140.596C155.992 141.952 155.495 144.047 154.197 144.797C153.203 145.371 151.347 144.36 149.867 144.057C149.796 144.036 149.704 144.057 149.633 144.036C141.896 142.546 134.199 142.973 126.573 144.766C125.163 145.1 124.524 144.672 123.855 143.557C122.739 141.733 121.685 139.825 120.295 138.23C111.533 128.128 102.447 118.371 92.0928 109.917C70.3503 92.1846 45.7074 80.311 19.4825 71.6064C13.3675 69.5841 8.40846 72.3883 6.55264 78.8307C0.883763 98.5019 -1.5501 118.496 1.02574 138.991C2.95255 154.367 7.69858 168.691 15.923 181.7C25.9424 197.525 38.6289 210.337 55.6254 218.155C56.4773 218.551 57.2683 219.083 58.0998 219.552C55.311 221.95 52.5932 224.118 50.0579 226.495C47.0765 229.268 47.1982 231.353 50.6157 233.625C53.6479 235.648 56.8322 237.472 60.0672 239.15C72.8957 245.77 85.6126 252.661 98.6744 258.748C108.278 263.231 117.608 267.933 125.883 274.729C134.665 281.933 144.624 281.599 153.67 274.667C158.953 270.622 164.561 266.807 170.524 263.982C184.296 257.466 198.26 251.399 211.535 243.789C217.021 240.631 222.649 237.712 228.136 234.553C231.533 232.614 231.928 230.06 229.423 227.016C228.389 225.755 227.142 224.671 225.955 223.566C224.586 222.294 223.166 221.074 221.453 219.552C222.183 219.156 222.507 218.947 222.862 218.791C231.675 214.997 239.25 209.336 246.481 202.967C260.881 190.249 270.698 174.466 275.778 155.847C281.275 135.686 281.021 115.129 277.218 94.6656ZM69.4376 199.328C51.4067 200.913 37.7162 193.094 26.6929 178.886C13.8137 162.29 9.34144 143.098 9.2096 122.416C9.13862 110.73 11.0249 99.315 13.479 87.9626C13.7325 86.7534 13.9151 85.5233 14.199 84.314C14.9495 81.1137 16.8763 79.8419 19.7563 81.3326C24.2691 83.6677 28.9138 86.0132 32.8586 89.2031C54.0738 106.299 69.3159 127.816 76.871 154.607C80.0553 165.897 79.8525 177.416 76.1307 188.643C75.005 192.062 73.1492 195.232 71.4759 198.432C71.2021 198.964 70.1576 199.266 69.4376 199.328ZM128.043 246.145C123.895 241.996 118.642 241.61 113.257 241.152C108.724 240.766 104.11 240.109 99.7595 238.785C91.4032 236.252 85.5416 230.623 82.3877 222.127C82.1241 221.418 81.11 220.73 80.3393 220.553C77.5099 219.927 75.1267 218.603 73.6664 216.049C73.0782 215.017 72.8247 213.475 73.1188 212.359C73.2608 211.796 74.9746 211.4 75.9786 211.379C82.0227 211.265 88.087 210.993 94.1109 211.306C113.004 212.276 125.254 226.891 129.047 243.247C129.3 244.342 129.422 245.457 129.666 246.896C128.875 246.541 128.357 246.458 128.043 246.145ZM142.981 268.495C138.62 271.195 132.485 267.891 131.968 262.637C131.724 260.26 133.235 259.103 134.898 258.311C136.46 257.57 138.255 257.341 139.938 256.893C144.188 257.195 146.713 258.644 147.372 261.407C147.889 263.565 145.972 266.65 142.981 268.495ZM205.876 214.986C204.487 217.769 202.692 220.146 199.152 220.271C198.402 220.303 197.307 221.23 197.013 222.002C192.834 233.135 181.872 240.287 170.453 240.766C165.241 240.985 159.906 240.683 155.14 243.508C153.568 244.446 152.128 245.603 150.404 246.802C150.688 241.319 152.716 236.617 154.836 232.009C160.616 219.5 170.849 213.433 183.728 211.41C186.101 211.035 188.545 210.983 190.958 210.941C195.065 210.879 199.173 210.921 203.29 210.941C205.947 210.941 207.083 212.578 205.876 214.986ZM265.972 153.21C261.561 166.345 254.928 178.125 244.97 187.59C237.1 195.075 227.537 199.068 216.788 199.85C215.662 199.933 214.526 199.902 213.401 199.996C207.914 200.475 209.304 201.153 206.707 195.888C198.899 180.032 199.649 163.885 205.247 147.622C212.457 126.679 224.606 109.177 240.599 94.3737C245.964 89.4116 251.774 85.1271 258.366 82.0206C262.139 80.238 264.187 80.9886 265.353 85.1063C268.751 97.0216 270.535 109.197 270.282 123.125C270.515 132.403 269.41 142.984 265.972 153.21Z" fill="white"/>
3 | <path d="M198.179 85.6275C193.625 92.7787 187.794 99.1586 182.064 105.434C171.122 117.422 161.032 129.974 154.501 145.152C149.735 156.223 148.964 167.898 150.232 179.782C150.546 182.795 151.246 185.776 151.773 188.768C151.621 188.873 151.469 188.977 151.317 189.081C146.611 183.525 141.743 178.104 137.241 172.381C130.892 164.281 126.359 155.066 123.662 145.068C120.376 132.934 119.788 120.56 121.989 108.113C123.865 97.4698 126.664 87.1807 132.13 77.7778C139.401 65.2683 149.065 54.8124 158.588 44.1794C165.352 36.6215 172.075 28.9699 178.16 20.8283C182.561 14.9176 185.594 8.04777 185.502 0C187.977 4.38875 190.512 8.7358 192.895 13.1767C198.077 22.8819 202.651 32.8374 204.527 43.8771C207.012 58.5444 206.281 72.8991 198.179 85.6275Z" fill="white"/>
4 | <path d="M80.1795 320H94.5429V353.855H85.6376V327.835L72.0881 353.855H69.4548L55.9053 327.835V353.855H47V320H61.3634L70.7475 338.378L80.1795 320Z" fill="white"/>
5 | <path d="M99.2253 320H113.732L129.053 345.972V320H138.054V353.807H123.404L108.226 327.835V353.807H99.2253V320Z" fill="white"/>
6 | <path d="M175.841 320H190.204V353.807H181.299V327.835L167.75 353.807H165.116L151.567 327.835V353.807H142.662V320H157.025L166.409 338.378L175.841 320Z" fill="white"/>
7 | <path d="M222.85 320.193L233 354H223.903L221.557 346.165H203.555L201.209 354H192.112L202.262 320.193H222.85ZM216.003 327.593H209.109L205.949 338.33H219.163L216.003 327.593Z" fill="white"/>
8 | </svg>
9 | 
```

--------------------------------------------------------------------------------
/assets/logo-full.svg:
--------------------------------------------------------------------------------

```
1 | <svg width="280" height="354" viewBox="0 0 280 354" fill="none" xmlns="http://www.w3.org/2000/svg">
2 | <path d="M277.218 94.6656C276.083 88.5047 274.217 82.4376 272.178 76.5164C270.667 72.1068 266.641 70.2512 262.189 71.0852C260.445 71.4084 258.64 71.669 257.017 72.3362C246.917 76.4956 236.674 80.3631 226.807 85.0541C203.645 96.0729 183.038 110.907 165.545 130.099C162.523 133.424 159.805 137.042 157.036 140.596C155.992 141.952 155.495 144.047 154.197 144.797C153.203 145.371 151.347 144.36 149.867 144.057C149.796 144.036 149.704 144.057 149.633 144.036C141.896 142.546 134.199 142.973 126.573 144.766C125.163 145.1 124.524 144.672 123.855 143.557C122.739 141.733 121.685 139.825 120.295 138.23C111.533 128.128 102.447 118.371 92.0928 109.917C70.3503 92.1846 45.7074 80.311 19.4825 71.6064C13.3675 69.5841 8.40846 72.3883 6.55264 78.8307C0.883763 98.5019 -1.5501 118.496 1.02574 138.991C2.95255 154.367 7.69858 168.691 15.923 181.7C25.9424 197.525 38.6289 210.337 55.6254 218.155C56.4773 218.551 57.2683 219.083 58.0998 219.552C55.311 221.95 52.5932 224.118 50.0579 226.495C47.0765 229.268 47.1982 231.353 50.6157 233.625C53.6479 235.648 56.8322 237.472 60.0672 239.15C72.8957 245.77 85.6126 252.661 98.6744 258.748C108.278 263.231 117.608 267.933 125.883 274.729C134.665 281.933 144.624 281.599 153.67 274.667C158.953 270.622 164.561 266.807 170.524 263.982C184.296 257.466 198.26 251.399 211.535 243.789C217.021 240.631 222.649 237.712 228.136 234.553C231.533 232.614 231.928 230.06 229.423 227.016C228.389 225.755 227.142 224.671 225.955 223.566C224.586 222.294 223.166 221.074 221.453 219.552C222.183 219.156 222.507 218.947 222.862 218.791C231.675 214.997 239.25 209.336 246.481 202.967C260.881 190.249 270.698 174.466 275.778 155.847C281.275 135.686 281.021 115.129 277.218 94.6656ZM69.4376 199.328C51.4067 200.913 37.7162 193.094 26.6929 178.886C13.8137 162.29 9.34144 143.098 9.2096 122.416C9.13862 110.73 11.0249 99.315 13.479 87.9626C13.7325 86.7534 13.9151 85.5233 14.199 84.314C14.9495 81.1137 16.8763 79.8419 19.7563 81.3326C24.2691 83.6677 28.9138 86.0132 32.8586 89.2031C54.0738 106.299 69.3159 127.816 76.871 154.607C80.0553 165.897 79.8525 177.416 76.1307 188.643C75.005 192.062 73.1492 195.232 71.4759 198.432C71.2021 198.964 70.1576 199.266 69.4376 199.328ZM128.043 246.145C123.895 241.996 118.642 241.61 113.257 241.152C108.724 240.766 104.11 240.109 99.7595 238.785C91.4032 236.252 85.5416 230.623 82.3877 222.127C82.1241 221.418 81.11 220.73 80.3393 220.553C77.5099 219.927 75.1267 218.603 73.6664 216.049C73.0782 215.017 72.8247 213.475 73.1188 212.359C73.2608 211.796 74.9746 211.4 75.9786 211.379C82.0227 211.265 88.087 210.993 94.1109 211.306C113.004 212.276 125.254 226.891 129.047 243.247C129.3 244.342 129.422 245.457 129.666 246.896C128.875 246.541 128.357 246.458 128.043 246.145ZM142.981 268.495C138.62 271.195 132.485 267.891 131.968 262.637C131.724 260.26 133.235 259.103 134.898 258.311C136.46 257.57 138.255 257.341 139.938 256.893C144.188 257.195 146.713 258.644 147.372 261.407C147.889 263.565 145.972 266.65 142.981 268.495ZM205.876 214.986C204.487 217.769 202.692 220.146 199.152 220.271C198.402 220.303 197.307 221.23 197.013 222.002C192.834 233.135 181.872 240.287 170.453 240.766C165.241 240.985 159.906 240.683 155.14 243.508C153.568 244.446 152.128 245.603 150.404 246.802C150.688 241.319 152.716 236.617 154.836 232.009C160.616 219.5 170.849 213.433 183.728 211.41C186.101 211.035 188.545 210.983 190.958 210.941C195.065 210.879 199.173 210.921 203.29 210.941C205.947 210.941 207.083 212.578 205.876 214.986ZM265.972 153.21C261.561 166.345 254.928 178.125 244.97 187.59C237.1 195.075 227.537 199.068 216.788 199.85C215.662 199.933 214.526 199.902 213.401 199.996C207.914 200.475 209.304 201.153 206.707 195.888C198.899 180.032 199.649 163.885 205.247 147.622C212.457 126.679 224.606 109.177 240.599 94.3737C245.964 89.4116 251.774 85.1271 258.366 82.0206C262.139 80.238 264.187 80.9886 265.353 85.1063C268.751 97.0216 270.535 109.197 270.282 123.125C270.515 132.403 269.41 142.984 265.972 153.21Z" fill="#D8FD87"/>
3 | <path d="M198.179 85.6275C193.625 92.7787 187.794 99.1586 182.064 105.434C171.122 117.422 161.032 129.974 154.501 145.152C149.735 156.223 148.964 167.898 150.232 179.782C150.546 182.795 151.246 185.776 151.773 188.768C151.621 188.873 151.469 188.977 151.317 189.081C146.611 183.525 141.743 178.104 137.241 172.381C130.892 164.281 126.359 155.066 123.662 145.068C120.376 132.934 119.788 120.56 121.989 108.113C123.865 97.4698 126.664 87.1807 132.13 77.7778C139.401 65.2683 149.065 54.8124 158.588 44.1794C165.352 36.6215 172.075 28.9699 178.16 20.8283C182.561 14.9176 185.594 8.04777 185.502 0C187.977 4.38875 190.512 8.7358 192.895 13.1767C198.077 22.8819 202.651 32.8374 204.527 43.8771C207.012 58.5444 206.281 72.8991 198.179 85.6275Z" fill="#D8FD87"/>
4 | <path d="M80.1795 320H94.5429V353.855H85.6376V327.835L72.0881 353.855H69.4548L55.9053 327.835V353.855H47V320H61.3634L70.7475 338.378L80.1795 320Z" fill="white"/>
5 | <path d="M99.2253 320H113.732L129.053 345.972V320H138.054V353.807H123.404L108.226 327.835V353.807H99.2253V320Z" fill="white"/>
6 | <path d="M175.841 320H190.204V353.807H181.299V327.835L167.75 353.807H165.116L151.567 327.835V353.807H142.662V320H157.025L166.409 338.378L175.841 320Z" fill="white"/>
7 | <path d="M222.85 320.193L233 354H223.903L221.557 346.165H203.555L201.209 354H192.112L202.262 320.193H222.85ZM216.003 327.593H209.109L205.949 338.33H219.163L216.003 327.593Z" fill="white"/>
8 | </svg>
9 | 
```

--------------------------------------------------------------------------------
/assets/logo-full-b.svg:
--------------------------------------------------------------------------------

```
1 | <svg width="280" height="354" viewBox="0 0 280 354" fill="none" xmlns="http://www.w3.org/2000/svg">
2 | <path d="M277.218 94.6656C276.083 88.5047 274.217 82.4376 272.178 76.5164C270.667 72.1068 266.641 70.2512 262.189 71.0852C260.445 71.4084 258.64 71.669 257.017 72.3362C246.917 76.4956 236.674 80.3631 226.807 85.0541C203.645 96.0729 183.038 110.907 165.545 130.099C162.523 133.424 159.805 137.042 157.036 140.596C155.992 141.952 155.495 144.047 154.197 144.797C153.203 145.371 151.347 144.36 149.867 144.057C149.796 144.036 149.704 144.057 149.633 144.036C141.896 142.546 134.199 142.973 126.573 144.766C125.163 145.1 124.524 144.672 123.855 143.557C122.739 141.733 121.685 139.825 120.295 138.23C111.533 128.128 102.447 118.371 92.0928 109.917C70.3503 92.1846 45.7074 80.311 19.4825 71.6064C13.3675 69.5841 8.40846 72.3883 6.55264 78.8307C0.883763 98.5019 -1.5501 118.496 1.02574 138.991C2.95255 154.367 7.69858 168.691 15.923 181.7C25.9424 197.525 38.6289 210.337 55.6254 218.155C56.4773 218.551 57.2683 219.083 58.0998 219.552C55.311 221.95 52.5932 224.118 50.0579 226.495C47.0765 229.268 47.1982 231.353 50.6157 233.625C53.6479 235.648 56.8322 237.472 60.0672 239.15C72.8957 245.77 85.6126 252.661 98.6744 258.748C108.278 263.231 117.608 267.933 125.883 274.729C134.665 281.933 144.624 281.599 153.67 274.667C158.953 270.622 164.561 266.807 170.524 263.982C184.296 257.466 198.26 251.399 211.535 243.789C217.021 240.631 222.649 237.712 228.136 234.553C231.533 232.614 231.928 230.06 229.423 227.016C228.389 225.755 227.142 224.671 225.955 223.566C224.586 222.294 223.166 221.074 221.453 219.552C222.183 219.156 222.507 218.947 222.862 218.791C231.675 214.997 239.25 209.336 246.481 202.967C260.881 190.249 270.698 174.466 275.778 155.847C281.275 135.686 281.021 115.129 277.218 94.6656ZM69.4376 199.328C51.4067 200.913 37.7162 193.094 26.6929 178.886C13.8137 162.29 9.34144 143.098 9.2096 122.416C9.13862 110.73 11.0249 99.315 13.479 87.9626C13.7325 86.7534 13.9151 85.5233 14.199 84.314C14.9495 81.1137 16.8763 79.8419 19.7563 81.3326C24.2691 83.6677 28.9138 86.0132 32.8586 89.2031C54.0738 106.299 69.3159 127.816 76.871 154.607C80.0553 165.897 79.8525 177.416 76.1307 188.643C75.005 192.062 73.1492 195.232 71.4759 198.432C71.2021 198.964 70.1576 199.266 69.4376 199.328ZM128.043 246.145C123.895 241.996 118.642 241.61 113.257 241.152C108.724 240.766 104.11 240.109 99.7595 238.785C91.4032 236.252 85.5416 230.623 82.3877 222.127C82.1241 221.418 81.11 220.73 80.3393 220.553C77.5099 219.927 75.1267 218.603 73.6664 216.049C73.0782 215.017 72.8247 213.475 73.1188 212.359C73.2608 211.796 74.9746 211.4 75.9786 211.379C82.0227 211.265 88.087 210.993 94.1109 211.306C113.004 212.276 125.254 226.891 129.047 243.247C129.3 244.342 129.422 245.457 129.666 246.896C128.875 246.541 128.357 246.458 128.043 246.145ZM142.981 268.495C138.62 271.195 132.485 267.891 131.968 262.637C131.724 260.26 133.235 259.103 134.898 258.311C136.46 257.57 138.255 257.341 139.938 256.893C144.188 257.195 146.713 258.644 147.372 261.407C147.889 263.565 145.972 266.65 142.981 268.495ZM205.876 214.986C204.487 217.769 202.692 220.146 199.152 220.271C198.402 220.303 197.307 221.23 197.013 222.002C192.834 233.135 181.872 240.287 170.453 240.766C165.241 240.985 159.906 240.683 155.14 243.508C153.568 244.446 152.128 245.603 150.404 246.802C150.688 241.319 152.716 236.617 154.836 232.009C160.616 219.5 170.849 213.433 183.728 211.41C186.101 211.035 188.545 210.983 190.958 210.941C195.065 210.879 199.173 210.921 203.29 210.941C205.947 210.941 207.083 212.578 205.876 214.986ZM265.972 153.21C261.561 166.345 254.928 178.125 244.97 187.59C237.1 195.075 227.537 199.068 216.788 199.85C215.662 199.933 214.526 199.902 213.401 199.996C207.914 200.475 209.304 201.153 206.707 195.888C198.899 180.032 199.649 163.885 205.247 147.622C212.457 126.679 224.606 109.177 240.599 94.3737C245.964 89.4116 251.774 85.1271 258.366 82.0206C262.139 80.238 264.187 80.9886 265.353 85.1063C268.751 97.0216 270.535 109.197 270.282 123.125C270.515 132.403 269.41 142.984 265.972 153.21Z" fill="#1F1F1F"/>
3 | <path d="M198.179 85.6275C193.625 92.7787 187.794 99.1586 182.064 105.434C171.122 117.422 161.032 129.974 154.501 145.152C149.735 156.223 148.964 167.898 150.232 179.782C150.546 182.795 151.246 185.776 151.773 188.768C151.621 188.873 151.469 188.977 151.317 189.081C146.611 183.525 141.743 178.104 137.241 172.381C130.892 164.281 126.359 155.066 123.662 145.068C120.376 132.934 119.788 120.56 121.989 108.113C123.865 97.4698 126.664 87.1807 132.13 77.7778C139.401 65.2683 149.065 54.8124 158.588 44.1794C165.352 36.6215 172.075 28.9699 178.16 20.8283C182.561 14.9176 185.594 8.04777 185.502 0C187.977 4.38875 190.512 8.7358 192.895 13.1767C198.077 22.8819 202.651 32.8374 204.527 43.8771C207.012 58.5444 206.281 72.8991 198.179 85.6275Z" fill="#1F1F1F"/>
4 | <path d="M80.1795 320H94.5429V353.855H85.6376V327.835L72.0881 353.855H69.4548L55.9053 327.835V353.855H47V320H61.3634L70.7475 338.378L80.1795 320Z" fill="#1F1F1F"/>
5 | <path d="M99.2253 320H113.732L129.053 345.972V320H138.054V353.807H123.404L108.226 327.835V353.807H99.2253V320Z" fill="#1F1F1F"/>
6 | <path d="M175.841 320H190.204V353.807H181.299V327.835L167.75 353.807H165.116L151.567 327.835V353.807H142.662V320H157.025L166.409 338.378L175.841 320Z" fill="#1F1F1F"/>
7 | <path d="M222.85 320.193L233 354H223.903L221.557 346.165H203.555L201.209 354H192.112L202.262 320.193H222.85ZM216.003 327.593H209.109L205.949 338.33H219.163L216.003 327.593Z" fill="#1F1F1F"/>
8 | </svg>
9 | 
```

--------------------------------------------------------------------------------
/indexer/indexer.py:
--------------------------------------------------------------------------------

```python
  1 | import os
  2 | import uuid
  3 | import torch
  4 | import logging
  5 | import time
  6 | from dataclasses import dataclass
  7 | from typing import List, Dict
  8 | from pathlib import Path
  9 | 
 10 | from qdrant_client import QdrantClient
 11 | from langchain_qdrant import QdrantVectorStore
 12 | from langchain_huggingface import HuggingFaceEmbeddings
 13 | from qdrant_client.http.models import Distance, VectorParams, Filter, FieldCondition, MatchValue
 14 | from langchain.text_splitter import RecursiveCharacterTextSplitter
 15 | 
 16 | from langchain_community.document_loaders import (
 17 |     TextLoader,
 18 |     CSVLoader,
 19 |     Docx2txtLoader,
 20 |     UnstructuredExcelLoader,
 21 |     PyMuPDFLoader,
 22 |     UnstructuredPowerPointLoader,
 23 | )
 24 | 
 25 | from storage import MinimaStore, IndexingStatus
 26 | 
 27 | logger = logging.getLogger(__name__)
 28 | 
 29 | 
 30 | @dataclass
 31 | class Config:
 32 |     EXTENSIONS_TO_LOADERS = {
 33 |         ".pdf": PyMuPDFLoader,
 34 |         ".pptx": UnstructuredPowerPointLoader,
 35 |         ".ppt": UnstructuredPowerPointLoader,
 36 |         ".xls": UnstructuredExcelLoader,
 37 |         ".xlsx": UnstructuredExcelLoader,
 38 |         ".docx": Docx2txtLoader,
 39 |         ".doc": Docx2txtLoader,
 40 |         ".txt": TextLoader,
 41 |         ".md": TextLoader,
 42 |         ".csv": CSVLoader,
 43 |     }
 44 |     
 45 |     DEVICE = torch.device(
 46 |         "mps" if torch.backends.mps.is_available() else
 47 |         "cuda" if torch.cuda.is_available() else
 48 |         "cpu"
 49 |     )
 50 |     
 51 |     START_INDEXING = os.environ.get("START_INDEXING")
 52 |     LOCAL_FILES_PATH = os.environ.get("LOCAL_FILES_PATH")
 53 |     CONTAINER_PATH = os.environ.get("CONTAINER_PATH")
 54 |     QDRANT_COLLECTION = "mnm_storage"
 55 |     QDRANT_BOOTSTRAP = "qdrant"
 56 |     EMBEDDING_MODEL_ID = os.environ.get("EMBEDDING_MODEL_ID")
 57 |     EMBEDDING_SIZE = os.environ.get("EMBEDDING_SIZE")
 58 |     
 59 |     CHUNK_SIZE = 500
 60 |     CHUNK_OVERLAP = 200
 61 | 
 62 | class Indexer:
 63 |     def __init__(self):
 64 |         self.config = Config()
 65 |         self.qdrant = self._initialize_qdrant()
 66 |         self.embed_model = self._initialize_embeddings()
 67 |         self.document_store = self._setup_collection()
 68 |         self.text_splitter = self._initialize_text_splitter()
 69 | 
 70 |     def _initialize_qdrant(self) -> QdrantClient:
 71 |         return QdrantClient(host=self.config.QDRANT_BOOTSTRAP)
 72 | 
 73 |     def _initialize_embeddings(self) -> HuggingFaceEmbeddings:
 74 |         return HuggingFaceEmbeddings(
 75 |             model_name=self.config.EMBEDDING_MODEL_ID,
 76 |             model_kwargs={'device': self.config.DEVICE},
 77 |             encode_kwargs={'normalize_embeddings': False}
 78 |         )
 79 | 
 80 |     def _initialize_text_splitter(self) -> RecursiveCharacterTextSplitter:
 81 |         return RecursiveCharacterTextSplitter(
 82 |             chunk_size=self.config.CHUNK_SIZE,
 83 |             chunk_overlap=self.config.CHUNK_OVERLAP
 84 |         )
 85 | 
 86 |     def _setup_collection(self) -> QdrantVectorStore:
 87 |         if not self.qdrant.collection_exists(self.config.QDRANT_COLLECTION):
 88 |             self.qdrant.create_collection(
 89 |                 collection_name=self.config.QDRANT_COLLECTION,
 90 |                 vectors_config=VectorParams(
 91 |                     size=self.config.EMBEDDING_SIZE,
 92 |                     distance=Distance.COSINE
 93 |                 ),
 94 |             )
 95 |         self.qdrant.create_payload_index(
 96 |             collection_name=self.config.QDRANT_COLLECTION,
 97 |             field_name="fpath",
 98 |             field_schema="keyword"
 99 |         )
100 |         return QdrantVectorStore(
101 |             client=self.qdrant,
102 |             collection_name=self.config.QDRANT_COLLECTION,
103 |             embedding=self.embed_model,
104 |         )
105 | 
106 |     def _create_loader(self, file_path: str):
107 |         file_extension = Path(file_path).suffix.lower()
108 |         loader_class = self.config.EXTENSIONS_TO_LOADERS.get(file_extension)
109 |         
110 |         if not loader_class:
111 |             raise ValueError(f"Unsupported file type: {file_extension}")
112 |         
113 |         return loader_class(file_path=file_path)
114 | 
115 |     def _process_file(self, loader) -> List[str]:
116 |         try:
117 |             documents = loader.load_and_split(self.text_splitter)
118 |             if not documents:
119 |                 logger.warning(f"No documents loaded from {loader.file_path}")
120 |                 return []
121 | 
122 |             for doc in documents:
123 |                 doc.metadata['file_path'] = loader.file_path
124 | 
125 |             uuids = [str(uuid.uuid4()) for _ in range(len(documents))]
126 |             ids = self.document_store.add_documents(documents=documents, ids=uuids)
127 |             
128 |             logger.info(f"Successfully processed {len(ids)} documents from {loader.file_path}")
129 |             return ids
130 |             
131 |         except Exception as e:
132 |             logger.error(f"Error processing file {loader.file_path}: {str(e)}")
133 |             return []
134 | 
135 |     def index(self, message: Dict[str, any]) -> None:
136 |         start = time.time()
137 |         path, file_id, last_updated_seconds = message["path"], message["file_id"], message["last_updated_seconds"]
138 |         logger.info(f"Processing file: {path} (ID: {file_id})")
139 |         indexing_status: IndexingStatus = MinimaStore.check_needs_indexing(fpath=path, last_updated_seconds=last_updated_seconds)
140 |         if indexing_status != IndexingStatus.no_need_reindexing:
141 |             logger.info(f"Indexing needed for {path} with status: {indexing_status}")
142 |             try:
143 |                 if indexing_status == IndexingStatus.need_reindexing:
144 |                     logger.info(f"Removing {path} from index storage for reindexing")
145 |                     self.remove_from_storage(files_to_remove=[path])
146 |                 loader = self._create_loader(path)
147 |                 ids = self._process_file(loader)
148 |                 if ids:
149 |                     logger.info(f"Successfully indexed {path} with IDs: {ids}")
150 |             except Exception as e:
151 |                 logger.error(f"Failed to index file {path}: {str(e)}")
152 |         else:
153 |             logger.info(f"Skipping {path}, no indexing required. timestamp didn't change")
154 |         end = time.time()
155 |         logger.info(f"Processing took {end - start} seconds for file {path}")
156 | 
157 |     def purge(self, message: Dict[str, any]) -> None:
158 |         existing_file_paths: list[str] = message["existing_file_paths"]
159 |         files_to_remove = MinimaStore.find_removed_files(existing_file_paths=set(existing_file_paths))
160 |         if len(files_to_remove) > 0:
161 |             logger.info(f"purge processing removing old files {files_to_remove}")
162 |             self.remove_from_storage(files_to_remove)
163 |         else:
164 |             logger.info("Nothing to purge")
165 | 
166 |     def remove_from_storage(self, files_to_remove: list[str]):
167 |         filter_conditions = Filter(
168 |             must=[
169 |                 FieldCondition(
170 |                     key="fpath",
171 |                     match=MatchValue(value=fpath)
172 |                 )
173 |                 for fpath in files_to_remove
174 |             ]
175 |         )
176 |         response = self.qdrant.delete(
177 |             collection_name=self.config.QDRANT_COLLECTION,
178 |             points_selector=filter_conditions,
179 |             wait=True
180 |         )
181 |         logger.info(f"Delete response for {len(files_to_remove)} for files: {files_to_remove} is: {response}")
182 | 
183 |     def find(self, query: str) -> Dict[str, any]:
184 |         try:
185 |             logger.info(f"Searching for: {query}")
186 |             found = self.document_store.search(query, search_type="similarity")
187 |             
188 |             if not found:
189 |                 logger.info("No results found")
190 |                 return {"links": set(), "output": ""}
191 | 
192 |             links = set()
193 |             results = []
194 |             
195 |             for item in found:
196 |                 path = item.metadata["file_path"].replace(
197 |                     self.config.CONTAINER_PATH,
198 |                     self.config.LOCAL_FILES_PATH
199 |                 )
200 |                 links.add(f"file://{path}")
201 |                 results.append(item.page_content)
202 | 
203 |             output = {
204 |                 "links": links,
205 |                 "output": ". ".join(results)
206 |             }
207 |             
208 |             logger.info(f"Found {len(found)} results")
209 |             return output
210 |             
211 |         except Exception as e:
212 |             logger.error(f"Search failed: {str(e)}")
213 |             return {"error": "Unable to find anything for the given query"}
214 | 
215 |     def embed(self, query: str):
216 |         return self.embed_model.embed_query(query)
```

--------------------------------------------------------------------------------
/llm/llm_chain.py:
--------------------------------------------------------------------------------

```python
  1 | import os
  2 | import uuid
  3 | import torch
  4 | import datetime
  5 | import logging
  6 | from dataclasses import dataclass
  7 | from typing import Sequence, Optional
  8 | from langchain.schema import Document
  9 | from qdrant_client import QdrantClient
 10 | from langchain_ollama import ChatOllama
 11 | from minima_embed import MinimaEmbeddings
 12 | from langgraph.graph import START, StateGraph
 13 | from langchain_qdrant import QdrantVectorStore
 14 | from langchain_core.messages import BaseMessage
 15 | from langgraph.graph.message import add_messages
 16 | from typing_extensions import Annotated, TypedDict
 17 | from langgraph.checkpoint.memory import MemorySaver
 18 | from langchain_core.pydantic_v1 import BaseModel, Field
 19 | from langchain_core.messages import AIMessage, HumanMessage
 20 | from langchain.chains.retrieval import create_retrieval_chain
 21 | from langchain.retrievers import ContextualCompressionRetriever
 22 | from langchain.retrievers.document_compressors import CrossEncoderReranker
 23 | from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
 24 | from langchain.chains.combine_documents import create_stuff_documents_chain
 25 | from langchain_community.cross_encoders.huggingface import HuggingFaceCrossEncoder
 26 | from langchain.chains.history_aware_retriever import create_history_aware_retriever
 27 | 
 28 | logger = logging.getLogger(__name__)
 29 | 
 30 | CONTEXTUALIZE_Q_SYSTEM_PROMPT = (
 31 |     "Given a chat history and the latest user question "
 32 |     "which might reference context in the chat history, "
 33 |     "formulate a standalone question which can be understood "
 34 |     "without the chat history. Do NOT answer the question, "
 35 |     "just reformulate it if needed and otherwise return it as is."
 36 | )
 37 | 
 38 | SYSTEM_PROMPT = (
 39 |     "You are an assistant for question-answering tasks. "
 40 |     "Use the following pieces of retrieved context to answer "
 41 |     "the question. If you don't know the answer, say that you "
 42 |     "don't know. Use three sentences maximum and keep the "
 43 |     "answer concise."
 44 |     "\n\n"
 45 |     "{context}"
 46 | )
 47 | 
 48 | QUERY_ENHANCEMENT_PROMPT = (
 49 |     "You are an expert at converting user questions into queries."
 50 |     "You have access to a users files."
 51 |     "Perform query expansion."
 52 |     "Just return one expanded query, do not add any other text."
 53 |     "If there are acronyms or words you are not familiar with, do not try to rephrase them."
 54 |     "Do not change the original meaning of the question and do not add any additional information."
 55 | )
 56 | 
 57 | class ParaphrasedQuery(BaseModel):
 58 |     paraphrased_query: str = Field(
 59 |         ...,
 60 |         description="A unique paraphrasing of the original question.",
 61 |     )
 62 | 
 63 | @dataclass
 64 | class LLMConfig:
 65 |     """Configuration settings for the LLM Chain"""
 66 |     qdrant_collection: str = "mnm_storage"
 67 |     qdrant_host: str = "qdrant"
 68 |     ollama_url: str = "http://ollama:11434"
 69 |     ollama_model: str = os.environ.get("OLLAMA_MODEL")
 70 |     rerank_model: str = os.environ.get("RERANKER_MODEL")
 71 |     temperature: float = 0.5
 72 |     device: torch.device = torch.device(
 73 |         "mps" if torch.backends.mps.is_available() else
 74 |         "cuda" if torch.cuda.is_available() else
 75 |         "cpu"
 76 |     )
 77 | 
 78 | 
 79 | @dataclass
 80 | class LocalConfig:
 81 |     LOCAL_FILES_PATH = os.environ.get("LOCAL_FILES_PATH")
 82 |     CONTAINER_PATH = os.environ.get("CONTAINER_PATH")
 83 | 
 84 | 
 85 | class State(TypedDict):
 86 |     """State definition for the LLM Chain"""
 87 |     input: str
 88 |     chat_history: Annotated[Sequence[BaseMessage], add_messages]
 89 |     context: str
 90 |     answer: str
 91 |     init_query: str
 92 | 
 93 | 
 94 | class LLMChain:
 95 |     """A chain for processing LLM queries with context awareness and retrieval capabilities"""
 96 | 
 97 |     def __init__(self, config: Optional[LLMConfig] = None):
 98 |         """Initialize the LLM Chain with optional custom configuration"""
 99 |         self.localConfig = LocalConfig()
100 |         self.config = config or LLMConfig()
101 |         self.llm = self._setup_llm()
102 |         self.document_store = self._setup_document_store()
103 |         self.chain = self._setup_chain()
104 |         self.graph = self._create_graph()
105 | 
106 |     def _setup_llm(self) -> ChatOllama:
107 |         """Initialize the LLM model"""
108 |         return ChatOllama(
109 |             base_url=self.config.ollama_url,
110 |             model=self.config.ollama_model,
111 |             temperature=self.config.temperature
112 |         )
113 | 
114 |     def _setup_document_store(self) -> QdrantVectorStore:
115 |         """Initialize the document store with vector embeddings"""
116 |         qdrant = QdrantClient(host=self.config.qdrant_host)
117 |         embed_model = MinimaEmbeddings()
118 |         return QdrantVectorStore(
119 |             client=qdrant,
120 |             collection_name=self.config.qdrant_collection,
121 |             embedding=embed_model
122 |         )
123 | 
124 |     def _setup_chain(self):
125 |         """Set up the retrieval and QA chain"""
126 |         # Initialize retriever with reranking
127 |         base_retriever = self.document_store.as_retriever()
128 |         reranker = HuggingFaceCrossEncoder(
129 |             model_name=self.config.rerank_model,
130 |             model_kwargs={'device': self.config.device},
131 |         )
132 |         compression_retriever = ContextualCompressionRetriever(
133 |             base_compressor=CrossEncoderReranker(model=reranker, top_n=3),
134 |             base_retriever=base_retriever
135 |         )
136 | 
137 |         # Create history-aware retriever
138 |         contextualize_prompt = ChatPromptTemplate.from_messages([
139 |             ("system", CONTEXTUALIZE_Q_SYSTEM_PROMPT),
140 |             MessagesPlaceholder("chat_history"),
141 |             ("human", "{input}"),
142 |         ])
143 |         history_aware_retriever = create_history_aware_retriever(
144 |             self.llm, compression_retriever, contextualize_prompt
145 |         )
146 | 
147 |         # Create QA chain
148 |         qa_prompt = ChatPromptTemplate.from_messages([
149 |             ("system", SYSTEM_PROMPT),
150 |             MessagesPlaceholder("chat_history"),
151 |             ("human", "{input}"),
152 |         ])
153 |         qa_chain = create_stuff_documents_chain(self.llm, qa_prompt)
154 |         retrieval_chain = create_retrieval_chain(history_aware_retriever, qa_chain)
155 | 
156 |         return retrieval_chain
157 | 
158 |     def _create_graph(self) -> StateGraph:
159 |         """Create the processing graph"""
160 |         workflow = StateGraph(state_schema=State)
161 |         workflow.add_node("enhance", self._enhance_query)
162 |         workflow.add_node("retrieval", self._call_model)
163 |         workflow.add_edge(START, "enhance")
164 |         workflow.add_edge("enhance", "retrieval")
165 |         return workflow.compile(checkpointer=MemorySaver())
166 | 
167 |     def _enhance_query(self, state: State) -> str:
168 |         """Enhance the query using the LLM"""
169 |         prompt_enhancement = ChatPromptTemplate.from_messages([
170 |             ("system", QUERY_ENHANCEMENT_PROMPT),
171 |             ("human", "{input}"),
172 |         ])
173 |         query_enhancement = prompt_enhancement | self.llm
174 |         enhanced_query = query_enhancement.invoke({
175 |             "input": state["input"]
176 |         })
177 |         logger.info(f"Enhanced query: {enhanced_query}")
178 |         state["init_query"] = state["input"]
179 |         state["input"] = enhanced_query.content
180 |         return state
181 | 
182 |     def _call_model(self, state: State) -> dict:
183 |         """Process the query through the model"""
184 |         logger.info(f"Processing query: {state['init_query']}")
185 |         logger.info(f"Enhanced query: {state['input']}")
186 |         response = self.chain.invoke(state)
187 |         logger.info(f"Received response: {response['answer']}")
188 |         return {
189 |             "chat_history": [
190 |                 HumanMessage(state["init_query"]),
191 |                 AIMessage(response["answer"]),
192 |             ],
193 |             "context": response["context"],
194 |             "answer": response["answer"],
195 |         }
196 |     
197 |     def invoke(self, message: str) -> dict:
198 |         """
199 |         Process a user message and return the response
200 |         
201 |         Args:
202 |             message: The user's input message
203 |             
204 |         Returns:
205 |             dict: Contains the model's response or error information
206 |         """
207 |         try:
208 |             logger.info(f"Processing query: {message}")
209 |             config = {
210 |                 "configurable": {
211 |                     "thread_id": uuid.uuid4(),
212 |                     "thread_ts": datetime.datetime.now().isoformat()
213 |                 }   
214 |             }
215 |             result = self.graph.invoke(
216 |                 {"input": message},
217 |                 config=config
218 |             )
219 |             logger.info(f"OUTPUT: {result}")
220 |             links = set()
221 |             for ctx in result["context"]:
222 |                 doc: Document = ctx
223 |                 path = doc.metadata["file_path"].replace(
224 |                     self.localConfig.CONTAINER_PATH,
225 |                     self.localConfig.LOCAL_FILES_PATH
226 |                 )
227 |                 links.add(f"file://{path}")
228 |             return {"answer": result["answer"], "links": links}
229 |         except Exception as e:
230 |             logger.error(f"Error processing query", exc_info=True)
231 |             return {"error": str(e), "status": "error"}
```

--------------------------------------------------------------------------------
/chat/src/ChatApp.tsx:
--------------------------------------------------------------------------------

```typescript
  1 | import React, { useState, useEffect } from 'react';
  2 | import {
  3 |     Layout,
  4 |     Typography,
  5 |     List as AntList,
  6 |     Input,
  7 |     ConfigProvider,
  8 |     Switch,
  9 |     theme,
 10 |     Button,
 11 | } from 'antd';
 12 | import { ArrowRightOutlined } from '@ant-design/icons';
 13 | import {ToastContainer, toast, Bounce} from 'react-toastify';
 14 | 
 15 | const { Header, Content, Footer } = Layout;
 16 | const { TextArea } = Input;
 17 | const { Link: AntLink, Paragraph, Title } = Typography;
 18 | const { defaultAlgorithm, darkAlgorithm } = theme;
 19 | 
 20 | interface Message {
 21 |     type: 'answer' | 'question' | 'full';
 22 |     reporter: 'output_message' | 'user';
 23 |     message: string;
 24 |     links: string[];
 25 | }
 26 | 
 27 | const ChatApp: React.FC = () => {
 28 |     const [ws, setWs] = useState<WebSocket | null>(null);
 29 |     const [input, setInput] = useState<string>('');
 30 |     const [messages, setMessages] = useState<Message[]>([]);
 31 |     const [isDarkMode, setIsDarkMode] = useState(false);
 32 | 
 33 |     // Toggle light/dark theme
 34 |     const toggleTheme = () => setIsDarkMode((prev) => !prev);
 35 | 
 36 |     // WebSocket Setup
 37 |     useEffect(() => {
 38 |         const webSocket = new WebSocket('ws://localhost:8003/llm/');
 39 | 
 40 |         webSocket.onmessage = (event) => {
 41 |             const message_curr: Message = JSON.parse(event.data);
 42 | 
 43 |             if (message_curr.reporter === 'output_message') {
 44 |                 setMessages((messages_prev) => {
 45 |                     if (messages_prev.length === 0) return [message_curr];
 46 |                     const last = messages_prev[messages_prev.length - 1];
 47 | 
 48 |                     // If last message is question or 'full', append new
 49 |                     if (last.type === 'question' || last.type === 'full') {
 50 |                         return [...messages_prev, message_curr];
 51 |                     }
 52 | 
 53 |                     // If incoming message is 'full', replace last message
 54 |                     if (message_curr.type === 'full') {
 55 |                         return [...messages_prev.slice(0, -1), message_curr];
 56 |                     }
 57 | 
 58 |                     // Otherwise, merge partial message
 59 |                     return [
 60 |                         ...messages_prev.slice(0, -1),
 61 |                         {
 62 |                             ...last,
 63 |                             message: last.message + message_curr.message,
 64 |                         },
 65 |                     ];
 66 |                 });
 67 |             }
 68 |         };
 69 | 
 70 |         setWs(webSocket);
 71 |         return () => {
 72 |             webSocket.close();
 73 |         };
 74 |     }, []);
 75 | 
 76 |     // Send message
 77 |     const sendMessage = (): void => {
 78 |         try {
 79 |             if (ws && input.trim()) {
 80 |                 ws.send(input);
 81 |                 setMessages((prev) => [
 82 |                     ...prev,
 83 |                     {
 84 |                         type: 'question',
 85 |                         reporter: 'user',
 86 |                         message: input,
 87 |                         links: [],
 88 |                     },
 89 |                 ]);
 90 |                 setInput('');
 91 |             }
 92 |         } catch (e) {
 93 |             console.error(e);
 94 |         }
 95 |     };
 96 | 
 97 |     async function handleLinkClick(link: string) {
 98 |         await navigator.clipboard.writeText(link);
 99 |         toast('Link copied!', {
100 |             position: "top-right",
101 |             autoClose: 1000,
102 |             hideProgressBar: true,
103 |             closeOnClick: true,
104 |             pauseOnHover: true,
105 |             draggable: false,
106 |             progress: undefined,
107 |             theme: "light",
108 |             transition: Bounce,
109 |         });
110 |     }
111 | 
112 |     return (
113 |         <ConfigProvider
114 |             theme={{
115 |                 algorithm: isDarkMode ? darkAlgorithm : defaultAlgorithm,
116 |                 token: {
117 |                     borderRadius: 2,
118 |                 },
119 |             }}
120 |         >
121 |             <Layout
122 |                 style={{
123 |                     width: '100%',
124 |                     height: '100vh',
125 |                     margin: '0 auto',
126 |                     display: 'flex',
127 |                     flexDirection: 'column',
128 |                     overflow: 'hidden',
129 |                 }}
130 |             >
131 |                 {/* Header with Theme Toggle */}
132 |                 <Header
133 |                     style={{
134 |                         backgroundImage: isDarkMode
135 |                             ? 'linear-gradient(45deg, #10161A, #394B59)' // Dark gradient
136 |                             : 'linear-gradient(45deg, #2f3f48, #586770)', // Light gradient
137 |                         borderBottomLeftRadius: 2,
138 |                         borderBottomRightRadius: 2,
139 |                         display: 'flex',
140 |                         alignItems: 'center',
141 |                         justifyContent: 'space-between',
142 |                         padding: '0 16px',
143 |                     }}
144 |                 >
145 |                     <Title level={4} style={{ margin: 0, color: 'white' }}>
146 |                         Minima
147 |                     </Title>
148 |                     <Switch
149 |                         checked={isDarkMode}
150 |                         onChange={toggleTheme}
151 |                         checkedChildren="Dark"
152 |                         unCheckedChildren="Light"
153 |                     />
154 |                 </Header>
155 | 
156 |                 {/* Messages */}
157 |                 <Content style={{ padding: '16px', display: 'flex', flexDirection: 'column' }}>
158 |                     <AntList
159 |                         style={{
160 |                             flexGrow: 1,
161 |                             marginBottom: 16,
162 |                             border: '1px solid #ccc',
163 |                             borderRadius: 4,
164 |                             overflowY: 'auto',
165 |                             padding: '16px',
166 |                         }}
167 |                     >
168 |                         {messages.map((msg, index) => {
169 |                             const isUser = msg.reporter === 'user';
170 |                             return (
171 |                                 <AntList.Item
172 |                                     key={index}
173 |                                     style={{
174 |                                         display: 'flex',
175 |                                         flexDirection: 'column',
176 |                                         alignItems: isUser ? 'flex-end' : 'flex-start',
177 |                                         border: 'none',
178 |                                     }}
179 |                                 >
180 |                                     <div
181 |                                         style={{
182 |                                             maxWidth: '60%',
183 |                                             borderRadius: 16,
184 |                                             padding: '8px 16px',
185 |                                             wordBreak: 'break-word',
186 |                                             textAlign: isUser ? 'right' : 'left',
187 |                                             backgroundImage: isUser
188 |                                                 ? 'linear-gradient(120deg, #1a62aa, #007bff)'
189 |                                                 : 'linear-gradient(120deg, #abcbe8, #7bade0)',
190 |                                             color: isUser ? 'white' : 'black',
191 |                                         }}
192 |                                     >
193 |                                         <Paragraph
194 |                                             style={{
195 |                                                 margin: 0,
196 |                                                 color: 'inherit',
197 |                                                 fontSize: '1rem',
198 |                                                 fontWeight: 500,
199 |                                                 lineHeight: '1.4',
200 |                                             }}
201 |                                         >
202 |                                             {msg.message}
203 |                                         </Paragraph>
204 | 
205 |                                         {/* Links, if any */}
206 |                                         {msg.links?.length > 0 && (
207 |                                             <div style={{ marginTop: 4 }}>
208 |                                                 {msg.links.map((link, linkIndex) => (
209 |                                                     <React.Fragment key={linkIndex}>
210 |                                                         <br />
211 |                                                         <AntLink
212 |                                                             onClick={async () => {
213 |                                                                 await handleLinkClick(link)
214 |                                                             }}
215 |                                                             href={link}
216 |                                                             target="_blank"
217 |                                                             rel="noopener noreferrer"
218 |                                                             style={{
219 |                                                                 color: 'inherit',
220 |                                                                 textDecoration: 'underline',
221 |                                                             }}
222 |                                                         >
223 |                                                             {link}
224 |                                                         </AntLink>
225 |                                                     </React.Fragment>
226 |                                                 ))}
227 |                                             </div>
228 |                                         )}
229 |                                     </div>
230 |                                 </AntList.Item>
231 |                             );
232 |                         })}
233 |                     </AntList>
234 |                 </Content>
235 | 
236 |                 {/* Footer with TextArea & Circular Arrow Button */}
237 |                 <Footer style={{ padding: '16px' }}>
238 |                     <div style={{ position: 'relative', width: '100%' }}>
239 |                         <TextArea
240 |                             placeholder="Type your message here..."
241 |                             rows={5}
242 |                             value={input}
243 |                             onChange={(e) => setInput(e.target.value)}
244 |                             onPressEnter={(e) => {
245 |                                 // Allow SHIFT+ENTER for multiline
246 |                                 if (!e.shiftKey) {
247 |                                     e.preventDefault();
248 |                                     sendMessage();
249 |                                 }
250 |                             }}
251 |                             style={{
252 |                                 width: '100%',
253 |                                 border: '1px solid #ccc',
254 |                                 borderRadius: 4,
255 |                                 resize: 'none',
256 |                                 paddingRight: 60, // Extra space so text won't overlap the button
257 |                             }}
258 |                         />
259 |                         <Button
260 |                             shape="circle"
261 |                             icon={<ArrowRightOutlined />}
262 |                             onClick={sendMessage}
263 |                             style={{
264 |                                 position: 'absolute',
265 |                                 bottom: 8,
266 |                                 right: 8,
267 |                                 width: 40,
268 |                                 height: 40,
269 |                                 minWidth: 40,
270 |                                 borderRadius: '50%',
271 |                                 fontWeight: 'bold',
272 |                                 display: 'flex',
273 |                                 alignItems: 'center',
274 |                                 justifyContent: 'center',
275 |                             }}
276 |                         />
277 |                     </div>
278 |                 </Footer>
279 |             </Layout>
280 |         </ConfigProvider>
281 |     );
282 | };
283 | 
284 | export default ChatApp;
```