#
tokens: 4954/50000 17/17 files
lines: on (toggle) GitHub
raw markdown copy reset
# Directory Structure

```
├── .gitignore
├── .idea
│   ├── .gitignore
│   ├── inspectionProfiles
│   │   └── profiles_settings.xml
│   ├── misc.xml
│   ├── modules.xml
│   ├── vcs.xml
│   └── voice-recorder-mcp.iml
├── environment.yml
├── LICENSE
├── pyproject.toml
├── README.md
├── src
│   ├── voice_recorder
│   │   ├── __init__.py
│   │   ├── __pycache__
│   │   │   ├── __init__.cpython-311.pyc
│   │   │   ├── __init__.cpython-312.pyc
│   │   │   ├── audio_service.cpython-311.pyc
│   │   │   ├── audio_service.cpython-312.pyc
│   │   │   ├── config.cpython-311.pyc
│   │   │   ├── config.cpython-312.pyc
│   │   │   ├── server.cpython-311.pyc
│   │   │   └── server.cpython-312.pyc
│   │   ├── audio_service.py
│   │   ├── config.py
│   │   └── server.py
│   └── voice_recorder_mcp.egg-info
│       ├── dependency_links.txt
│       ├── entry_points.txt
│       ├── PKG-INFO
│       ├── SOURCES.txt
│       └── top_level.txt
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/.idea/.gitignore:
--------------------------------------------------------------------------------

```
1 | # Default ignored files
2 | /shelf/
3 | /workspace.xml
4 | 
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
1 | .venv
2 | build
3 | src/voice_recorder_mcp.egg-info
4 | .idea
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Voice Recorder MCP Server
  2 | 
  3 | An MCP server for recording audio and transcribing it using OpenAI's Whisper model. Designed to work as a Goose custom extension or standalone MCP server.
  4 | 
  5 | ## Features
  6 | 
  7 | - Record audio from the default microphone
  8 | - Transcribe recordings using Whisper
  9 | - Integrates with Goose AI agent as a custom extension
 10 | - Includes prompts for common recording scenarios
 11 | 
 12 | ## Installation
 13 | 
 14 | ```bash
 15 | # Install from source
 16 | git clone https://github.com/DefiBax/voice-recorder-mcp.git
 17 | cd voice-recorder-mcp
 18 | pip install -e .
 19 | ```
 20 | 
 21 | ## Usage
 22 | 
 23 | ### As a Standalone MCP Server
 24 | 
 25 | ```bash
 26 | # Run with default settings (base.en model)
 27 | voice-recorder-mcp
 28 | 
 29 | # Use a specific Whisper model
 30 | voice-recorder-mcp --model medium.en
 31 | 
 32 | # Adjust sample rate
 33 | voice-recorder-mcp --sample-rate 44100
 34 | ```
 35 | 
 36 | ### Testing with MCP Inspector
 37 | 
 38 | The MCP Inspector provides an interactive interface to test your server:
 39 | 
 40 | ```bash
 41 | # Install the MCP Inspector
 42 | npm install -g @modelcontextprotocol/inspector
 43 | 
 44 | # Run your server with the inspector
 45 | npx @modelcontextprotocol/inspector voice-recorder-mcp
 46 | ```
 47 | 
 48 | ### With Goose AI Agent
 49 | 
 50 | 1. Open Goose and go to Settings > Extensions > Add > Command Line Extension
 51 | 2. Set the name to `voice-recorder`
 52 | 3. In the Command field, enter the full path to the voice-recorder-mcp executable:
 53 |    ```
 54 |    /full/path/to/voice-recorder-mcp
 55 |    ```
 56 |    
 57 |    Or for a specific model:
 58 |    ```
 59 |    /full/path/to/voice-recorder-mcp --model medium.en
 60 |    ```
 61 |    
 62 |    To find the path, run:
 63 |    ```bash
 64 |    which voice-recorder-mcp
 65 |    ```
 66 | 
 67 | 4. No environment variables are needed for basic functionality
 68 | 5. Start a conversation with Goose and introduce the recorder with:
 69 |    "I want you to take action from transcriptions returned by voice-recorder. For example, if I dictate a calculation like 1+1, please return the result."
 70 | 
 71 | ## Available Tools
 72 | 
 73 | - `start_recording`: Start recording audio from the default microphone
 74 | - `stop_and_transcribe`: Stop recording and transcribe the audio to text
 75 | - `record_and_transcribe`: Record audio for a specified duration and transcribe it
 76 | 
 77 | ## Whisper Models
 78 | 
 79 | This extension supports various Whisper model sizes:
 80 | 
 81 | | Model | Speed | Accuracy | Memory Usage | Use Case |
 82 | |-------|-------|----------|--------------|----------|
 83 | | `tiny.en` | Fastest | Lowest | Minimal | Testing, quick transcriptions |
 84 | | `base.en` | Fast | Good | Low | Everyday use (default) |
 85 | | `small.en` | Medium | Better | Moderate | Good balance |
 86 | | `medium.en` | Slow | High | High | Important recordings |
 87 | | `large` | Slowest | Highest | Very High | Critical transcriptions |
 88 | 
 89 | The `.en` suffix indicates models specialized for English, which are faster and more accurate for English content.
 90 | 
 91 | ## Requirements
 92 | 
 93 | - Python 3.12+
 94 | - An audio input device (microphone)
 95 | 
 96 | ## Configuration
 97 | 
 98 | You can configure the server using environment variables:
 99 | 
100 | ```bash
101 | # Set Whisper model
102 | export WHISPER_MODEL=small.en
103 | 
104 | # Set audio sample rate
105 | export SAMPLE_RATE=44100
106 | 
107 | # Set maximum recording duration (seconds)
108 | export MAX_DURATION=120
109 | 
110 | # Then run the server
111 | voice-recorder-mcp
112 | ```
113 | 
114 | ## Troubleshooting
115 | 
116 | ### Common Issues
117 | 
118 | - **No audio being recorded**: Check your microphone permissions and settings
119 | - **Model download errors**: Ensure you have a stable internet connection for the initial model download
120 | - **Integration with Goose**: Make sure the command path is correct
121 | - **Audio quality issues**: Try adjusting the sample rate (default: 16000)
122 | 
123 | ## Contributing
124 | 
125 | Contributions are welcome! Please feel free to submit a Pull Request.
126 | 
127 | 1. Fork the repository
128 | 2. Create your feature branch (`git checkout -b feature/amazing-feature`)
129 | 3. Commit your changes (`git commit -m 'Add some amazing feature'`)
130 | 4. Push to the branch (`git push origin feature/amazing-feature`)
131 | 5. Open a Pull Request
132 | 
133 | ## License
134 | 
135 | This project is licensed under the MIT License - see the LICENSE file for details.
136 | 
```

--------------------------------------------------------------------------------
/src/voice_recorder_mcp.egg-info/dependency_links.txt:
--------------------------------------------------------------------------------

```
1 | 
2 | 
```

--------------------------------------------------------------------------------
/src/voice_recorder_mcp.egg-info/top_level.txt:
--------------------------------------------------------------------------------

```
1 | voice_recorder
2 | 
```

--------------------------------------------------------------------------------
/src/voice_recorder_mcp.egg-info/entry_points.txt:
--------------------------------------------------------------------------------

```
1 | [console_scripts]
2 | voice-recorder-mcp = voice_recorder:main
3 | 
```

--------------------------------------------------------------------------------
/.idea/vcs.xml:
--------------------------------------------------------------------------------

```
1 | <?xml version="1.0" encoding="UTF-8"?>
2 | <project version="4">
3 |   <component name="VcsDirectoryMappings">
4 |     <mapping directory="" vcs="Git" />
5 |   </component>
6 | </project>
```

--------------------------------------------------------------------------------
/.idea/inspectionProfiles/profiles_settings.xml:
--------------------------------------------------------------------------------

```
1 | <component name="InspectionProjectProfileManager">
2 |   <settings>
3 |     <option name="USE_PROJECT_PROFILE" value="false" />
4 |     <version value="1.0" />
5 |   </settings>
6 | </component>
```

--------------------------------------------------------------------------------
/environment.yml:
--------------------------------------------------------------------------------

```yaml
 1 | #name: voice-recorder-mcp
 2 | #channels:
 3 | #  - conda-forge
 4 | #  - defaults
 5 | #dependencies:
 6 | #  - python>=3.10,<3.12
 7 | #  - pip
 8 | #  - numpy>=1.26.0
 9 | #  - pip:
10 | #    - "mcp[cli]>=1.2.0"
11 | #    - "git+https://github.com/openai/whisper.git"
12 | #    - "sounddevice>=0.4.6"
13 | #    - "nltk>=3.8.1"
14 | 
```

--------------------------------------------------------------------------------
/src/voice_recorder/__init__.py:
--------------------------------------------------------------------------------

```python
1 | from .config import get_config
2 | from .server import mcp, audio_service
3 | 
4 | def main():
5 |     """Voice Recorder MCP: Record audio and transcribe using Whisper."""
6 |     # Config is automatically loaded when server is imported
7 |     mcp.run()
8 | 
9 | __all__ = ["mcp", "audio_service", "main"]
```

--------------------------------------------------------------------------------
/.idea/modules.xml:
--------------------------------------------------------------------------------

```
1 | <?xml version="1.0" encoding="UTF-8"?>
2 | <project version="4">
3 |   <component name="ProjectModuleManager">
4 |     <modules>
5 |       <module fileurl="file://$PROJECT_DIR$/.idea/voice-recorder-mcp.iml" filepath="$PROJECT_DIR$/.idea/voice-recorder-mcp.iml" />
6 |     </modules>
7 |   </component>
8 | </project>
```

--------------------------------------------------------------------------------
/.idea/misc.xml:
--------------------------------------------------------------------------------

```
1 | <?xml version="1.0" encoding="UTF-8"?>
2 | <project version="4">
3 |   <component name="Black">
4 |     <option name="sdkName" value="voice-recorder-mcp" />
5 |   </component>
6 |   <component name="ProjectRootManager" version="2" project-jdk-name="Python 3.12 virtualenv at ~/PycharmProjects/voice-recorder-mcp-pip/.venv" project-jdk-type="Python SDK" />
7 | </project>
```

--------------------------------------------------------------------------------
/src/voice_recorder_mcp.egg-info/SOURCES.txt:
--------------------------------------------------------------------------------

```
 1 | LICENSE
 2 | README.md
 3 | pyproject.toml
 4 | src/voice_recorder/__init__.py
 5 | src/voice_recorder/audio_service.py
 6 | src/voice_recorder/config.py
 7 | src/voice_recorder/server.py
 8 | src/voice_recorder_mcp.egg-info/PKG-INFO
 9 | src/voice_recorder_mcp.egg-info/SOURCES.txt
10 | src/voice_recorder_mcp.egg-info/dependency_links.txt
11 | src/voice_recorder_mcp.egg-info/entry_points.txt
12 | src/voice_recorder_mcp.egg-info/requires.txt
13 | src/voice_recorder_mcp.egg-info/top_level.txt
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
 1 | [project]
 2 | name = "voice-recorder-mcp"
 3 | version = "0.1.0"
 4 | description = "MCP server for voice recording and transcription"
 5 | readme = "README.md"
 6 | license = {text = "MIT"}
 7 | requires-python = ">=3.10,<3.13"  # Allow Python 3.11 and 3.12
 8 | dependencies = [
 9 |     "mcp[cli]>=1.2.0",
10 |     "sounddevice>=0.4.6",
11 |     "numpy>=1.20.0,<2.0.0",
12 |     "openai-whisper @ git+https://github.com/openai/whisper.git",
13 | ]
14 | 
15 | [project.optional-dependencies]
16 | dev = [
17 |     "pytest>=7.0.0",
18 |     "black>=23.0.0",
19 |     "isort>=5.12.0",
20 | ]
21 | 
22 | [project.scripts]
23 | voice-recorder-mcp = "voice_recorder:main"
24 | 
25 | [tool.setuptools]
26 | package-dir = {"" = "src"}
27 | 
28 | [tool.setuptools.packages.find]
29 | where = ["src"]
30 | 
31 | [tool.uv]
32 | package = true
```

--------------------------------------------------------------------------------
/src/voice_recorder/config.py:
--------------------------------------------------------------------------------

```python
 1 | import os
 2 | import argparse
 3 | from dataclasses import dataclass
 4 | 
 5 | 
 6 | @dataclass
 7 | class Config:
 8 |     whisper_model: str = "base.en"
 9 |     sample_rate: int = 16000
10 |     max_duration: int = 60
11 | 
12 | 
13 | def parse_args():
14 |     parser = argparse.ArgumentParser(
15 |         description="MCP server for voice recording and transcription using Whisper."
16 |     )
17 |     parser.add_argument('--model', default='base.en', help='Whisper model to use')
18 |     parser.add_argument('--sample-rate', type=int, default=16000, help='Audio sample rate')
19 |     return parser.parse_args()
20 | 
21 | 
22 | def get_config():
23 |     """Load configuration from environment variables or command line arguments"""
24 |     args = parse_args()
25 | 
26 |     # Environment variables take precedence over command line arguments
27 |     config = Config(
28 |         whisper_model=os.environ.get("WHISPER_MODEL", args.model),
29 |         sample_rate=int(os.environ.get("SAMPLE_RATE", args.sample_rate)),
30 |         max_duration=int(os.environ.get("MAX_DURATION", 60))
31 |     )
32 | 
33 |     return config
```

--------------------------------------------------------------------------------
/src/voice_recorder/server.py:
--------------------------------------------------------------------------------

```python
 1 | from mcp.server.fastmcp import FastMCP, Context
 2 | from mcp.shared.exceptions import McpError
 3 | from mcp.types import ErrorData, INTERNAL_ERROR, INVALID_PARAMS
 4 | import logging
 5 | import time
 6 | from voice_recorder.audio_service import AudioService
 7 | from .config import get_config
 8 | 
 9 | # Configure logging
10 | logging.basicConfig(
11 |     level=logging.INFO,
12 |     format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
13 | )
14 | logger = logging.getLogger(__name__)
15 | 
16 | # Create an MCP server
17 | mcp = FastMCP("VoiceRecorder")
18 | 
19 | # Initialize the audio service
20 | config = get_config()
21 | audio_service = AudioService(model_name=config.whisper_model)
22 | 
23 | 
24 | @mcp.tool()
25 | def start_recording() -> str:
26 |     """Start recording audio from the default microphone"""
27 |     try:
28 |         return audio_service.start_recording()
29 |     except Exception as e:
30 |         logger.error(f"Error starting recording: {str(e)}")
31 |         raise McpError(ErrorData(INTERNAL_ERROR, f"Recording error: {str(e)}"))
32 | 
33 | 
34 | @mcp.tool()
35 | def stop_and_transcribe() -> str:
36 |     """Stop recording and transcribe the audio to text"""
37 |     try:
38 |         audio_data, msg = audio_service.stop_recording()
39 |         if audio_data is None:
40 |             return msg
41 | 
42 |         return audio_service.transcribe(audio_data)
43 |     except Exception as e:
44 |         logger.error(f"Error during transcription: {str(e)}")
45 |         raise McpError(ErrorData(INTERNAL_ERROR, f"Transcription error: {str(e)}"))
46 | 
47 | 
48 | @mcp.tool()
49 | def record_and_transcribe(duration_seconds: int) -> str:
50 |     """
51 |     Record audio for the specified duration and transcribe it
52 | 
53 |     Args:
54 |         duration_seconds: Number of seconds to record (1-60)
55 |     """
56 |     try:
57 |         # Validate input
58 |         if not isinstance(duration_seconds, int) or duration_seconds < 1 or duration_seconds > 60:
59 |             raise McpError(
60 |                 ErrorData(INVALID_PARAMS, "Duration must be between 1 and 60 seconds")
61 |             )
62 | 
63 |         # Start recording
64 |         start_recording()
65 |         logger.info(f"Recording for {duration_seconds} seconds")
66 | 
67 |         # Wait for specified duration
68 |         time.sleep(duration_seconds)
69 | 
70 |         # Stop and transcribe
71 |         return stop_and_transcribe()
72 |     except Exception as e:
73 |         logger.error(f"Error in record_and_transcribe: {str(e)}")
74 |         # Make sure recording is stopped in case of error
75 |         if audio_service.is_recording:
76 |             try:
77 |                 audio_service.stop_recording()
78 |             except:
79 |                 pass
80 |         raise McpError(ErrorData(INTERNAL_ERROR, f"Error: {str(e)}"))
```

--------------------------------------------------------------------------------
/src/voice_recorder/audio_service.py:
--------------------------------------------------------------------------------

```python
 1 | import time
 2 | import threading
 3 | import numpy as np
 4 | import sounddevice as sd
 5 | from queue import Queue
 6 | import whisper
 7 | import logging
 8 | 
 9 | logger = logging.getLogger(__name__)
10 | 
11 | 
12 | class AudioService:
13 |     def __init__(self, model_name="base.en", sample_rate=16000):
14 |         """Initialize the audio service with recording and transcription capabilities"""
15 |         self.is_recording = False
16 |         self.sample_rate = sample_rate
17 |         self.stop_event = None
18 |         self.data_queue = None
19 |         self.recording_thread = None
20 | 
21 |         # Initialize transcriber
22 |         logger.info(f"Loading Whisper model: {model_name}")
23 |         try:
24 |             self.transcriber = whisper.load_model(model_name)
25 |             logger.info(f"Whisper model '{model_name}' loaded successfully")
26 |         except Exception as e:
27 |             logger.error(f"Failed to load Whisper model: {str(e)}")
28 |             raise
29 | 
30 |     def start_recording(self):
31 |         """Start recording audio from the default microphone"""
32 |         if self.is_recording:
33 |             return "Already recording"
34 | 
35 |         self.data_queue = Queue()
36 |         self.stop_event = threading.Event()
37 | 
38 |         def callback(indata, frames, time, status):
39 |             if status:
40 |                 logger.warning(f"Recording status: {status}")
41 |             self.data_queue.put(bytes(indata))
42 | 
43 |         def record_thread():
44 |             try:
45 |                 with sd.RawInputStream(
46 |                         samplerate=self.sample_rate,
47 |                         dtype="int16",
48 |                         channels=1,
49 |                         callback=callback
50 |                 ):
51 |                     logger.info("Recording started")
52 |                     while not self.stop_event.is_set():
53 |                         time.sleep(0.1)
54 |             except Exception as e:
55 |                 logger.error(f"Error in recording thread: {str(e)}")
56 | 
57 |         self.recording_thread = threading.Thread(target=record_thread)
58 |         self.recording_thread.daemon = True  # Make thread exit when main program exits
59 |         self.recording_thread.start()
60 |         self.is_recording = True
61 | 
62 |         return "Recording started"
63 | 
64 |     def stop_recording(self):
65 |         """Stop recording and return the audio data"""
66 |         if not self.is_recording:
67 |             return None, "Not recording"
68 | 
69 |         self.stop_event.set()
70 |         self.recording_thread.join()
71 |         self.is_recording = False
72 | 
73 |         logger.info("Processing audio data...")
74 |         audio_data = b"".join(list(self.data_queue.queue))
75 |         audio_np = np.frombuffer(audio_data, dtype=np.int16).astype(np.float32) / 32768.0
76 | 
77 |         return audio_np, "Recording stopped"
78 | 
79 |     def transcribe(self, audio_np):
80 |         """Transcribe audio data to text"""
81 |         if audio_np is None or audio_np.size == 0:
82 |             return "No audio recorded"
83 | 
84 |         logger.info("Transcribing audio...")
85 |         try:
86 |             result = self.transcriber.transcribe(audio_np, fp16=False)
87 |             transcription = result["text"].strip()
88 |             logger.info(f"Transcription completed: {transcription[:30]}...")
89 |             return transcription
90 |         except Exception as e:
91 |             logger.error(f"Error during transcription: {str(e)}")
92 |             raise
```