pinkpixel-dev/blabber-mcp # codebase.md

# Directory Structure

```
├── .gitignore
├── CHANGELOG.md
├── Dockerfile
├── LICENSE
├── mcp_config.json
├── package-lock.json
├── package.json
├── README.md
├── smithery.yaml
├── src
│   └── index.ts
└── tsconfig.json
```

# Files

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
node_modules/
build/
*.log
.env*
howler.md
memory.db
vector_store.db
output/
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
# <span style="color: #FF69B4;">📢 Blabber-MCP</span> <span style="color: #ADD8E6;">🗣️</span>

[![smithery badge](https://smithery.ai/badge/@pinkpixel-dev/blabber-mcp)](https://smithery.ai/server/@pinkpixel-dev/blabber-mcp)

<span style="color: #90EE90;">An MCP server that gives your LLMs a voice using OpenAI's Text-to-Speech API!</span> 🔊

---

## <span style="color: #FFD700;">✨ Features</span>

*   **Text-to-Speech:** Converts input text into high-quality spoken audio.
*   **Voice Selection:** Choose from various OpenAI voices (`alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`).
*   **Model Selection:** Use standard (`tts-1`) or high-definition (`tts-1-hd`) models.
*   **Format Options:** Get audio output in `mp3`, `opus`, `aac`, or `flac`.
*   **File Saving:** Saves the generated audio to a local file.
*   **Optional Playback:** Automatically play the generated audio using a configurable system command.
*   **Configurable Defaults:** Set a default voice via configuration.

---

## <span style="color: #FFA07A;">🔧 Configuration</span>

To use this server, you need to add its configuration to your MCP client's settings file (e.g., `mcp_settings.json`).

1.  **Get OpenAI API Key:** You need an API key from [OpenAI](https://platform.openai.com/api-keys).
2.  **Add to MCP Settings:** Add the following block to the `mcpServers` object in your settings file, replacing `"YOUR_OPENAI_API_KEY"` with your actual key.

```json
{
  "mcpServers": {
    "blabber-mcp": {
      "command": "node",
      "args": ["/full/path/to/blabber-mcp/build/index.js"], (IMPORTANT: Use the full, absolute path to the built index.js file)
      "env": {
        "OPENAI_API_KEY": "YOUR_OPENAI_API_KEY",
        "AUDIO_PLAYER_COMMAND": "xdg-open", (Optional: Command to play audio (e.g., "cvlc", "vlc", "mpv", "ffplay", "afplay", "xdg-open"; defaults to "cvlc")
        "DEFAULT_TTS_VOICE": "nova" (Optional: Set default voice (alloy, echo, fable, onyx, nova, shimmer); defaults to nova)
      },
      "disabled": false,
      "alwaysAllow": []
    }
  }
}
```

<span style="color: #FF6347;">**Important:**</span> Make sure the `args` path points to the correct location of the `build/index.js` file within your `blabber-mcp` project directory. Use the full absolute path.

---

## <span style="color: #87CEEB;">🚀 Usage</span>

Once configured and running, you can use the `text_to_speech` tool via your MCP client.

**Tool:** `text_to_speech`
**Server:** `blabber-mcp` (or the key you used in the config)

**Arguments:**

*   `input` (string, **required**): The text to synthesize.
*   `voice` (string, optional): The voice to use (`alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`). Defaults to the `DEFAULT_TTS_VOICE` set in config, or `nova`.
*   `model` (string, optional): The model (`tts-1`, `tts-1-hd`). Defaults to `tts-1`.
*   `response_format` (string, optional): Audio format (`mp3`, `opus`, `aac`, `flac`). Defaults to `mp3`.
*   `play` (boolean, optional): Set to `true` to automatically play the audio after saving. Defaults to `false`.

**Example Tool Call (with playback):**

```xml
<use_mcp_tool>
  <server_name>blabber-mcp</server_name>
  <tool_name>text_to_speech</tool_name>
  <arguments>
  {
    "input": "Hello from Blabber MCP!",
    "voice": "shimmer",
    "play": true
  }
  </arguments>
</use_mcp_tool>
```

**Output:**

The tool saves the audio file to the `output/` directory within the `blabber-mcp` project folder and returns a JSON response like this:

```json
{
  "message": "Audio saved successfully. Playback initiated using command: cvlc",
  "filePath": "path/to/speech_1743908694848.mp3", 
  "format": "mp3",
  "voiceUsed": "shimmer"
}
```

---

## <span style="color: #98FB98;">📜 License</span>

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

---

## <span style="color: #BA55D3;">🕒 Changelog</span>

See the [CHANGELOG.md](CHANGELOG.md) file for details on version history.

---

<p align="center">Made with ❤️ by Pink Pixel</p>

```

--------------------------------------------------------------------------------
/mcp_config.json:
--------------------------------------------------------------------------------

```json
{
  "mcpServers": {
    "openai-tts": {
      "command": "node",
      "args": ["/path/to/blabber-mcp/build/index.js"],
      "env": {
        "OPENAI_API_KEY": "YOUR_OPENAI_API_KEY",
        "AUDIO_PLAYER_COMMAND": "cvlc",
        "DEFAULT_TTS_VOICE": "nova" 
      },
      "disabled": false,
      "alwaysAllow": []
    }
  }
}
```

--------------------------------------------------------------------------------
/tsconfig.json:
--------------------------------------------------------------------------------

```json
{
  "compilerOptions": {
    "target": "ES2022",
    "module": "Node16",
    "moduleResolution": "Node16",
    "outDir": "./build",
    "rootDir": "./src",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "forceConsistentCasingInFileNames": true
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules"]
}

```

--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------

```json
{
  "name": "@pinkpixel/blabber-mcp",
  "version": "0.1.2",
  "description": "An MCP server that gives a voice to LLMs",
  "private": false,
  "type": "module",
  "bin": {
    "blabber-mcp": "./build/index.js"
  },
  "files": [
    "build"
  ],
  "scripts": {
    "build": "tsc && node -e \"require('fs').chmodSync('build/index.js', '755')\"",
    "prepare": "npm run build",
    "watch": "tsc --watch",
    "inspector": "npx @modelcontextprotocol/inspector build/index.js"
  },
  "dependencies": {
    "@modelcontextprotocol/sdk": "0.6.0",
    "openai": "^4.91.1"
  },
  "devDependencies": {
    "@types/node": "^20.11.24",
    "typescript": "^5.3.3"
  }
}

```

--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
# Generated by https://smithery.ai. See: https://smithery.ai/docs/build/project-config
# syntax=docker/dockerfile:1

# Builder stage
FROM node:lts-alpine AS builder
WORKDIR /app
# Install build dependencies and source
COPY package.json package-lock.json tsconfig.json ./
COPY src ./src
RUN apk add --no-cache git python3 make g++ \
 && npm install \
 && npm run build

# Final stage
FROM node:lts-alpine
WORKDIR /app
# Copy built code and production dependencies
COPY --from=builder /app/build ./build
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./package.json

# Ensure output directory exists
RUN mkdir -p /app/output

ENV NODE_ENV=production
ENTRYPOINT ["node", "build/index.js"]

```

--------------------------------------------------------------------------------
/smithery.yaml:
--------------------------------------------------------------------------------

```yaml
# Smithery configuration file: https://smithery.ai/docs/build/project-config

startCommand:
  type: stdio
  commandFunction:
    # A JS function that produces the CLI command based on the given config to start the MCP on stdio.
    |-
    (config) => ({ command: 'node', args: ['build/index.js'], env: { OPENAI_API_KEY: config.openaiApiKey, AUDIO_PLAYER_COMMAND: config.audioPlayerCommand, DEFAULT_TTS_VOICE: config.defaultTtsVoice } })
  configSchema:
    # JSON Schema defining the configuration options for the MCP.
    type: object
    required:
      - openaiApiKey
    properties:
      openaiApiKey:
        type: string
        description: OpenAI API key for authentication
      audioPlayerCommand:
        type: string
        default: xdg-open
        description: Command to play audio
      defaultTtsVoice:
        type: string
        default: nova
        description: Default TTS voice
  exampleConfig:
    openaiApiKey: YOUR_OPENAI_API_KEY
    audioPlayerCommand: xdg-open
    defaultTtsVoice: nova

```

--------------------------------------------------------------------------------
/CHANGELOG.md:
--------------------------------------------------------------------------------

```markdown
# <span style="color: #FF69B4;">🕒 Changelog</span>

All notable changes to the **Blabber-MCP** project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

---

## <span style="color: #ADD8E6;">[0.1.2] - 2025-04-05</span>

### <span style="color: #90EE90;">✨ Added</span>

*   Configurable default voice via `DEFAULT_TTS_VOICE` environment variable.

## <span style="color: #FFD700;">[0.1.1] - 2025-04-05</span>

### <span style="color: #90EE90;">✨ Added</span>

*   Optional automatic playback of generated audio via `play: true` parameter.
*   Configurable audio player command via `AUDIO_PLAYER_COMMAND` environment variable (defaults to `xdg-open`).
*   Server now saves audio to `output/` directory and returns file path instead of base64 data.

## <span style="color: #FFA07A;">[0.1.0] - 2025-04-05</span>

### <span style="color: #90EE90;">✨ Added</span>

*   Initial Blabber-MCP server setup.
*   `text_to_speech` tool using OpenAI TTS API.
*   Support for selecting voice, model, and response format.
*   Requires `OPENAI_API_KEY` environment variable.
*   Basic project structure (`README.md`, `LICENSE`, `CHANGELOG.md`).
```

--------------------------------------------------------------------------------
/src/index.ts:
--------------------------------------------------------------------------------

```typescript
#!/usr/bin/env node

import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import {
  CallToolRequestSchema,
  ListToolsRequestSchema,
  McpError,
  ErrorCode,
} from "@modelcontextprotocol/sdk/types.js";
import OpenAI from "openai";
import { APIError } from "openai/error.js";
import fs from 'fs';
import path from 'path';
import { fileURLToPath } from 'url';
import { exec } from 'child_process';

// --- Configuration ---
const API_KEY = process.env.OPENAI_API_KEY;
if (!API_KEY) {
  console.error("Error: OPENAI_API_KEY environment variable is not set.");
  process.exit(1);
}

const AUDIO_PLAYER_COMMAND = process.env.AUDIO_PLAYER_COMMAND || 'xdg-open';

// Define allowed voices
const ALLOWED_VOICES = ["alloy", "echo", "fable", "onyx", "nova", "shimmer"] as const;
type AllowedVoice = typeof ALLOWED_VOICES[number];

// Read default voice from env var, validate, and set default
let DEFAULT_VOICE: AllowedVoice = "nova"; 
const configuredDefaultVoice = process.env.DEFAULT_TTS_VOICE;
if (configuredDefaultVoice && (ALLOWED_VOICES as readonly string[]).includes(configuredDefaultVoice)) {
    DEFAULT_VOICE = configuredDefaultVoice as AllowedVoice;
    console.error(`Using configured default voice: ${DEFAULT_VOICE}`);
} else if (configuredDefaultVoice) {
    console.error(`Warning: Invalid DEFAULT_TTS_VOICE "${configuredDefaultVoice}" provided. Using default "alloy".`);
} else {
    console.error(`Using default voice: ${DEFAULT_VOICE}`);
}


const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
const OUTPUT_DIR = path.resolve(__dirname, '..', 'output');

const openai = new OpenAI({
  apiKey: API_KEY,
});

// --- MCP Server Setup ---
const server = new Server(
  {
    name: "@pink/pixel/blabber-mcp",
    version: "0.1.2", 
  },
  {
    capabilities: {
      tools: {},
    },
  }
);

// --- Tool Definition ---
const TEXT_TO_SPEECH_TOOL_NAME = "text_to_speech";

server.setRequestHandler(ListToolsRequestSchema, async () => {
  return {
    tools: [
      {
        name: TEXT_TO_SPEECH_TOOL_NAME,
        description: `Converts text into spoken audio using OpenAI TTS (default voice: ${DEFAULT_VOICE}), saves it to a file, and optionally plays it.`, // Updated description
        inputSchema: {
          type: "object",
          properties: {
            input: {
              type: "string",
              description: "The text to synthesize into speech.",
            },
            voice: {
              type: "string",
              description: `Optional: The voice to use. Overrides the configured default (${DEFAULT_VOICE}).`,
              enum: [...ALLOWED_VOICES], // Use the defined constant
            },
            model: {
              type: "string",
              description: "The TTS model to use.",
              enum: ["tts-1", "tts-1-hd"],
              default: "tts-1",
            },
            response_format: {
              type: "string",
              description: "The format of the audio response.",
              enum: ["mp3", "opus", "aac", "flac"],
              default: "mp3",
            },
            play: {
              type: "boolean",
              description: "Whether to automatically play the generated audio file.",
              default: false,
            }
          },
          required: ["input"],
        },
      },
    ],
  };
});

// --- Tool Implementation ---

type TextToSpeechArgs = {
  input: string;
  voice?: AllowedVoice; // Use the specific type
  model?: "tts-1" | "tts-1-hd";
  response_format?: "mp3" | "opus" | "aac" | "flac";
  play?: boolean;
};

// Updated type guard
function isValidTextToSpeechArgs(args: any): args is TextToSpeechArgs {
  return (
    typeof args === "object" &&
    args !== null &&
    typeof args.input === "string" &&
    (args.voice === undefined || (ALLOWED_VOICES as readonly string[]).includes(args.voice)) && // Validate against allowed voices
    (args.model === undefined || ["tts-1", "tts-1-hd"].includes(args.model)) &&
    (args.response_format === undefined || ["mp3", "opus", "aac", "flac"].includes(args.response_format)) &&
    (args.play === undefined || typeof args.play === 'boolean')
  );
}

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (request.params.name !== TEXT_TO_SPEECH_TOOL_NAME) {
    throw new McpError(ErrorCode.MethodNotFound, `Unknown tool: ${request.params.name}`);
  }

  if (!isValidTextToSpeechArgs(request.params.arguments)) {
    throw new McpError(ErrorCode.InvalidParams, "Invalid arguments for text_to_speech tool.");
  }

  const {
    input,
    // Use voice from args if provided, otherwise use the configured DEFAULT_VOICE
    voice = DEFAULT_VOICE,
    model = "tts-1",
    response_format = "mp3",
    play = false,
  } = request.params.arguments;

  // Ensure the final voice is valid (handles case where default might somehow be invalid, though unlikely with validation above)
  const finalVoice: AllowedVoice = (ALLOWED_VOICES as readonly string[]).includes(voice) ? voice : DEFAULT_VOICE;


  let playbackMessage = "";

  try {
    if (!fs.existsSync(OUTPUT_DIR)) {
      fs.mkdirSync(OUTPUT_DIR, { recursive: true });
      console.error(`Created output directory: ${OUTPUT_DIR}`);
    }

    console.error(`Generating speech with voice: ${finalVoice}`); // Log the voice being used

    const speechResponse = await openai.audio.speech.create({
      model: model,
      voice: finalVoice, // Use the validated final voice
      input: input,
      response_format: response_format,
    });

    const audioBuffer = Buffer.from(await speechResponse.arrayBuffer());
    const timestamp = Date.now();
    const filename = `speech_${timestamp}.${response_format}`;
    const filePath = path.join(OUTPUT_DIR, filename);
    const relativeFilePath = path.relative(process.cwd(), filePath);

    fs.writeFileSync(filePath, audioBuffer);
    console.error(`Audio saved to: ${filePath}`);

    if (play) {
      const command = `${AUDIO_PLAYER_COMMAND} "${filePath}"`;
      console.error(`Attempting to play audio with command: ${command}`);
      exec(command, (error, stdout, stderr) => {
        if (error) console.error(`Playback Error: ${error.message}`);
        if (stderr) console.error(`Playback Stderr: ${stderr}`);
        if (stdout) console.error(`Playback stdout: ${stdout}`);
      });
      playbackMessage = ` Playback initiated using command: ${AUDIO_PLAYER_COMMAND}.`;
    }

    return {
      content: [
        {
          type: "text",
          text: JSON.stringify({
            message: `Audio saved successfully.${playbackMessage}`,
            filePath: relativeFilePath,
            format: response_format,
            voiceUsed: finalVoice, // Inform client which voice was actually used
          }),
          mimeType: "application/json",
        },
      ],
    };
  } catch (error) {
    let errorMessage = "Failed to generate speech.";
    if (error instanceof APIError) {
      errorMessage = `OpenAI API Error (${error.status}): ${error.message}`;
    } else if (error instanceof Error) {
      errorMessage = error.message;
    }
    console.error(`[${TEXT_TO_SPEECH_TOOL_NAME} Error]`, errorMessage, error);
    return {
        content: [{ type: "text", text: errorMessage }],
        isError: true
    }
  }
});

// --- Server Start ---
async function main() {
  const transport = new StdioServerTransport();
  server.onerror = (error) => console.error("[MCP Error]", error);
  process.on('SIGINT', async () => {
      console.error("Received SIGINT, shutting down server...");
      await server.close();
      process.exit(0);
  });
  await server.connect(transport);
  console.error("OpenAI TTS MCP server running on stdio");
}

main().catch((error) => {
  console.error("Server failed to start:", error);
  process.exit(1);
});

```