# Directory Structure
```
├── .github
│ └── workflows
│ └── npm-publish.yml
├── .gitignore
├── assets
│ └── banner.png
├── CHANGELOG.md
├── LICENSE
├── package-lock.json
├── package.json
├── pnpm-lock.yaml
├── README.md
├── README.zh-CN.md
├── src
│ ├── config
│ │ ├── config-manager.ts
│ │ ├── constants.ts
│ │ └── schemas.ts
│ ├── core
│ │ ├── base-client.ts
│ │ ├── rate-limiter.ts
│ │ └── task-manager.ts
│ ├── index.ts
│ ├── services
│ │ ├── image-service.ts
│ │ └── tts-service.ts
│ └── utils
│ ├── error-handler.ts
│ └── file-handler.ts
└── tsconfig.json
```
# Files
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
```
# Dependency directories
node_modules/
npm-debug.log
yarn-debug.log
yarn-error.log
# Environment variables
.env
.env.local
.env.development.local
.env.test.local
.env.production.local
# Generated files
generated-images/
generated-audio/
# Build output
dist/
build/
# IDE and editor files
.idea/
.vscode/
*.swp
*.swo
.DS_Store
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
# Coverage directory used by tools like istanbul
coverage/
# nyc test coverage
.nyc_output/
# Optional npm cache directory
.npm
# Optional eslint cache
.eslintcache
.mcp.json
mcp.json
```
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
```markdown
# Minimax MCP Tools

A Model Context Protocol (MCP) server for Minimax AI integration, providing async image generation and text-to-speech with advanced rate limiting and error handling.
English | [简体中文](README.zh-CN.md)
### MCP Configuration
Add to your MCP settings:
```json
{
"mcpServers": {
"minimax-mcp-tools": {
"command": "npx",
"args": ["minimax-mcp-tools"],
"env": {
"MINIMAX_API_KEY": "your_api_key_here"
}
}
}
}
```
## Async Design - Perfect for Content Production at Scale
This MCP server uses an **asynchronous submit-and-barrier pattern** designed for **batch content creation**:
🎬 **Narrated Slideshow Production** - Generate dozens of slide images and corresponding narration in parallel
📚 **AI-Driven Audiobook Creation** - Produce chapters with multiple voice characters simultaneously
🖼️ **Website Asset Generation** - Create consistent visual content and audio elements for web projects
🎯 **Multimedia Content Pipelines** - Perfect for LLM-driven content workflows requiring both visuals and audio
### Architecture Benefits:
1. **Submit Phase**: Tools return immediately with task IDs, tasks execute in background
2. **Smart Rate Limiting**: Adaptive rate limiting (10 RPM images, 20 RPM speech) with burst capacity
3. **Barrier Synchronization**: `task_barrier` waits for all tasks and returns comprehensive results
4. **Batch Optimization**: Submit multiple tasks to saturate rate limits, then barrier once for maximum throughput
## Tools
### `submit_image_generation`
**Submit Image Generation Task** - Generate images asynchronously.
**Required:** `prompt`, `outputFile`
**Optional:** `aspectRatio`, `customSize`, `seed`, `subjectReference`, `style`
### `submit_speech_generation`
**Submit Speech Generation Task** - Convert text to speech asynchronously.
**Required:** `text`, `outputFile`
**Optional:** `highQuality`, `voiceId`, `speed`, `volume`, `pitch`, `emotion`, `format`, `sampleRate`, `bitrate`, `languageBoost`, `intensity`, `timbre`, `sound_effects`
### `task_barrier`
**Wait for Task Completion** - Wait for ALL submitted tasks to complete and retrieve results. Essential for batch processing.
## Architecture
```mermaid
sequenceDiagram
participant User
participant MCP as MCP Server
participant TM as Task Manager
participant API as Minimax API
Note over User, API: Async Submit-and-Barrier Pattern
User->>MCP: submit_image_generation(prompt1)
MCP->>TM: submitImageTask()
TM-->>MCP: taskId: img-001
MCP-->>User: "Task img-001 submitted"
par Background Execution (Rate Limited)
TM->>API: POST /image/generate
API-->>TM: image data + save file
end
User->>MCP: submit_speech_generation(text1)
MCP->>TM: submitTTSTask()
TM-->>MCP: taskId: tts-002
MCP-->>User: "Task tts-002 submitted"
par Background Execution (Rate Limited)
TM->>API: POST /speech/generate
API-->>TM: audio data + save file
end
User->>MCP: submit_image_generation(prompt2)
MCP->>TM: submitImageTask()
TM-->>MCP: taskId: img-003
MCP-->>User: "Task img-003 submitted"
par Background Execution (Rate Limited)
TM->>API: POST /image/generate (queued)
API-->>TM: image data + save file
end
User->>MCP: task_barrier()
MCP->>TM: barrier()
TM->>TM: wait for all tasks
TM-->>MCP: results summary
MCP-->>User: ✅ All tasks completed<br/>Files available at specified paths
Note over User, API: Immediate Task Submission + Background Rate-Limited Execution
```
## License
MIT
```
--------------------------------------------------------------------------------
/CHANGELOG.md:
--------------------------------------------------------------------------------
```markdown
# Changelog
## [2.2.0] - 2025-08-14
### Added
- Speech 2.5 series models (`speech-2.5-hd-preview`, `speech-2.5-turbo-preview`)
- 13 additional language boost options
### Changed
- **BREAKING**: Removed Speech 2.0 series models
- Default model: `speech-02-hd` → `speech-2.5-hd-preview`
### Fixed
- Task barrier bug returning 0 completed tasks
```
--------------------------------------------------------------------------------
/.github/workflows/npm-publish.yml:
--------------------------------------------------------------------------------
```yaml
# This workflow will build TypeScript and publish a package to npm when a release is created
# For more information see: https://docs.github.com/en/actions/publishing-packages/publishing-nodejs-packages
name: Build and Publish
on:
release:
types: [created]
jobs:
publish-npm:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
registry-url: https://registry.npmjs.org/
- name: Install pnpm
uses: pnpm/action-setup@v3
with:
version: 8
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
- name: Build TypeScript
run: pnpm run build
- name: Publish to npm
run: pnpm publish --no-git-checks
env:
NODE_AUTH_TOKEN: ${{secrets.npm_token}}
```
--------------------------------------------------------------------------------
/tsconfig.json:
--------------------------------------------------------------------------------
```json
{
"compilerOptions": {
"target": "ES2022",
"module": "ESNext",
"moduleResolution": "Node",
"lib": ["ES2022", "DOM"],
"outDir": "./dist",
"rootDir": "./src",
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"forceConsistentCasingInFileNames": true,
"declaration": true,
"declarationMap": true,
"sourceMap": true,
"removeComments": true,
"allowSyntheticDefaultImports": true,
"resolveJsonModule": true,
"experimentalDecorators": true,
"emitDecoratorMetadata": true,
"noImplicitAny": true,
"noImplicitReturns": true,
"noUnusedLocals": true,
"noUnusedParameters": true,
"exactOptionalPropertyTypes": true,
"noImplicitOverride": true,
"noPropertyAccessFromIndexSignature": true,
"noUncheckedIndexedAccess": true
},
"include": [
"src/**/*"
],
"exclude": [
"node_modules",
"dist",
"**/*.test.ts",
"**/*.spec.ts"
],
"ts-node": {
"esm": true,
"experimentalSpecifierResolution": "node"
}
}
```
--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------
```json
{
"name": "minimax-mcp-tools",
"version": "2.2.1",
"description": "Async MCP server with Minimax API integration for image generation and text-to-speech",
"type": "module",
"main": "dist/index.js",
"bin": {
"minimax-mcp-tools": "dist/index.js"
},
"scripts": {
"build": "tsc",
"start": "node dist/index.js",
"dev": "ts-node src/index.ts",
"dev:watch": "nodemon --exec ts-node src/index.ts",
"test": "echo \"Error: no test specified\" && exit 1",
"prepublishOnly": "npm run build"
},
"keywords": [
"mcp",
"minimax",
"ai",
"image-generation",
"text-to-speech",
"tts"
],
"author": "PsychArch (https://github.com/PsychArch)",
"license": "MIT",
"repository": {
"type": "git",
"url": "git+https://github.com/PsychArch/minimax-mcp-tools.git"
},
"bugs": {
"url": "https://github.com/PsychArch/minimax-mcp-tools/issues"
},
"homepage": "https://github.com/PsychArch/minimax-mcp-tools#readme",
"dependencies": {
"@modelcontextprotocol/sdk": "^1.17.0",
"node-fetch": "^3.3.2",
"zod": "^3.25.76"
},
"devDependencies": {
"@types/node": "^20.19.9",
"nodemon": "^3.0.0",
"ts-node": "^10.9.0",
"typescript": "^5.3.0"
},
"engines": {
"node": ">=16.0.0"
},
"files": [
"dist/",
"src/",
"README.md",
"README.zh-CN.md",
"LICENSE",
"assets/"
],
"publishConfig": {
"access": "public"
}
}
```
--------------------------------------------------------------------------------
/src/config/config-manager.ts:
--------------------------------------------------------------------------------
```typescript
import { MinimaxConfigError } from '../utils/error-handler.js';
interface Config {
apiKey: string;
apiHost: string;
logLevel: 'error' | 'debug';
tempDir: string;
maxConcurrency: number;
retryAttempts: number;
retryDelay: number;
}
interface RetryConfig {
attempts: number;
delay: number;
}
export class ConfigManager {
private static instance: ConfigManager | null = null;
private config!: Config;
constructor() {
if (ConfigManager.instance) {
return ConfigManager.instance;
}
this.config = this.loadConfig();
ConfigManager.instance = this;
}
static getInstance(): ConfigManager {
if (!ConfigManager.instance) {
ConfigManager.instance = new ConfigManager();
}
return ConfigManager.instance;
}
private loadConfig(): Config {
return {
apiKey: this.getRequiredEnv('MINIMAX_API_KEY'),
apiHost: 'https://api.minimaxi.com',
logLevel: 'error',
tempDir: '/tmp',
maxConcurrency: 5,
retryAttempts: 3,
retryDelay: 1000
};
}
private getRequiredEnv(key: string): string {
const value = process.env[key];
if (!value) {
throw new MinimaxConfigError(`Required environment variable ${key} is not set`);
}
return value;
}
get<K extends keyof Config>(key: K): Config[K] {
return this.config[key];
}
getApiKey(): string {
return this.config.apiKey;
}
getApiHost(): string | undefined {
return this.config.apiHost;
}
getTempDir(): string {
return this.config.tempDir;
}
getMaxConcurrency(): number {
return this.config.maxConcurrency;
}
getRetryConfig(): RetryConfig {
return {
attempts: this.config.retryAttempts,
delay: this.config.retryDelay
};
}
isDebugMode(): boolean {
return this.config.logLevel === 'debug';
}
// Validate configuration
validate(): boolean {
const required: Array<keyof Config> = ['apiKey'];
const missing = required.filter(key => !this.config[key]);
if (missing.length > 0) {
throw new MinimaxConfigError(`Missing required configuration: ${missing.join(', ')}`);
}
return true;
}
}
```
--------------------------------------------------------------------------------
/src/core/rate-limiter.ts:
--------------------------------------------------------------------------------
```typescript
import { MinimaxRateLimitError } from '../utils/error-handler.js';
interface RateLimiterConfig {
rpm: number;
burst?: number;
window?: number;
}
interface AdaptiveRateLimiterConfig extends RateLimiterConfig {
backoffFactor?: number;
recoveryFactor?: number;
maxBackoff?: number;
}
interface QueueRequest {
resolve: () => void;
reject: (error: Error) => void;
timestamp: number;
}
interface RateLimiterStatus {
tokens: number;
queueLength: number;
rpm: number;
burst: number;
}
export interface AdaptiveStatus extends RateLimiterStatus {
consecutiveErrors: number;
adaptedRpm: number;
originalRpm: number;
}
export class RateLimiter {
protected rpm: number;
protected burst: number;
protected window: number;
protected interval: number;
protected tokens: number;
protected lastRefill: number;
protected queue: QueueRequest[];
constructor({ rpm, burst = 1, window = 60000 }: RateLimiterConfig) {
this.rpm = rpm;
this.burst = burst;
this.window = window;
this.interval = window / rpm;
// Token bucket algorithm
this.tokens = burst;
this.lastRefill = Date.now();
this.queue = [];
}
async acquire(): Promise<void> {
return new Promise<void>((resolve, reject) => {
this.queue.push({ resolve, reject, timestamp: Date.now() });
this.processQueue();
});
}
private processQueue(): void {
if (this.queue.length === 0) return;
this.refillTokens();
while (this.queue.length > 0 && this.tokens > 0) {
const request = this.queue.shift();
if (!request) break;
this.tokens--;
// Schedule the next refill
const delay = Math.max(0, this.interval - (Date.now() - this.lastRefill));
setTimeout(() => this.processQueue(), delay);
request.resolve();
}
}
private refillTokens(): void {
const now = Date.now();
const timePassed = now - this.lastRefill;
const tokensToAdd = Math.floor(timePassed / this.interval);
if (tokensToAdd > 0) {
this.tokens = Math.min(this.burst, this.tokens + tokensToAdd);
this.lastRefill = now;
}
}
getStatus(): RateLimiterStatus {
this.refillTokens();
return {
tokens: this.tokens,
queueLength: this.queue.length,
rpm: this.rpm,
burst: this.burst
};
}
reset(): void {
this.tokens = this.burst;
this.lastRefill = Date.now();
this.queue = [];
}
}
export class AdaptiveRateLimiter extends RateLimiter {
private consecutiveErrors: number;
private originalRpm: number;
private backoffFactor: number;
private recoveryFactor: number;
private maxBackoff: number;
constructor(config: AdaptiveRateLimiterConfig) {
super(config);
this.consecutiveErrors = 0;
this.originalRpm = this.rpm;
this.backoffFactor = config.backoffFactor || 0.5;
this.recoveryFactor = config.recoveryFactor || 1.1;
this.maxBackoff = config.maxBackoff || 5;
}
onSuccess(): void {
// Gradually recover rate limit after success
if (this.consecutiveErrors > 0) {
this.consecutiveErrors = Math.max(0, this.consecutiveErrors - 1);
if (this.consecutiveErrors === 0) {
this.rpm = Math.min(this.originalRpm, this.rpm * this.recoveryFactor);
this.interval = this.window / this.rpm;
}
}
}
onError(error: Error): void {
if (error instanceof MinimaxRateLimitError) {
this.consecutiveErrors++;
// Reduce rate limit on consecutive errors
const backoffMultiplier = Math.pow(this.backoffFactor, Math.min(this.consecutiveErrors, this.maxBackoff));
this.rpm = Math.max(1, this.originalRpm * backoffMultiplier);
this.interval = this.window / this.rpm;
// Clear some tokens to enforce the new limit
this.tokens = Math.min(this.tokens, Math.floor(this.burst * backoffMultiplier));
}
}
getAdaptiveStatus(): AdaptiveStatus {
return {
...this.getStatus(),
consecutiveErrors: this.consecutiveErrors,
adaptedRpm: this.rpm,
originalRpm: this.originalRpm
};
}
}
```
--------------------------------------------------------------------------------
/src/core/base-client.ts:
--------------------------------------------------------------------------------
```typescript
import fetch from 'node-fetch';
import type { RequestInit, Response } from 'node-fetch';
import { ConfigManager } from '../config/config-manager.js';
import { API_CONFIG } from '../config/constants.js';
import { ErrorHandler, MinimaxError } from '../utils/error-handler.js';
interface BaseClientOptions {
baseURL?: string;
timeout?: number;
}
interface RequestOptions extends Omit<RequestInit, 'body'> {
body?: any;
headers?: Record<string, string>;
}
interface HealthCheckResult {
status: 'healthy' | 'unhealthy';
timestamp: string;
error?: string;
}
interface APIResponse {
base_resp?: {
status_code: number;
status_msg?: string;
};
[key: string]: any;
}
export class MinimaxBaseClient {
protected config: ConfigManager;
protected baseURL: string;
protected timeout: number;
protected retryConfig: { attempts: number; delay: number };
constructor(options: BaseClientOptions = {}) {
this.config = ConfigManager.getInstance();
this.baseURL = options.baseURL || API_CONFIG.BASE_URL;
this.timeout = options.timeout || API_CONFIG.TIMEOUT;
this.retryConfig = this.config.getRetryConfig();
}
async makeRequest(endpoint: string, options: RequestOptions = {}): Promise<APIResponse> {
const url = `${this.baseURL}${endpoint}`;
const headers: Record<string, string> = {
'Authorization': `Bearer ${this.config.getApiKey()}`,
'Content-Type': 'application/json',
...API_CONFIG.HEADERS,
...options.headers
};
const requestOptions: RequestInit & { timeout?: number } = {
method: options.method || 'POST',
headers,
timeout: this.timeout,
...options
};
if (options.body && requestOptions.method !== 'GET') {
requestOptions.body = JSON.stringify(options.body);
}
return this.executeWithRetry(url, requestOptions);
}
private async executeWithRetry(url: string, requestOptions: RequestInit & { timeout?: number }, attempt: number = 1): Promise<APIResponse> {
try {
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), this.timeout);
const response: Response = await fetch(url, {
...requestOptions,
signal: controller.signal
});
clearTimeout(timeoutId);
if (!response.ok) {
const errorText = await response.text();
throw new Error(`HTTP ${response.status}: ${errorText}`);
}
const data = await response.json() as APIResponse;
return this.processResponse(data);
} catch (error: any) {
const processedError = ErrorHandler.handleAPIError(error);
// Retry logic for certain errors
if (this.shouldRetry(processedError, attempt)) {
await this.delay(this.retryConfig.delay * attempt);
return this.executeWithRetry(url, requestOptions, attempt + 1);
}
throw processedError;
}
}
private processResponse(data: APIResponse): APIResponse {
// Check for API-level errors in response
if (data.base_resp && data.base_resp.status_code !== 0) {
throw ErrorHandler.handleAPIError(new Error('API Error'), data);
}
return data;
}
private shouldRetry(error: MinimaxError, attempt: number): boolean {
if (attempt >= this.retryConfig.attempts) {
return false;
}
// Retry on network errors, timeouts, and 5xx errors
return (
error.code === 'NETWORK_ERROR' ||
error.code === 'TIMEOUT_ERROR' ||
('statusCode' in error && typeof error.statusCode === 'number' && error.statusCode >= 500 && error.statusCode < 600)
);
}
private async delay(ms: number): Promise<void> {
return new Promise(resolve => setTimeout(resolve, ms));
}
async get(endpoint: string, options: RequestOptions = {}): Promise<APIResponse> {
return this.makeRequest(endpoint, { ...options, method: 'GET' });
}
async post(endpoint: string, body?: any, options: RequestOptions = {}): Promise<APIResponse> {
return this.makeRequest(endpoint, { ...options, method: 'POST', body });
}
// Health check method
async healthCheck(): Promise<HealthCheckResult> {
try {
// Make a simple request to verify connectivity
await this.get('/health');
return { status: 'healthy', timestamp: new Date().toISOString() };
} catch (error: any) {
return {
status: 'unhealthy',
error: ErrorHandler.formatErrorForUser(error),
timestamp: new Date().toISOString()
};
}
}
}
```
--------------------------------------------------------------------------------
/src/utils/file-handler.ts:
--------------------------------------------------------------------------------
```typescript
import fs from 'fs/promises';
import path from 'path';
import fetch from 'node-fetch';
import type { RequestInit } from 'node-fetch';
import { MinimaxError } from './error-handler.js';
interface DownloadOptions {
timeout?: number;
fetchOptions?: RequestInit;
}
interface FileStats {
size: number;
isFile(): boolean;
isDirectory(): boolean;
mtime: Date;
ctime: Date;
}
export class FileHandler {
static async ensureDirectoryExists(filePath: string): Promise<void> {
const dir = path.dirname(filePath);
try {
await fs.mkdir(dir, { recursive: true });
} catch (error: any) {
throw new MinimaxError(`Failed to create directory: ${error.message}`);
}
}
static async writeFile(filePath: string, data: string | Buffer, options: any = {}): Promise<void> {
try {
await this.ensureDirectoryExists(filePath);
await fs.writeFile(filePath, data, options);
} catch (error: any) {
throw new MinimaxError(`Failed to write file ${filePath}: ${error.message}`);
}
}
static async readFile(filePath: string, options: any = {}): Promise<Buffer | string> {
try {
return await fs.readFile(filePath, options);
} catch (error: any) {
throw new MinimaxError(`Failed to read file ${filePath}: ${error.message}`);
}
}
static async downloadFile(url: string, outputPath: string, options: DownloadOptions = {}): Promise<string> {
try {
await this.ensureDirectoryExists(outputPath);
const response = await fetch(url, {
...options.fetchOptions
});
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${response.statusText}`);
}
const buffer = await response.buffer();
await fs.writeFile(outputPath, buffer);
return outputPath;
} catch (error: any) {
throw new MinimaxError(`Failed to download file from ${url}: ${error.message}`);
}
}
static async convertToBase64(input: string): Promise<string> {
try {
let buffer: Buffer;
if (input.startsWith('http://') || input.startsWith('https://')) {
// Download URL and convert to base64
const response = await fetch(input);
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${response.statusText}`);
}
buffer = await response.buffer();
} else {
// Read local file
const fileData = await this.readFile(input);
buffer = Buffer.isBuffer(fileData) ? fileData : Buffer.from(fileData as string);
}
return `data:image/jpeg;base64,${buffer.toString('base64')}`;
} catch (error: any) {
throw new MinimaxError(`Failed to convert to base64: ${error.message}`);
}
}
static generateUniqueFilename(basePath: string, index: number, total: number): string {
if (total === 1) {
return basePath;
}
const dir = path.dirname(basePath);
const ext = path.extname(basePath);
const name = path.basename(basePath, ext);
return path.join(dir, `${name}_${String(index + 1).padStart(2, '0')}${ext}`);
}
static validateFilePath(filePath: string): boolean {
if (!filePath || typeof filePath !== 'string') {
throw new MinimaxError('File path must be a non-empty string');
}
if (!path.isAbsolute(filePath)) {
throw new MinimaxError('File path must be absolute');
}
return true;
}
static getFileExtension(format: string): string {
const extensions: Record<string, string> = {
mp3: '.mp3',
wav: '.wav',
flac: '.flac',
pcm: '.pcm',
jpg: '.jpg',
jpeg: '.jpeg',
png: '.png',
webp: '.webp'
};
return extensions[format.toLowerCase()] || `.${format}`;
}
static async fileExists(filePath: string): Promise<boolean> {
try {
await fs.access(filePath);
return true;
} catch {
return false;
}
}
static async getFileStats(filePath: string): Promise<FileStats> {
try {
const stats = await fs.stat(filePath);
return {
size: stats.size,
isFile: () => stats.isFile(),
isDirectory: () => stats.isDirectory(),
mtime: stats.mtime,
ctime: stats.ctime
};
} catch (error: any) {
throw new MinimaxError(`Failed to get file stats: ${error.message}`);
}
}
static async saveBase64Image(base64Data: string, outputPath: string): Promise<void> {
try {
await this.ensureDirectoryExists(outputPath);
// Remove data URL prefix if present
const cleanBase64 = base64Data.replace(/^data:image\/\w+;base64,/, '');
const buffer = Buffer.from(cleanBase64, 'base64');
await fs.writeFile(outputPath, buffer);
} catch (error: any) {
throw new MinimaxError(`Failed to save base64 image: ${error.message}`);
}
}
}
```
--------------------------------------------------------------------------------
/src/services/image-service.ts:
--------------------------------------------------------------------------------
```typescript
import { MinimaxBaseClient } from '../core/base-client.js';
import { API_CONFIG, DEFAULTS, MODELS, CONSTRAINTS, type ImageModel } from '../config/constants.js';
import { FileHandler } from '../utils/file-handler.js';
import { ErrorHandler } from '../utils/error-handler.js';
import { type ImageGenerationParams } from '../config/schemas.js';
interface ImageGenerationPayload {
model: string;
prompt: string;
n: number;
prompt_optimizer: boolean;
response_format: string;
width?: number;
height?: number;
aspect_ratio?: string;
seed?: number;
subject_reference?: Array<{
type: string;
image_file: string;
}>;
style?: {
style_type: string;
style_weight: number;
};
}
interface ImageGenerationResponse {
data?: {
image_urls?: string[];
image_base64?: string[];
};
}
interface ImageGenerationResult {
files: string[];
count: number;
model: string;
prompt: string;
warnings?: string[];
}
export class ImageGenerationService extends MinimaxBaseClient {
constructor(options: { baseURL?: string; timeout?: number } = {}) {
super(options);
}
async generateImage(params: ImageGenerationParams): Promise<ImageGenerationResult> {
try {
// Build API payload (MCP handles validation)
const payload = this.buildPayload(params);
// Make API request
const response = await this.post(API_CONFIG.ENDPOINTS.IMAGE_GENERATION, payload) as ImageGenerationResponse;
// Process response
return await this.processImageResponse(response, params);
} catch (error: any) {
const processedError = ErrorHandler.handleAPIError(error);
ErrorHandler.logError(processedError, { service: 'image', params });
// Throw the error so task manager can properly mark it as failed
throw processedError;
}
}
private buildPayload(params: ImageGenerationParams): ImageGenerationPayload {
const imageDefaults = DEFAULTS.IMAGE as any;
// Choose model based on whether style is provided
const model = params.style ? 'image-01-live' : 'image-01';
const payload: ImageGenerationPayload = {
model: model,
prompt: params.prompt,
n: 1,
prompt_optimizer: true, // Always optimize prompts
response_format: 'url' // Always use URL since we save to file
};
// Handle sizing parameters (conflict-free approach)
if (params.customSize) {
payload.width = params.customSize.width;
payload.height = params.customSize.height;
} else {
payload.aspect_ratio = params.aspectRatio || imageDefaults.aspectRatio;
}
// Add optional parameters
if (params.seed !== undefined) {
payload.seed = params.seed;
}
// Model-specific parameter handling
if (model === 'image-01') {
// Add subject reference for image-01 model
// MCP Server Bridge: Convert user-friendly file path to API format
if (params.subjectReference) {
// TODO: Convert file path/URL to base64 or ensure URL is accessible
// For now, pass through assuming it's already in correct format
payload.subject_reference = [{
type: 'character',
image_file: params.subjectReference
}];
}
} else if (model === 'image-01-live') {
// Add style settings for image-01-live model
if (params.style) {
payload.style = {
style_type: params.style.style_type,
style_weight: params.style.style_weight || 0.8
};
}
}
return payload;
}
private async processImageResponse(response: ImageGenerationResponse, params: ImageGenerationParams): Promise<ImageGenerationResult> {
// Handle both URL and base64 responses
const imageUrls = response.data?.image_urls || [];
const imageBase64 = response.data?.image_base64 || [];
if (!imageUrls.length && !imageBase64.length) {
throw new Error('No images generated in API response');
}
// Download and save images
const savedFiles: string[] = [];
const errors: string[] = [];
const imageSources = imageUrls.length ? imageUrls : imageBase64;
for (let i = 0; i < imageSources.length; i++) {
try {
const filename = FileHandler.generateUniqueFilename(params.outputFile, i, imageSources.length);
if (imageBase64.length && !imageUrls.length) {
// Save base64 image
await FileHandler.saveBase64Image(imageBase64[i]!, filename);
} else {
// Download from URL
await FileHandler.downloadFile(imageSources[i]!, filename);
}
savedFiles.push(filename);
} catch (error: any) {
errors.push(`Image ${i + 1}: ${error.message}`);
}
}
if (savedFiles.length === 0) {
throw new Error(`Failed to save any images: ${errors.join('; ')}`);
}
// Use the actual model that was used
const modelUsed = params.style ? 'image-01-live' : 'image-01';
const result: ImageGenerationResult = {
files: savedFiles,
count: savedFiles.length,
model: modelUsed,
prompt: params.prompt
};
if (errors.length > 0) {
result.warnings = errors;
}
return result;
}
// Utility methods
async validateSubjectReference(reference: string): Promise<string | null> {
if (!reference) return null;
try {
return await FileHandler.convertToBase64(reference);
} catch (error: any) {
throw new Error(`Invalid subject reference: ${error.message}`);
}
}
getSupportedModels(): string[] {
return Object.keys(MODELS.IMAGE);
}
getSupportedAspectRatios(): readonly string[] {
return CONSTRAINTS.IMAGE.ASPECT_RATIOS;
}
getModelInfo(modelName: string): { name: string; description: string } | null {
return MODELS.IMAGE[modelName as ImageModel] || null;
}
}
```
--------------------------------------------------------------------------------
/src/utils/error-handler.ts:
--------------------------------------------------------------------------------
```typescript
// Custom error classes for better error handling
export class MinimaxError extends Error {
public readonly code: string;
public readonly details: any;
public readonly timestamp: string;
constructor(message: string, code: string = 'MINIMAX_ERROR', details: any = null) {
super(message);
this.name = this.constructor.name;
this.code = code;
this.details = details;
this.timestamp = new Date().toISOString();
}
toJSON(): {
name: string;
message: string;
code: string;
details: any;
timestamp: string;
} {
return {
name: this.name,
message: this.message,
code: this.code,
details: this.details,
timestamp: this.timestamp
};
}
}
export class MinimaxConfigError extends MinimaxError {
constructor(message: string, details: any = null) {
super(message, 'CONFIG_ERROR', details);
}
}
export class MinimaxAPIError extends MinimaxError {
public readonly statusCode: number | null;
public readonly response: any;
constructor(message: string, statusCode: number | null = null, response: any = null) {
super(message, 'API_ERROR', { statusCode, response });
this.statusCode = statusCode;
this.response = response;
}
}
export class MinimaxValidationError extends MinimaxError {
public readonly field: string | null;
public readonly value: any;
constructor(message: string, field: string | null = null, value: any = null) {
super(message, 'VALIDATION_ERROR', { field, value });
this.field = field;
this.value = value;
}
}
export class MinimaxNetworkError extends MinimaxError {
public readonly originalError: Error | null;
constructor(message: string, originalError: Error | null = null) {
super(message, 'NETWORK_ERROR', { originalError: originalError?.message });
this.originalError = originalError;
}
}
export class MinimaxTimeoutError extends MinimaxError {
public readonly timeout: number | null;
constructor(message: string, timeout: number | null = null) {
super(message, 'TIMEOUT_ERROR', { timeout });
this.timeout = timeout;
}
}
export class MinimaxRateLimitError extends MinimaxError {
public readonly retryAfter: number | null;
constructor(message: string, retryAfter: number | null = null) {
super(message, 'RATE_LIMIT_ERROR', { retryAfter });
this.retryAfter = retryAfter;
}
}
// API Response interface for better typing
interface APIResponse {
base_resp?: {
status_code: number;
status_msg?: string;
retry_after?: number;
};
}
// Error with common Node.js error properties
interface NodeError extends Error {
code?: string;
timeout?: number;
}
// Error handler utility functions
export class ErrorHandler {
static handleAPIError(error: NodeError, response?: APIResponse): MinimaxError {
// Handle different types of API errors
if (response?.base_resp && response.base_resp.status_code !== 0) {
const statusCode = response.base_resp.status_code;
const message = response.base_resp.status_msg || 'API request failed';
switch (statusCode) {
case 1004:
return new MinimaxAPIError(`Authentication failed: ${message}`, statusCode, response);
case 1013:
return new MinimaxRateLimitError(`Rate limit exceeded: ${message}`, response.base_resp?.retry_after);
default:
return new MinimaxAPIError(message, statusCode, response);
}
}
// Handle HTTP errors
if (error.message && error.message.includes('HTTP')) {
const match = error.message.match(/HTTP (\d+):/);
const statusCode = match ? parseInt(match[1]!, 10) : null;
switch (statusCode) {
case 401:
return new MinimaxAPIError('Unauthorized: Invalid API key', statusCode!);
case 403:
return new MinimaxAPIError('Forbidden: Access denied', statusCode!);
case 404:
return new MinimaxAPIError('Not found: Invalid endpoint', statusCode!);
case 429:
return new MinimaxRateLimitError('Rate limit exceeded', null);
case 500:
return new MinimaxAPIError('Internal server error', statusCode!);
default:
return new MinimaxAPIError(error.message, statusCode!);
}
}
// Handle network errors
if (error.code === 'ECONNREFUSED' || error.code === 'ENOTFOUND') {
return new MinimaxNetworkError('Network connection failed', error);
}
// Handle timeout errors
if (error.name === 'AbortError' || (error.message && error.message.includes('timeout'))) {
return new MinimaxTimeoutError('Request timeout', error.timeout);
}
// Default to generic error
return new MinimaxError(error.message || 'Unknown error occurred');
}
static formatErrorForUser(error: Error): string {
if (error instanceof MinimaxConfigError) {
return `Configuration Error: ${error.message}`;
}
if (error instanceof MinimaxValidationError) {
return `Validation Error: ${error.message}`;
}
if (error instanceof MinimaxAPIError) {
return `API Error: ${error.message}`;
}
if (error instanceof MinimaxNetworkError) {
return `Network Error: ${error.message}`;
}
if (error instanceof MinimaxTimeoutError) {
return `Timeout Error: ${error.message}`;
}
if (error instanceof MinimaxRateLimitError) {
return `Rate Limit Error: ${error.message}`;
}
return `Error: ${error.message}`;
}
static logError(error: Error, context: Record<string, any> = {}): void {
const logEntry = {
timestamp: new Date().toISOString(),
error: error instanceof MinimaxError ? error.toJSON() : {
name: error.name,
message: error.message,
stack: error.stack
},
context
};
if (typeof console !== 'undefined') {
console.error('[MINIMAX-ERROR]', JSON.stringify(logEntry, null, 2));
}
}
}
```
--------------------------------------------------------------------------------
/src/index.ts:
--------------------------------------------------------------------------------
```typescript
#!/usr/bin/env node
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
// Import refactored components
import { ConfigManager } from './config/config-manager.js';
import {
imageGenerationSchema,
textToSpeechSchema,
taskBarrierSchema,
validateImageParams,
validateTTSParams,
validateTaskBarrierParams
} from './config/schemas.js';
import { ImageGenerationService } from './services/image-service.js';
import { TextToSpeechService } from './services/tts-service.js';
import { RateLimitedTaskManager } from './core/task-manager.js';
import { ErrorHandler } from './utils/error-handler.js';
// MCP Tool Response interface
interface ToolResponse {
[x: string]: unknown;
content: Array<{
type: "text";
text: string;
}>;
}
// Initialize configuration and services
let config: ConfigManager;
let imageService: ImageGenerationService;
let ttsService: TextToSpeechService;
let taskManager: RateLimitedTaskManager;
try {
config = ConfigManager.getInstance();
config.validate();
imageService = new ImageGenerationService();
ttsService = new TextToSpeechService();
taskManager = new RateLimitedTaskManager();
} catch (error: any) {
console.error("❌ Failed to initialize:", ErrorHandler.formatErrorForUser(error));
process.exit(1);
}
// Create MCP server
const server = new McpServer({
name: "minimax-mcp-tools",
version: "2.2.0",
description: "Async Minimax AI integration for image generation and text-to-speech"
});
// Image generation tool
server.registerTool(
"submit_image_generation",
{
title: "Submit Image Generation Task",
description: "Generate images asynchronously. RECOMMENDED: Submit multiple tasks in batch to saturate rate limits, then call task_barrier once to wait for all completions. Returns task ID only - actual files available after task_barrier.",
inputSchema: imageGenerationSchema.shape
},
async (params: unknown): Promise<ToolResponse> => {
try {
const validatedParams = validateImageParams(params);
const { taskId } = await taskManager.submitImageTask(async () => {
return await imageService.generateImage(validatedParams);
});
return {
content: [{
type: "text",
text: `Task ${taskId} submitted`
}]
};
} catch (error: any) {
ErrorHandler.logError(error, { tool: 'submit_image_generation', params });
return {
content: [{
type: "text",
text: `❌ Failed to submit image generation task: ${ErrorHandler.formatErrorForUser(error)}`
}]
};
}
}
);
// Text-to-speech tool
server.registerTool(
"submit_speech_generation",
{
title: "Submit Speech Generation Task",
description: "Convert text to speech asynchronously. RECOMMENDED: Submit multiple tasks in batch to saturate rate limits, then call task_barrier once to wait for all completions. Returns task ID only - actual files available after task_barrier.",
inputSchema: textToSpeechSchema.shape
},
async (params: unknown): Promise<ToolResponse> => {
try {
const validatedParams = validateTTSParams(params);
const { taskId } = await taskManager.submitTTSTask(async () => {
return await ttsService.generateSpeech(validatedParams);
});
return {
content: [{
type: "text",
text: `Task ${taskId} submitted`
}]
};
} catch (error: any) {
ErrorHandler.logError(error, { tool: 'submit_speech_generation', params });
return {
content: [{
type: "text",
text: `❌ Failed to submit TTS task: ${ErrorHandler.formatErrorForUser(error)}`
}]
};
}
}
);
// Task barrier tool
server.registerTool(
"task_barrier",
{
title: "Wait for Task Completion",
description: "Wait for ALL submitted tasks to complete and retrieve results. Essential for batch processing - submit multiple tasks first, then call task_barrier once to collect all results efficiently. Clears completed tasks.",
inputSchema: taskBarrierSchema.shape
},
async (params: unknown): Promise<ToolResponse> => {
try {
validateTaskBarrierParams(params);
const { completed, results } = await taskManager.barrier();
if (completed === 0) {
return {
content: [{
type: "text",
text: "ℹ️ No tasks were submitted before this barrier."
}]
};
}
// Format results
const resultSummaries = results.map(({ taskId, success, result, error }) => {
if (!success) {
return `❌ Task ${taskId}: FAILED - ${error?.message || 'Unknown error'}`;
}
// Format success results based on task type
if (result?.files) {
// Image generation result
const warnings = result.warnings ? ` (${result.warnings.length} warnings)` : '';
return `✅ Task ${taskId}: Generated ${result.count} image(s)${warnings}`;
} else if (result?.audioFile) {
// TTS generation result
const subtitles = result.subtitleFile ? ` + subtitles` : '';
const warnings = result.warnings ? ` (${result.warnings.length} warnings)` : '';
return `✅ Task ${taskId}: Generated speech${subtitles}${warnings}`;
} else {
// Generic success
return `✅ Task ${taskId}: Completed successfully`;
}
});
const summary = resultSummaries.join('\n');
// Clear completed tasks to prevent memory leaks
taskManager.clearCompletedTasks();
return {
content: [{
type: "text",
text: summary
}]
};
} catch (error: any) {
ErrorHandler.logError(error, { tool: 'task_barrier' });
return {
content: [{
type: "text",
text: `❌ Task barrier failed: ${ErrorHandler.formatErrorForUser(error)}`
}]
};
}
}
);
// Graceful shutdown
process.on('SIGINT', () => {
console.error("🛑 Shutting down gracefully...");
taskManager.clearCompletedTasks();
process.exit(0);
});
process.on('SIGTERM', () => {
console.error("🛑 Received SIGTERM, shutting down...");
taskManager.clearCompletedTasks();
process.exit(0);
});
// Start server
const transport = new StdioServerTransport();
await server.connect(transport);
```
--------------------------------------------------------------------------------
/src/core/task-manager.ts:
--------------------------------------------------------------------------------
```typescript
import { AdaptiveRateLimiter } from './rate-limiter.js';
import { RATE_LIMITS } from '../config/constants.js';
import { MinimaxError, ErrorHandler } from '../utils/error-handler.js';
// Type definitions
interface TaskResult {
success: boolean;
result?: any;
error?: MinimaxError;
completedAt: number;
}
interface TaskSubmissionResult {
taskId: string;
promise: Promise<any>;
}
interface BarrierResult {
completed: number;
results: (TaskResult & { taskId: string })[];
}
interface TaskStatus {
status: 'running' | 'completed' | 'not_found';
taskId: string;
success?: boolean;
result?: any;
error?: MinimaxError;
completedAt?: number;
}
interface AllTasksStatus {
running: Array<{ taskId: string; status: 'running' }>;
completed: Array<{ taskId: string; status: 'completed' } & TaskResult>;
total: number;
}
interface TaskStats {
activeTasks: number;
completedTasks: number;
totalProcessed: number;
}
interface TaskMetrics {
requests: number;
successes: number;
errors: number;
}
interface RateLimitedTaskManagerOptions {
backoffFactor?: number;
recoveryFactor?: number;
}
export class TaskManager {
protected tasks: Map<string, Promise<any>>;
protected completedTasks: Map<string, TaskResult>;
protected taskCounter: number;
constructor() {
this.tasks = new Map();
this.completedTasks = new Map();
this.taskCounter = 0;
}
protected generateTaskId(): string {
return `task_${++this.taskCounter}`;
}
async submit(fn: () => Promise<any>, taskId: string | null = null): Promise<TaskSubmissionResult> {
taskId = taskId || this.generateTaskId();
const taskPromise = Promise.resolve()
.then(fn)
.then(result => {
this.completedTasks.set(taskId!, { success: true, result, completedAt: Date.now() });
return result;
})
.catch(error => {
const processedError = ErrorHandler.handleAPIError(error);
this.completedTasks.set(taskId!, { success: false, error: processedError, completedAt: Date.now() });
throw processedError;
})
.finally(() => {
this.tasks.delete(taskId!);
});
this.tasks.set(taskId, taskPromise);
return { taskId, promise: taskPromise };
}
async barrier(): Promise<BarrierResult> {
const activeTasks = Array.from(this.tasks.values());
// Wait for any active tasks to complete
if (activeTasks.length > 0) {
await Promise.allSettled(activeTasks);
}
// Return all completed tasks (including those completed before this barrier call)
const results = Array.from(this.completedTasks.entries()).map(([taskId, taskResult]) => ({
taskId,
...taskResult
}));
return { completed: results.length, results };
}
getTaskStatus(taskId: string): TaskStatus {
if (this.tasks.has(taskId)) {
return { status: 'running', taskId };
}
if (this.completedTasks.has(taskId)) {
return { status: 'completed', taskId, ...this.completedTasks.get(taskId)! };
}
return { status: 'not_found', taskId };
}
getAllTasksStatus(): AllTasksStatus {
const running = Array.from(this.tasks.keys()).map(taskId => ({ taskId, status: 'running' as const }));
const completed = Array.from(this.completedTasks.entries()).map(([taskId, result]) => ({
taskId,
status: 'completed' as const,
...result
}));
return { running, completed, total: running.length + completed.length };
}
clearCompletedTasks(): number {
const count = this.completedTasks.size;
this.completedTasks.clear();
return count;
}
getStats(): TaskStats {
return {
activeTasks: this.tasks.size,
completedTasks: this.completedTasks.size,
totalProcessed: this.taskCounter
};
}
}
export class RateLimitedTaskManager extends TaskManager {
private rateLimiters: {
image: AdaptiveRateLimiter;
tts: AdaptiveRateLimiter;
};
private metrics: {
image: TaskMetrics;
tts: TaskMetrics;
};
private taskCounters: {
image: number;
tts: number;
};
constructor(options: RateLimitedTaskManagerOptions = {}) {
super();
this.rateLimiters = {
image: new AdaptiveRateLimiter({
...RATE_LIMITS.IMAGE,
backoffFactor: options.backoffFactor || 0.7,
recoveryFactor: options.recoveryFactor || 1.05
}),
tts: new AdaptiveRateLimiter({
...RATE_LIMITS.TTS,
backoffFactor: options.backoffFactor || 0.7,
recoveryFactor: options.recoveryFactor || 1.05
})
};
this.metrics = {
image: { requests: 0, successes: 0, errors: 0 },
tts: { requests: 0, successes: 0, errors: 0 }
};
this.taskCounters = {
image: 0,
tts: 0
};
}
async submitImageTask(fn: () => Promise<any>, taskId: string | null = null): Promise<TaskSubmissionResult> {
if (!taskId) {
taskId = `img-${++this.taskCounters.image}`;
}
return this.submitRateLimitedTask('image', fn, taskId);
}
async submitTTSTask(fn: () => Promise<any>, taskId: string | null = null): Promise<TaskSubmissionResult> {
if (!taskId) {
taskId = `tts-${++this.taskCounters.tts}`;
}
return this.submitRateLimitedTask('tts', fn, taskId);
}
private async submitRateLimitedTask(type: 'image' | 'tts', fn: () => Promise<any>, taskId: string | null = null): Promise<TaskSubmissionResult> {
const rateLimiter = this.rateLimiters[type];
if (!rateLimiter) {
throw new MinimaxError(`Unknown task type: ${type}`);
}
const wrappedFn = async () => {
await rateLimiter.acquire();
this.metrics[type].requests++;
try {
const result = await fn();
this.metrics[type].successes++;
rateLimiter.onSuccess();
return result;
} catch (error: any) {
this.metrics[type].errors++;
rateLimiter.onError(error);
throw error;
}
};
return this.submit(wrappedFn, taskId);
}
getRateLimiterStatus() {
return {
image: this.rateLimiters.image.getAdaptiveStatus(),
tts: this.rateLimiters.tts.getAdaptiveStatus()
};
}
getMetrics() {
return {
...this.metrics,
rateLimiters: this.getRateLimiterStatus()
};
}
resetMetrics(): void {
this.metrics = {
image: { requests: 0, successes: 0, errors: 0 },
tts: { requests: 0, successes: 0, errors: 0 }
};
this.taskCounters = {
image: 0,
tts: 0
};
Object.values(this.rateLimiters).forEach(limiter => limiter.reset());
}
}
```
--------------------------------------------------------------------------------
/src/services/tts-service.ts:
--------------------------------------------------------------------------------
```typescript
import { MinimaxBaseClient } from '../core/base-client.js';
import { API_CONFIG, DEFAULTS, MODELS, VOICES, type TTSModel, type VoiceId } from '../config/constants.js';
import { FileHandler } from '../utils/file-handler.js';
import { ErrorHandler } from '../utils/error-handler.js';
import { type TextToSpeechParams } from '../config/schemas.js';
interface TTSPayload {
model: string;
text: string;
voice_setting: {
voice_id: string;
speed: number;
vol: number;
pitch: number;
emotion: string;
};
audio_setting: {
sample_rate: number;
bitrate: number;
format: string;
channel: number;
};
language_boost?: string;
voice_modify?: {
pitch?: number;
intensity?: number;
timbre?: number;
sound_effects?: string;
};
}
interface TTSResponse {
data?: {
audio?: string;
duration?: number;
subtitle_url?: string;
};
}
interface TTSResult {
audioFile: string;
voiceUsed: string;
model: string;
duration: number | null;
format: string;
sampleRate: number;
bitrate: number;
subtitleFile?: string;
warnings?: string[];
}
export class TextToSpeechService extends MinimaxBaseClient {
constructor(options: { baseURL?: string; timeout?: number } = {}) {
super(options);
}
async generateSpeech(params: TextToSpeechParams): Promise<TTSResult> {
try {
// Build API payload (MCP handles validation)
const payload = this.buildPayload(params);
// Make API request
const response = await this.post(API_CONFIG.ENDPOINTS.TEXT_TO_SPEECH, payload) as TTSResponse;
// Process response
return await this.processTTSResponse(response, params);
} catch (error: any) {
const processedError = ErrorHandler.handleAPIError(error);
ErrorHandler.logError(processedError, { service: 'tts', params });
// Throw the error so task manager can properly mark it as failed
throw processedError;
}
}
private buildPayload(params: TextToSpeechParams): TTSPayload {
const ttsDefaults = DEFAULTS.TTS as any;
// Map highQuality parameter to appropriate Speech 2.6 model
const model = (params as any).highQuality ? 'speech-2.6-hd' : 'speech-2.6-turbo';
const payload: TTSPayload = {
model: model,
text: params.text,
voice_setting: {
voice_id: params.voiceId || ttsDefaults.voiceId,
speed: params.speed || ttsDefaults.speed,
vol: params.volume || ttsDefaults.volume,
pitch: params.pitch || ttsDefaults.pitch,
emotion: params.emotion || ttsDefaults.emotion
},
audio_setting: {
sample_rate: parseInt(params.sampleRate || ttsDefaults.sampleRate),
bitrate: parseInt(params.bitrate || ttsDefaults.bitrate),
format: params.format || ttsDefaults.format,
channel: ttsDefaults.channel
}
};
// Add optional parameters
if (params.languageBoost) {
payload.language_boost = params.languageBoost;
}
// Add voice modify parameters if present
if (params.intensity !== undefined || params.timbre !== undefined || params.sound_effects !== undefined) {
payload.voice_modify = {};
if (params.intensity !== undefined) {
payload.voice_modify.intensity = params.intensity;
}
if (params.timbre !== undefined) {
payload.voice_modify.timbre = params.timbre;
}
if (params.sound_effects !== undefined) {
payload.voice_modify.sound_effects = params.sound_effects;
}
}
// Voice mixing feature removed for simplicity
// Filter out undefined values
return this.cleanPayload(payload) as TTSPayload;
}
private cleanPayload(obj: any): any {
if (typeof obj !== 'object' || obj === null) {
return obj;
}
if (Array.isArray(obj)) {
return obj.map(item => this.cleanPayload(item)).filter(item => item !== undefined);
}
const result: any = {};
for (const [key, value] of Object.entries(obj)) {
if (value === undefined) continue;
if (typeof value === 'object' && value !== null) {
const cleanedValue = this.cleanPayload(value);
if (typeof cleanedValue === 'object' && !Array.isArray(cleanedValue) && Object.keys(cleanedValue).length === 0) {
continue;
}
result[key] = cleanedValue;
} else {
result[key] = value;
}
}
return result;
}
private async processTTSResponse(response: TTSResponse, params: TextToSpeechParams): Promise<TTSResult> {
const audioHex = response.data?.audio;
if (!audioHex) {
throw new Error('No audio data received from API');
}
// Convert hex to bytes and save
const audioBytes = Buffer.from(audioHex, 'hex');
await FileHandler.writeFile(params.outputFile, audioBytes);
const ttsDefaults = DEFAULTS.TTS as any;
const result: TTSResult = {
audioFile: params.outputFile,
voiceUsed: params.voiceId || ttsDefaults.voiceId,
model: (params as any).highQuality ? 'speech-2.6-hd' : 'speech-2.6-turbo',
duration: response.data?.duration || null,
format: params.format || ttsDefaults.format,
sampleRate: parseInt(params.sampleRate || ttsDefaults.sampleRate),
bitrate: parseInt(params.bitrate || ttsDefaults.bitrate)
};
// Subtitles feature removed for simplicity
return result;
}
// Utility methods
getSupportedModels(): string[] {
return Object.keys(MODELS.TTS);
}
getSupportedVoices(): string[] {
return Object.keys(VOICES);
}
getVoiceInfo(voiceId: string): { name: string; gender: 'male' | 'female' | 'other'; language: 'zh' | 'en' } | null {
return VOICES[voiceId as VoiceId] || null;
}
getModelInfo(modelName: string): { name: string; description: string } | null {
return MODELS.TTS[modelName as TTSModel] || null;
}
validateVoiceParameters(params: TextToSpeechParams): string[] {
const ttsDefaults = DEFAULTS.TTS as any;
const voice = this.getVoiceInfo(params.voiceId || ttsDefaults.voiceId);
const model = (params as any).highQuality ? 'speech-2.6-hd' : 'speech-2.6-turbo';
const issues: string[] = [];
if (!voice && params.voiceId) {
issues.push(`Unknown voice ID: ${params.voiceId}`);
}
// Check emotion compatibility (Speech 2.6 models support emotions)
if (params.emotion && params.emotion !== 'neutral') {
const emotionSupportedModels = ['speech-2.6-hd', 'speech-2.6-turbo'];
if (!emotionSupportedModels.includes(model)) {
issues.push(`Emotion parameter not supported by model ${model}`);
}
}
return issues;
}
}
```
--------------------------------------------------------------------------------
/src/config/constants.ts:
--------------------------------------------------------------------------------
```typescript
// Type definitions
export interface ModelConfig {
name: string;
description: string;
}
export interface VoiceConfig {
name: string;
gender: 'male' | 'female' | 'other';
language: 'zh' | 'en';
}
export interface RateLimit {
rpm: number;
burst: number;
}
export interface ApiConfig {
BASE_URL: string;
ENDPOINTS: {
IMAGE_GENERATION: string;
TEXT_TO_SPEECH: string;
};
HEADERS: {
'MM-API-Source': string;
};
TIMEOUT: number;
}
export interface ImageConstraints {
PROMPT_MAX_LENGTH: number;
MAX_IMAGES: number;
MIN_DIMENSION: number;
MAX_DIMENSION: number;
DIMENSION_STEP: number;
ASPECT_RATIOS: readonly string[];
STYLE_TYPES: readonly string[];
RESPONSE_FORMATS: readonly string[];
SUBJECT_TYPES: readonly string[];
STYLE_WEIGHT_MIN: number;
STYLE_WEIGHT_MAX: number;
}
export interface TTSConstraints {
TEXT_MAX_LENGTH: number;
SPEED_MIN: number;
SPEED_MAX: number;
VOLUME_MIN: number;
VOLUME_MAX: number;
PITCH_MIN: number;
PITCH_MAX: number;
EMOTIONS: readonly string[];
FORMATS: readonly string[];
SAMPLE_RATES: readonly number[];
BITRATES: readonly number[];
VOICE_MODIFY_PITCH_MIN: number;
VOICE_MODIFY_PITCH_MAX: number;
VOICE_MODIFY_INTENSITY_MIN: number;
VOICE_MODIFY_INTENSITY_MAX: number;
VOICE_MODIFY_TIMBRE_MIN: number;
VOICE_MODIFY_TIMBRE_MAX: number;
SOUND_EFFECTS: readonly string[];
}
export interface ImageDefaults {
model: string;
aspectRatio: string;
n: number;
promptOptimizer: boolean;
responseFormat: string;
styleWeight: number;
}
export interface TTSDefaults {
model: string;
voiceId: string;
speed: number;
volume: number;
pitch: number;
emotion: string;
format: string;
sampleRate: number;
bitrate: number;
channel: number;
}
// API Configuration
export const API_CONFIG: ApiConfig = {
BASE_URL: 'https://api.minimaxi.com/v1',
ENDPOINTS: {
IMAGE_GENERATION: '/image_generation',
TEXT_TO_SPEECH: '/t2a_v2'
},
HEADERS: {
'MM-API-Source': 'mcp-tools'
},
TIMEOUT: 30000
} as const;
// Rate Limiting Configuration
export const RATE_LIMITS: Record<'IMAGE' | 'TTS', RateLimit> = {
IMAGE: { rpm: 10, burst: 3 },
TTS: { rpm: 20, burst: 5 }
} as const;
// Model Configurations
export const MODELS: Record<'IMAGE' | 'TTS', Record<string, ModelConfig>> = {
IMAGE: {
'image-01': { name: 'image-01', description: 'Standard image generation' },
'image-01-live': { name: 'image-01-live', description: 'Live image generation' }
},
TTS: {
'speech-2.6-hd': { name: 'speech-2.6-hd', description: 'Ultra-low latency, intelligent parsing, and enhanced naturalness' },
'speech-2.6-turbo': { name: 'speech-2.6-turbo', description: 'Faster, more affordable, ideal for voice agents with 40 languages support' }
}
} as const;
// Voice Configurations
export const VOICES: Record<string, VoiceConfig> = {
// Basic Chinese voices
'male-qn-qingse': { name: '青涩青年音色', gender: 'male', language: 'zh' },
'male-qn-jingying': { name: '精英青年音色', gender: 'male', language: 'zh' },
'male-qn-badao': { name: '霸道青年音色', gender: 'male', language: 'zh' },
'male-qn-daxuesheng': { name: '青年大学生音色', gender: 'male', language: 'zh' },
'female-shaonv': { name: '少女音色', gender: 'female', language: 'zh' },
'female-yujie': { name: '御姐音色', gender: 'female', language: 'zh' },
'female-chengshu': { name: '成熟女性音色', gender: 'female', language: 'zh' },
'female-tianmei': { name: '甜美女性音色', gender: 'female', language: 'zh' },
// Professional voices
'presenter_male': { name: '男性主持人', gender: 'male', language: 'zh' },
'presenter_female': { name: '女性主持人', gender: 'female', language: 'zh' },
'audiobook_male_1': { name: '男性有声书1', gender: 'male', language: 'zh' },
'audiobook_male_2': { name: '男性有声书2', gender: 'male', language: 'zh' },
'audiobook_female_1': { name: '女性有声书1', gender: 'female', language: 'zh' },
'audiobook_female_2': { name: '女性有声书2', gender: 'female', language: 'zh' },
// Beta voices
'male-qn-qingse-jingpin': { name: '青涩青年音色-beta', gender: 'male', language: 'zh' },
'male-qn-jingying-jingpin': { name: '精英青年音色-beta', gender: 'male', language: 'zh' },
'male-qn-badao-jingpin': { name: '霸道青年音色-beta', gender: 'male', language: 'zh' },
'male-qn-daxuesheng-jingpin': { name: '青年大学生音色-beta', gender: 'male', language: 'zh' },
'female-shaonv-jingpin': { name: '少女音色-beta', gender: 'female', language: 'zh' },
'female-yujie-jingpin': { name: '御姐音色-beta', gender: 'female', language: 'zh' },
'female-chengshu-jingpin': { name: '成熟女性音色-beta', gender: 'female', language: 'zh' },
'female-tianmei-jingpin': { name: '甜美女性音色-beta', gender: 'female', language: 'zh' },
// Children voices
'clever_boy': { name: '聪明男童', gender: 'male', language: 'zh' },
'cute_boy': { name: '可爱男童', gender: 'male', language: 'zh' },
'lovely_girl': { name: '萌萌女童', gender: 'female', language: 'zh' },
'cartoon_pig': { name: '卡通猪小琪', gender: 'other', language: 'zh' },
// Character voices
'bingjiao_didi': { name: '病娇弟弟', gender: 'male', language: 'zh' },
'junlang_nanyou': { name: '俊朗男友', gender: 'male', language: 'zh' },
'chunzhen_xuedi': { name: '纯真学弟', gender: 'male', language: 'zh' },
'lengdan_xiongzhang': { name: '冷淡学长', gender: 'male', language: 'zh' },
'badao_shaoye': { name: '霸道少爷', gender: 'male', language: 'zh' },
'tianxin_xiaoling': { name: '甜心小玲', gender: 'female', language: 'zh' },
'qiaopi_mengmei': { name: '俏皮萌妹', gender: 'female', language: 'zh' },
'wumei_yujie': { name: '妩媚御姐', gender: 'female', language: 'zh' },
'diadia_xuemei': { name: '嗲嗲学妹', gender: 'female', language: 'zh' },
'danya_xuejie': { name: '淡雅学姐', gender: 'female', language: 'zh' },
// English voices
'Santa_Claus': { name: 'Santa Claus', gender: 'male', language: 'en' },
'Grinch': { name: 'Grinch', gender: 'male', language: 'en' },
'Rudolph': { name: 'Rudolph', gender: 'other', language: 'en' },
'Arnold': { name: 'Arnold', gender: 'male', language: 'en' },
'Charming_Santa': { name: 'Charming Santa', gender: 'male', language: 'en' },
'Charming_Lady': { name: 'Charming Lady', gender: 'female', language: 'en' },
'Sweet_Girl': { name: 'Sweet Girl', gender: 'female', language: 'en' },
'Cute_Elf': { name: 'Cute Elf', gender: 'other', language: 'en' },
'Attractive_Girl': { name: 'Attractive Girl', gender: 'female', language: 'en' },
'Serene_Woman': { name: 'Serene Woman', gender: 'female', language: 'en' }
} as const;
// Parameter Constraints
export const CONSTRAINTS = {
IMAGE: {
PROMPT_MAX_LENGTH: 1500,
MAX_IMAGES: 9,
MIN_DIMENSION: 512,
MAX_DIMENSION: 2048,
DIMENSION_STEP: 8,
ASPECT_RATIOS: ["1:1", "16:9", "4:3", "3:2", "2:3", "3:4", "9:16", "21:9"] as const,
STYLE_TYPES: ["漫画", "元气", "中世纪", "水彩"] as const,
RESPONSE_FORMATS: ["url", "base64"] as const,
SUBJECT_TYPES: ["character"] as const,
STYLE_WEIGHT_MIN: 0.01,
STYLE_WEIGHT_MAX: 1
},
TTS: {
TEXT_MAX_LENGTH: 10000,
SPEED_MIN: 0.5,
SPEED_MAX: 2.0,
VOLUME_MIN: 0.1,
VOLUME_MAX: 10.0,
PITCH_MIN: -12,
PITCH_MAX: 12,
EMOTIONS: ["neutral", "happy", "sad", "angry", "fearful", "disgusted", "surprised"] as const,
FORMATS: ["mp3", "wav", "flac", "pcm"] as const,
SAMPLE_RATES: ["8000", "16000", "22050", "24000", "32000", "44100"] as const,
BITRATES: ["64000", "96000", "128000", "160000", "192000", "224000", "256000", "320000"] as const,
VOICE_MODIFY_PITCH_MIN: -100,
VOICE_MODIFY_PITCH_MAX: 100,
VOICE_MODIFY_INTENSITY_MIN: -100,
VOICE_MODIFY_INTENSITY_MAX: 100,
VOICE_MODIFY_TIMBRE_MIN: -100,
VOICE_MODIFY_TIMBRE_MAX: 100,
SOUND_EFFECTS: ["spacious_echo", "auditorium_echo", "lofi_telephone", "robotic"] as const
}
} as const;
// Default Values
export const DEFAULTS = {
IMAGE: {
model: 'image-01',
aspectRatio: '1:1',
n: 1,
promptOptimizer: true,
responseFormat: 'url',
styleWeight: 0.8
},
TTS: {
model: 'speech-2.6-hd',
voiceId: 'female-shaonv',
speed: 1.0,
volume: 1.0,
pitch: 0,
emotion: 'neutral',
format: 'mp3',
sampleRate: "32000",
bitrate: "128000",
channel: 1
}
} as const;
// Type exports for use in other modules
export type ImageModel = keyof typeof MODELS.IMAGE;
export type TTSModel = keyof typeof MODELS.TTS;
export type VoiceId = keyof typeof VOICES;
export type AspectRatio = typeof CONSTRAINTS.IMAGE.ASPECT_RATIOS[number];
export type StyleType = typeof CONSTRAINTS.IMAGE.STYLE_TYPES[number];
export type ResponseFormat = typeof CONSTRAINTS.IMAGE.RESPONSE_FORMATS[number];
export type SubjectType = typeof CONSTRAINTS.IMAGE.SUBJECT_TYPES[number];
export type Emotion = typeof CONSTRAINTS.TTS.EMOTIONS[number];
export type AudioFormat = typeof CONSTRAINTS.TTS.FORMATS[number];
export type SampleRate = typeof CONSTRAINTS.TTS.SAMPLE_RATES[number];
export type Bitrate = typeof CONSTRAINTS.TTS.BITRATES[number];
export type SoundEffect = typeof CONSTRAINTS.TTS.SOUND_EFFECTS[number];
```
--------------------------------------------------------------------------------
/src/config/schemas.ts:
--------------------------------------------------------------------------------
```typescript
import { z } from 'zod';
import {
CONSTRAINTS,
VOICES,
type VoiceId,
type AspectRatio,
type StyleType,
type Emotion,
type AudioFormat,
type SampleRate,
type Bitrate,
type SoundEffect
} from './constants.js';
// Base schemas
const filePathSchema = z.string().min(1, 'File path is required');
const positiveIntSchema = z.number().int().positive();
// Helper functions for generating descriptions
const getSoundEffectsDescription = () => {
const descriptions = {
'spacious_echo': 'spacious_echo (空旷回音)',
'auditorium_echo': 'auditorium_echo (礼堂广播)',
'lofi_telephone': 'lofi_telephone (电话失真)',
'robotic': 'robotic (机械音)'
};
return `Sound effects. Options: ${CONSTRAINTS.TTS.SOUND_EFFECTS.map(effect => descriptions[effect] || effect).join(', ')}. Only one sound effect can be used per request`;
};
// Image generation schema
export const imageGenerationSchema = z.object({
prompt: z.string()
.min(1, 'Prompt is required')
.max(CONSTRAINTS.IMAGE.PROMPT_MAX_LENGTH, `Prompt must not exceed ${CONSTRAINTS.IMAGE.PROMPT_MAX_LENGTH} characters`),
outputFile: filePathSchema.describe('Absolute path for generated image'),
aspectRatio: z.enum(CONSTRAINTS.IMAGE.ASPECT_RATIOS as readonly [AspectRatio, ...AspectRatio[]])
.default('1:1' as AspectRatio)
.describe(`Aspect ratio for the image. Options: ${CONSTRAINTS.IMAGE.ASPECT_RATIOS.join(', ')}`),
customSize: z.object({
width: z.number()
.min(CONSTRAINTS.IMAGE.MIN_DIMENSION)
.max(CONSTRAINTS.IMAGE.MAX_DIMENSION)
.multipleOf(CONSTRAINTS.IMAGE.DIMENSION_STEP),
height: z.number()
.min(CONSTRAINTS.IMAGE.MIN_DIMENSION)
.max(CONSTRAINTS.IMAGE.MAX_DIMENSION)
.multipleOf(CONSTRAINTS.IMAGE.DIMENSION_STEP)
}).optional().describe('Custom image dimensions (width x height in pixels). Range: 512-2048, must be multiples of 8. Total resolution should stay under 2M pixels. Only supported with image-01 model (cannot be used with style parameter). When both customSize and aspectRatio are set, aspectRatio takes precedence'),
seed: positiveIntSchema.optional().describe('Random seed for reproducible results'),
subjectReference: z.string().optional().describe('File path to a portrait image for maintaining facial characteristics in generated images. Only supported with image-01 model (cannot be used with style parameter). Provide a clear frontal face photo for best results. Supports local file paths and URLs. Max 10MB, formats: jpg, jpeg, png'),
style: z.object({
style_type: z.enum(CONSTRAINTS.IMAGE.STYLE_TYPES as readonly [StyleType, ...StyleType[]])
.describe(`Art style type. Options: ${CONSTRAINTS.IMAGE.STYLE_TYPES.join(', ')}`),
style_weight: z.number()
.min(CONSTRAINTS.IMAGE.STYLE_WEIGHT_MIN, 'Style weight must be greater than 0')
.max(CONSTRAINTS.IMAGE.STYLE_WEIGHT_MAX, 'Style weight must not exceed 1')
.default(0.8)
.describe('Style control weight (0-1]. Higher values apply stronger style effects. Default: 0.8')
}).optional().describe('Art style control settings. Uses image-01-live model which does not support customSize or subjectReference parameters. Cannot be combined with customSize or subjectReference'),
});
// Text-to-speech schema
export const textToSpeechSchema = z.object({
text: z.string()
.min(1, 'Text is required')
.max(CONSTRAINTS.TTS.TEXT_MAX_LENGTH, `Text to convert to speech. Max ${CONSTRAINTS.TTS.TEXT_MAX_LENGTH} characters. Use newlines for paragraph breaks. For custom pauses, insert <#x#> where x is seconds (0.01-99.99, max 2 decimals). Pause markers must be between pronounceable text and cannot be consecutive`),
outputFile: filePathSchema.describe('Absolute path for audio file'),
highQuality: z.boolean()
.default(false)
.describe('Use high-quality model (speech-02-hd) for audiobooks/premium content. Default: false (uses faster speech-02-turbo)'),
voiceId: z.enum(Object.keys(VOICES) as [VoiceId, ...VoiceId[]])
.default('female-shaonv' as VoiceId)
.describe(`Voice ID for speech generation. Available voices: ${Object.keys(VOICES).map(id => `${id} (${VOICES[id as VoiceId]?.name || id})`).join(', ')}`),
speed: z.number()
.min(CONSTRAINTS.TTS.SPEED_MIN)
.max(CONSTRAINTS.TTS.SPEED_MAX)
.default(1.0)
.describe(`Speech speed multiplier (${CONSTRAINTS.TTS.SPEED_MIN}-${CONSTRAINTS.TTS.SPEED_MAX}). Higher values = faster speech`),
volume: z.number()
.min(CONSTRAINTS.TTS.VOLUME_MIN)
.max(CONSTRAINTS.TTS.VOLUME_MAX)
.default(1.0)
.describe(`Audio volume level (${CONSTRAINTS.TTS.VOLUME_MIN}-${CONSTRAINTS.TTS.VOLUME_MAX}). Higher values = louder audio`),
pitch: z.number()
.min(CONSTRAINTS.TTS.PITCH_MIN)
.max(CONSTRAINTS.TTS.PITCH_MAX)
.default(0)
.describe(`Pitch adjustment in semitones (${CONSTRAINTS.TTS.PITCH_MIN} to ${CONSTRAINTS.TTS.PITCH_MAX}). Negative = lower pitch, Positive = higher pitch`),
emotion: z.enum(CONSTRAINTS.TTS.EMOTIONS as readonly [Emotion, ...Emotion[]])
.default('neutral' as Emotion)
.describe(`Emotional tone of the speech. Options: ${CONSTRAINTS.TTS.EMOTIONS.join(', ')}`),
format: z.enum(CONSTRAINTS.TTS.FORMATS as readonly [AudioFormat, ...AudioFormat[]])
.default('mp3' as AudioFormat)
.describe(`Output audio format. Options: ${CONSTRAINTS.TTS.FORMATS.join(', ')}`),
sampleRate: z.enum(CONSTRAINTS.TTS.SAMPLE_RATES as readonly [SampleRate, ...SampleRate[]])
.default("32000" as SampleRate)
.describe(`Audio sample rate in Hz. Options: ${CONSTRAINTS.TTS.SAMPLE_RATES.join(', ')}`),
bitrate: z.enum(CONSTRAINTS.TTS.BITRATES as readonly [Bitrate, ...Bitrate[]])
.default("128000" as Bitrate)
.describe(`Audio bitrate in bps. Options: ${CONSTRAINTS.TTS.BITRATES.join(', ')}`),
languageBoost: z.string().default('auto').describe('Enhance recognition for specific languages/dialects. Options: Chinese, Chinese,Yue, English, Arabic, Russian, Spanish, French, Portuguese, German, Turkish, Dutch, Ukrainian, Vietnamese, Indonesian, Japanese, Italian, Korean, Thai, Polish, Romanian, Greek, Czech, Finnish, Hindi, Bulgarian, Danish, Hebrew, Malay, Persian, Slovak, Swedish, Croatian, Filipino, Hungarian, Norwegian, Slovenian, Catalan, Nynorsk, Tamil, Afrikaans, auto. Use "auto" for automatic detection'),
intensity: z.number()
.int()
.min(CONSTRAINTS.TTS.VOICE_MODIFY_INTENSITY_MIN)
.max(CONSTRAINTS.TTS.VOICE_MODIFY_INTENSITY_MAX)
.optional()
.describe('Voice intensity adjustment (-100 to 100). Values closer to -100 make voice more robust, closer to 100 make voice softer'),
timbre: z.number()
.int()
.min(CONSTRAINTS.TTS.VOICE_MODIFY_TIMBRE_MIN)
.max(CONSTRAINTS.TTS.VOICE_MODIFY_TIMBRE_MAX)
.optional()
.describe('Voice timbre adjustment (-100 to 100). Values closer to -100 make voice more mellow, closer to 100 make voice more crisp'),
sound_effects: z.enum(CONSTRAINTS.TTS.SOUND_EFFECTS as readonly [SoundEffect, ...SoundEffect[]])
.optional()
.describe(getSoundEffectsDescription())
});
// Task barrier schema
export const taskBarrierSchema = z.object({});
// Type definitions for parsed schemas
export type ImageGenerationParams = z.infer<typeof imageGenerationSchema>;
export type TextToSpeechParams = z.infer<typeof textToSpeechSchema>;
export type TaskBarrierParams = z.infer<typeof taskBarrierSchema>;
// MCP Tool Schemas (for registerTool API)
export const imageGenerationToolSchema = {
type: "object",
properties: {
prompt: {
type: "string",
description: `Image generation prompt (max ${CONSTRAINTS.IMAGE.PROMPT_MAX_LENGTH} characters)`,
maxLength: CONSTRAINTS.IMAGE.PROMPT_MAX_LENGTH
},
outputFile: {
type: "string",
description: "Absolute path for generated image file"
},
aspectRatio: {
type: "string",
enum: [...CONSTRAINTS.IMAGE.ASPECT_RATIOS],
default: "1:1",
description: `Aspect ratio for the image. Options: ${CONSTRAINTS.IMAGE.ASPECT_RATIOS.join(', ')}`
},
customSize: {
type: "object",
properties: {
width: { type: "number", minimum: CONSTRAINTS.IMAGE.MIN_DIMENSION, maximum: CONSTRAINTS.IMAGE.MAX_DIMENSION, multipleOf: CONSTRAINTS.IMAGE.DIMENSION_STEP },
height: { type: "number", minimum: CONSTRAINTS.IMAGE.MIN_DIMENSION, maximum: CONSTRAINTS.IMAGE.MAX_DIMENSION, multipleOf: CONSTRAINTS.IMAGE.DIMENSION_STEP }
},
required: ["width", "height"],
description: "Custom image dimensions (width x height in pixels). Range: 512-2048, must be multiples of 8. Total resolution should stay under 2M pixels. Only supported with image-01 model (cannot be used with style parameter). When both customSize and aspectRatio are set, aspectRatio takes precedence"
},
seed: {
type: "number",
description: "Random seed for reproducible results"
},
subjectReference: {
type: "string",
description: "File path to a portrait image for maintaining facial characteristics in generated images. Only supported with image-01 model (cannot be used with style parameter). Provide a clear frontal face photo for best results. Supports local file paths and URLs. Max 10MB, formats: jpg, jpeg, png"
},
style: {
type: "object",
properties: {
style_type: {
type: "string",
enum: [...CONSTRAINTS.IMAGE.STYLE_TYPES],
description: `Art style type. Options: ${CONSTRAINTS.IMAGE.STYLE_TYPES.join(', ')}`
},
style_weight: {
type: "number",
exclusiveMinimum: 0,
maximum: CONSTRAINTS.IMAGE.STYLE_WEIGHT_MAX,
default: 0.8,
description: "Style control weight (0-1]. Higher values apply stronger style effects. Default: 0.8"
}
},
required: ["style_type"],
description: "Art style control settings. Uses image-01-live model which does not support customSize or subjectReference parameters. Cannot be combined with customSize or subjectReference"
}
},
required: ["prompt", "outputFile"]
} as const;
export const textToSpeechToolSchema = {
type: "object",
properties: {
text: {
type: "string",
description: `Text to convert to speech. Max ${CONSTRAINTS.TTS.TEXT_MAX_LENGTH} characters. Use newlines for paragraph breaks. For custom pauses, insert <#x#> where x is seconds (0.01-99.99, max 2 decimals). Pause markers must be between pronounceable text and cannot be consecutive`,
maxLength: CONSTRAINTS.TTS.TEXT_MAX_LENGTH,
minLength: 1
},
outputFile: {
type: "string",
description: "Absolute path for audio file"
},
highQuality: {
type: "boolean",
default: false,
description: "Use high-quality model (speech-02-hd) for audiobooks/premium content. Default: false (uses faster speech-02-turbo)"
},
voiceId: {
type: "string",
enum: Object.keys(VOICES),
default: "female-shaonv",
description: `Voice ID for speech generation. Available voices: ${Object.keys(VOICES).map(id => `${id} (${VOICES[id as VoiceId]?.name || id})`).join(', ')}`
},
speed: {
type: "number",
minimum: CONSTRAINTS.TTS.SPEED_MIN,
maximum: CONSTRAINTS.TTS.SPEED_MAX,
default: 1.0,
description: `Speech speed multiplier (${CONSTRAINTS.TTS.SPEED_MIN}-${CONSTRAINTS.TTS.SPEED_MAX}). Higher values = faster speech`
},
volume: {
type: "number",
minimum: CONSTRAINTS.TTS.VOLUME_MIN,
maximum: CONSTRAINTS.TTS.VOLUME_MAX,
default: 1.0,
description: `Audio volume level (${CONSTRAINTS.TTS.VOLUME_MIN}-${CONSTRAINTS.TTS.VOLUME_MAX}). Higher values = louder audio`
},
pitch: {
type: "number",
minimum: CONSTRAINTS.TTS.PITCH_MIN,
maximum: CONSTRAINTS.TTS.PITCH_MAX,
default: 0,
description: `Pitch adjustment in semitones (${CONSTRAINTS.TTS.PITCH_MIN} to ${CONSTRAINTS.TTS.PITCH_MAX}). Negative = lower pitch, Positive = higher pitch`
},
emotion: {
type: "string",
enum: [...CONSTRAINTS.TTS.EMOTIONS],
default: "neutral",
description: `Emotional tone of the speech. Options: ${CONSTRAINTS.TTS.EMOTIONS.join(', ')}`
},
format: {
type: "string",
enum: [...CONSTRAINTS.TTS.FORMATS],
default: "mp3",
description: `Output audio format. Options: ${CONSTRAINTS.TTS.FORMATS.join(', ')}`
},
sampleRate: {
type: "string",
enum: [...CONSTRAINTS.TTS.SAMPLE_RATES],
default: "32000",
description: `Audio sample rate in Hz. Options: ${CONSTRAINTS.TTS.SAMPLE_RATES.join(', ')}`
},
bitrate: {
type: "string",
enum: [...CONSTRAINTS.TTS.BITRATES],
default: "128000",
description: `Audio bitrate in bps. Options: ${CONSTRAINTS.TTS.BITRATES.join(', ')}`
},
languageBoost: {
type: "string",
default: "auto",
description: "Enhance recognition for specific languages/dialects. Options: Chinese, Chinese,Yue, English, Arabic, Russian, Spanish, French, Portuguese, German, Turkish, Dutch, Ukrainian, Vietnamese, Indonesian, Japanese, Italian, Korean, Thai, Polish, Romanian, Greek, Czech, Finnish, Hindi, Bulgarian, Danish, Hebrew, Malay, Persian, Slovak, Swedish, Croatian, Filipino, Hungarian, Norwegian, Slovenian, Catalan, Nynorsk, Tamil, Afrikaans, auto. Use 'auto' for automatic detection"
},
intensity: {
type: "number",
minimum: CONSTRAINTS.TTS.VOICE_MODIFY_INTENSITY_MIN,
maximum: CONSTRAINTS.TTS.VOICE_MODIFY_INTENSITY_MAX,
description: "Voice intensity adjustment (-100 to 100). Values closer to -100 make voice more robust, closer to 100 make voice softer"
},
timbre: {
type: "number",
minimum: CONSTRAINTS.TTS.VOICE_MODIFY_TIMBRE_MIN,
maximum: CONSTRAINTS.TTS.VOICE_MODIFY_TIMBRE_MAX,
description: "Voice timbre adjustment (-100 to 100). Values closer to -100 make voice more mellow, closer to 100 make voice more crisp"
},
sound_effects: {
type: "string",
enum: [...CONSTRAINTS.TTS.SOUND_EFFECTS],
description: getSoundEffectsDescription()
}
},
required: ["text", "outputFile"]
} as const;
export const taskBarrierToolSchema = {
type: "object",
properties: {}
} as const;
// Validation helper functions
export function validateImageParams(params: unknown): ImageGenerationParams {
try {
const parsed = imageGenerationSchema.parse(params);
// Manual validation for incompatible parameter combinations
const hasStyle = !!parsed.style;
const hasCustomSize = !!parsed.customSize;
const hasSubjectReference = !!parsed.subjectReference;
if (hasStyle && hasCustomSize) {
throw new Error('Style parameter (image-01-live model) cannot be combined with customSize (image-01 model feature)');
}
if (hasStyle && hasSubjectReference) {
throw new Error('Style parameter (image-01-live model) cannot be combined with subjectReference (image-01 model feature)');
}
return parsed;
} catch (error) {
if (error instanceof z.ZodError) {
const messages = error.errors.map(e => `${e.path.join('.')}: ${e.message}`);
throw new Error(`Validation failed: ${messages.join(', ')}`);
}
throw error;
}
}
export function validateTTSParams(params: unknown): TextToSpeechParams {
try {
return textToSpeechSchema.parse(params);
} catch (error) {
if (error instanceof z.ZodError) {
const messages = error.errors.map(e => `${e.path.join('.')}: ${e.message}`);
throw new Error(`Validation failed: ${messages.join(', ')}`);
}
throw error;
}
}
export function validateTaskBarrierParams(params: unknown): TaskBarrierParams {
try {
return taskBarrierSchema.parse(params);
} catch (error) {
if (error instanceof z.ZodError) {
const messages = error.errors.map(e => `${e.path.join('.')}: ${e.message}`);
throw new Error(`Validation failed: ${messages.join(', ')}`);
}
throw error;
}
}
```