# Directory Structure ``` ├── .docs │ ├── llms-full.txt │ ├── openai images 1.txt │ └── typescript-sdk mcp README.md ├── .gitignore ├── CHANGELOG.md ├── CONTEXT.md ├── LICENSE ├── logo.png ├── package-lock.json ├── package.json ├── README.md └── src ├── index.ts └── tsconfig.json ``` # Files -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- ``` # Dependency directories node_modules/ jspm_packages/ # Build outputs build/ dist/ lib/ out/ *.tsbuildinfo # Generated images generated-images/ # Environment variables .env .env.local .env.development.local .env.test.local .env.production.local # Logs logs *.log npm-debug.log* yarn-debug.log* yarn-error.log* lerna-debug.log* .pnpm-debug.log* # Coverage directory used by tools like istanbul coverage/ *.lcov # TypeScript cache *.tsbuildinfo # Optional npm cache directory .npm # Optional eslint cache .eslintcache # Optional stylelint cache .stylelintcache # Microbundle cache .rpt2_cache/ .rts2_cache_cjs/ .rts2_cache_es/ .rts2_cache_umd/ # Optional REPL history .node_repl_history # Output of 'npm pack' *.tgz # Yarn Integrity file .yarn-integrity # dotenv environment variable files .env .env.development.local .env.test.local .env.production.local .env.local # parcel-bundler cache (https://parceljs.org/) .cache .parcel-cache # Next.js build output .next out # Nuxt.js build / generate output .nuxt dist # Gatsby files .cache/ # Comment in the public line in if your project uses Gatsby and not Next.js # https://nextjs.org/blog/next-9-1#public-directory-support # public # vuepress build output .vuepress/dist # vuepress v2.x temp and cache directory .temp .cache # Docusaurus cache and generated files .docusaurus # Serverless directories .serverless/ # FuseBox cache .fusebox/ # DynamoDB Local files .dynamodb/ # TernJS port file .tern-port # Stores VSCode versions used for testing VSCode extensions .vscode-test # yarn v2 .yarn/cache .yarn/unplugged .yarn/build-state.yml .yarn/install-state.gz .pnp.* # IDE specific files .idea/ .vscode/ *.swp *.swo .DS_Store ].docs/ ``` -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- ```markdown <p align="center"> <img src="logo.png" alt="GPT Image 1 MCP Logo" width="200"/> </p> <h1 align="center">@cloudwerxlab/gpt-image-1-mcp</h1> <p align="center"> <a href="https://www.npmjs.com/package/@cloudwerxlab/gpt-image-1-mcp"><img src="https://img.shields.io/npm/v/@cloudwerxlab/gpt-image-1-mcp.svg" alt="npm version"></a> <a href="https://www.npmjs.com/package/@cloudwerxlab/gpt-image-1-mcp"><img src="https://img.shields.io/npm/dm/@cloudwerxlab/gpt-image-1-mcp.svg" alt="npm downloads"></a> <a href="https://github.com/CLOUDWERX-DEV/gpt-image-1-mcp/blob/main/LICENSE"><img src="https://img.shields.io/github/license/CLOUDWERX-DEV/gpt-image-1-mcp.svg" alt="license"></a> <a href="https://nodejs.org/"><img src="https://img.shields.io/node/v/@cloudwerxlab/gpt-image-1-mcp.svg" alt="node version"></a> <a href="https://cloudwerx.dev"><img src="https://img.shields.io/badge/website-cloudwerx.dev-blue" alt="Website"></a> </p> <p align="center"> A Model Context Protocol (MCP) server for generating and editing images using the OpenAI <code>gpt-image-1</code> model. </p> <p align="center"> <img src="https://img.shields.io/badge/OpenAI-GPT--Image--1-6E46AE" alt="OpenAI GPT-Image-1"> <img src="https://img.shields.io/badge/MCP-Compatible-00A3E0" alt="MCP Compatible"> </p> ## 🚀 Quick Start <div align="center"> <a href="https://www.npmjs.com/package/@cloudwerxlab/gpt-image-1-mcp"><img src="https://img.shields.io/badge/NPX-Ready-red.svg" alt="NPX Ready"></a> </div> <p align="center">Run this MCP server directly using NPX without installing it. <a href="https://www.npmjs.com/package/@cloudwerxlab/gpt-image-1-mcp">View on npm</a>.</p> ```bash npx -y @cloudwerxlab/gpt-image-1-mcp ``` <p align="center">The <code>-y</code> flag automatically answers "yes" to any prompts that might appear during the installation process.</p> ### 📋 Prerequisites <table> <tr> <td width="50%" align="center"> <img src="https://img.shields.io/badge/Node.js-v14+-339933?logo=node.js&logoColor=white" alt="Node.js v14+"> <p>Node.js (v14 or higher)</p> </td> <td width="50%" align="center"> <img src="https://img.shields.io/badge/OpenAI-API_Key-412991?logo=openai&logoColor=white" alt="OpenAI API Key"> <p>OpenAI API key with access to gpt-image-1</p> </td> </tr> </table> ### 🔑 Environment Variables <table> <tr> <th>Variable</th> <th>Required</th> <th>Description</th> </tr> <tr> <td><code>OPENAI_API_KEY</code></td> <td>✅ Yes</td> <td>Your OpenAI API key with access to the gpt-image-1 model</td> </tr> <tr> <td><code>GPT_IMAGE_OUTPUT_DIR</code></td> <td>❌ No</td> <td>Custom directory for saving generated images (defaults to user's Pictures folder under <code>gpt-image-1</code> subfolder)</td> </tr> </table> ### 💻 Example Usage with NPX <table> <tr> <th>Operating System</th> <th>Command Line Example</th> </tr> <tr> <td><strong>Linux/macOS</strong></td> <td> ```bash # Set your OpenAI API key export OPENAI_API_KEY=sk-your-openai-api-key # Optional: Set custom output directory export GPT_IMAGE_OUTPUT_DIR=/home/username/Pictures/ai-generated-images # Run the server with NPX npx -y @cloudwerxlab/gpt-image-1-mcp ``` </tr> <tr> <td><strong>Windows (PowerShell)</strong></td> <td> ```powershell # Set your OpenAI API key $env:OPENAI_API_KEY = "sk-your-openai-api-key" # Optional: Set custom output directory $env:GPT_IMAGE_OUTPUT_DIR = "C:\Users\username\Pictures\ai-generated-images" # Run the server with NPX npx -y @cloudwerxlab/gpt-image-1-mcp ``` </tr> <tr> <td><strong>Windows (Command Prompt)</strong></td> <td> ```cmd :: Set your OpenAI API key set OPENAI_API_KEY=sk-your-openai-api-key :: Optional: Set custom output directory set GPT_IMAGE_OUTPUT_DIR=C:\Users\username\Pictures\ai-generated-images :: Run the server with NPX npx -y @cloudwerxlab/gpt-image-1-mcp ``` </tr> </table> ## 🔌 Integration with MCP Clients <div align="center"> <img src="https://img.shields.io/badge/VS_Code-MCP_Extension-007ACC?logo=visual-studio-code&logoColor=white" alt="VS Code MCP Extension"> <img src="https://img.shields.io/badge/Roo-Compatible-FF6B6B" alt="Roo Compatible"> <img src="https://img.shields.io/badge/Cursor-Compatible-4C2889" alt="Cursor Compatible"> <img src="https://img.shields.io/badge/Augment-Compatible-6464FF" alt="Augment Compatible"> <img src="https://img.shields.io/badge/Windsurf-Compatible-00B4D8" alt="Windsurf Compatible"> </div> ### 🛠️ Setting Up in an MCP Client <table> <tr> <td> <h4>Step 1: Locate Settings File</h4> <ul> <li>For <strong>Roo</strong>: <code>c:\Users\<username>\AppData\Roaming\Code\User\globalStorage\rooveterinaryinc.roo-cline\settings\mcp_settings.json</code></li> <li>For <strong>VS Code MCP Extension</strong>: Check your extension documentation for the settings file location</li> <li>For <strong>Cursor</strong>: <code>~/.config/cursor/mcp_settings.json</code> (Linux/macOS) or <code>%APPDATA%\Cursor\mcp_settings.json</code> (Windows)</li> <li>For <strong>Augment</strong>: <code>~/.config/augment/mcp_settings.json</code> (Linux/macOS) or <code>%APPDATA%\Augment\mcp_settings.json</code> (Windows)</li> <li>For <strong>Windsurf</strong>: <code>~/.config/windsurf/mcp_settings.json</code> (Linux/macOS) or <code>%APPDATA%\Windsurf\mcp_settings.json</code> (Windows)</li> </ul> </td> </tr> <tr> <td> <h4>Step 2: Add Configuration</h4> <p>Add the following configuration to the <code>mcpServers</code> object:</p> </td> </tr> </table> ```json { "mcpServers": { "gpt-image-1": { "command": "npx", "args": [ "-y", "@cloudwerxlab/gpt-image-1-mcp" ], "env": { "OPENAI_API_KEY": "PASTE YOUR OPEN-AI KEY HERE", "GPT_IMAGE_OUTPUT_DIR": "OPTIONAL: PATH TO SAVE GENERATED IMAGES" } } } } ``` #### Example Configurations for Different Operating Systems <table> <tr> <th>Operating System</th> <th>Example Configuration</th> </tr> <tr> <td><strong>Windows</strong></td> <td> ```json { "mcpServers": { "gpt-image-1": { "command": "npx", "args": ["-y", "@cloudwerxlab/gpt-image-1-mcp"], "env": { "OPENAI_API_KEY": "sk-your-openai-api-key", "GPT_IMAGE_OUTPUT_DIR": "C:\\Users\\username\\Pictures\\ai-generated-images" } } } } ``` </tr> <tr> <td><strong>Linux/macOS</strong></td> <td> ```json { "mcpServers": { "gpt-image-1": { "command": "npx", "args": ["-y", "@cloudwerxlab/gpt-image-1-mcp"], "env": { "OPENAI_API_KEY": "sk-your-openai-api-key", "GPT_IMAGE_OUTPUT_DIR": "/home/username/Pictures/ai-generated-images" } } } } ``` </tr> </table> > **Note**: For Windows paths, use double backslashes (`\\`) to escape the backslash character in JSON. For Linux/macOS, use forward slashes (`/`). ## ✨ Features <div align="center"> <table> <tr> <td align="center"> <h3>🎨 Core Tools</h3> <ul> <li><code>create_image</code>: Generate new images from text prompts</li> <li><code>create_image_edit</code>: Edit existing images with text prompts and masks</li> </ul> </td> <td align="center"> <h3>🚀 Key Benefits</h3> <ul> <li>Simple integration with MCP clients</li> <li>Full access to OpenAI's gpt-image-1 capabilities</li> <li>Streamlined workflow for AI image generation</li> </ul> </td> </tr> </table> </div> ### 💡 Enhanced Capabilities <table> <tr> <td> <h4>📊 Output & Formatting</h4> <ul> <li>✅ <strong>Beautifully Formatted Output</strong>: Responses include emojis and detailed information</li> <li>✅ <strong>Automatic Image Saving</strong>: All generated images saved to disk for easy access</li> <li>✅ <strong>Detailed Token Usage</strong>: View token consumption for each request</li> </ul> </td> <td> <h4>⚙️ Configuration & Handling</h4> <ul> <li>✅ <strong>Configurable Output Directory</strong>: Customize where images are saved</li> <li>✅ <strong>File Path Support</strong>: Edit images using file paths instead of base64 encoding</li> <li>✅ <strong>Comprehensive Error Handling</strong>: Detailed error reporting with specific error codes, descriptions, and troubleshooting suggestions</li> </ul> </td> </tr> </table> ## 🔄 How It Works <div align="center"> <table> <tr> <th align="center">🖼️ Image Generation</th> <th align="center">✏️ Image Editing</th> </tr> <tr> <td> <ol> <li>Server receives prompt and parameters</li> <li>Calls OpenAI API using gpt-image-1 model</li> <li>API returns base64-encoded images</li> <li>Server saves images to configured directory</li> <li>Returns formatted response with paths and metadata</li> </ol> </td> <td> <ol> <li>Server receives image, prompt, and optional mask</li> <li>For file paths, reads and prepares files for API</li> <li>Uses direct curl command for proper MIME handling</li> <li>API returns base64-encoded edited images</li> <li>Server saves images to configured directory</li> <li>Returns formatted response with paths and metadata</li> </ol> </td> </tr> </table> </div> ### 📁 Output Directory Behavior <table> <tr> <td width="50%"> <h4>📂 Storage Location</h4> <ul> <li>🔹 <strong>Default Location</strong>: User's Pictures folder under <code>gpt-image-1</code> subfolder (e.g., <code>C:\Users\username\Pictures\gpt-image-1</code> on Windows)</li> <li>🔹 <strong>Custom Location</strong>: Set via <code>GPT_IMAGE_OUTPUT_DIR</code> environment variable</li> <li>🔹 <strong>Fallback Location</strong>: <code>./generated-images</code> (if Pictures folder can't be determined)</li> </ul> </td> <td width="50%"> <h4>🗂️ File Management</h4> <ul> <li>🔹 <strong>Directory Creation</strong>: Automatically creates output directory if it doesn't exist</li> <li>🔹 <strong>File Naming</strong>: Images saved with timestamped filenames (e.g., <code>image-2023-05-05T12-34-56-789Z.png</code>)</li> <li>🔹 <strong>Cross-Platform</strong>: Works on Windows, macOS, and Linux with appropriate Pictures folder detection</li> </ul> </td> </tr> </table> ## Installation & Usage ### NPM Package This package is available on npm: [@cloudwerxlab/gpt-image-1-mcp](https://www.npmjs.com/package/@cloudwerxlab/gpt-image-1-mcp) You can install it globally: ```bash npm install -g @cloudwerxlab/gpt-image-1-mcp ``` Or run it directly with npx as shown in the Quick Start section. ### Tool: `create_image` Generates a new image based on a text prompt. #### Parameters | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `prompt` | string | Yes | The text description of the image to generate (max 32,000 chars) | | `size` | string | No | Image size: "1024x1024" (default), "1536x1024", or "1024x1536" | | `quality` | string | No | Image quality: "high" (default), "medium", or "low" | | `n` | integer | No | Number of images to generate (1-10, default: 1) | | `background` | string | No | Background style: "transparent", "opaque", or "auto" (default) | | `output_format` | string | No | Output format: "png" (default), "jpeg", or "webp" | | `output_compression` | integer | No | Compression level (0-100, default: 0) | | `user` | string | No | User identifier for OpenAI usage tracking | | `moderation` | string | No | Moderation level: "low" or "auto" (default) | #### Example ```xml <use_mcp_tool> <server_name>gpt-image-1</server_name> <tool_name>create_image</tool_name> <arguments> { "prompt": "A futuristic city skyline at sunset, digital art", "size": "1024x1024", "quality": "high", "n": 1, "background": "auto" } </arguments> </use_mcp_tool> ``` #### Response The tool returns: - A formatted text message with details about the generated image(s) - The image(s) as base64-encoded data - Metadata including token usage and file paths ### Tool: `create_image_edit` Edits an existing image based on a text prompt and optional mask. #### Parameters | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `image` | string, object, or array | Yes | The image(s) to edit (base64 string or file path object) | | `prompt` | string | Yes | The text description of the desired edit (max 32,000 chars) | | `mask` | string or object | No | The mask that defines areas to edit (base64 string or file path object) | | `size` | string | No | Image size: "1024x1024" (default), "1536x1024", or "1024x1536" | | `quality` | string | No | Image quality: "high" (default), "medium", or "low" | | `n` | integer | No | Number of images to generate (1-10, default: 1) | | `background` | string | No | Background style: "transparent", "opaque", or "auto" (default) | | `user` | string | No | User identifier for OpenAI usage tracking | #### Example with Base64 Encoded Image ```xml <use_mcp_tool> <server_name>gpt-image-1</server_name> <tool_name>create_image_edit</tool_name> <arguments> { "image": "BASE64_ENCODED_IMAGE_STRING", "prompt": "Add a small robot in the corner", "mask": "BASE64_ENCODED_MASK_STRING", "quality": "high" } </arguments> </use_mcp_tool> ``` #### Example with File Path ```xml <use_mcp_tool> <server_name>gpt-image-1</server_name> <tool_name>create_image_edit</tool_name> <arguments> { "image": { "filePath": "C:/path/to/your/image.png" }, "prompt": "Add a small robot in the corner", "mask": { "filePath": "C:/path/to/your/mask.png" }, "quality": "high" } </arguments> </use_mcp_tool> ``` #### Response The tool returns: - A formatted text message with details about the edited image(s) - The edited image(s) as base64-encoded data - Metadata including token usage and file paths ## 🔧 Troubleshooting <div align="center"> <img src="https://img.shields.io/badge/Support-Available-brightgreen" alt="Support Available"> </div> ### 🚨 Common Issues <table> <tr> <th align="center">Issue</th> <th align="center">Solution</th> </tr> <tr> <td> <h4>🖼️ MIME Type Errors</h4> <p>Errors related to image format or MIME type handling</p> </td> <td> <p>Ensure image files have the correct extension (.png, .jpg, etc.) that matches their actual format. The server uses file extensions to determine MIME types.</p> </td> </tr> <tr> <td> <h4>🔑 API Key Issues</h4> <p>Authentication errors with OpenAI API</p> </td> <td> <p>Verify your OpenAI API key is correct and has access to the gpt-image-1 model. Check for any spaces or special characters that might have been accidentally included.</p> </td> </tr> <tr> <td> <h4>🛠️ Build Errors</h4> <p>Issues when building from source</p> </td> <td> <p>Ensure you have the correct TypeScript version installed (v5.3.3 or compatible) and that your <code>tsconfig.json</code> is properly configured. Run <code>npm install</code> to ensure all dependencies are installed.</p> </td> </tr> <tr> <td> <h4>📁 Output Directory Issues</h4> <p>Problems with saving generated images</p> </td> <td> <p>Check if the process has write permissions to the configured output directory. Try using an absolute path for <code>GPT_IMAGE_OUTPUT_DIR</code> if relative paths aren't working.</p> </td> </tr> </table> ### 🔍 Error Handling and Reporting The MCP server includes comprehensive error handling that provides detailed information when something goes wrong. When an error occurs: 1. **Error Format**: All errors are returned with: - A clear error message describing what went wrong - The specific error code or type - Additional context about the error when available 2. **AI Assistant Behavior**: When using this MCP server with AI assistants: - The AI will always report the full error message to help with troubleshooting - The AI will explain the likely cause of the error in plain language - The AI will suggest specific steps to resolve the issue ## 📄 License <div align="center"> <a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="MIT License"></a> </div> <p align="center"> This project is licensed under the MIT License - see the <a href="LICENSE">LICENSE</a> file for details. </p> <details> <summary>License Summary</summary> <p>The MIT License is a permissive license that is short and to the point. It lets people do anything with your code with proper attribution and without warranty.</p> <p><strong>You are free to:</strong></p> <ul> <li>Use the software commercially</li> <li>Modify the software</li> <li>Distribute the software</li> <li>Use and modify the software privately</li> </ul> <p><strong>Under the following terms:</strong></p> <ul> <li>Include the original copyright notice and the license notice in all copies or substantial uses of the work</li> </ul> <p><strong>Limitations:</strong></p> <ul> <li>The authors provide no warranty with the software and are not liable for any damages</li> </ul> </details> ## 🙏 Acknowledgments <div align="center"> <table> <tr> <td align="center"> <a href="https://openai.com/"> <img src="https://img.shields.io/badge/OpenAI-412991?logo=openai&logoColor=white" alt="OpenAI"> <p>For providing the gpt-image-1 model</p> </a> </td> <td align="center"> <a href="https://github.com/model-context-protocol/mcp"> <img src="https://img.shields.io/badge/MCP-Protocol-00A3E0" alt="MCP Protocol"> <p>For the protocol specification</p> </a> </td> </tr> </table> </div> <div align="center"> <p> <a href="https://github.com/CLOUDWERX-DEV/gpt-image-1-mcp/issues">Report Bug</a> • <a href="https://github.com/CLOUDWERX-DEV/gpt-image-1-mcp/issues">Request Feature</a> • <a href="https://cloudwerx.dev">Visit Our Website</a> </p> </div> <div align="center"> <p> Developed with ❤️ by <a href="https://cloudwerx.dev">CLOUDWERX</a> </p> </div> ``` -------------------------------------------------------------------------------- /src/tsconfig.json: -------------------------------------------------------------------------------- ```json { "compilerOptions": { "target": "ES2022", "module": "Node16", "moduleResolution": "Node16", "outDir": "../build", "rootDir": ".", "strict": true, "esModuleInterop": true, "skipLibCheck": true, "forceConsistentCasingInFileNames": true, "paths": { "@modelcontextprotocol/sdk/*": ["./node_modules/@modelcontextprotocol/sdk/*"] } }, "include": ["./**/*.ts"], "exclude": ["node_modules", "build"] } ``` -------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- ```json { "name": "@cloudwerxlab/gpt-image-1-mcp", "version": "1.1.7", "description": "A Model Context Protocol server for OpenAI's gpt-image-1 model", "type": "module", "bin": { "@cloudwerxlab/gpt-image-1-mcp": "build/index.js" }, "files": [ "build", "README.md", "CHANGELOG.md", "LICENSE", "package.json", "tsconfig.json", "logo.png" ], "scripts": { "build": "cd src && tsc && node -e \"require('fs').chmodSync('../build/index.js', '755')\"", "watch": "cd src && tsc --watch", "test": "node test-mcp-server.js", "test:npx": "node test-npx.js", "prepare": "npm run build", "inspector": "npx @modelcontextprotocol/inspector ./build/index.js" }, "dependencies": { "@modelcontextprotocol/sdk": "^1.11.0", "node-fetch": "^3.3.2", "openai": "^4.97.0", "zod": "^3.24.4", "form-data": "^4.0.0" }, "devDependencies": { "@types/node": "^20.11.24", "typescript": "^5.3.3" }, "keywords": [ "mcp", "openai", "gpt-image-1", "image-generation", "model-context-protocol" ], "author": "", "license": "MIT", "engines": { "node": ">=14.0.0" } } ``` -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- ```markdown # Changelog All notable changes to this project will be documented in this file. ## 1.1.7 - 2025-05-07 ### Fixed - **Documentation**: Fixed formatting issues in README.md - **Documentation**: Restored enhanced README with centered logo and improved layout ## 1.1.6 - 2025-05-07 ### Changed - **Default Output Directory**: Changed default image save location to user's Pictures folder under `gpt-image-1` subfolder - **Cross-Platform Support**: Added detection of Pictures folder location on Windows, macOS, and Linux - **Documentation**: Updated README with new default output directory information ## 1.1.0 - 2025-05-05 ### Added - **File Path Support**: Added ability to use file paths for images and masks in the `create_image_edit` tool - **Configurable Output Directory**: Added support for customizing the output directory via the `GPT_IMAGE_OUTPUT_DIR` environment variable - **Enhanced Output Formatting**: Improved response formatting with emojis and detailed information - **Detailed Token Usage**: Added token usage information to the response metadata - **Comprehensive Documentation**: Completely rewrote the README.md with detailed usage examples and configuration options - **Proper .gitignore**: Added a comprehensive .gitignore file to exclude build artifacts and generated images ### Fixed - **Build Structure**: Fixed the build process to output to the root build directory instead of inside the src folder - **MIME Type Handling**: Improved MIME type handling for image uploads in the `create_image_edit` tool - **Error Handling**: Enhanced error handling with more informative error messages - **Cleanup Process**: Improved the cleanup process for temporary files ### Changed - **API Implementation**: Changed the image editing implementation to use a direct curl command for better MIME type handling - **Response Structure**: Updated the response structure to include more detailed information about generated images - **File Naming**: Improved the file naming convention for saved images with timestamps - **Dependencies**: Added node-fetch and form-data dependencies for improved HTTP requests ## 1.0.0 - 2025-05-04 ### Added - Initial release of the GPT-Image-1 MCP Server. - Implemented `create_image` tool for generating images using OpenAI `gpt-image-1`. - Implemented `create_image_edit` tool for editing images using OpenAI `gpt-image-1`. - Added support for all `gpt-image-1` specific parameters in both tools (`background`, `output_compression`, `output_format`, `quality`, `size`). - Included basic error handling for OpenAI API calls. - Created `README.md` with installation and configuration instructions. - Created `gpt-image-1-mcp.md` with a detailed architecture and tool overview. ``` -------------------------------------------------------------------------------- /.docs/openai images 1.txt: -------------------------------------------------------------------------------- ``` Create image post https://api.openai.com/v1/images/generations Creates an image given a prompt. Learn more. Request body prompt string Required A text description of the desired image(s). The maximum length is 32000 characters for gpt-image-1, 1000 characters for dall-e-2 and 4000 characters for dall-e-3. background string or null Optional Defaults to auto Allows to set transparency for the background of the generated image(s). This parameter is only supported for gpt-image-1. Must be one of transparent, opaque or auto (default value). When auto is used, the model will automatically determine the best background for the image. If transparent, the output format needs to support transparency, so it should be set to either png (default value) or webp. model string Optional Defaults to dall-e-2 The model to use for image generation. One of dall-e-2, dall-e-3, or gpt-image-1. Defaults to dall-e-2 unless a parameter specific to gpt-image-1 is used. moderation string or null Optional Defaults to auto Control the content-moderation level for images generated by gpt-image-1. Must be either low for less restrictive filtering or auto (default value). n integer or null Optional Defaults to 1 The number of images to generate. Must be between 1 and 10. For dall-e-3, only n=1 is supported. output_compression integer or null Optional Defaults to 100 The compression level (0-100%) for the generated images. This parameter is only supported for gpt-image-1 with the webp or jpeg output formats, and defaults to 100. output_format string or null Optional Defaults to png The format in which the generated images are returned. This parameter is only supported for gpt-image-1. Must be one of png, jpeg, or webp. quality string or null Optional Defaults to auto The quality of the image that will be generated. auto (default value) will automatically select the best quality for the given model. high, medium and low are supported for gpt-image-1. hd and standard are supported for dall-e-3. standard is the only option for dall-e-2. response_format string or null Optional Defaults to url The format in which generated images with dall-e-2 and dall-e-3 are returned. Must be one of url or b64_json. URLs are only valid for 60 minutes after the image has been generated. This parameter isn't supported for gpt-image-1 which will always return base64-encoded images. size string or null Optional Defaults to auto The size of the generated images. Must be one of 1024x1024, 1536x1024 (landscape), 1024x1536 (portrait), or auto (default value) for gpt-image-1, one of 256x256, 512x512, or 1024x1024 for dall-e-2, and one of 1024x1024, 1792x1024, or 1024x1792 for dall-e-3. style string or null Optional Defaults to vivid The style of the generated images. This parameter is only supported for dall-e-3. Must be one of vivid or natural. Vivid causes the model to lean towards generating hyper-real and dramatic images. Natural causes the model to produce more natural, less hyper-real looking images. user string Optional A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Learn more. Returns Returns a list of image objects. Example request curl https://api.openai.com/v1/images/generations \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -d '{ "model": "gpt-image-1", "prompt": "A cute baby sea otter", "n": 1, "size": "1024x1024" }' Response { "created": 1713833628, "data": [ { "b64_json": "..." } ], "usage": { "total_tokens": 100, "input_tokens": 50, "output_tokens": 50, "input_tokens_details": { "text_tokens": 10, "image_tokens": 40 } } } Create image edit post https://api.openai.com/v1/images/edits Creates an edited or extended image given one or more source images and a prompt. This endpoint only supports gpt-image-1 and dall-e-2. Request body image string or array Required The image(s) to edit. Must be a supported image file or an array of images. For gpt-image-1, each image should be a png, webp, or jpg file less than 25MB. You can provide up to 16 images. For dall-e-2, you can only provide one image, and it should be a square png file less than 4MB. prompt string Required A text description of the desired image(s). The maximum length is 1000 characters for dall-e-2, and 32000 characters for gpt-image-1. background string or null Optional Defaults to auto Allows to set transparency for the background of the generated image(s). This parameter is only supported for gpt-image-1. Must be one of transparent, opaque or auto (default value). When auto is used, the model will automatically determine the best background for the image. If transparent, the output format needs to support transparency, so it should be set to either png (default value) or webp. mask file Optional An additional image whose fully transparent areas (e.g. where alpha is zero) indicate where image should be edited. If there are multiple images provided, the mask will be applied on the first image. Must be a valid PNG file, less than 4MB, and have the same dimensions as image. model string Optional Defaults to dall-e-2 The model to use for image generation. Only dall-e-2 and gpt-image-1 are supported. Defaults to dall-e-2 unless a parameter specific to gpt-image-1 is used. n integer or null Optional Defaults to 1 The number of images to generate. Must be between 1 and 10. quality string or null Optional Defaults to auto The quality of the image that will be generated. high, medium and low are only supported for gpt-image-1. dall-e-2 only supports standard quality. Defaults to auto. response_format string or null Optional Defaults to url The format in which the generated images are returned. Must be one of url or b64_json. URLs are only valid for 60 minutes after the image has been generated. This parameter is only supported for dall-e-2, as gpt-image-1 will always return base64-encoded images. size string or null Optional Defaults to 1024x1024 The size of the generated images. Must be one of 1024x1024, 1536x1024 (landscape), 1024x1536 (portrait), or auto (default value) for gpt-image-1, and one of 256x256, 512x512, or 1024x1024 for dall-e-2. user string Optional A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Learn more. Returns Returns a list of image objects. Example request curl -s -D >(grep -i x-request-id >&2) \ -o >(jq -r '.data[0].b64_json' | base64 --decode > gift-basket.png) \ -X POST "https://api.openai.com/v1/images/edits" \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -F "model=gpt-image-1" \ -F "image[][email protected]" \ -F "image[][email protected]" \ -F "image[][email protected]" \ -F "image[][email protected]" \ -F 'prompt=Create a lovely gift basket with these four items in it' Response { "created": 1713833628, "data": [ { "b64_json": "..." } ], "usage": { "total_tokens": 100, "input_tokens": 50, "output_tokens": 50, "input_tokens_details": { "text_tokens": 10, "image_tokens": 40 } } } Create image variation post https://api.openai.com/v1/images/variations Creates a variation of a given image. This endpoint only supports dall-e-2. Request body image file Required The image to use as the basis for the variation(s). Must be a valid PNG file, less than 4MB, and square. model string or "dall-e-2" Optional Defaults to dall-e-2 The model to use for image generation. Only dall-e-2 is supported at this time. n integer or null Optional Defaults to 1 The number of images to generate. Must be between 1 and 10. response_format string or null Optional Defaults to url The format in which the generated images are returned. Must be one of url or b64_json. URLs are only valid for 60 minutes after the image has been generated. size string or null Optional Defaults to 1024x1024 The size of the generated images. Must be one of 256x256, 512x512, or 1024x1024. user string Optional A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. Learn more. Returns Returns a list of image objects. Example request curl https://api.openai.com/v1/images/variations \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -F image="@otter.png" \ -F n=2 \ -F size="1024x1024" Response { "created": 1589478378, "data": [ { "url": "https://..." }, { "url": "https://..." } ] } The image generation response The response from the image generation endpoint. created integer The Unix timestamp (in seconds) of when the image was created. data array The list of generated images. Show properties usage object For gpt-image-1 only, the token usage information for the image generation. Show properties OBJECT The image generation response { "created": 1713833628, "data": [ { "b64_json": "..." } ], "usage": { "total_tokens": 100, "input_tokens": 50, "output_tokens": 50, "input_tokens_details": { "text_tokens": 10, "image_tokens": 40 } } } ``` -------------------------------------------------------------------------------- /CONTEXT.md: -------------------------------------------------------------------------------- ```markdown # GPT-Image-1 MCP Server: Project Context This document provides a comprehensive overview of the GPT-Image-1 MCP Server project, including its architecture, functionality, implementation details, and development history. It's designed to quickly bring developers and AI assistants up to speed on all aspects of the project. ## Project Overview The GPT-Image-1 MCP Server is a Node.js application that implements the Model Context Protocol (MCP) to provide image generation and editing capabilities using OpenAI's gpt-image-1 model. It serves as a bridge between MCP clients (like Roo or VS Code extensions) and the OpenAI API, allowing users to generate and edit images using natural language prompts. ## Core Functionality ### Image Generation The server provides the `create_image` tool, which: 1. Accepts a text prompt and optional parameters 2. Validates the input using Zod schemas 3. Calls the OpenAI API's images.generate endpoint 4. Saves the generated images to a configurable output directory 5. Returns a formatted response with image paths, base64 data, and metadata ### Image Editing The server provides the `create_image_edit` tool, which: 1. Accepts an image (as base64 or file path), a text prompt, and an optional mask 2. Supports both base64-encoded images and file paths 3. Uses a direct curl command to ensure proper MIME type handling 4. Calls the OpenAI API's images.edit endpoint 5. Saves the edited images to the configured output directory 6. Returns a formatted response with image paths, base64 data, and metadata ## Technical Architecture ### Project Structure ``` gpt-image-1-server/ ├── src/ # TypeScript source code │ └── index.ts # Main server implementation ├── build/ # Compiled JavaScript (output of build process) ├── generated-images/ # Default location for saved images (created at runtime) ├── node_modules/ # Dependencies (not in version control) ├── .gitignore # Git ignore configuration ├── package.json # Project configuration and dependencies ├── tsconfig.json # TypeScript compiler configuration ├── README.md # User documentation ├── CHANGELOG.md # Version history and changes └── CONTEXT.md # This comprehensive project overview ``` ### Dependencies The server relies on several key dependencies: - `@modelcontextprotocol/sdk`: For implementing the MCP protocol - `openai`: The official OpenAI SDK for API access - `zod`: For input validation and type safety - `node-fetch`: For making HTTP requests - `form-data`: For handling multipart/form-data requests - `child_process`: For executing curl commands ### Implementation Details #### MCP Server Setup The server is implemented using the MCP SDK's `McpServer` class. It registers two tools: 1. `create_image`: For generating images 2. `create_image_edit`: For editing images Each tool has a defined schema for its parameters and a handler function that processes requests. #### Image Generation Implementation The image generation functionality uses the OpenAI SDK directly: ```typescript const response = await openai.images.generate({ model: "gpt-image-1", prompt: args.prompt, n: args.n || 1, size: args.size || "1024x1024", quality: args.quality || "high", // ... other parameters }); ``` The server then processes the response, saves the images to disk, and returns a formatted response. #### Image Editing Implementation The image editing functionality uses a direct curl command for better MIME type handling: ```typescript // Build the curl command let curlCommand = `curl -s -X POST "https://api.openai.com/v1/images/edits" -H "Authorization: Bearer ${process.env.OPENAI_API_KEY}"`; // Add parameters curlCommand += ` -F "model=gpt-image-1"`; curlCommand += ` -F "prompt=${args.prompt}"`; curlCommand += ` -F "image[]=@${imageFile}"`; // ... other parameters // Execute the command execSync(curlCommand, { stdio: ['pipe', 'pipe', 'inherit'] }); ``` This approach ensures proper handling of file uploads with correct MIME types. #### Image Saving Images are saved to a configurable output directory: ```typescript function saveImageToDisk(base64Data: string, format: string = 'png'): string { // Determine the output directory const outputDir = process.env.GPT_IMAGE_OUTPUT_DIR || path.join(process.cwd(), 'generated-images'); // Create the directory if it doesn't exist if (!fs.existsSync(outputDir)) { fs.mkdirSync(outputDir, { recursive: true }); } // Generate a filename with timestamp const timestamp = new Date().toISOString().replace(/[:.]/g, '-'); const filename = `image-${timestamp}.${format}`; const outputPath = path.join(outputDir, filename); // Save the image fs.writeFileSync(outputPath, Buffer.from(base64Data, 'base64')); return outputPath; } ``` #### Response Formatting The server provides beautifully formatted responses with emojis and detailed information: ``` 🎨 **Image Generated Successfully!** 📝 **Prompt**: A futuristic city skyline at sunset, digital art 📁 **Saved 1 Image**: 1. C:\Users\username\project\generated-images\image-2025-05-05T12-34-56-789Z.png ⚡ **Token Usage**: • Total Tokens: 123 • Input Tokens: 45 • Output Tokens: 78 ``` ## Configuration ### Environment Variables The server uses the following environment variables: | Variable | Required | Description | |----------|----------|-------------| | `OPENAI_API_KEY` | Yes | OpenAI API key with access to the gpt-image-1 model | | `GPT_IMAGE_OUTPUT_DIR` | No | Custom directory for saving generated images (defaults to `./generated-images`) | ### MCP Client Configuration To use the server with an MCP client, the following configuration is needed: ```json { "mcpServers": { "gpt-image-1": { "command": "node", "args": ["<path-to-project>/build/index.js"], "env": { "OPENAI_API_KEY": "sk-your-openai-api-key", "GPT_IMAGE_OUTPUT_DIR": "C:/path/to/output/directory" // Optional }, "disabled": false, "alwaysAllow": [] } } } ``` ## Development History ### Version 1.0.0 (May 4, 2025) The initial release included: - Basic implementation of the `create_image` and `create_image_edit` tools - Support for all gpt-image-1 specific parameters - Basic error handling - Initial documentation ### Version 1.1.0 (May 5, 2025) Major improvements included: - Added file path support for the `create_image_edit` tool - Fixed the build structure to output to the root build directory - Enhanced output formatting with emojis and detailed information - Added configurable output directory via environment variable - Improved MIME type handling for image uploads - Enhanced error handling and cleanup processes - Added comprehensive documentation - Added proper .gitignore file ## Key Challenges and Solutions ### MIME Type Handling **Challenge**: The OpenAI SDK didn't properly handle MIME types for file uploads in the image edit endpoint. **Solution**: Implemented a direct curl command approach that ensures proper MIME type handling: ```typescript curlCommand += ` -F "image[]=@${imageFile}"`; ``` ### File Path Support **Challenge**: The original implementation only supported base64-encoded images. **Solution**: Added support for file paths by: 1. Detecting if the input is a file path object 2. Reading the file from disk 3. Handling the file appropriately based on whether using the SDK or curl approach ### Build Structure **Challenge**: The build process was outputting to a directory inside the src folder. **Solution**: Updated the tsconfig.json to output to the root build directory: ```json { "compilerOptions": { "outDir": "./build", // other options... } } ``` ## Usage Examples ### Generating an Image ```xml <use_mcp_tool> <server_name>gpt-image-1</server_name> <tool_name>create_image</tool_name> <arguments> { "prompt": "A futuristic city skyline at sunset, digital art", "size": "1024x1024", "quality": "high" } </arguments> </use_mcp_tool> ``` ### Editing an Image with File Path ```xml <use_mcp_tool> <server_name>gpt-image-1</server_name> <tool_name>create_image_edit</tool_name> <arguments> { "image": { "filePath": "C:/path/to/your/image.png" }, "prompt": "Add a small robot in the corner", "quality": "high" } </arguments> </use_mcp_tool> ``` ## Future Improvements Potential areas for future development: 1. Add support for the DALL-E 3 model 2. Implement image variation functionality 3. Add batch processing capabilities 4. Create a web interface for easier testing 5. Add support for more image formats 6. Implement caching to reduce API calls 7. Add unit and integration tests ## Troubleshooting Guide ### Common Issues 1. **MIME Type Errors**: Ensure image files have the correct extension (.png, .jpg, etc.) that matches their actual format. 2. **API Key Issues**: Verify your OpenAI API key is correct and has access to the gpt-image-1 model. 3. **Build Errors**: Ensure you have the correct TypeScript version installed and that your tsconfig.json is properly configured. 4. **File Path Issues**: Make sure file paths are absolute or correctly relative to the current working directory. 5. **Output Directory Issues**: Check if the process has write permissions to the configured output directory. ## Conclusion The GPT-Image-1 MCP Server provides a robust and user-friendly interface to OpenAI's image generation capabilities. With features like file path support, configurable output directories, and detailed response formatting, it enhances the image generation experience for users of MCP-compatible clients. This document should provide a comprehensive understanding of the project's architecture, functionality, and development history, enabling developers and AI assistants to quickly get up to speed and contribute effectively. ``` -------------------------------------------------------------------------------- /.docs/typescript-sdk mcp README.md: -------------------------------------------------------------------------------- ```markdown # MCP TypeScript SDK   ## Table of Contents - [Overview](#overview) - [Installation](#installation) - [Quickstart](#quickstart) - [What is MCP?](#what-is-mcp) - [Core Concepts](#core-concepts) - [Server](#server) - [Resources](#resources) - [Tools](#tools) - [Prompts](#prompts) - [Running Your Server](#running-your-server) - [stdio](#stdio) - [Streamable HTTP](#streamable-http) - [Testing and Debugging](#testing-and-debugging) - [Examples](#examples) - [Echo Server](#echo-server) - [SQLite Explorer](#sqlite-explorer) - [Advanced Usage](#advanced-usage) - [Low-Level Server](#low-level-server) - [Writing MCP Clients](#writing-mcp-clients) - [Server Capabilities](#server-capabilities) - [Proxy OAuth Server](#proxy-authorization-requests-upstream) - [Backwards Compatibility](#backwards-compatibility) ## Overview The Model Context Protocol allows applications to provide context for LLMs in a standardized way, separating the concerns of providing context from the actual LLM interaction. This TypeScript SDK implements the full MCP specification, making it easy to: - Build MCP clients that can connect to any MCP server - Create MCP servers that expose resources, prompts and tools - Use standard transports like stdio and Streamable HTTP - Handle all MCP protocol messages and lifecycle events ## Installation ```bash npm install @modelcontextprotocol/sdk ``` ## Quick Start Let's create a simple MCP server that exposes a calculator tool and some data: ```typescript import { McpServer, ResourceTemplate } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { z } from "zod"; // Create an MCP server const server = new McpServer({ name: "Demo", version: "1.0.0" }); // Add an addition tool server.tool("add", { a: z.number(), b: z.number() }, async ({ a, b }) => ({ content: [{ type: "text", text: String(a + b) }] }) ); // Add a dynamic greeting resource server.resource( "greeting", new ResourceTemplate("greeting://{name}", { list: undefined }), async (uri, { name }) => ({ contents: [{ uri: uri.href, text: `Hello, ${name}!` }] }) ); // Start receiving messages on stdin and sending messages on stdout const transport = new StdioServerTransport(); await server.connect(transport); ``` ## What is MCP? The [Model Context Protocol (MCP)](https://modelcontextprotocol.io) lets you build servers that expose data and functionality to LLM applications in a secure, standardized way. Think of it like a web API, but specifically designed for LLM interactions. MCP servers can: - Expose data through **Resources** (think of these sort of like GET endpoints; they are used to load information into the LLM's context) - Provide functionality through **Tools** (sort of like POST endpoints; they are used to execute code or otherwise produce a side effect) - Define interaction patterns through **Prompts** (reusable templates for LLM interactions) - And more! ## Core Concepts ### Server The McpServer is your core interface to the MCP protocol. It handles connection management, protocol compliance, and message routing: ```typescript const server = new McpServer({ name: "My App", version: "1.0.0" }); ``` ### Resources Resources are how you expose data to LLMs. They're similar to GET endpoints in a REST API - they provide data but shouldn't perform significant computation or have side effects: ```typescript // Static resource server.resource( "config", "config://app", async (uri) => ({ contents: [{ uri: uri.href, text: "App configuration here" }] }) ); // Dynamic resource with parameters server.resource( "user-profile", new ResourceTemplate("users://{userId}/profile", { list: undefined }), async (uri, { userId }) => ({ contents: [{ uri: uri.href, text: `Profile data for user ${userId}` }] }) ); ``` ### Tools Tools let LLMs take actions through your server. Unlike resources, tools are expected to perform computation and have side effects: ```typescript // Simple tool with parameters server.tool( "calculate-bmi", { weightKg: z.number(), heightM: z.number() }, async ({ weightKg, heightM }) => ({ content: [{ type: "text", text: String(weightKg / (heightM * heightM)) }] }) ); // Async tool with external API call server.tool( "fetch-weather", { city: z.string() }, async ({ city }) => { const response = await fetch(`https://api.weather.com/${city}`); const data = await response.text(); return { content: [{ type: "text", text: data }] }; } ); ``` ### Prompts Prompts are reusable templates that help LLMs interact with your server effectively: ```typescript server.prompt( "review-code", { code: z.string() }, ({ code }) => ({ messages: [{ role: "user", content: { type: "text", text: `Please review this code:\n\n${code}` } }] }) ); ``` ## Running Your Server MCP servers in TypeScript need to be connected to a transport to communicate with clients. How you start the server depends on the choice of transport: ### stdio For command-line tools and direct integrations: ```typescript import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; const server = new McpServer({ name: "example-server", version: "1.0.0" }); // ... set up server resources, tools, and prompts ... const transport = new StdioServerTransport(); await server.connect(transport); ``` ### Streamable HTTP For remote servers, set up a Streamable HTTP transport that handles both client requests and server-to-client notifications. #### With Session Management In some cases, servers need to be stateful. This is achieved by [session management](https://modelcontextprotocol.io/specification/2025-03-26/basic/transports#session-management). ```typescript import express from "express"; import { randomUUID } from "node:crypto"; import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js"; import { isInitializeRequest } from "@modelcontextprotocol/sdk/types.js" const app = express(); app.use(express.json()); // Map to store transports by session ID const transports: { [sessionId: string]: StreamableHTTPServerTransport } = {}; // Handle POST requests for client-to-server communication app.post('/mcp', async (req, res) => { // Check for existing session ID const sessionId = req.headers['mcp-session-id'] as string | undefined; let transport: StreamableHTTPServerTransport; if (sessionId && transports[sessionId]) { // Reuse existing transport transport = transports[sessionId]; } else if (!sessionId && isInitializeRequest(req.body)) { // New initialization request transport = new StreamableHTTPServerTransport({ sessionIdGenerator: () => randomUUID(), onsessioninitialized: (sessionId) => { // Store the transport by session ID transports[sessionId] = transport; } }); // Clean up transport when closed transport.onclose = () => { if (transport.sessionId) { delete transports[transport.sessionId]; } }; const server = new McpServer({ name: "example-server", version: "1.0.0" }); // ... set up server resources, tools, and prompts ... // Connect to the MCP server await server.connect(transport); } else { // Invalid request res.status(400).json({ jsonrpc: '2.0', error: { code: -32000, message: 'Bad Request: No valid session ID provided', }, id: null, }); return; } // Handle the request await transport.handleRequest(req, res, req.body); }); // Reusable handler for GET and DELETE requests const handleSessionRequest = async (req: express.Request, res: express.Response) => { const sessionId = req.headers['mcp-session-id'] as string | undefined; if (!sessionId || !transports[sessionId]) { res.status(400).send('Invalid or missing session ID'); return; } const transport = transports[sessionId]; await transport.handleRequest(req, res); }; // Handle GET requests for server-to-client notifications via SSE app.get('/mcp', handleSessionRequest); // Handle DELETE requests for session termination app.delete('/mcp', handleSessionRequest); app.listen(3000); ``` #### Without Session Management (Stateless) For simpler use cases where session management isn't needed: ```typescript const app = express(); app.use(express.json()); app.post('/mcp', async (req: Request, res: Response) => { // In stateless mode, create a new instance of transport and server for each request // to ensure complete isolation. A single instance would cause request ID collisions // when multiple clients connect concurrently. try { const server = getServer(); const transport: StreamableHTTPServerTransport = new StreamableHTTPServerTransport({ sessionIdGenerator: undefined, }); res.on('close', () => { console.log('Request closed'); transport.close(); server.close(); }); await server.connect(transport); await transport.handleRequest(req, res, req.body); } catch (error) { console.error('Error handling MCP request:', error); if (!res.headersSent) { res.status(500).json({ jsonrpc: '2.0', error: { code: -32603, message: 'Internal server error', }, id: null, }); } } }); app.get('/mcp', async (req: Request, res: Response) => { console.log('Received GET MCP request'); res.writeHead(405).end(JSON.stringify({ jsonrpc: "2.0", error: { code: -32000, message: "Method not allowed." }, id: null })); }); app.delete('/mcp', async (req: Request, res: Response) => { console.log('Received DELETE MCP request'); res.writeHead(405).end(JSON.stringify({ jsonrpc: "2.0", error: { code: -32000, message: "Method not allowed." }, id: null })); }); // Start the server const PORT = 3000; app.listen(PORT, () => { console.log(`MCP Stateless Streamable HTTP Server listening on port ${PORT}`); }); ``` This stateless approach is useful for: - Simple API wrappers - RESTful scenarios where each request is independent - Horizontally scaled deployments without shared session state ### Testing and Debugging To test your server, you can use the [MCP Inspector](https://github.com/modelcontextprotocol/inspector). See its README for more information. ## Examples ### Echo Server A simple server demonstrating resources, tools, and prompts: ```typescript import { McpServer, ResourceTemplate } from "@modelcontextprotocol/sdk/server/mcp.js"; import { z } from "zod"; const server = new McpServer({ name: "Echo", version: "1.0.0" }); server.resource( "echo", new ResourceTemplate("echo://{message}", { list: undefined }), async (uri, { message }) => ({ contents: [{ uri: uri.href, text: `Resource echo: ${message}` }] }) ); server.tool( "echo", { message: z.string() }, async ({ message }) => ({ content: [{ type: "text", text: `Tool echo: ${message}` }] }) ); server.prompt( "echo", { message: z.string() }, ({ message }) => ({ messages: [{ role: "user", content: { type: "text", text: `Please process this message: ${message}` } }] }) ); ``` ### SQLite Explorer A more complex example showing database integration: ```typescript import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import sqlite3 from "sqlite3"; import { promisify } from "util"; import { z } from "zod"; const server = new McpServer({ name: "SQLite Explorer", version: "1.0.0" }); // Helper to create DB connection const getDb = () => { const db = new sqlite3.Database("database.db"); return { all: promisify<string, any[]>(db.all.bind(db)), close: promisify(db.close.bind(db)) }; }; server.resource( "schema", "schema://main", async (uri) => { const db = getDb(); try { const tables = await db.all( "SELECT sql FROM sqlite_master WHERE type='table'" ); return { contents: [{ uri: uri.href, text: tables.map((t: {sql: string}) => t.sql).join("\n") }] }; } finally { await db.close(); } } ); server.tool( "query", { sql: z.string() }, async ({ sql }) => { const db = getDb(); try { const results = await db.all(sql); return { content: [{ type: "text", text: JSON.stringify(results, null, 2) }] }; } catch (err: unknown) { const error = err as Error; return { content: [{ type: "text", text: `Error: ${error.message}` }], isError: true }; } finally { await db.close(); } } ); ``` ## Advanced Usage ### Dynamic Servers If you want to offer an initial set of tools/prompts/resources, but later add additional ones based on user action or external state change, you can add/update/remove them _after_ the Server is connected. This will automatically emit the corresponding `listChanged` notificaions: ```ts import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { z } from "zod"; const server = new McpServer({ name: "Dynamic Example", version: "1.0.0" }); const listMessageTool = server.tool( "listMessages", { channel: z.string() }, async ({ channel }) => ({ content: [{ type: "text", text: await listMessages(channel) }] }) ); const putMessageTool = server.tool( "putMessage", { channel: z.string(), message: z.string() }, async ({ channel, message }) => ({ content: [{ type: "text", text: await putMessage(channel, string) }] }) ); // Until we upgrade auth, `putMessage` is disabled (won't show up in listTools) putMessageTool.disable() const upgradeAuthTool = server.tool( "upgradeAuth", { permission: z.enum(["write', vadmin"])}, // Any mutations here will automatically emit `listChanged` notifications async ({ permission }) => { const { ok, err, previous } = await upgradeAuthAndStoreToken(permission) if (!ok) return {content: [{ type: "text", text: `Error: ${err}` }]} // If we previously had read-only access, 'putMessage' is now available if (previous === "read") { putMessageTool.enable() } if (permission === 'write') { // If we've just upgraded to 'write' permissions, we can still call 'upgradeAuth' // but can only upgrade to 'admin'. upgradeAuthTool.update({ paramSchema: { permission: z.enum(["admin"]) }, // change validation rules }) } else { // If we're now an admin, we no longer have anywhere to upgrade to, so fully remove that tool upgradeAuthTool.remove() } } ) // Connect as normal const transport = new StdioServerTransport(); await server.connect(transport); ``` ### Low-Level Server For more control, you can use the low-level Server class directly: ```typescript import { Server } from "@modelcontextprotocol/sdk/server/index.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { ListPromptsRequestSchema, GetPromptRequestSchema } from "@modelcontextprotocol/sdk/types.js"; const server = new Server( { name: "example-server", version: "1.0.0" }, { capabilities: { prompts: {} } } ); server.setRequestHandler(ListPromptsRequestSchema, async () => { return { prompts: [{ name: "example-prompt", description: "An example prompt template", arguments: [{ name: "arg1", description: "Example argument", required: true }] }] }; }); server.setRequestHandler(GetPromptRequestSchema, async (request) => { if (request.params.name !== "example-prompt") { throw new Error("Unknown prompt"); } return { description: "Example prompt", messages: [{ role: "user", content: { type: "text", text: "Example prompt text" } }] }; }); const transport = new StdioServerTransport(); await server.connect(transport); ``` ### Writing MCP Clients The SDK provides a high-level client interface: ```typescript import { Client } from "@modelcontextprotocol/sdk/client/index.js"; import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js"; const transport = new StdioClientTransport({ command: "node", args: ["server.js"] }); const client = new Client( { name: "example-client", version: "1.0.0" } ); await client.connect(transport); // List prompts const prompts = await client.listPrompts(); // Get a prompt const prompt = await client.getPrompt({ name: "example-prompt", arguments: { arg1: "value" } }); // List resources const resources = await client.listResources(); // Read a resource const resource = await client.readResource({ uri: "file:///example.txt" }); // Call a tool const result = await client.callTool({ name: "example-tool", arguments: { arg1: "value" } }); ``` ### Proxy Authorization Requests Upstream You can proxy OAuth requests to an external authorization provider: ```typescript import express from 'express'; import { ProxyOAuthServerProvider, mcpAuthRouter } from '@modelcontextprotocol/sdk'; const app = express(); const proxyProvider = new ProxyOAuthServerProvider({ endpoints: { authorizationUrl: "https://auth.external.com/oauth2/v1/authorize", tokenUrl: "https://auth.external.com/oauth2/v1/token", revocationUrl: "https://auth.external.com/oauth2/v1/revoke", }, verifyAccessToken: async (token) => { return { token, clientId: "123", scopes: ["openid", "email", "profile"], } }, getClient: async (client_id) => { return { client_id, redirect_uris: ["http://localhost:3000/callback"], } } }) app.use(mcpAuthRouter({ provider: proxyProvider, issuerUrl: new URL("http://auth.external.com"), baseUrl: new URL("http://mcp.example.com"), serviceDocumentationUrl: new URL("https://docs.example.com/"), })) ``` This setup allows you to: - Forward OAuth requests to an external provider - Add custom token validation logic - Manage client registrations - Provide custom documentation URLs - Maintain control over the OAuth flow while delegating to an external provider ### Backwards Compatibility Clients and servers with StreamableHttp tranport can maintain [backwards compatibility](https://modelcontextprotocol.io/specification/2025-03-26/basic/transports#backwards-compatibility) with the deprecated HTTP+SSE transport (from protocol version 2024-11-05) as follows #### Client-Side Compatibility For clients that need to work with both Streamable HTTP and older SSE servers: ```typescript import { Client } from "@modelcontextprotocol/sdk/client/index.js"; import { StreamableHTTPClientTransport } from "@modelcontextprotocol/sdk/client/streamableHttp.js"; import { SSEClientTransport } from "@modelcontextprotocol/sdk/client/sse.js"; let client: Client|undefined = undefined const baseUrl = new URL(url); try { client = new Client({ name: 'streamable-http-client', version: '1.0.0' }); const transport = new StreamableHTTPClientTransport( new URL(baseUrl) ); await client.connect(transport); console.log("Connected using Streamable HTTP transport"); } catch (error) { // If that fails with a 4xx error, try the older SSE transport console.log("Streamable HTTP connection failed, falling back to SSE transport"); client = new Client({ name: 'sse-client', version: '1.0.0' }); const sseTransport = new SSEClientTransport(baseUrl); await client.connect(sseTransport); console.log("Connected using SSE transport"); } ``` #### Server-Side Compatibility For servers that need to support both Streamable HTTP and older clients: ```typescript import express from "express"; import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js"; import { SSEServerTransport } from "@modelcontextprotocol/sdk/server/sse.js"; const server = new McpServer({ name: "backwards-compatible-server", version: "1.0.0" }); // ... set up server resources, tools, and prompts ... const app = express(); app.use(express.json()); // Store transports for each session type const transports = { streamable: {} as Record<string, StreamableHTTPServerTransport>, sse: {} as Record<string, SSEServerTransport> }; // Modern Streamable HTTP endpoint app.all('/mcp', async (req, res) => { // Handle Streamable HTTP transport for modern clients // Implementation as shown in the "With Session Management" example // ... }); // Legacy SSE endpoint for older clients app.get('/sse', async (req, res) => { // Create SSE transport for legacy clients const transport = new SSEServerTransport('/messages', res); transports.sse[transport.sessionId] = transport; res.on("close", () => { delete transports.sse[transport.sessionId]; }); await server.connect(transport); }); // Legacy message endpoint for older clients app.post('/messages', async (req, res) => { const sessionId = req.query.sessionId as string; const transport = transports.sse[sessionId]; if (transport) { await transport.handlePostMessage(req, res, req.body); } else { res.status(400).send('No transport found for sessionId'); } }); app.listen(3000); ``` **Note**: The SSE transport is now deprecated in favor of Streamable HTTP. New implementations should use Streamable HTTP, and existing SSE implementations should plan to migrate. ## Documentation - [Model Context Protocol documentation](https://modelcontextprotocol.io) - [MCP Specification](https://spec.modelcontextprotocol.io) - [Example Servers](https://github.com/modelcontextprotocol/servers) ## Contributing Issues and pull requests are welcome on GitHub at https://github.com/modelcontextprotocol/typescript-sdk. ## License This project is licensed under the MIT License—see the [LICENSE](LICENSE) file for details. ``` -------------------------------------------------------------------------------- /src/index.ts: -------------------------------------------------------------------------------- ```typescript #!/usr/bin/env node import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { z } from "zod"; import OpenAI from "openai"; import type { ImageGenerateParams, ImageEditParams } from "openai/resources"; import { Readable } from "stream"; import { toFile } from "openai/uploads"; import fs from 'fs'; import path from 'path'; import os from 'os'; import fetch from 'node-fetch'; import FormData from 'form-data'; import { execSync } from 'child_process'; // Get the API key from the environment variable const OPENAI_API_KEY = process.env.OPENAI_API_KEY; if (!OPENAI_API_KEY) { console.error("OPENAI_API_KEY environment variable is required."); process.exit(1); } // Configure OpenAI client with strict defaults for gpt-image-1 const openai = new OpenAI({ apiKey: OPENAI_API_KEY, defaultQuery: {}, // Ensure no default query parameters defaultHeaders: {} // Ensure no default headers that might affect the request }); // Determine the output directory for saving images // Priority: // 1. Environment variable GPT_IMAGE_OUTPUT_DIR if set // 2. User's Pictures folder with a gpt-image-1 subfolder // 3. Fallback to a 'generated-images' folder in the current directory if Pictures folder can't be determined const OUTPUT_DIR_ENV = process.env.GPT_IMAGE_OUTPUT_DIR; let outputDir: string; if (OUTPUT_DIR_ENV) { // Use the directory specified in the environment variable outputDir = OUTPUT_DIR_ENV; console.error(`Using output directory from environment variable: ${outputDir}`); } else { // Try to use the user's Pictures folder try { // Determine the user's home directory const homeDir = os.homedir(); // Determine the Pictures folder based on the OS let picturesDir: string; if (process.platform === 'win32') { // Windows: Use the standard Pictures folder picturesDir = path.join(homeDir, 'Pictures'); } else if (process.platform === 'darwin') { // macOS: Use the standard Pictures folder picturesDir = path.join(homeDir, 'Pictures'); } else { // Linux and other Unix-like systems: Use the XDG standard if possible const xdgPicturesDir = process.env.XDG_PICTURES_DIR; if (xdgPicturesDir) { picturesDir = xdgPicturesDir; } else { // Fallback to a standard location picturesDir = path.join(homeDir, 'Pictures'); } } // Create a gpt-image-1 subfolder in the Pictures directory outputDir = path.join(picturesDir, 'gpt-image-1'); console.error(`Using user's Pictures folder for output: ${outputDir}`); } catch (error) { // If there's any error determining the Pictures folder, fall back to the current directory outputDir = path.join(process.cwd(), 'generated-images'); console.error(`Could not determine Pictures folder, using fallback directory: ${outputDir}`); } } // Create the output directory if it doesn't exist if (!fs.existsSync(outputDir)) { fs.mkdirSync(outputDir, { recursive: true }); console.error(`Created output directory: ${outputDir}`); } else { console.error(`Using existing output directory: ${outputDir}`); } // Function to save base64 image to disk and return the file path function saveImageToDisk(base64Data: string, format: string = 'png'): string { // Create a dedicated folder for generated images if we're using the workspace root // This keeps the workspace organized while still saving in the current directory const imagesFolder = path.join(outputDir, 'gpt-images'); // Create the images folder if it doesn't exist if (!fs.existsSync(imagesFolder)) { fs.mkdirSync(imagesFolder, { recursive: true }); console.error(`Created images folder: ${imagesFolder}`); } const timestamp = new Date().toISOString().replace(/[:.]/g, '-'); const filename = `image-${timestamp}.${format}`; const outputPath = path.join(imagesFolder, filename); // Remove the data URL prefix if present const base64Image = base64Data.replace(/^data:image\/\w+;base64,/, ''); // Write the image to disk fs.writeFileSync(outputPath, Buffer.from(base64Image, 'base64')); console.error(`Image saved to: ${outputPath}`); return outputPath; } // Function to read an image file and convert it to base64 function readImageAsBase64(imagePath: string): string { try { // Check if the file exists if (!fs.existsSync(imagePath)) { throw new Error(`Image file not found: ${imagePath}`); } // Read the file as a buffer const imageBuffer = fs.readFileSync(imagePath); // Determine the MIME type based on file extension const fileExtension = path.extname(imagePath).toLowerCase(); let mimeType = 'image/png'; // Default to PNG if (fileExtension === '.jpg' || fileExtension === '.jpeg') { mimeType = 'image/jpeg'; } else if (fileExtension === '.webp') { mimeType = 'image/webp'; } else if (fileExtension === '.gif') { mimeType = 'image/gif'; } // Convert the buffer to a base64 string with data URL prefix const base64Data = imageBuffer.toString('base64'); const dataUrl = `data:${mimeType};base64,${base64Data}`; console.error(`Read image from: ${imagePath} (${mimeType})`); return dataUrl; } catch (error: any) { console.error(`Error reading image: ${error.message}`); throw error; } } const server = new McpServer({ name: "@cloudwerxlab/gpt-image-1-mcp", version: "1.1.7", description: "An MCP server for generating and editing images using the OpenAI gpt-image-1 model.", }); // Define the create_image tool const createImageSchema = z.object({ prompt: z.string().max(32000, "Prompt exceeds maximum length for gpt-image-1."), background: z.enum(["transparent", "opaque", "auto"]).optional(), n: z.number().int().min(1).max(10).optional(), output_compression: z.number().int().min(0).max(100).optional(), output_format: z.enum(["png", "jpeg", "webp"]).optional(), quality: z.enum(["high", "medium", "low", "auto"]).optional(), size: z.enum(["1024x1024", "1536x1024", "1024x1536", "auto"]).optional(), user: z.string().optional(), moderation: z.enum(["low", "auto"]).optional() }); type CreateImageArgs = z.infer<typeof createImageSchema>; server.tool( "create_image", createImageSchema.shape, { title: "Generate new images using OpenAI's gpt-image-1 model" }, async (args: CreateImageArgs, extra: any) => { try { // Use the OpenAI SDK's createImage method with detailed error handling let apiResponse; try { apiResponse = await openai.images.generate({ model: "gpt-image-1", prompt: args.prompt, size: args.size || "1024x1024", quality: args.quality || "high", n: args.n || 1 }); // Check if the response contains an error field (shouldn't happen with SDK but just in case) if (apiResponse && 'error' in apiResponse) { const error = (apiResponse as any).error; throw { message: error.message || 'Unknown API error', type: error.type || 'api_error', code: error.code || 'unknown', response: { data: { error } } }; } } catch (apiError: any) { // Enhance the error with more details if possible console.error("OpenAI API Error:", apiError); // Rethrow with enhanced information throw apiError; } // Create a Response-like object with a json() method for compatibility with the built-in tool const response = { json: () => Promise.resolve(apiResponse) }; const responseData = apiResponse; const format = args.output_format || "png"; // Save images to disk and create response with file paths const savedImages = []; const imageContents = []; if (responseData.data && responseData.data.length > 0) { for (const item of responseData.data) { if (item.b64_json) { // Save the image to disk const imagePath = saveImageToDisk(item.b64_json, format); // Add the saved image info to our response savedImages.push({ path: imagePath, format: format }); // Also include the image content for compatibility imageContents.push({ type: "image" as const, data: item.b64_json, mimeType: `image/${format}` }); } else if (item.url) { console.error(`Image URL: ${item.url}`); console.error("The gpt-image-1 model returned a URL instead of base64 data."); console.error("To view the image, open the URL in your browser."); // Add the URL info to our response savedImages.push({ url: item.url, format: format }); // Include a text message about the URL in the content imageContents.push({ type: "text" as const, text: `Image available at URL: ${item.url}` }); } } } // Create a beautifully formatted response with emojis and details const formatSize = (size: string | undefined) => size || "1024x1024"; const formatQuality = (quality: string | undefined) => quality || "high"; // Create a beautiful formatted message const formattedMessage = ` 🎨 **Image Generation Complete!** 🎨 ✨ **Prompt**: "${args.prompt}" 📊 **Generation Parameters**: • Size: ${formatSize(args.size)} • Quality: ${formatQuality(args.quality)} • Number of Images: ${args.n || 1} ${args.background ? `• Background: ${args.background}` : ''} ${args.output_format ? `• Format: ${args.output_format}` : ''} ${args.output_compression ? `• Compression: ${args.output_compression}%` : ''} ${args.moderation ? `• Moderation: ${args.moderation}` : ''} 📁 **Generated ${savedImages.length} Image${savedImages.length > 1 ? 's' : ''}**: ${savedImages.map((img, index) => ` ${index + 1}. ${img.path || img.url}`).join('\n')} ${responseData.usage ? `⚡ **Token Usage**: • Total Tokens: ${responseData.usage.total_tokens} • Input Tokens: ${responseData.usage.input_tokens} • Output Tokens: ${responseData.usage.output_tokens}` : ''} 🔍 You can find your image${savedImages.length > 1 ? 's' : ''} at the path${savedImages.length > 1 ? 's' : ''} above! `; // Return both the image content and the saved file paths with the beautiful message return { content: [ { type: "text" as const, text: formattedMessage }, ...imageContents ], ...(responseData.usage && { _meta: { usage: responseData.usage, savedImages: savedImages } }) }; } catch (error: any) { // Log the full error for debugging console.error("Error generating image:", error); // Extract detailed error information const errorCode = error.status || error.code || 'Unknown'; const errorType = error.type || 'Error'; const errorMessage = error.message || 'An unknown error occurred'; // Check for specific OpenAI API errors let detailedError = ''; if (error.response) { // If we have a response object from OpenAI, extract more details try { const responseData = error.response.data || {}; if (responseData.error) { detailedError = `\n📋 **Details**: ${responseData.error.message || 'No additional details available'}`; // Add parameter errors if available if (responseData.error.param) { detailedError += `\n🔍 **Parameter**: ${responseData.error.param}`; } // Add code if available if (responseData.error.code) { detailedError += `\n🔢 **Error Code**: ${responseData.error.code}`; } // Add type if available if (responseData.error.type) { detailedError += `\n📝 **Error Type**: ${responseData.error.type}`; } } } catch (parseError) { // If we can't parse the response, just use what we have detailedError = '\n📋 **Details**: Could not parse error details from API response'; } } // Construct a comprehensive error message const fullErrorMessage = `❌ **Image Generation Failed**\n\n⚠️ **Error ${errorCode}**: ${errorType} - ${errorMessage}${detailedError}\n\n🔄 Please try again with a different prompt or parameters.`; // Return the detailed error to the client return { content: [{ type: "text", text: fullErrorMessage }], isError: true, _meta: { error: { code: errorCode, type: errorType, message: errorMessage, raw: JSON.stringify(error, Object.getOwnPropertyNames(error)) } } }; } } ); // Define the create_image_edit tool const createImageEditSchema = z.object({ image: z.union([ z.string(), // Can be base64 encoded image string z.array(z.string()), // Can be array of base64 encoded image strings z.object({ // Can be an object with a file path filePath: z.string(), isBase64: z.boolean().optional().default(false) }), z.array(z.object({ // Can be an array of objects with file paths filePath: z.string(), isBase64: z.boolean().optional().default(false) })) ]), prompt: z.string().max(32000, "Prompt exceeds maximum length for gpt-image-1."), background: z.enum(["transparent", "opaque", "auto"]).optional(), mask: z.union([ z.string(), // Can be base64 encoded mask string z.object({ // Can be an object with a file path filePath: z.string(), isBase64: z.boolean().optional().default(false) }) ]).optional(), n: z.number().int().min(1).max(10).optional(), quality: z.enum(["high", "medium", "low", "auto"]).optional(), size: z.enum(["1024x1024", "1536x1024", "1024x1536", "auto"]).optional(), user: z.string().optional() }); type CreateImageEditArgs = z.infer<typeof createImageEditSchema>; server.tool( "create_image_edit", createImageEditSchema.shape, { title: "Edit existing images using OpenAI's gpt-image-1 model" }, async (args: CreateImageEditArgs, extra: any) => { try { // The OpenAI SDK expects 'image' and 'mask' to be Node.js ReadStream or Blob. // Since we are receiving base64 strings from the client, we need to convert them. // This is a simplified approach. A robust solution might involve handling file uploads // or different data formats depending on the client's capabilities. // For this implementation, we'll assume base64 and convert to Buffer, which the SDK might accept // or require further processing depending on its exact requirements for file-like objects. // NOTE: The OpenAI SDK's `images.edit` method specifically expects `File` or `Blob` in browser // environments and `ReadableStream` or `Buffer` in Node.js. Converting base64 to Buffer is // the most straightforward approach for a Node.js server receiving base64. // Process image input which can be file paths or base64 strings const imageFiles = []; // Handle different image input formats if (Array.isArray(args.image)) { // Handle array of strings or objects for (const img of args.image) { if (typeof img === 'string') { // Base64 string - create a temporary file const tempFile = path.join(os.tmpdir(), `image-${Date.now()}-${Math.random().toString(36).substring(2, 15)}.png`); const base64Data = img.replace(/^data:image\/\w+;base64,/, ''); fs.writeFileSync(tempFile, Buffer.from(base64Data, 'base64')); imageFiles.push(tempFile); } else { // Object with filePath - use the file directly imageFiles.push(img.filePath); } } } else if (typeof args.image === 'string') { // Single base64 string - create a temporary file const tempFile = path.join(os.tmpdir(), `image-${Date.now()}-${Math.random().toString(36).substring(2, 15)}.png`); const base64Data = args.image.replace(/^data:image\/\w+;base64,/, ''); fs.writeFileSync(tempFile, Buffer.from(base64Data, 'base64')); imageFiles.push(tempFile); } else { // Single object with filePath - use the file directly imageFiles.push(args.image.filePath); } // Process mask input which can be a file path or base64 string let maskFile = undefined; if (args.mask) { if (typeof args.mask === 'string') { // Mask is a base64 string - create a temporary file const tempFile = path.join(os.tmpdir(), `mask-${Date.now()}-${Math.random().toString(36).substring(2, 15)}.png`); const base64Data = args.mask.replace(/^data:image\/\w+;base64,/, ''); fs.writeFileSync(tempFile, Buffer.from(base64Data, 'base64')); maskFile = tempFile; } else { // Mask is an object with filePath - use the file directly maskFile = args.mask.filePath; } } // Use a direct curl command to call the OpenAI API // This is more reliable than using the SDK for file uploads // Create a temporary file to store the response const tempResponseFile = path.join(os.tmpdir(), `response-${Date.now()}.json`); // Build the curl command let curlCommand = `curl -s -X POST "https://api.openai.com/v1/images/edits" -H "Authorization: Bearer ${process.env.OPENAI_API_KEY}"`; // Add the model curlCommand += ` -F "model=gpt-image-1"`; // Add the prompt curlCommand += ` -F "prompt=${args.prompt}"`; // Add the images for (const imageFile of imageFiles) { curlCommand += ` -F "image[]=@${imageFile}"`; } // Add the mask if it exists if (maskFile) { curlCommand += ` -F "mask=@${maskFile}"`; } // Add other parameters if (args.n) curlCommand += ` -F "n=${args.n}"`; if (args.size) curlCommand += ` -F "size=${args.size}"`; if (args.quality) curlCommand += ` -F "quality=${args.quality}"`; if (args.background) curlCommand += ` -F "background=${args.background}"`; if (args.user) curlCommand += ` -F "user=${args.user}"`; // Add output redirection curlCommand += ` > "${tempResponseFile}"`; // Execute the curl command // Use execSync to run the curl command try { console.error(`Executing curl command to edit image...`); execSync(curlCommand, { stdio: ['pipe', 'pipe', 'inherit'] }); console.error(`Curl command executed successfully.`); } catch (error: any) { console.error(`Error executing curl command: ${error.message}`); throw new Error(`Failed to edit image: ${error.message}`); } // Read the response from the temporary file let responseJson; try { responseJson = fs.readFileSync(tempResponseFile, 'utf8'); console.error(`Response file read successfully.`); } catch (error: any) { console.error(`Error reading response file: ${error.message}`); throw new Error(`Failed to read response file: ${error.message}`); } // Parse the response let responseData; try { responseData = JSON.parse(responseJson); console.error(`Response parsed successfully.`); // Check if the response contains an error if (responseData.error) { console.error(`OpenAI API returned an error:`, responseData.error); const errorMessage = responseData.error.message || 'Unknown API error'; const errorType = responseData.error.type || 'api_error'; const errorCode = responseData.error.code || responseData.error.status || 'unknown'; throw { message: errorMessage, type: errorType, code: errorCode, response: { data: responseData } }; } } catch (error: any) { // If the error is from our API error check, rethrow it if (error.response && error.response.data) { throw error; } console.error(`Error parsing response: ${error.message}`); throw new Error(`Failed to parse response: ${error.message}`); } // Delete the temporary response file try { fs.unlinkSync(tempResponseFile); console.error(`Temporary response file deleted.`); } catch (error: any) { console.error(`Error deleting temporary file: ${error.message}`); // Don't throw an error here, just log it } // Clean up temporary files try { // Delete temporary image files for (const imageFile of imageFiles) { // Only delete files we created (temporary files in the os.tmpdir directory) if (imageFile.startsWith(os.tmpdir())) { try { fs.unlinkSync(imageFile); } catch (e) { /* ignore errors */ } } } // Delete temporary mask file if (maskFile && maskFile.startsWith(os.tmpdir())) { try { fs.unlinkSync(maskFile); } catch (e) { /* ignore errors */ } } } catch (cleanupError) { console.error("Error cleaning up temporary files:", cleanupError); } // No need for a Response-like object anymore since we're using fetch directly // Save images to disk and create response with file paths const savedImages = []; const imageContents = []; const format = "png"; // Assuming png for edits based on common practice if (responseData.data && responseData.data.length > 0) { for (const item of responseData.data) { if (item.b64_json) { // Save the image to disk const imagePath = saveImageToDisk(item.b64_json, format); // Add the saved image info to our response savedImages.push({ path: imagePath, format: format }); // Also include the image content for compatibility imageContents.push({ type: "image" as const, data: item.b64_json, mimeType: `image/${format}` }); } else if (item.url) { console.error(`Image URL: ${item.url}`); console.error("The gpt-image-1 model returned a URL instead of base64 data."); console.error("To view the image, open the URL in your browser."); // Add the URL info to our response savedImages.push({ url: item.url, format: format }); // Include a text message about the URL in the content imageContents.push({ type: "text" as const, text: `Image available at URL: ${item.url}` }); } } } // Create a beautifully formatted response with emojis and details const formatSize = (size: string | undefined) => size || "1024x1024"; const formatQuality = (quality: string | undefined) => quality || "high"; // Get source image information let sourceImageInfo = ""; if (Array.isArray(args.image)) { // Handle array of strings or objects sourceImageInfo = args.image.map((img, index) => { if (typeof img === 'string') { return ` ${index + 1}. Base64 encoded image`; } else { return ` ${index + 1}. ${img.filePath}`; } }).join('\n'); } else if (typeof args.image === 'string') { sourceImageInfo = " Base64 encoded image"; } else { sourceImageInfo = ` ${args.image.filePath}`; } // Get mask information let maskInfo = ""; if (args.mask) { if (typeof args.mask === 'string') { maskInfo = "🎭 **Mask**: Base64 encoded mask applied"; } else { maskInfo = `🎭 **Mask**: Mask from ${args.mask.filePath} applied`; } } // Create a beautiful formatted message const formattedMessage = ` ✏️ **Image Edit Complete!** 🖌️ ✨ **Edit Prompt**: "${args.prompt}" 🖼️ **Source Image${imageFiles.length > 1 ? 's' : ''}**: ${sourceImageInfo} ${maskInfo} 📊 **Edit Parameters**: • Size: ${formatSize(args.size)} • Quality: ${formatQuality(args.quality)} • Number of Results: ${args.n || 1} ${args.background ? `• Background: ${args.background}` : ''} 📁 **Edited ${savedImages.length} Image${savedImages.length > 1 ? 's' : ''}**: ${savedImages.map((img, index) => ` ${index + 1}. ${img.path || img.url}`).join('\n')} ${responseData.usage ? `⚡ **Token Usage**: • Total Tokens: ${responseData.usage.total_tokens} • Input Tokens: ${responseData.usage.input_tokens} • Output Tokens: ${responseData.usage.output_tokens}` : ''} 🔍 You can find your edited image${savedImages.length > 1 ? 's' : ''} at the path${savedImages.length > 1 ? 's' : ''} above! `; // Return both the image content and the saved file paths with the beautiful message return { content: [ { type: "text" as const, text: formattedMessage }, ...imageContents ], ...(responseData.usage && { _meta: { usage: { totalTokens: responseData.usage.total_tokens, inputTokens: responseData.usage.input_tokens, outputTokens: responseData.usage.output_tokens, }, savedImages: savedImages } }) }; } catch (error: any) { // Log the full error for debugging console.error("Error creating image edit:", error); // Extract detailed error information const errorCode = error.status || error.code || 'Unknown'; const errorType = error.type || 'Error'; const errorMessage = error.message || 'An unknown error occurred'; // Check for specific error types and provide more helpful messages let detailedError = ''; let suggestedFix = ''; // Handle file-related errors if (errorMessage.includes('ENOENT') || errorMessage.includes('no such file')) { detailedError = '\n📋 **Details**: The specified image or mask file could not be found'; suggestedFix = '\n💡 **Suggestion**: Verify that the file path is correct and the file exists'; } // Handle permission errors else if (errorMessage.includes('EACCES') || errorMessage.includes('permission denied')) { detailedError = '\n📋 **Details**: Permission denied when trying to access the file'; suggestedFix = '\n💡 **Suggestion**: Check file permissions or try running with elevated privileges'; } // Handle curl errors else if (errorMessage.includes('curl')) { detailedError = '\n📋 **Details**: Error occurred while sending the request to OpenAI API'; suggestedFix = '\n💡 **Suggestion**: Check your internet connection and API key'; } // Handle OpenAI API errors else if (error.response) { try { const responseData = error.response.data || {}; if (responseData.error) { detailedError = `\n📋 **Details**: ${responseData.error.message || 'No additional details available'}`; // Add parameter errors if available if (responseData.error.param) { detailedError += `\n🔍 **Parameter**: ${responseData.error.param}`; } // Add code if available if (responseData.error.code) { detailedError += `\n🔢 **Error Code**: ${responseData.error.code}`; } // Add type if available if (responseData.error.type) { detailedError += `\n📝 **Error Type**: ${responseData.error.type}`; } // Provide suggestions based on error type if (responseData.error.type === 'invalid_request_error') { suggestedFix = '\n💡 **Suggestion**: Check that your image format is supported (PNG, JPEG) and the prompt is valid'; } else if (responseData.error.type === 'authentication_error') { suggestedFix = '\n💡 **Suggestion**: Verify your OpenAI API key is correct and has access to the gpt-image-1 model'; } } } catch (parseError) { detailedError = '\n📋 **Details**: Could not parse error details from API response'; } } // If we have a JSON response with an error, try to extract it if (errorMessage.includes('{') && errorMessage.includes('}')) { try { const jsonStartIndex = errorMessage.indexOf('{'); const jsonEndIndex = errorMessage.lastIndexOf('}') + 1; const jsonStr = errorMessage.substring(jsonStartIndex, jsonEndIndex); const jsonError = JSON.parse(jsonStr); if (jsonError.error) { detailedError = `\n📋 **Details**: ${jsonError.error.message || 'No additional details available'}`; if (jsonError.error.code) { detailedError += `\n🔢 **Error Code**: ${jsonError.error.code}`; } if (jsonError.error.type) { detailedError += `\n📝 **Error Type**: ${jsonError.error.type}`; } } } catch (e) { // If we can't parse JSON from the error message, just continue } } // Construct a comprehensive error message const fullErrorMessage = `❌ **Image Edit Failed**\n\n⚠️ **Error ${errorCode}**: ${errorType} - ${errorMessage}${detailedError}${suggestedFix}\n\n🔄 Please try again with a different prompt, image, or parameters.`; // Return the detailed error to the client return { content: [{ type: "text", text: fullErrorMessage }], isError: true, _meta: { error: { code: errorCode, type: errorType, message: errorMessage, details: detailedError.replace(/\n📋 \*\*Details\*\*: /, ''), suggestion: suggestedFix.replace(/\n💡 \*\*Suggestion\*\*: /, ''), raw: JSON.stringify(error, Object.getOwnPropertyNames(error)) } } }; } } ); // Start the server const transport = new StdioServerTransport(); server.connect(transport).then(() => { console.error("✅ GPT-Image-1 MCP server running on stdio"); console.error("🎨 Ready to generate and edit images!"); }).catch(console.error); // Handle graceful shutdown process.on('SIGINT', async () => { console.error("🛑 Shutting down GPT-Image-1 MCP server..."); await server.close(); console.error("👋 Server shutdown complete. Goodbye!"); process.exit(0); }); process.on('SIGTERM', async () => { console.error("🛑 Shutting down GPT-Image-1 MCP server..."); await server.close(); console.error("👋 Server shutdown complete. Goodbye!"); process.exit(0); }); ```