# Directory Structure ``` ├── .gitignore ├── .python-version ├── images │ ├── magnet-claude.png │ ├── magnet-webcam.jpg │ ├── orange-claude.png │ └── orange-webcam.jpg ├── pyproject.toml ├── README.md ├── uv.lock └── videocapture_mcp.py ``` # Files -------------------------------------------------------------------------------- /.python-version: -------------------------------------------------------------------------------- ``` 3.10 ``` -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- ``` # Python-generated files __pycache__/ *.py[oc] build/ dist/ wheels/ *.egg-info # Virtual environments .venv ``` -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- ```markdown # Video Still Capture MCP **A Model Context Protocol server for accessing and controlling webcams via OpenCV** ## Overview Video Still Capture MCP is a Python implementation of the Model Context Protocol (MCP) that provides AI assistants with the ability to access and control webcams and video sources through OpenCV. This server exposes a set of tools that allow language models to capture images, manipulate camera settings, and manage video connections. There is no video capture. ## Examples Here are some examples of the Video Still Capture MCP server in action: ### Orange Example Left: Claude's view of the image | Right: Actual webcam capture :-------------------------:|:-------------------------:  |  ### Magnet Example Left: Claude's view of the image | Right: Actual webcam capture :-------------------------:|:-------------------------:  |  ## Installation ### Prerequisites - Python 3.10+ - [OpenCV](https://opencv.org/) (`opencv-python`) - [MCP Python SDK](https://modelcontextprotocol.io/docs/) - [UV](https://astral.sh/uv/) (optional) ### Installation from source ```bash git clone https://github.com/13rac1/videocapture-mcp.git cd videocapture-mcp pip install -e . ``` Run the MCP server: ```bash mcp dev videocapture_mcp.py ``` ## Integrating with Claude for Desktop ### macOS/Linux Edit your Claude Desktop configuration: ```bash # Mac nano ~/Library/Application\ Support/Claude/claude_desktop_config.json # Linux nano ~/.config/Claude/claude_desktop_config.json ``` Add this MCP server configuration: ```json { "mcpServers": { "VideoCapture ": { "command": "uv", "args": [ "run", "--with", "mcp[cli]", "--with", "numpy", "--with", "opencv-python", "mcp", "run", "/ABSOLUTE_PATH/videocapture_mcp.py" ] } } } ``` Ensure you replace `/ABSOLUTE_PATH/videocapture-mcp` with the project's absolute path. ### Windows Edit your Claude Desktop configuration: ```powershell nano $env:AppData\Claude\claude_desktop_config.json ``` Add this MCP server configuration: ```json { "mcpServers": { "VideoCapture": { "command": "uv", "args": [ "run", "--with", "mcp[cli]", "--with", "numpy", "--with", "opencv-python", "mcp", "run", "C:\ABSOLUTE_PATH\videocapture-mcp\videocapture_mcp.py" ] } } } ``` Ensure you replace `C:\ABSOLUTE_PATH\videocapture-mcp` with the project's absolute path. ### Using the Installation Command Alternatively, you can use the `mcp` CLI to install the server: ```bash mcp install videocapture_mcp.py ``` This will automatically configure Claude Desktop to use your videocapture MCP server. Once integrated, Claude will be able to access your webcam or video source when requested. Simply ask Claude to take a photo or perform any webcam-related task. ## Features - **Quick Image Capture**: Capture a single image from a webcam without managing connections - **Connection Management**: Open, manage, and close camera connections - **Video Properties**: Read and adjust camera settings like brightness, contrast, and resolution - **Image Processing**: Basic image transformations like horizontal flipping ## Tools Reference ### `quick_capture` Quickly open a camera, capture a single frame, and close it. ```python quick_capture(device_index: int = 0, flip: bool = False) -> Image ``` - **device_index**: Camera index (0 is usually the default webcam) - **flip**: Whether to horizontally flip the image - **Returns**: The captured frame as an Image object ### `open_camera` Open a connection to a camera device. ```python open_camera(device_index: int = 0, name: Optional[str] = None) -> str ``` - **device_index**: Camera index (0 is usually the default webcam) - **name**: Optional name to identify this camera connection - **Returns**: Connection ID for the opened camera ### `capture_frame` Capture a single frame from the specified video source. ```python capture_frame(connection_id: str, flip: bool = False) -> Image ``` - **connection_id**: ID of the previously opened video connection - **flip**: Whether to horizontally flip the image - **Returns**: The captured frame as an Image object ### `get_video_properties` Get properties of the video source. ```python get_video_properties(connection_id: str) -> dict ``` - **connection_id**: ID of the previously opened video connection - **Returns**: Dictionary of video properties (width, height, fps, etc.) ### `set_video_property` Set a property of the video source. ```python set_video_property(connection_id: str, property_name: str, value: float) -> bool ``` - **connection_id**: ID of the previously opened video connection - **property_name**: Name of the property to set (width, height, brightness, etc.) - **value**: Value to set - **Returns**: True if successful, False otherwise ### `close_connection` Close a video connection and release resources. ```python close_connection(connection_id: str) -> bool ``` - **connection_id**: ID of the connection to close - **Returns**: True if successful ### `list_active_connections` List all active video connections. ```python list_active_connections() -> list ``` - **Returns**: List of active connection IDs ## Example Usage Here's how an AI assistant might use the Webcam MCP server: 1. **Take a quick photo**: ``` I'll take a photo using your webcam. ``` (The AI would call `quick_capture()` behind the scenes) 2. **Open a persistent connection**: ``` I'll open a connection to your webcam so we can take multiple photos. ``` (The AI would call `open_camera()` and store the connection ID) 3. **Adjust camera settings**: ``` Let me increase the brightness of the webcam feed. ``` (The AI would call `set_video_property()` with the appropriate parameters) ## Advanced Usage ### Resource Management The server automatically manages camera resources, ensuring all connections are properly released when the server shuts down. For long-running applications, it's good practice to explicitly close connections when they're no longer needed. ### Multiple Cameras If your system has multiple cameras, you can specify the device index when opening a connection: ```python # Open the second webcam (index 1) connection_id = open_camera(device_index=1) ``` ## Troubleshooting - **Camera Not Found**: Ensure your webcam is properly connected and not in use by another application - **Permission Issues**: Some systems require explicit permission to access the camera - **OpenCV Installation**: If you encounter issues with OpenCV, refer to the [official installation guide](https://docs.opencv.org/master/d5/de5/tutorial_py_setup_in_windows.html) ## License This project is licensed under the MIT License - see the LICENSE file for details. ## Contributing Contributions are welcome! Please feel free to submit a Pull Request. ``` -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- ```toml [project] name = "videocapture-mcp" version = "0.1.0" description = "Model Context Protocol (MCP) server to capture from an OpenCV-compatible webcam" readme = "README.md" requires-python = ">=3.10" dependencies = [ "httpx>=0.28.1", "mcp[cli]>=1.4.1", "opencv-python>=4.11.0.86", ] [[project.authors]] name = "13rac1" [project.scripts] videocapture_mcp = "videocapture_mcp.main" ``` -------------------------------------------------------------------------------- /videocapture_mcp.py: -------------------------------------------------------------------------------- ```python import cv2 from contextlib import asynccontextmanager from collections.abc import AsyncIterator from dataclasses import dataclass from datetime import datetime from typing import Optional, Dict from mcp.server.fastmcp import FastMCP, Image # Store active video capture objects active_captures: Dict[str, cv2.VideoCapture] = {} # Define our application context @dataclass class AppContext: active_captures: Dict[str, cv2.VideoCapture] @asynccontextmanager async def app_lifespan(server: FastMCP) -> AsyncIterator[AppContext]: """Manage application lifecycle with camera resource cleanup""" # Initialize on startup #print("Starting VideoCapture MCP Server") try: # Pass the active_captures dictionary in the context yield AppContext(active_captures=active_captures) finally: # Cleanup on shutdown #print("Shutting down VideoCapture MCP Server") for connection_id, cap in active_captures.items(): cap.release() active_captures.clear() # Initialize the FastMCP server with lifespan mcp = FastMCP("VideoCapture", description="Provides access to camera and video streams via OpenCV", dependencies=["opencv-python", "numpy"], lifespan=app_lifespan) def main(): """Main entry point for the VideoCapture Server""" mcp.run() @mcp.tool() def quick_capture(device_index: int = 0, flip: bool = False) -> Image: """ Quickly open a camera, capture a single frame, and close it. If the camera is already open, use the existing connection. Args: device_index: Camera index (0 is usually the default webcam) flip: Whether to horizontally flip the image Returns: The captured frame as an Image object """ # Check if this device is already open device_key = None for key, cap in active_captures.items(): if key.startswith(f"camera_{device_index}_"): device_key = key break # If device is not already open, open it temporarily temp_connection = False if device_key is None: device_key = open_camera(device_index) temp_connection = True try: # Capture the frame frame = capture_frame(device_key, flip) return frame finally: # Close the connection if we opened it temporarily if temp_connection: close_connection(device_key) @mcp.tool() def open_camera(device_index: int = 0, name: Optional[str] = None) -> str: """ Open a connection to a camera device. Args: device_index: Camera index (0 is usually the default webcam) name: Optional name to identify this camera connection Returns: Connection ID for the opened camera """ if name is None: name = f"camera_{device_index}_{datetime.now().strftime('%Y%m%d%H%M%S')}" cap = cv2.VideoCapture(device_index) if not cap.isOpened(): raise ValueError(f"Failed to open camera at index {device_index}") active_captures[name] = cap return name @mcp.tool() def capture_frame(connection_id: str, flip: bool = False) -> Image: """ Capture a single frame from the specified video source. Args: connection_id: ID of the previously opened video connection flip: Whether to horizontally flip the image Returns: The captured frame as an Image object """ if connection_id not in active_captures: raise ValueError(f"No active connection with ID: {connection_id}") cap = active_captures[connection_id] ret, frame = cap.read() if not ret: raise RuntimeError(f"Failed to capture frame from {connection_id}") if flip: frame = cv2.flip(frame, 1) # 1 for horizontal flip # Encode the image as PNG _, png_data = cv2.imencode('.png', frame) # Return as MCP Image object return Image(data=png_data.tobytes(), format="png") @mcp.tool() def get_video_properties(connection_id: str) -> dict: """ Get properties of the video source. Args: connection_id: ID of the previously opened video connection Returns: Dictionary of video properties """ if connection_id not in active_captures: raise ValueError(f"No active connection with ID: {connection_id}") cap = active_captures[connection_id] properties = { "width": int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), "height": int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)), "fps": cap.get(cv2.CAP_PROP_FPS), "frame_count": int(cap.get(cv2.CAP_PROP_FRAME_COUNT)), "brightness": cap.get(cv2.CAP_PROP_BRIGHTNESS), "contrast": cap.get(cv2.CAP_PROP_CONTRAST), "saturation": cap.get(cv2.CAP_PROP_SATURATION), "format": int(cap.get(cv2.CAP_PROP_FORMAT)) } return properties @mcp.tool() def set_video_property(connection_id: str, property_name: str, value: float) -> bool: """ Set a property of the video source. Args: connection_id: ID of the previously opened video connection property_name: Name of the property to set (width, height, brightness, etc.) value: Value to set Returns: True if successful, False otherwise """ if connection_id not in active_captures: raise ValueError(f"No active connection with ID: {connection_id}") cap = active_captures[connection_id] property_map = { "width": cv2.CAP_PROP_FRAME_WIDTH, "height": cv2.CAP_PROP_FRAME_HEIGHT, "fps": cv2.CAP_PROP_FPS, "brightness": cv2.CAP_PROP_BRIGHTNESS, "contrast": cv2.CAP_PROP_CONTRAST, "saturation": cv2.CAP_PROP_SATURATION, "auto_exposure": cv2.CAP_PROP_AUTO_EXPOSURE, "auto_focus": cv2.CAP_PROP_AUTOFOCUS } if property_name not in property_map: raise ValueError(f"Unknown property: {property_name}") return cap.set(property_map[property_name], value) @mcp.tool() def close_connection(connection_id: str) -> bool: """ Close a video connection and release resources. Args: connection_id: ID of the connection to close Returns: True if successful """ if connection_id not in active_captures: raise ValueError(f"No active connection with ID: {connection_id}") active_captures[connection_id].release() del active_captures[connection_id] return True @mcp.tool() def list_active_connections() -> list: """ List all active video connections. Returns: List of active connection IDs """ return list(active_captures.keys()) mcp.run(transport='stdio') # For: $ mcp run videocapture_mcp.py def run(): main() if __name__ == "__main__": main() ```