# Directory Structure ``` ├── .gitignore ├── .python-version ├── pyproject.toml ├── README.md ├── screenshot.py └── uv.lock ``` # Files -------------------------------------------------------------------------------- /.python-version: -------------------------------------------------------------------------------- ``` 1 | 3.10 2 | ``` -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- ``` 1 | # Python-generated files 2 | __pycache__/ 3 | *.py[oc] 4 | build/ 5 | dist/ 6 | wheels/ 7 | *.egg-info 8 | 9 | # Virtual environments 10 | .venv 11 | ``` -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- ```markdown 1 | # Screenshot Server (File Path Focused) 2 | 3 | This project provides an MCP (Modular Communication Protocol) server designed to capture screenshots and facilitate their use by other processes or AI assistants, primarily by **saving the screenshot to a file path specified by the client (Host)**. 4 | 5 | ## Core Problem & Solution 6 | 7 | Directly interpreting screenshot image data sent via MCP by AI assistants proved unreliable in testing. This server adopts more robust workflows focused on file paths: 8 | 9 | **Recommended Workflow (WSL Host -> Windows Server):** 10 | 11 | 1. An MCP Host (like an AI assistant running in WSL) calls the `save_screenshot_to_host_workspace` tool, providing its **WSL workspace path** as an argument. 12 | 2. This server (running on Windows) captures the screen. 13 | 3. The server converts the received WSL path to a Windows-accessible UNC path (e.g., `\\wsl$\Distro\path`). 14 | 4. The server saves the screenshot to the specified location within the Host's WSL filesystem via the UNC path. 15 | 5. The server returns `"success"` or `"failed:..."`. 16 | 6. The MCP Host knows the file is saved in its workspace (or a sub-directory if specified in the path argument). 17 | 7. The MCP Host can then pass the **WSL path** to another specialized MCP server (running in WSL) for image analysis. 18 | 19 | **Alternative Workflow (General):** 20 | 21 | 1. MCP Host calls `take_screenshot_and_return_path`, optionally specifying a filename. 22 | 2. Server saves the screenshot to its local `images/` directory. 23 | 3. Server returns the **absolute path** (e.g., Windows path) to the saved file. 24 | 4. MCP Host receives the path and passes it (with potential conversion) to an analysis server. 25 | 26 | ## Available Tools 27 | 28 | This server provides the following tools, ordered by recommended usage: 29 | 30 | * **`save_screenshot_to_host_workspace(host_workspace_path: str, name: str = "workspace_screenshot.jpg")`** 31 | * **Recommended Use:** Saves a screenshot directly into the AI Assistant's (Host's) current WSL workspace. This is the preferred method for seamless integration. 32 | * **Action:** Takes a screenshot, converts the provided WSL path to a UNC path, and saves the file to the Host's workspace. Automatically detects the WSL distribution name. 33 | * **Args:** 34 | * `host_workspace_path` (str): The absolute WSL path of the Host's workspace (e.g., `/home/user/project`). 35 | * `name` (str, optional): Filename. Defaults to `workspace_screenshot.jpg`. 36 | * **Returns:** `str` - `"success"` or `"failed: [error message]"`. 37 | 38 | * **`take_screenshot_and_return_path(name: str = "latest_screenshot.jpg")`** 39 | * **Use Case:** Saves a screenshot to a fixed `images/` directory relative to the server's location and returns the absolute path (typically a Windows path). Useful if the caller needs the path for external processing. 40 | * **Args:** 41 | * `name` (str, optional): Filename. Defaults to `latest_screenshot.jpg`. 42 | * **Returns:** `str` - Absolute path or `"failed: [error message]"`. 43 | 44 | * **`take_screenshot_path(path: str = "./", name: str = "screenshot.jpg")`** 45 | * **Use Case:** Saves a screenshot to an arbitrary location specified by a Windows path or a UNC path (e.g., for saving outside the Host's workspace). Requires careful path specification by the caller. 46 | * **Args:** 47 | * `path` (str, optional): Target directory (Windows or UNC path). Defaults to server's working directory. 48 | * `name` (str, optional): Filename. Defaults to `screenshot.jpg`. 49 | * **Returns:** `str` - `"success"` or `"failed: [error message]"`. 50 | ## Setup and Usage 51 | 52 | ### 1. Prerequisites 53 | * **Python 3.x:** Required on the machine where the server will run. 54 | * **Dependencies:** Install using `uv`: 55 | ```bash 56 | uv sync 57 | ``` 58 | Required libraries include `mcp[cli]>=1.4.1`, `pyautogui`, and `Pillow`. 59 | 60 | ### 2. Running the Server 61 | This server is typically launched *by* an MCP Host based on its configuration. 62 | 63 | ### 3. Environment Considerations (Especially WSL2) 64 | 65 | **Crucial Point:** To capture the **Windows screen**, this `screenshot.py` server **must run directly on Windows**. 66 | 67 | **Recommended WSL2 Host -> Windows Server Setup:** 68 | 69 | 1. **Project Location:** Place this `screenshot-server` project folder on your **Windows filesystem** (e.g., `C:\Users\YourUser\projects\screenshot-server`). 70 | 2. **Windows Dependencies:** Install Python, `uv`, and project dependencies (`uv sync ...`) directly on **Windows** within the project folder. 71 | 3. **MCP Host Configuration (in WSL):** Configure your MCP Host (running in WSL) to launch the server on Windows using PowerShell. Update `mcp_settings.json` (or equivalent): 72 | 73 | ```json 74 | { 75 | "mcpServers": { 76 | "Screenshot-server": { 77 | "command": "powershell.exe", 78 | "args": [ 79 | "-Command", 80 | "Invoke-Command -ScriptBlock { cd '<YOUR_WINDOWS_PROJECT_PATH>'; & '<YOUR_WINDOWS_UV_PATH>' run screenshot.py }" 81 | ] 82 | } 83 | // ... other servers ... 84 | } 85 | } 86 | ``` 87 | * Replace paths with your actual Windows paths. 88 | 89 | ### 4. Workflow Example (AI Assistant in WSL) 90 | 1. AI Assistant identifies its current workspace path (e.g., `/home/user/current_project`). 91 | 2. AI Assistant uses `use_mcp_tool` to call `save_screenshot_to_host_workspace` on `Screenshot-server`, passing `host_workspace_path="/home/user/current_project"` and optionally a `name`. 92 | 3. Receives `"success"`. 93 | 4. AI Assistant knows the screenshot is now at `/home/user/current_project/workspace_screenshot.jpg` (or the specified name). 94 | 5. AI Assistant uses `use_mcp_tool` to call an *image analysis* server/tool (also running in WSL), passing the WSL path `/home/user/current_project/workspace_screenshot.jpg`. 95 | 6. The image analysis server reads the file and performs its task. 96 | 97 | ## File Structure 98 | 99 | * `screenshot.py`: The core MCP server script. 100 | * `README.md`: This documentation file. 101 | * `pyproject.toml`: Project definition and dependencies for `uv`. 102 | * `uv.lock`: Dependency lock file. 103 | * `.gitignore`: Git ignore configuration. 104 | * `.python-version`: (Optional) Python version specifier. 105 | * `server.log`: Log file generated by the server. 106 | * `images/`: Default directory for `take_screenshot_and_return_path`. ``` -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- ```toml 1 | [project] 2 | name = "screenshot-server" 3 | version = "0.1.0" 4 | description = "Add your description here" 5 | readme = "README.md" 6 | requires-python = ">=3.10" 7 | dependencies = [ 8 | "mcp[cli]>=1.4.1", 9 | "pyautogui>=0.9.54", 10 | ] 11 | ``` -------------------------------------------------------------------------------- /screenshot.py: -------------------------------------------------------------------------------- ```python 1 | from mcp.server.fastmcp import FastMCP, Image # Image might still be needed internally 2 | import io 3 | import os 4 | from pathlib import Path 5 | import pyautogui 6 | from mcp.types import ImageContent # Keep ImageContent for potential future use or internal conversion 7 | import base64 # Import base64 module 8 | import logging 9 | import sys 10 | import datetime 11 | 12 | # --- Logger Setup --- 13 | log_file = "server.log" 14 | logging.basicConfig( 15 | level=logging.INFO, # Set to INFO for general use, DEBUG for detailed troubleshooting 16 | format='%(asctime)s - %(levelname)s - %(message)s', 17 | handlers=[ 18 | logging.FileHandler(log_file, mode='a'), # Append to log file 19 | logging.StreamHandler(sys.stdout) # Also print to console (useful if run directly) 20 | ] 21 | ) 22 | logger = logging.getLogger(__name__) 23 | logger.info("--- Screenshot Server Starting ---") 24 | # --- End Logger Setup --- 25 | 26 | # Create server instance 27 | mcp = FastMCP("screenshot server") 28 | 29 | # Note: Tools returning raw image data (like the original take_screenshot/take_screenshot_image) 30 | # were removed because AI interpretation via MCP showed inconsistencies. 31 | # The current approach focuses on saving the image to a file and returning the path or base64 data. 32 | 33 | @mcp.tool() 34 | def take_screenshot_path(path: str = "./", name: str = "screenshot.jpg") -> str: 35 | """Takes a screenshot and saves it to a specified path and filename on the server machine. 36 | 37 | Provides flexibility for saving to specific Windows locations or WSL locations via UNC paths. 38 | For saving directly to the Host's WSL workspace, prefer using 'save_screenshot_to_host_workspace'. 39 | 40 | Args: 41 | path (str, optional): The target directory path (Windows path or UNC path like \\\\wsl$\\Distro\\...). 42 | Defaults to the server's current working directory (`./`). 43 | name (str, optional): The desired filename for the screenshot. 44 | Defaults to "screenshot.jpg". 45 | 46 | Returns: 47 | str: "success" if saved successfully, otherwise "failed: [error message]". 48 | """ 49 | logger.info(f"take_screenshot_path called with path='{path}', name='{name}'") 50 | buffer = io.BytesIO() 51 | try: 52 | # Capture the screenshot 53 | screenshot = pyautogui.screenshot() 54 | # Convert and save to buffer as JPEG 55 | screenshot.convert("RGB").save(buffer, format="JPEG", quality=60, optimize=True) 56 | image_data = buffer.getvalue() 57 | logger.debug(f"Image data length: {len(image_data)}") 58 | 59 | # Process file saving 60 | try: 61 | # Resolve the path - this works for both Windows and UNC paths 62 | save_path_obj = Path(path) / name 63 | # Ensure the directory exists 64 | save_path_obj.parent.mkdir(parents=True, exist_ok=True) 65 | # Resolve after ensuring directory exists, especially for UNC 66 | save_path = save_path_obj.resolve() 67 | 68 | # Security check (more robust check might be needed for UNC paths if strict confinement is required) 69 | # For simple cases, checking if the resolved path is valid might suffice here. 70 | # A basic check could involve ensuring it's not trying to write to system dirs, but UNC makes it tricky. 71 | # For now, we rely on the OS permissions and the user providing a valid target. 72 | # Consider adding checks based on expected base paths if needed. 73 | 74 | # Write the image data to the file 75 | with open(save_path, "wb") as f: 76 | f.write(image_data) 77 | logger.info(f"Successfully saved screenshot to {save_path}") 78 | return "success" 79 | except Exception as e: 80 | # Log the specific path that failed if possible 81 | logger.error(f"Error writing screenshot to file '{path}/{name}': {e}", exc_info=True) 82 | return "failed: file write error" 83 | except Exception as e: 84 | # Handle errors during screenshot capture itself 85 | logger.error(f"Error capturing screenshot: {e}", exc_info=True) 86 | return "failed: screenshot capture error" 87 | 88 | @mcp.tool() 89 | def take_screenshot_and_return_path(name: str = "latest_screenshot.jpg") -> str: 90 | """Takes a screenshot, saves it to images/ directory, and returns the absolute path. 91 | 92 | Saves the screenshot with the specified filename within the 'images' subdirectory 93 | relative to the server's execution directory. This is the primary tool for 94 | workflows requiring the file path for subsequent processing. 95 | 96 | Args: 97 | name (str, optional): The filename for the screenshot (e.g., "current_view.jpg"). 98 | Defaults to "latest_screenshot.jpg". 99 | 100 | Returns: 101 | str: The absolute path (e.g., Windows path like C:\\...) to the saved screenshot file, 102 | or "failed: [error message]" if an error occurs. 103 | """ 104 | logger.info(f"take_screenshot_and_return_path called with name='{name}'") 105 | buffer = io.BytesIO() 106 | try: 107 | # Capture the screenshot 108 | screenshot = pyautogui.screenshot() 109 | # Convert and save to buffer as JPEG 110 | screenshot.convert("RGB").save(buffer, format="JPEG", quality=60, optimize=True) 111 | image_data = buffer.getvalue() 112 | logger.debug(f"Image data length: {len(image_data)}") 113 | 114 | # Define the fixed save location relative to the script's execution directory 115 | save_dir = Path("images") 116 | # Use the provided 'name' argument for the filename 117 | save_path = (save_dir / name).resolve() # Use 'name' argument 118 | 119 | # Create the 'images' directory if it doesn't exist 120 | save_dir.mkdir(parents=True, exist_ok=True) 121 | 122 | # Save the file 123 | with open(save_path, "wb") as f: 124 | f.write(image_data) 125 | logger.info(f"Screenshot saved to: {save_path}") 126 | 127 | # Return the absolute path as a string 128 | return str(save_path) 129 | 130 | except Exception as e: 131 | # Handle errors during screenshot capture or file saving 132 | logger.error(f"Error in take_screenshot_and_return_path: {e}", exc_info=True) 133 | return f"failed: {e}" # Return a failure indicator with the error 134 | 135 | # --- New Tool to Save to Host Workspace --- 136 | @mcp.tool() 137 | def save_screenshot_to_host_workspace(host_workspace_path: str, name: str = "workspace_screenshot.jpg") -> str: 138 | """Takes a screenshot and saves it to the specified Host's WSL workspace path. 139 | 140 | The server (running on Windows) converts the provided WSL path 141 | (e.g., /home/user/project) to a UNC path (e.g., \\\\wsl$\\Distro\\home\\user\\project) 142 | before saving. 143 | 144 | Args: 145 | host_workspace_path (str): The absolute WSL path of the Host's workspace. 146 | name (str, optional): The desired filename for the screenshot. 147 | Defaults to "workspace_screenshot.jpg". 148 | 149 | Returns: 150 | str: "success" if saved successfully, otherwise "failed: [error message]". 151 | """ 152 | logger.info(f"save_screenshot_to_host_workspace called with host_path='{host_workspace_path}', name='{name}'") 153 | buffer = io.BytesIO() 154 | try: 155 | # --- Convert WSL path to UNC path (with auto-detection attempt) --- 156 | if host_workspace_path.startswith('/'): 157 | distro_name = None 158 | try: 159 | import subprocess 160 | # Try to get the default WSL distribution name quietly 161 | result = subprocess.run(['wsl', '-l', '-q'], capture_output=True, text=True, check=True, encoding='utf-16le') # Use utf-16le for wsl output on Windows 162 | # Get the first line of the output, remove potential trailing "(Default)" and strip whitespace 163 | lines = result.stdout.strip().splitlines() 164 | if lines: 165 | distro_name = lines[0].replace('(Default)', '').strip() 166 | logger.info(f"Auto-detected WSL distribution: {distro_name}") 167 | else: 168 | logger.warning("Could not auto-detect WSL distribution name from 'wsl -l -q'. Falling back to default.") 169 | # Fallback to a common default if detection fails 170 | distro_name = "Ubuntu-22.04" 171 | 172 | except FileNotFoundError: 173 | logger.error("'wsl.exe' command not found. Cannot auto-detect distribution. Falling back.") 174 | distro_name = "Ubuntu-22.04" # Fallback 175 | except subprocess.CalledProcessError as e: 176 | logger.error(f"Error running 'wsl -l -q': {e}. Falling back.") 177 | distro_name = "Ubuntu-22.04" # Fallback 178 | except Exception as e: 179 | logger.error(f"Unexpected error during WSL distro detection: {e}. Falling back.") 180 | distro_name = "Ubuntu-22.04" # Fallback 181 | 182 | if distro_name: 183 | unc_path_base = f"\\\\wsl$\\{distro_name}" 184 | windows_compatible_wsl_path = host_workspace_path.lstrip('/').replace('/', '\\') 185 | unc_save_dir = os.path.join(unc_path_base, windows_compatible_wsl_path) 186 | save_path_obj = Path(unc_save_dir) / name 187 | logger.info(f"Attempting to save to UNC path: {save_path_obj}") 188 | else: 189 | logger.error("Failed to determine WSL distribution name.") 190 | return "failed: could not determine WSL distribution" 191 | else: 192 | logger.error(f"Invalid WSL path provided: '{host_workspace_path}'. Path must start with '/'.") 193 | return "failed: invalid WSL path format" 194 | # --- End Path Conversion --- 195 | 196 | # Capture the screenshot 197 | screenshot = pyautogui.screenshot() 198 | # Convert and save to buffer as JPEG 199 | screenshot.convert("RGB").save(buffer, format="JPEG", quality=60, optimize=True) 200 | image_data = buffer.getvalue() 201 | logger.debug(f"Image data length: {len(image_data)}") 202 | 203 | # Process file saving using the UNC path 204 | try: 205 | # Create directory if it doesn't exist (using Path object) 206 | save_path_obj.parent.mkdir(parents=True, exist_ok=True) 207 | 208 | # Write the image data to the file 209 | with open(save_path_obj, "wb") as f: 210 | f.write(image_data) 211 | logger.info(f"Successfully saved screenshot to WSL path via UNC: {save_path_obj}") 212 | return "success" 213 | except Exception as e: 214 | logger.error(f"Error writing screenshot to UNC path '{save_path_obj}': {e}", exc_info=True) 215 | # Provide more specific error if possible (e.g., permission denied, path not found) 216 | return f"failed: file write error to WSL path ({e})" 217 | 218 | except Exception as e: 219 | # Handle errors during screenshot capture itself 220 | logger.error(f"Error capturing screenshot: {e}", exc_info=True) 221 | return "failed: screenshot capture error" 222 | # --- End New Tool --- 223 | 224 | # --- Tool take_screenshot_and_return_base64 removed as direct interpretation was problematic --- 225 | 226 | # Removed take_screenshot_and_create_resource as resource handling in mcp library was unclear 227 | 228 | def run(): 229 | """Starts the MCP server.""" 230 | logger.info("Starting MCP server...") 231 | try: 232 | # Run the server, listening via stdio 233 | mcp.run(transport="stdio") 234 | except Exception as e: 235 | # Log critical errors if the server fails to start or run 236 | logger.critical(f"MCP server failed to run: {e}", exc_info=True) 237 | finally: 238 | # Log when the server stops 239 | logger.info("--- Screenshot Server Stopping ---") 240 | 241 | # Removed test_run function 242 | 243 | if __name__ == "__main__": 244 | # Entry point when the script is executed directly 245 | run() ```