kunihiros/screenshot-server # codebase.md

# Directory Structure

```
├── .gitignore
├── .python-version
├── pyproject.toml
├── README.md
├── screenshot.py
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------

```
1 | 3.10
2 | 
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
 1 | # Python-generated files
 2 | __pycache__/
 3 | *.py[oc]
 4 | build/
 5 | dist/
 6 | wheels/
 7 | *.egg-info
 8 | 
 9 | # Virtual environments
10 | .venv
11 | 
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Screenshot Server (File Path Focused)
  2 | 
  3 | This project provides an MCP (Modular Communication Protocol) server designed to capture screenshots and facilitate their use by other processes or AI assistants, primarily by **saving the screenshot to a file path specified by the client (Host)**.
  4 | 
  5 | ## Core Problem & Solution
  6 | 
  7 | Directly interpreting screenshot image data sent via MCP by AI assistants proved unreliable in testing. This server adopts more robust workflows focused on file paths:
  8 | 
  9 | **Recommended Workflow (WSL Host -> Windows Server):**
 10 | 
 11 | 1.  An MCP Host (like an AI assistant running in WSL) calls the `save_screenshot_to_host_workspace` tool, providing its **WSL workspace path** as an argument.
 12 | 2.  This server (running on Windows) captures the screen.
 13 | 3.  The server converts the received WSL path to a Windows-accessible UNC path (e.g., `\\wsl$\Distro\path`).
 14 | 4.  The server saves the screenshot to the specified location within the Host's WSL filesystem via the UNC path.
 15 | 5.  The server returns `"success"` or `"failed:..."`.
 16 | 6.  The MCP Host knows the file is saved in its workspace (or a sub-directory if specified in the path argument).
 17 | 7.  The MCP Host can then pass the **WSL path** to another specialized MCP server (running in WSL) for image analysis.
 18 | 
 19 | **Alternative Workflow (General):**
 20 | 
 21 | 1.  MCP Host calls `take_screenshot_and_return_path`, optionally specifying a filename.
 22 | 2.  Server saves the screenshot to its local `images/` directory.
 23 | 3.  Server returns the **absolute path** (e.g., Windows path) to the saved file.
 24 | 4.  MCP Host receives the path and passes it (with potential conversion) to an analysis server.
 25 | 
 26 | ## Available Tools
 27 | 
 28 | This server provides the following tools, ordered by recommended usage:
 29 | 
 30 | *   **`save_screenshot_to_host_workspace(host_workspace_path: str, name: str = "workspace_screenshot.jpg")`**
 31 |     *   **Recommended Use:** Saves a screenshot directly into the AI Assistant's (Host's) current WSL workspace. This is the preferred method for seamless integration.
 32 |     *   **Action:** Takes a screenshot, converts the provided WSL path to a UNC path, and saves the file to the Host's workspace. Automatically detects the WSL distribution name.
 33 |     *   **Args:**
 34 |         *   `host_workspace_path` (str): The absolute WSL path of the Host's workspace (e.g., `/home/user/project`).
 35 |         *   `name` (str, optional): Filename. Defaults to `workspace_screenshot.jpg`.
 36 |     *   **Returns:** `str` - `"success"` or `"failed: [error message]"`.
 37 | 
 38 | *   **`take_screenshot_and_return_path(name: str = "latest_screenshot.jpg")`**
 39 |     *   **Use Case:** Saves a screenshot to a fixed `images/` directory relative to the server's location and returns the absolute path (typically a Windows path). Useful if the caller needs the path for external processing.
 40 |     *   **Args:**
 41 |         *   `name` (str, optional): Filename. Defaults to `latest_screenshot.jpg`.
 42 |     *   **Returns:** `str` - Absolute path or `"failed: [error message]"`.
 43 | 
 44 | *   **`take_screenshot_path(path: str = "./", name: str = "screenshot.jpg")`**
 45 |     *   **Use Case:** Saves a screenshot to an arbitrary location specified by a Windows path or a UNC path (e.g., for saving outside the Host's workspace). Requires careful path specification by the caller.
 46 |     *   **Args:**
 47 |         *   `path` (str, optional): Target directory (Windows or UNC path). Defaults to server's working directory.
 48 |         *   `name` (str, optional): Filename. Defaults to `screenshot.jpg`.
 49 |     *   **Returns:** `str` - `"success"` or `"failed: [error message]"`.
 50 | ## Setup and Usage
 51 | 
 52 | ### 1. Prerequisites
 53 | *   **Python 3.x:** Required on the machine where the server will run.
 54 | *   **Dependencies:** Install using `uv`:
 55 |     ```bash
 56 |     uv sync
 57 |     ```
 58 |     Required libraries include `mcp[cli]>=1.4.1`, `pyautogui`, and `Pillow`.
 59 | 
 60 | ### 2. Running the Server
 61 | This server is typically launched *by* an MCP Host based on its configuration.
 62 | 
 63 | ### 3. Environment Considerations (Especially WSL2)
 64 | 
 65 | **Crucial Point:** To capture the **Windows screen**, this `screenshot.py` server **must run directly on Windows**.
 66 | 
 67 | **Recommended WSL2 Host -> Windows Server Setup:**
 68 | 
 69 | 1.  **Project Location:** Place this `screenshot-server` project folder on your **Windows filesystem** (e.g., `C:\Users\YourUser\projects\screenshot-server`).
 70 | 2.  **Windows Dependencies:** Install Python, `uv`, and project dependencies (`uv sync ...`) directly on **Windows** within the project folder.
 71 | 3.  **MCP Host Configuration (in WSL):** Configure your MCP Host (running in WSL) to launch the server on Windows using PowerShell. Update `mcp_settings.json` (or equivalent):
 72 | 
 73 |     ```json
 74 |     {
 75 |       "mcpServers": {
 76 |         "Screenshot-server": {
 77 |           "command": "powershell.exe",
 78 |           "args": [
 79 |             "-Command",
 80 |             "Invoke-Command -ScriptBlock { cd '<YOUR_WINDOWS_PROJECT_PATH>'; & '<YOUR_WINDOWS_UV_PATH>' run screenshot.py }"
 81 |           ]
 82 |         }
 83 |         // ... other servers ...
 84 |       }
 85 |     }
 86 |     ```
 87 |     *   Replace paths with your actual Windows paths.
 88 | 
 89 | ### 4. Workflow Example (AI Assistant in WSL)
 90 | 1.  AI Assistant identifies its current workspace path (e.g., `/home/user/current_project`).
 91 | 2.  AI Assistant uses `use_mcp_tool` to call `save_screenshot_to_host_workspace` on `Screenshot-server`, passing `host_workspace_path="/home/user/current_project"` and optionally a `name`.
 92 | 3.  Receives `"success"`.
 93 | 4.  AI Assistant knows the screenshot is now at `/home/user/current_project/workspace_screenshot.jpg` (or the specified name).
 94 | 5.  AI Assistant uses `use_mcp_tool` to call an *image analysis* server/tool (also running in WSL), passing the WSL path `/home/user/current_project/workspace_screenshot.jpg`.
 95 | 6.  The image analysis server reads the file and performs its task.
 96 | 
 97 | ## File Structure
 98 | 
 99 | *   `screenshot.py`: The core MCP server script.
100 | *   `README.md`: This documentation file.
101 | *   `pyproject.toml`: Project definition and dependencies for `uv`.
102 | *   `uv.lock`: Dependency lock file.
103 | *   `.gitignore`: Git ignore configuration.
104 | *   `.python-version`: (Optional) Python version specifier.
105 | *   `server.log`: Log file generated by the server.
106 | *   `images/`: Default directory for `take_screenshot_and_return_path`.
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
 1 | [project]
 2 | name = "screenshot-server"
 3 | version = "0.1.0"
 4 | description = "Add your description here"
 5 | readme = "README.md"
 6 | requires-python = ">=3.10"
 7 | dependencies = [
 8 |     "mcp[cli]>=1.4.1",
 9 |     "pyautogui>=0.9.54",
10 | ]
11 | 
```

--------------------------------------------------------------------------------
/screenshot.py:
--------------------------------------------------------------------------------

```python
  1 | from mcp.server.fastmcp import FastMCP, Image # Image might still be needed internally
  2 | import io
  3 | import os
  4 | from pathlib import Path
  5 | import pyautogui
  6 | from mcp.types import ImageContent # Keep ImageContent for potential future use or internal conversion
  7 | import base64 # Import base64 module
  8 | import logging
  9 | import sys
 10 | import datetime
 11 | 
 12 | # --- Logger Setup ---
 13 | log_file = "server.log"
 14 | logging.basicConfig(
 15 |     level=logging.INFO, # Set to INFO for general use, DEBUG for detailed troubleshooting
 16 |     format='%(asctime)s - %(levelname)s - %(message)s',
 17 |     handlers=[
 18 |         logging.FileHandler(log_file, mode='a'), # Append to log file
 19 |         logging.StreamHandler(sys.stdout)      # Also print to console (useful if run directly)
 20 |     ]
 21 | )
 22 | logger = logging.getLogger(__name__)
 23 | logger.info("--- Screenshot Server Starting ---")
 24 | # --- End Logger Setup ---
 25 | 
 26 | # Create server instance
 27 | mcp = FastMCP("screenshot server")
 28 | 
 29 | # Note: Tools returning raw image data (like the original take_screenshot/take_screenshot_image)
 30 | # were removed because AI interpretation via MCP showed inconsistencies.
 31 | # The current approach focuses on saving the image to a file and returning the path or base64 data.
 32 | 
 33 | @mcp.tool()
 34 | def take_screenshot_path(path: str = "./", name: str = "screenshot.jpg") -> str:
 35 |     """Takes a screenshot and saves it to a specified path and filename on the server machine.
 36 | 
 37 |     Provides flexibility for saving to specific Windows locations or WSL locations via UNC paths.
 38 |     For saving directly to the Host's WSL workspace, prefer using 'save_screenshot_to_host_workspace'.
 39 | 
 40 |     Args:
 41 |         path (str, optional): The target directory path (Windows path or UNC path like \\\\wsl$\\Distro\\...).
 42 |                               Defaults to the server's current working directory (`./`).
 43 |         name (str, optional): The desired filename for the screenshot.
 44 |                               Defaults to "screenshot.jpg".
 45 | 
 46 |     Returns:
 47 |         str: "success" if saved successfully, otherwise "failed: [error message]".
 48 |     """
 49 |     logger.info(f"take_screenshot_path called with path='{path}', name='{name}'")
 50 |     buffer = io.BytesIO()
 51 |     try:
 52 |         # Capture the screenshot
 53 |         screenshot = pyautogui.screenshot()
 54 |         # Convert and save to buffer as JPEG
 55 |         screenshot.convert("RGB").save(buffer, format="JPEG", quality=60, optimize=True)
 56 |         image_data = buffer.getvalue()
 57 |         logger.debug(f"Image data length: {len(image_data)}")
 58 | 
 59 |         # Process file saving
 60 |         try:
 61 |             # Resolve the path - this works for both Windows and UNC paths
 62 |             save_path_obj = Path(path) / name
 63 |             # Ensure the directory exists
 64 |             save_path_obj.parent.mkdir(parents=True, exist_ok=True)
 65 |             # Resolve after ensuring directory exists, especially for UNC
 66 |             save_path = save_path_obj.resolve()
 67 | 
 68 |             # Security check (more robust check might be needed for UNC paths if strict confinement is required)
 69 |             # For simple cases, checking if the resolved path is valid might suffice here.
 70 |             # A basic check could involve ensuring it's not trying to write to system dirs, but UNC makes it tricky.
 71 |             # For now, we rely on the OS permissions and the user providing a valid target.
 72 |             # Consider adding checks based on expected base paths if needed.
 73 | 
 74 |             # Write the image data to the file
 75 |             with open(save_path, "wb") as f:
 76 |                 f.write(image_data)
 77 |             logger.info(f"Successfully saved screenshot to {save_path}")
 78 |             return "success"
 79 |         except Exception as e:
 80 |             # Log the specific path that failed if possible
 81 |             logger.error(f"Error writing screenshot to file '{path}/{name}': {e}", exc_info=True)
 82 |             return "failed: file write error"
 83 |     except Exception as e:
 84 |         # Handle errors during screenshot capture itself
 85 |         logger.error(f"Error capturing screenshot: {e}", exc_info=True)
 86 |         return "failed: screenshot capture error"
 87 | 
 88 | @mcp.tool()
 89 | def take_screenshot_and_return_path(name: str = "latest_screenshot.jpg") -> str:
 90 |     """Takes a screenshot, saves it to images/ directory, and returns the absolute path.
 91 | 
 92 |     Saves the screenshot with the specified filename within the 'images' subdirectory
 93 |     relative to the server's execution directory. This is the primary tool for
 94 |     workflows requiring the file path for subsequent processing.
 95 | 
 96 |     Args:
 97 |         name (str, optional): The filename for the screenshot (e.g., "current_view.jpg").
 98 |                               Defaults to "latest_screenshot.jpg".
 99 | 
100 |     Returns:
101 |         str: The absolute path (e.g., Windows path like C:\\...) to the saved screenshot file,
102 |              or "failed: [error message]" if an error occurs.
103 |     """
104 |     logger.info(f"take_screenshot_and_return_path called with name='{name}'")
105 |     buffer = io.BytesIO()
106 |     try:
107 |         # Capture the screenshot
108 |         screenshot = pyautogui.screenshot()
109 |         # Convert and save to buffer as JPEG
110 |         screenshot.convert("RGB").save(buffer, format="JPEG", quality=60, optimize=True)
111 |         image_data = buffer.getvalue()
112 |         logger.debug(f"Image data length: {len(image_data)}")
113 | 
114 |         # Define the fixed save location relative to the script's execution directory
115 |         save_dir = Path("images")
116 |         # Use the provided 'name' argument for the filename
117 |         save_path = (save_dir / name).resolve() # Use 'name' argument
118 | 
119 |         # Create the 'images' directory if it doesn't exist
120 |         save_dir.mkdir(parents=True, exist_ok=True)
121 | 
122 |         # Save the file
123 |         with open(save_path, "wb") as f:
124 |             f.write(image_data)
125 |         logger.info(f"Screenshot saved to: {save_path}")
126 | 
127 |         # Return the absolute path as a string
128 |         return str(save_path)
129 | 
130 |     except Exception as e:
131 |         # Handle errors during screenshot capture or file saving
132 |         logger.error(f"Error in take_screenshot_and_return_path: {e}", exc_info=True)
133 |         return f"failed: {e}" # Return a failure indicator with the error
134 | 
135 | # --- New Tool to Save to Host Workspace ---
136 | @mcp.tool()
137 | def save_screenshot_to_host_workspace(host_workspace_path: str, name: str = "workspace_screenshot.jpg") -> str:
138 |     """Takes a screenshot and saves it to the specified Host's WSL workspace path.
139 | 
140 |     The server (running on Windows) converts the provided WSL path
141 |     (e.g., /home/user/project) to a UNC path (e.g., \\\\wsl$\\Distro\\home\\user\\project)
142 |     before saving.
143 | 
144 |     Args:
145 |         host_workspace_path (str): The absolute WSL path of the Host's workspace.
146 |         name (str, optional): The desired filename for the screenshot.
147 |                               Defaults to "workspace_screenshot.jpg".
148 | 
149 |     Returns:
150 |         str: "success" if saved successfully, otherwise "failed: [error message]".
151 |     """
152 |     logger.info(f"save_screenshot_to_host_workspace called with host_path='{host_workspace_path}', name='{name}'")
153 |     buffer = io.BytesIO()
154 |     try:
155 |         # --- Convert WSL path to UNC path (with auto-detection attempt) ---
156 |         if host_workspace_path.startswith('/'):
157 |             distro_name = None
158 |             try:
159 |                 import subprocess
160 |                 # Try to get the default WSL distribution name quietly
161 |                 result = subprocess.run(['wsl', '-l', '-q'], capture_output=True, text=True, check=True, encoding='utf-16le') # Use utf-16le for wsl output on Windows
162 |                 # Get the first line of the output, remove potential trailing "(Default)" and strip whitespace
163 |                 lines = result.stdout.strip().splitlines()
164 |                 if lines:
165 |                     distro_name = lines[0].replace('(Default)', '').strip()
166 |                     logger.info(f"Auto-detected WSL distribution: {distro_name}")
167 |                 else:
168 |                     logger.warning("Could not auto-detect WSL distribution name from 'wsl -l -q'. Falling back to default.")
169 |                     # Fallback to a common default if detection fails
170 |                     distro_name = "Ubuntu-22.04"
171 | 
172 |             except FileNotFoundError:
173 |                 logger.error("'wsl.exe' command not found. Cannot auto-detect distribution. Falling back.")
174 |                 distro_name = "Ubuntu-22.04" # Fallback
175 |             except subprocess.CalledProcessError as e:
176 |                 logger.error(f"Error running 'wsl -l -q': {e}. Falling back.")
177 |                 distro_name = "Ubuntu-22.04" # Fallback
178 |             except Exception as e:
179 |                  logger.error(f"Unexpected error during WSL distro detection: {e}. Falling back.")
180 |                  distro_name = "Ubuntu-22.04" # Fallback
181 | 
182 |             if distro_name:
183 |                 unc_path_base = f"\\\\wsl$\\{distro_name}"
184 |                 windows_compatible_wsl_path = host_workspace_path.lstrip('/').replace('/', '\\')
185 |                 unc_save_dir = os.path.join(unc_path_base, windows_compatible_wsl_path)
186 |                 save_path_obj = Path(unc_save_dir) / name
187 |                 logger.info(f"Attempting to save to UNC path: {save_path_obj}")
188 |             else:
189 |                  logger.error("Failed to determine WSL distribution name.")
190 |                  return "failed: could not determine WSL distribution"
191 |         else:
192 |             logger.error(f"Invalid WSL path provided: '{host_workspace_path}'. Path must start with '/'.")
193 |             return "failed: invalid WSL path format"
194 |         # --- End Path Conversion ---
195 | 
196 |         # Capture the screenshot
197 |         screenshot = pyautogui.screenshot()
198 |         # Convert and save to buffer as JPEG
199 |         screenshot.convert("RGB").save(buffer, format="JPEG", quality=60, optimize=True)
200 |         image_data = buffer.getvalue()
201 |         logger.debug(f"Image data length: {len(image_data)}")
202 | 
203 |         # Process file saving using the UNC path
204 |         try:
205 |             # Create directory if it doesn't exist (using Path object)
206 |             save_path_obj.parent.mkdir(parents=True, exist_ok=True)
207 | 
208 |             # Write the image data to the file
209 |             with open(save_path_obj, "wb") as f:
210 |                 f.write(image_data)
211 |             logger.info(f"Successfully saved screenshot to WSL path via UNC: {save_path_obj}")
212 |             return "success"
213 |         except Exception as e:
214 |             logger.error(f"Error writing screenshot to UNC path '{save_path_obj}': {e}", exc_info=True)
215 |             # Provide more specific error if possible (e.g., permission denied, path not found)
216 |             return f"failed: file write error to WSL path ({e})"
217 | 
218 |     except Exception as e:
219 |         # Handle errors during screenshot capture itself
220 |         logger.error(f"Error capturing screenshot: {e}", exc_info=True)
221 |         return "failed: screenshot capture error"
222 | # --- End New Tool ---
223 | 
224 | # --- Tool take_screenshot_and_return_base64 removed as direct interpretation was problematic ---
225 | 
226 | # Removed take_screenshot_and_create_resource as resource handling in mcp library was unclear
227 | 
228 | def run():
229 |     """Starts the MCP server."""
230 |     logger.info("Starting MCP server...")
231 |     try:
232 |         # Run the server, listening via stdio
233 |         mcp.run(transport="stdio")
234 |     except Exception as e:
235 |         # Log critical errors if the server fails to start or run
236 |         logger.critical(f"MCP server failed to run: {e}", exc_info=True)
237 |     finally:
238 |         # Log when the server stops
239 |         logger.info("--- Screenshot Server Stopping ---")
240 | 
241 | # Removed test_run function
242 | 
243 | if __name__ == "__main__":
244 |     # Entry point when the script is executed directly
245 |     run()
```