kunihiros/screenshot-server # codebase.md

# Directory Structure

```
├── .gitignore
├── .python-version
├── pyproject.toml
├── README.md
├── screenshot.py
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------

```
3.10

```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
# Python-generated files
__pycache__/
*.py[oc]
build/
dist/
wheels/
*.egg-info

# Virtual environments
.venv

```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
# Screenshot Server (File Path Focused)

This project provides an MCP (Modular Communication Protocol) server designed to capture screenshots and facilitate their use by other processes or AI assistants, primarily by **saving the screenshot to a file path specified by the client (Host)**.

## Core Problem & Solution

Directly interpreting screenshot image data sent via MCP by AI assistants proved unreliable in testing. This server adopts more robust workflows focused on file paths:

**Recommended Workflow (WSL Host -> Windows Server):**

1.  An MCP Host (like an AI assistant running in WSL) calls the `save_screenshot_to_host_workspace` tool, providing its **WSL workspace path** as an argument.
2.  This server (running on Windows) captures the screen.
3.  The server converts the received WSL path to a Windows-accessible UNC path (e.g., `\\wsl$\Distro\path`).
4.  The server saves the screenshot to the specified location within the Host's WSL filesystem via the UNC path.
5.  The server returns `"success"` or `"failed:..."`.
6.  The MCP Host knows the file is saved in its workspace (or a sub-directory if specified in the path argument).
7.  The MCP Host can then pass the **WSL path** to another specialized MCP server (running in WSL) for image analysis.

**Alternative Workflow (General):**

1.  MCP Host calls `take_screenshot_and_return_path`, optionally specifying a filename.
2.  Server saves the screenshot to its local `images/` directory.
3.  Server returns the **absolute path** (e.g., Windows path) to the saved file.
4.  MCP Host receives the path and passes it (with potential conversion) to an analysis server.

## Available Tools

This server provides the following tools, ordered by recommended usage:

*   **`save_screenshot_to_host_workspace(host_workspace_path: str, name: str = "workspace_screenshot.jpg")`**
    *   **Recommended Use:** Saves a screenshot directly into the AI Assistant's (Host's) current WSL workspace. This is the preferred method for seamless integration.
    *   **Action:** Takes a screenshot, converts the provided WSL path to a UNC path, and saves the file to the Host's workspace. Automatically detects the WSL distribution name.
    *   **Args:**
        *   `host_workspace_path` (str): The absolute WSL path of the Host's workspace (e.g., `/home/user/project`).
        *   `name` (str, optional): Filename. Defaults to `workspace_screenshot.jpg`.
    *   **Returns:** `str` - `"success"` or `"failed: [error message]"`.

*   **`take_screenshot_and_return_path(name: str = "latest_screenshot.jpg")`**
    *   **Use Case:** Saves a screenshot to a fixed `images/` directory relative to the server's location and returns the absolute path (typically a Windows path). Useful if the caller needs the path for external processing.
    *   **Args:**
        *   `name` (str, optional): Filename. Defaults to `latest_screenshot.jpg`.
    *   **Returns:** `str` - Absolute path or `"failed: [error message]"`.

*   **`take_screenshot_path(path: str = "./", name: str = "screenshot.jpg")`**
    *   **Use Case:** Saves a screenshot to an arbitrary location specified by a Windows path or a UNC path (e.g., for saving outside the Host's workspace). Requires careful path specification by the caller.
    *   **Args:**
        *   `path` (str, optional): Target directory (Windows or UNC path). Defaults to server's working directory.
        *   `name` (str, optional): Filename. Defaults to `screenshot.jpg`.
    *   **Returns:** `str` - `"success"` or `"failed: [error message]"`.
## Setup and Usage

### 1. Prerequisites
*   **Python 3.x:** Required on the machine where the server will run.
*   **Dependencies:** Install using `uv`:
    ```bash
    uv sync
    ```
    Required libraries include `mcp[cli]>=1.4.1`, `pyautogui`, and `Pillow`.

### 2. Running the Server
This server is typically launched *by* an MCP Host based on its configuration.

### 3. Environment Considerations (Especially WSL2)

**Crucial Point:** To capture the **Windows screen**, this `screenshot.py` server **must run directly on Windows**.

**Recommended WSL2 Host -> Windows Server Setup:**

1.  **Project Location:** Place this `screenshot-server` project folder on your **Windows filesystem** (e.g., `C:\Users\YourUser\projects\screenshot-server`).
2.  **Windows Dependencies:** Install Python, `uv`, and project dependencies (`uv sync ...`) directly on **Windows** within the project folder.
3.  **MCP Host Configuration (in WSL):** Configure your MCP Host (running in WSL) to launch the server on Windows using PowerShell. Update `mcp_settings.json` (or equivalent):

    ```json
    {
      "mcpServers": {
        "Screenshot-server": {
          "command": "powershell.exe",
          "args": [
            "-Command",
            "Invoke-Command -ScriptBlock { cd '<YOUR_WINDOWS_PROJECT_PATH>'; & '<YOUR_WINDOWS_UV_PATH>' run screenshot.py }"
          ]
        }
        // ... other servers ...
      }
    }
    ```
    *   Replace paths with your actual Windows paths.

### 4. Workflow Example (AI Assistant in WSL)
1.  AI Assistant identifies its current workspace path (e.g., `/home/user/current_project`).
2.  AI Assistant uses `use_mcp_tool` to call `save_screenshot_to_host_workspace` on `Screenshot-server`, passing `host_workspace_path="/home/user/current_project"` and optionally a `name`.
3.  Receives `"success"`.
4.  AI Assistant knows the screenshot is now at `/home/user/current_project/workspace_screenshot.jpg` (or the specified name).
5.  AI Assistant uses `use_mcp_tool` to call an *image analysis* server/tool (also running in WSL), passing the WSL path `/home/user/current_project/workspace_screenshot.jpg`.
6.  The image analysis server reads the file and performs its task.

## File Structure

*   `screenshot.py`: The core MCP server script.
*   `README.md`: This documentation file.
*   `pyproject.toml`: Project definition and dependencies for `uv`.
*   `uv.lock`: Dependency lock file.
*   `.gitignore`: Git ignore configuration.
*   `.python-version`: (Optional) Python version specifier.
*   `server.log`: Log file generated by the server.
*   `images/`: Default directory for `take_screenshot_and_return_path`.
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
[project]
name = "screenshot-server"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.10"
dependencies = [
    "mcp[cli]>=1.4.1",
    "pyautogui>=0.9.54",
]

```

--------------------------------------------------------------------------------
/screenshot.py:
--------------------------------------------------------------------------------

```python
from mcp.server.fastmcp import FastMCP, Image # Image might still be needed internally
import io
import os
from pathlib import Path
import pyautogui
from mcp.types import ImageContent # Keep ImageContent for potential future use or internal conversion
import base64 # Import base64 module
import logging
import sys
import datetime

# --- Logger Setup ---
log_file = "server.log"
logging.basicConfig(
    level=logging.INFO, # Set to INFO for general use, DEBUG for detailed troubleshooting
    format='%(asctime)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler(log_file, mode='a'), # Append to log file
        logging.StreamHandler(sys.stdout)      # Also print to console (useful if run directly)
    ]
)
logger = logging.getLogger(__name__)
logger.info("--- Screenshot Server Starting ---")
# --- End Logger Setup ---

# Create server instance
mcp = FastMCP("screenshot server")

# Note: Tools returning raw image data (like the original take_screenshot/take_screenshot_image)
# were removed because AI interpretation via MCP showed inconsistencies.
# The current approach focuses on saving the image to a file and returning the path or base64 data.

@mcp.tool()
def take_screenshot_path(path: str = "./", name: str = "screenshot.jpg") -> str:
    """Takes a screenshot and saves it to a specified path and filename on the server machine.

    Provides flexibility for saving to specific Windows locations or WSL locations via UNC paths.
    For saving directly to the Host's WSL workspace, prefer using 'save_screenshot_to_host_workspace'.

    Args:
        path (str, optional): The target directory path (Windows path or UNC path like \\\\wsl$\\Distro\\...).
                              Defaults to the server's current working directory (`./`).
        name (str, optional): The desired filename for the screenshot.
                              Defaults to "screenshot.jpg".

    Returns:
        str: "success" if saved successfully, otherwise "failed: [error message]".
    """
    logger.info(f"take_screenshot_path called with path='{path}', name='{name}'")
    buffer = io.BytesIO()
    try:
        # Capture the screenshot
        screenshot = pyautogui.screenshot()
        # Convert and save to buffer as JPEG
        screenshot.convert("RGB").save(buffer, format="JPEG", quality=60, optimize=True)
        image_data = buffer.getvalue()
        logger.debug(f"Image data length: {len(image_data)}")

        # Process file saving
        try:
            # Resolve the path - this works for both Windows and UNC paths
            save_path_obj = Path(path) / name
            # Ensure the directory exists
            save_path_obj.parent.mkdir(parents=True, exist_ok=True)
            # Resolve after ensuring directory exists, especially for UNC
            save_path = save_path_obj.resolve()

            # Security check (more robust check might be needed for UNC paths if strict confinement is required)
            # For simple cases, checking if the resolved path is valid might suffice here.
            # A basic check could involve ensuring it's not trying to write to system dirs, but UNC makes it tricky.
            # For now, we rely on the OS permissions and the user providing a valid target.
            # Consider adding checks based on expected base paths if needed.

            # Write the image data to the file
            with open(save_path, "wb") as f:
                f.write(image_data)
            logger.info(f"Successfully saved screenshot to {save_path}")
            return "success"
        except Exception as e:
            # Log the specific path that failed if possible
            logger.error(f"Error writing screenshot to file '{path}/{name}': {e}", exc_info=True)
            return "failed: file write error"
    except Exception as e:
        # Handle errors during screenshot capture itself
        logger.error(f"Error capturing screenshot: {e}", exc_info=True)
        return "failed: screenshot capture error"

@mcp.tool()
def take_screenshot_and_return_path(name: str = "latest_screenshot.jpg") -> str:
    """Takes a screenshot, saves it to images/ directory, and returns the absolute path.

    Saves the screenshot with the specified filename within the 'images' subdirectory
    relative to the server's execution directory. This is the primary tool for
    workflows requiring the file path for subsequent processing.

    Args:
        name (str, optional): The filename for the screenshot (e.g., "current_view.jpg").
                              Defaults to "latest_screenshot.jpg".

    Returns:
        str: The absolute path (e.g., Windows path like C:\\...) to the saved screenshot file,
             or "failed: [error message]" if an error occurs.
    """
    logger.info(f"take_screenshot_and_return_path called with name='{name}'")
    buffer = io.BytesIO()
    try:
        # Capture the screenshot
        screenshot = pyautogui.screenshot()
        # Convert and save to buffer as JPEG
        screenshot.convert("RGB").save(buffer, format="JPEG", quality=60, optimize=True)
        image_data = buffer.getvalue()
        logger.debug(f"Image data length: {len(image_data)}")

        # Define the fixed save location relative to the script's execution directory
        save_dir = Path("images")
        # Use the provided 'name' argument for the filename
        save_path = (save_dir / name).resolve() # Use 'name' argument

        # Create the 'images' directory if it doesn't exist
        save_dir.mkdir(parents=True, exist_ok=True)

        # Save the file
        with open(save_path, "wb") as f:
            f.write(image_data)
        logger.info(f"Screenshot saved to: {save_path}")

        # Return the absolute path as a string
        return str(save_path)

    except Exception as e:
        # Handle errors during screenshot capture or file saving
        logger.error(f"Error in take_screenshot_and_return_path: {e}", exc_info=True)
        return f"failed: {e}" # Return a failure indicator with the error

# --- New Tool to Save to Host Workspace ---
@mcp.tool()
def save_screenshot_to_host_workspace(host_workspace_path: str, name: str = "workspace_screenshot.jpg") -> str:
    """Takes a screenshot and saves it to the specified Host's WSL workspace path.

    The server (running on Windows) converts the provided WSL path
    (e.g., /home/user/project) to a UNC path (e.g., \\\\wsl$\\Distro\\home\\user\\project)
    before saving.

    Args:
        host_workspace_path (str): The absolute WSL path of the Host's workspace.
        name (str, optional): The desired filename for the screenshot.
                              Defaults to "workspace_screenshot.jpg".

    Returns:
        str: "success" if saved successfully, otherwise "failed: [error message]".
    """
    logger.info(f"save_screenshot_to_host_workspace called with host_path='{host_workspace_path}', name='{name}'")
    buffer = io.BytesIO()
    try:
        # --- Convert WSL path to UNC path (with auto-detection attempt) ---
        if host_workspace_path.startswith('/'):
            distro_name = None
            try:
                import subprocess
                # Try to get the default WSL distribution name quietly
                result = subprocess.run(['wsl', '-l', '-q'], capture_output=True, text=True, check=True, encoding='utf-16le') # Use utf-16le for wsl output on Windows
                # Get the first line of the output, remove potential trailing "(Default)" and strip whitespace
                lines = result.stdout.strip().splitlines()
                if lines:
                    distro_name = lines[0].replace('(Default)', '').strip()
                    logger.info(f"Auto-detected WSL distribution: {distro_name}")
                else:
                    logger.warning("Could not auto-detect WSL distribution name from 'wsl -l -q'. Falling back to default.")
                    # Fallback to a common default if detection fails
                    distro_name = "Ubuntu-22.04"

            except FileNotFoundError:
                logger.error("'wsl.exe' command not found. Cannot auto-detect distribution. Falling back.")
                distro_name = "Ubuntu-22.04" # Fallback
            except subprocess.CalledProcessError as e:
                logger.error(f"Error running 'wsl -l -q': {e}. Falling back.")
                distro_name = "Ubuntu-22.04" # Fallback
            except Exception as e:
                 logger.error(f"Unexpected error during WSL distro detection: {e}. Falling back.")
                 distro_name = "Ubuntu-22.04" # Fallback

            if distro_name:
                unc_path_base = f"\\\\wsl$\\{distro_name}"
                windows_compatible_wsl_path = host_workspace_path.lstrip('/').replace('/', '\\')
                unc_save_dir = os.path.join(unc_path_base, windows_compatible_wsl_path)
                save_path_obj = Path(unc_save_dir) / name
                logger.info(f"Attempting to save to UNC path: {save_path_obj}")
            else:
                 logger.error("Failed to determine WSL distribution name.")
                 return "failed: could not determine WSL distribution"
        else:
            logger.error(f"Invalid WSL path provided: '{host_workspace_path}'. Path must start with '/'.")
            return "failed: invalid WSL path format"
        # --- End Path Conversion ---

        # Capture the screenshot
        screenshot = pyautogui.screenshot()
        # Convert and save to buffer as JPEG
        screenshot.convert("RGB").save(buffer, format="JPEG", quality=60, optimize=True)
        image_data = buffer.getvalue()
        logger.debug(f"Image data length: {len(image_data)}")

        # Process file saving using the UNC path
        try:
            # Create directory if it doesn't exist (using Path object)
            save_path_obj.parent.mkdir(parents=True, exist_ok=True)

            # Write the image data to the file
            with open(save_path_obj, "wb") as f:
                f.write(image_data)
            logger.info(f"Successfully saved screenshot to WSL path via UNC: {save_path_obj}")
            return "success"
        except Exception as e:
            logger.error(f"Error writing screenshot to UNC path '{save_path_obj}': {e}", exc_info=True)
            # Provide more specific error if possible (e.g., permission denied, path not found)
            return f"failed: file write error to WSL path ({e})"

    except Exception as e:
        # Handle errors during screenshot capture itself
        logger.error(f"Error capturing screenshot: {e}", exc_info=True)
        return "failed: screenshot capture error"
# --- End New Tool ---

# --- Tool take_screenshot_and_return_base64 removed as direct interpretation was problematic ---

# Removed take_screenshot_and_create_resource as resource handling in mcp library was unclear

def run():
    """Starts the MCP server."""
    logger.info("Starting MCP server...")
    try:
        # Run the server, listening via stdio
        mcp.run(transport="stdio")
    except Exception as e:
        # Log critical errors if the server fails to start or run
        logger.critical(f"MCP server failed to run: {e}", exc_info=True)
    finally:
        # Log when the server stops
        logger.info("--- Screenshot Server Stopping ---")

# Removed test_run function

if __name__ == "__main__":
    # Entry point when the script is executed directly
    run()
```