# Directory Structure ``` ├── .gitignore ├── bin │ └── mcp-selenium.js ├── Dockerfile ├── LICENSE ├── package-lock.json ├── package.json ├── README.md ├── smithery.yaml └── src └── lib └── server.js ``` # Files -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- ``` 1 | # Python 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | *.so 6 | .Python 7 | build/ 8 | develop-eggs/ 9 | dist/ 10 | downloads/ 11 | eggs/ 12 | .eggs/ 13 | lib/ 14 | lib64/ 15 | parts/ 16 | sdist/ 17 | var/ 18 | wheels/ 19 | *.egg-info/ 20 | .installed.cfg 21 | *.egg 22 | 23 | # Virtual Environment 24 | venv/ 25 | ENV/ 26 | env/ 27 | 28 | # IDE 29 | .idea/ 30 | .vscode/ 31 | *.swp 32 | *.swo 33 | 34 | # Node 35 | node_modules/ 36 | npm-debug.log* 37 | 38 | # Misc 39 | .DS_Store 40 | .env 41 | .env.local 42 | .env.*.local 43 | 44 | # Selenium 45 | geckodriver.log 46 | chromedriver.log 47 | .goose/ ``` -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- ```markdown 1 | [](https://mseep.ai/app/angiejones-mcp-selenium) 2 | 3 | # MCP Selenium Server 4 | 5 | <a href="https://glama.ai/mcp/servers/s2em7b2kwf"> 6 | <img width="380" height="200" src="https://glama.ai/mcp/servers/s2em7b2kwf/badge" /> 7 | </a> 8 | 9 | [](https://smithery.ai/server/@angiejones/mcp-selenium) 10 | 11 | A Model Context Protocol (MCP) server implementation for Selenium WebDriver, enabling browser automation through standardized MCP clients. 12 | 13 | ## Video Demo (Click to Watch) 14 | 15 | [](https://youtu.be/mRV0N8hcgYA) 16 | 17 | 18 | ## Features 19 | 20 | - Start browser sessions with customizable options 21 | - Navigate to URLs 22 | - Find elements using various locator strategies 23 | - Click, type, and interact with elements 24 | - Perform mouse actions (hover, drag and drop) 25 | - Handle keyboard input 26 | - Take screenshots 27 | - Upload files 28 | - Support for headless mode 29 | 30 | ## Supported Browsers 31 | 32 | - Chrome 33 | - Firefox 34 | - MS Edge 35 | 36 | ## Use with Goose 37 | 38 | ### Option 1: One-click install 39 | Copy and paste the link below into a browser address bar to add this extension to goose desktop: 40 | 41 | ``` 42 | goose://extension?cmd=npx&arg=-y&arg=%40angiejones%2Fmcp-selenium&id=selenium-mcp&name=Selenium%20MCP&description=automates%20browser%20interactions 43 | ``` 44 | 45 | 46 | ### Option 2: Add manually to desktop or CLI 47 | 48 | * Name: `Selenium MCP` 49 | * Description: `automates browser interactions` 50 | * Command: `npx -y @angiejones/mcp-selenium` 51 | 52 | ## Use with other MCP clients (e.g. Claude Desktop, etc) 53 | ```json 54 | { 55 | "mcpServers": { 56 | "selenium": { 57 | "command": "npx", 58 | "args": ["-y", "@angiejones/mcp-selenium"] 59 | } 60 | } 61 | } 62 | ``` 63 | 64 | --- 65 | 66 | ## Development 67 | 68 | To work on this project: 69 | 70 | 1. Clone the repository 71 | 2. Install dependencies: `npm install` 72 | 3. Run the server: `npm start` 73 | 74 | ### Installation 75 | 76 | #### Installing via Smithery 77 | 78 | To install MCP Selenium for Claude Desktop automatically via [Smithery](https://smithery.ai/server/@angiejones/mcp-selenium): 79 | 80 | ```bash 81 | npx -y @smithery/cli install @angiejones/mcp-selenium --client claude 82 | ``` 83 | 84 | #### Manual Installation 85 | ```bash 86 | npm install -g @angiejones/mcp-selenium 87 | ``` 88 | 89 | 90 | ### Usage 91 | 92 | Start the server by running: 93 | 94 | ```bash 95 | mcp-selenium 96 | ``` 97 | 98 | Or use with NPX in your MCP configuration: 99 | 100 | ```json 101 | { 102 | "mcpServers": { 103 | "selenium": { 104 | "command": "npx", 105 | "args": [ 106 | "-y", 107 | "@angiejones/mcp-selenium" 108 | ] 109 | } 110 | } 111 | } 112 | ``` 113 | 114 | 115 | 116 | ## Tools 117 | 118 | ### start_browser 119 | Launches a browser session. 120 | 121 | **Parameters:** 122 | - `browser` (required): Browser to launch 123 | - Type: string 124 | - Enum: ["chrome", "firefox"] 125 | - `options`: Browser configuration options 126 | - Type: object 127 | - Properties: 128 | - `headless`: Run browser in headless mode 129 | - Type: boolean 130 | - `arguments`: Additional browser arguments 131 | - Type: array of strings 132 | 133 | **Example:** 134 | ```json 135 | { 136 | "tool": "start_browser", 137 | "parameters": { 138 | "browser": "chrome", 139 | "options": { 140 | "headless": true, 141 | "arguments": ["--no-sandbox"] 142 | } 143 | } 144 | } 145 | ``` 146 | 147 | ### navigate 148 | Navigates to a URL. 149 | 150 | **Parameters:** 151 | - `url` (required): URL to navigate to 152 | - Type: string 153 | 154 | **Example:** 155 | ```json 156 | { 157 | "tool": "navigate", 158 | "parameters": { 159 | "url": "https://www.example.com" 160 | } 161 | } 162 | ``` 163 | 164 | ### find_element 165 | Finds an element on the page. 166 | 167 | **Parameters:** 168 | - `by` (required): Locator strategy 169 | - Type: string 170 | - Enum: ["id", "css", "xpath", "name", "tag", "class"] 171 | - `value` (required): Value for the locator strategy 172 | - Type: string 173 | - `timeout`: Maximum time to wait for element in milliseconds 174 | - Type: number 175 | - Default: 10000 176 | 177 | **Example:** 178 | ```json 179 | { 180 | "tool": "find_element", 181 | "parameters": { 182 | "by": "id", 183 | "value": "search-input", 184 | "timeout": 5000 185 | } 186 | } 187 | ``` 188 | 189 | ### click_element 190 | Clicks an element. 191 | 192 | **Parameters:** 193 | - `by` (required): Locator strategy 194 | - Type: string 195 | - Enum: ["id", "css", "xpath", "name", "tag", "class"] 196 | - `value` (required): Value for the locator strategy 197 | - Type: string 198 | - `timeout`: Maximum time to wait for element in milliseconds 199 | - Type: number 200 | - Default: 10000 201 | 202 | **Example:** 203 | ```json 204 | { 205 | "tool": "click_element", 206 | "parameters": { 207 | "by": "css", 208 | "value": ".submit-button" 209 | } 210 | } 211 | ``` 212 | 213 | ### send_keys 214 | Sends keys to an element (typing). 215 | 216 | **Parameters:** 217 | - `by` (required): Locator strategy 218 | - Type: string 219 | - Enum: ["id", "css", "xpath", "name", "tag", "class"] 220 | - `value` (required): Value for the locator strategy 221 | - Type: string 222 | - `text` (required): Text to enter into the element 223 | - Type: string 224 | - `timeout`: Maximum time to wait for element in milliseconds 225 | - Type: number 226 | - Default: 10000 227 | 228 | **Example:** 229 | ```json 230 | { 231 | "tool": "send_keys", 232 | "parameters": { 233 | "by": "name", 234 | "value": "username", 235 | "text": "testuser" 236 | } 237 | } 238 | ``` 239 | 240 | ### get_element_text 241 | Gets the text() of an element. 242 | 243 | **Parameters:** 244 | - `by` (required): Locator strategy 245 | - Type: string 246 | - Enum: ["id", "css", "xpath", "name", "tag", "class"] 247 | - `value` (required): Value for the locator strategy 248 | - Type: string 249 | - `timeout`: Maximum time to wait for element in milliseconds 250 | - Type: number 251 | - Default: 10000 252 | 253 | **Example:** 254 | ```json 255 | { 256 | "tool": "get_element_text", 257 | "parameters": { 258 | "by": "css", 259 | "value": ".message" 260 | } 261 | } 262 | ``` 263 | 264 | ### hover 265 | Moves the mouse to hover over an element. 266 | 267 | **Parameters:** 268 | - `by` (required): Locator strategy 269 | - Type: string 270 | - Enum: ["id", "css", "xpath", "name", "tag", "class"] 271 | - `value` (required): Value for the locator strategy 272 | - Type: string 273 | - `timeout`: Maximum time to wait for element in milliseconds 274 | - Type: number 275 | - Default: 10000 276 | 277 | **Example:** 278 | ```json 279 | { 280 | "tool": "hover", 281 | "parameters": { 282 | "by": "css", 283 | "value": ".dropdown-menu" 284 | } 285 | } 286 | ``` 287 | 288 | ### drag_and_drop 289 | Drags an element and drops it onto another element. 290 | 291 | **Parameters:** 292 | - `by` (required): Locator strategy for source element 293 | - Type: string 294 | - Enum: ["id", "css", "xpath", "name", "tag", "class"] 295 | - `value` (required): Value for the source locator strategy 296 | - Type: string 297 | - `targetBy` (required): Locator strategy for target element 298 | - Type: string 299 | - Enum: ["id", "css", "xpath", "name", "tag", "class"] 300 | - `targetValue` (required): Value for the target locator strategy 301 | - Type: string 302 | - `timeout`: Maximum time to wait for elements in milliseconds 303 | - Type: number 304 | - Default: 10000 305 | 306 | **Example:** 307 | ```json 308 | { 309 | "tool": "drag_and_drop", 310 | "parameters": { 311 | "by": "id", 312 | "value": "draggable", 313 | "targetBy": "id", 314 | "targetValue": "droppable" 315 | } 316 | } 317 | ``` 318 | 319 | ### double_click 320 | Performs a double click on an element. 321 | 322 | **Parameters:** 323 | - `by` (required): Locator strategy 324 | - Type: string 325 | - Enum: ["id", "css", "xpath", "name", "tag", "class"] 326 | - `value` (required): Value for the locator strategy 327 | - Type: string 328 | - `timeout`: Maximum time to wait for element in milliseconds 329 | - Type: number 330 | - Default: 10000 331 | 332 | **Example:** 333 | ```json 334 | { 335 | "tool": "double_click", 336 | "parameters": { 337 | "by": "css", 338 | "value": ".editable-text" 339 | } 340 | } 341 | ``` 342 | 343 | ### right_click 344 | Performs a right click (context click) on an element. 345 | 346 | **Parameters:** 347 | - `by` (required): Locator strategy 348 | - Type: string 349 | - Enum: ["id", "css", "xpath", "name", "tag", "class"] 350 | - `value` (required): Value for the locator strategy 351 | - Type: string 352 | - `timeout`: Maximum time to wait for element in milliseconds 353 | - Type: number 354 | - Default: 10000 355 | 356 | **Example:** 357 | ```json 358 | { 359 | "tool": "right_click", 360 | "parameters": { 361 | "by": "css", 362 | "value": ".context-menu-trigger" 363 | } 364 | } 365 | ``` 366 | 367 | ### press_key 368 | Simulates pressing a keyboard key. 369 | 370 | **Parameters:** 371 | - `key` (required): Key to press (e.g., 'Enter', 'Tab', 'a', etc.) 372 | - Type: string 373 | 374 | **Example:** 375 | ```json 376 | { 377 | "tool": "press_key", 378 | "parameters": { 379 | "key": "Enter" 380 | } 381 | } 382 | ``` 383 | 384 | ### upload_file 385 | Uploads a file using a file input element. 386 | 387 | **Parameters:** 388 | - `by` (required): Locator strategy 389 | - Type: string 390 | - Enum: ["id", "css", "xpath", "name", "tag", "class"] 391 | - `value` (required): Value for the locator strategy 392 | - Type: string 393 | - `filePath` (required): Absolute path to the file to upload 394 | - Type: string 395 | - `timeout`: Maximum time to wait for element in milliseconds 396 | - Type: number 397 | - Default: 10000 398 | 399 | **Example:** 400 | ```json 401 | { 402 | "tool": "upload_file", 403 | "parameters": { 404 | "by": "id", 405 | "value": "file-input", 406 | "filePath": "/path/to/file.pdf" 407 | } 408 | } 409 | ``` 410 | 411 | ### take_screenshot 412 | Captures a screenshot of the current page. 413 | 414 | **Parameters:** 415 | - `outputPath` (optional): Path where to save the screenshot. If not provided, returns base64 data. 416 | - Type: string 417 | 418 | **Example:** 419 | ```json 420 | { 421 | "tool": "take_screenshot", 422 | "parameters": { 423 | "outputPath": "/path/to/screenshot.png" 424 | } 425 | } 426 | ``` 427 | 428 | ### close_session 429 | Closes the current browser session and cleans up resources. 430 | 431 | **Parameters:** 432 | None required 433 | 434 | **Example:** 435 | ```json 436 | { 437 | "tool": "close_session", 438 | "parameters": {} 439 | } 440 | ``` 441 | 442 | 443 | ## License 444 | 445 | MIT 446 | ``` -------------------------------------------------------------------------------- /smithery.yaml: -------------------------------------------------------------------------------- ```yaml 1 | # Smithery configuration file: https://smithery.ai/docs/config#smitheryyaml 2 | 3 | startCommand: 4 | type: stdio 5 | configSchema: 6 | # JSON Schema defining the configuration options for the MCP. 7 | type: object 8 | required: [] 9 | properties: {} 10 | commandFunction: 11 | # A function that produces the CLI command to start the MCP on stdio. 12 | |- 13 | (config) => ({command:'node', args:['src/lib/server.js'], env:{}}) ``` -------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- ```json 1 | { 2 | "name": "@angiejones/mcp-selenium", 3 | "version": "0.1.21", 4 | "description": "Selenium WebDriver MCP Server", 5 | "type": "module", 6 | "main": "src/lib/server.js", 7 | "bin": { 8 | "mcp-selenium": "./src/lib/server.js" 9 | }, 10 | "scripts": { 11 | "test": "echo \"Error: no test specified\" && exit 1" 12 | }, 13 | "keywords": [], 14 | "author": "", 15 | "license": "ISC", 16 | "dependencies": { 17 | "@modelcontextprotocol/sdk": "^1.7.0", 18 | "selenium-webdriver": "^4.18.1" 19 | } 20 | } 21 | ``` -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- ```dockerfile 1 | FROM node:18-alpine 2 | 3 | # Install Chrome and dependencies 4 | RUN apk update && apk add --no-cache \ 5 | chromium \ 6 | chromium-chromedriver \ 7 | nss \ 8 | freetype \ 9 | freetype-dev \ 10 | harfbuzz \ 11 | ca-certificates \ 12 | ttf-freefont \ 13 | udev \ 14 | ttf-opensans \ 15 | chromium-chromedriver 16 | 17 | # Set Chrome environment variables 18 | ENV CHROME_BIN=/usr/bin/chromium-browser 19 | ENV CHROME_PATH=/usr/lib/chromium/ 20 | ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true 21 | 22 | WORKDIR /app 23 | 24 | # Copy package files 25 | COPY package*.json ./ 26 | 27 | # Install dependencies 28 | RUN npm install 29 | 30 | # Copy application code 31 | COPY . . 32 | 33 | # Start the MCP server 34 | CMD ["node", "src/lib/server.js"] ``` -------------------------------------------------------------------------------- /bin/mcp-selenium.js: -------------------------------------------------------------------------------- ```javascript 1 | #!/usr/bin/env node 2 | 3 | import { fileURLToPath } from 'url'; 4 | import { dirname, resolve } from 'path'; 5 | import { spawn } from 'child_process'; 6 | 7 | const __filename = fileURLToPath(import.meta.url); 8 | const __dirname = dirname(__filename); 9 | 10 | const serverPath = resolve(__dirname, '../src/lib/server.js'); 11 | 12 | // Start the server 13 | const child = spawn('node', [serverPath], { 14 | stdio: 'inherit' 15 | }); 16 | 17 | child.on('error', (error) => { 18 | console.error(`Error starting server: ${error.message}`); 19 | process.exit(1); 20 | }); 21 | 22 | // Handle process termination 23 | process.on('SIGTERM', () => { 24 | child.kill('SIGTERM'); 25 | }); 26 | 27 | process.on('SIGINT', () => { 28 | child.kill('SIGINT'); 29 | }); ``` -------------------------------------------------------------------------------- /src/lib/server.js: -------------------------------------------------------------------------------- ```javascript 1 | #!/usr/bin/env node 2 | 3 | import { McpServer, ResourceTemplate } from "@modelcontextprotocol/sdk/server/mcp.js"; 4 | import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; 5 | import { z } from "zod"; 6 | import pkg from 'selenium-webdriver'; 7 | const { Builder, By, Key, until, Actions } = pkg; 8 | import { Options as ChromeOptions } from 'selenium-webdriver/chrome.js'; 9 | import { Options as FirefoxOptions } from 'selenium-webdriver/firefox.js'; 10 | import { Options as EdgeOptions } from 'selenium-webdriver/edge.js'; 11 | 12 | 13 | // Create an MCP server 14 | const server = new McpServer({ 15 | name: "MCP Selenium", 16 | version: "1.0.0" 17 | }); 18 | 19 | // Server state 20 | const state = { 21 | drivers: new Map(), 22 | currentSession: null 23 | }; 24 | 25 | // Helper functions 26 | const getDriver = () => { 27 | const driver = state.drivers.get(state.currentSession); 28 | if (!driver) { 29 | throw new Error('No active browser session'); 30 | } 31 | return driver; 32 | }; 33 | 34 | const getLocator = (by, value) => { 35 | switch (by.toLowerCase()) { 36 | case 'id': return By.id(value); 37 | case 'css': return By.css(value); 38 | case 'xpath': return By.xpath(value); 39 | case 'name': return By.name(value); 40 | case 'tag': return By.css(value); 41 | case 'class': return By.className(value); 42 | default: throw new Error(`Unsupported locator strategy: ${by}`); 43 | } 44 | }; 45 | 46 | // Common schemas 47 | const browserOptionsSchema = z.object({ 48 | headless: z.boolean().optional().describe("Run browser in headless mode"), 49 | arguments: z.array(z.string()).optional().describe("Additional browser arguments") 50 | }).optional(); 51 | 52 | const locatorSchema = { 53 | by: z.enum(["id", "css", "xpath", "name", "tag", "class"]).describe("Locator strategy to find element"), 54 | value: z.string().describe("Value for the locator strategy"), 55 | timeout: z.number().optional().describe("Maximum time to wait for element in milliseconds") 56 | }; 57 | 58 | // Browser Management Tools 59 | server.tool( 60 | "start_browser", 61 | "launches browser", 62 | { 63 | browser: z.enum(["chrome", "firefox", "edge"]).describe("Browser to launch (chrome or firefox or microsoft edge)"), 64 | options: browserOptionsSchema 65 | }, 66 | async ({ browser, options = {} }) => { 67 | try { 68 | let builder = new Builder(); 69 | let driver; 70 | switch (browser) { 71 | case 'chrome': { 72 | const chromeOptions = new ChromeOptions(); 73 | if (options.headless) { 74 | chromeOptions.addArguments('--headless=new'); 75 | } 76 | if (options.arguments) { 77 | options.arguments.forEach(arg => chromeOptions.addArguments(arg)); 78 | } 79 | driver = await builder 80 | .forBrowser('chrome') 81 | .setChromeOptions(chromeOptions) 82 | .build(); 83 | break; 84 | } 85 | case 'edge': { 86 | const edgeOptions = new EdgeOptions(); 87 | if (options.headless) { 88 | edgeOptions.addArguments('--headless=new'); 89 | } 90 | if (options.arguments) { 91 | options.arguments.forEach(arg => edgeOptions.addArguments(arg)); 92 | } 93 | driver = await builder 94 | .forBrowser('edge') 95 | .setEdgeOptions(edgeOptions) 96 | .build(); 97 | break; 98 | } 99 | case 'firefox': { 100 | const firefoxOptions = new FirefoxOptions(); 101 | if (options.headless) { 102 | firefoxOptions.addArguments('--headless'); 103 | } 104 | if (options.arguments) { 105 | options.arguments.forEach(arg => firefoxOptions.addArguments(arg)); 106 | } 107 | driver = await builder 108 | .forBrowser('firefox') 109 | .setFirefoxOptions(firefoxOptions) 110 | .build(); 111 | break; 112 | } 113 | default: { 114 | throw new Error(`Unsupported browser: ${browser}`); 115 | } 116 | } 117 | const sessionId = `${browser}_${Date.now()}`; 118 | state.drivers.set(sessionId, driver); 119 | state.currentSession = sessionId; 120 | 121 | return { 122 | content: [{ type: 'text', text: `Browser started with session_id: ${sessionId}` }] 123 | }; 124 | } catch (e) { 125 | return { 126 | content: [{ type: 'text', text: `Error starting browser: ${e.message}` }] 127 | }; 128 | } 129 | } 130 | ); 131 | 132 | server.tool( 133 | "navigate", 134 | "navigates to a URL", 135 | { 136 | url: z.string().describe("URL to navigate to") 137 | }, 138 | async ({ url }) => { 139 | try { 140 | const driver = getDriver(); 141 | await driver.get(url); 142 | return { 143 | content: [{ type: 'text', text: `Navigated to ${url}` }] 144 | }; 145 | } catch (e) { 146 | return { 147 | content: [{ type: 'text', text: `Error navigating: ${e.message}` }] 148 | }; 149 | } 150 | } 151 | ); 152 | 153 | // Element Interaction Tools 154 | server.tool( 155 | "find_element", 156 | "finds an element", 157 | { 158 | ...locatorSchema 159 | }, 160 | async ({ by, value, timeout = 10000 }) => { 161 | try { 162 | const driver = getDriver(); 163 | const locator = getLocator(by, value); 164 | await driver.wait(until.elementLocated(locator), timeout); 165 | return { 166 | content: [{ type: 'text', text: 'Element found' }] 167 | }; 168 | } catch (e) { 169 | return { 170 | content: [{ type: 'text', text: `Error finding element: ${e.message}` }] 171 | }; 172 | } 173 | } 174 | ); 175 | 176 | server.tool( 177 | "click_element", 178 | "clicks an element", 179 | { 180 | ...locatorSchema 181 | }, 182 | async ({ by, value, timeout = 10000 }) => { 183 | try { 184 | const driver = getDriver(); 185 | const locator = getLocator(by, value); 186 | const element = await driver.wait(until.elementLocated(locator), timeout); 187 | await element.click(); 188 | return { 189 | content: [{ type: 'text', text: 'Element clicked' }] 190 | }; 191 | } catch (e) { 192 | return { 193 | content: [{ type: 'text', text: `Error clicking element: ${e.message}` }] 194 | }; 195 | } 196 | } 197 | ); 198 | 199 | server.tool( 200 | "send_keys", 201 | "sends keys to an element, aka typing", 202 | { 203 | ...locatorSchema, 204 | text: z.string().describe("Text to enter into the element") 205 | }, 206 | async ({ by, value, text, timeout = 10000 }) => { 207 | try { 208 | const driver = getDriver(); 209 | const locator = getLocator(by, value); 210 | const element = await driver.wait(until.elementLocated(locator), timeout); 211 | await element.clear(); 212 | await element.sendKeys(text); 213 | return { 214 | content: [{ type: 'text', text: `Text "${text}" entered into element` }] 215 | }; 216 | } catch (e) { 217 | return { 218 | content: [{ type: 'text', text: `Error entering text: ${e.message}` }] 219 | }; 220 | } 221 | } 222 | ); 223 | 224 | server.tool( 225 | "get_element_text", 226 | "gets the text() of an element", 227 | { 228 | ...locatorSchema 229 | }, 230 | async ({ by, value, timeout = 10000 }) => { 231 | try { 232 | const driver = getDriver(); 233 | const locator = getLocator(by, value); 234 | const element = await driver.wait(until.elementLocated(locator), timeout); 235 | const text = await element.getText(); 236 | return { 237 | content: [{ type: 'text', text }] 238 | }; 239 | } catch (e) { 240 | return { 241 | content: [{ type: 'text', text: `Error getting element text: ${e.message}` }] 242 | }; 243 | } 244 | } 245 | ); 246 | 247 | server.tool( 248 | "hover", 249 | "moves the mouse to hover over an element", 250 | { 251 | ...locatorSchema 252 | }, 253 | async ({ by, value, timeout = 10000 }) => { 254 | try { 255 | const driver = getDriver(); 256 | const locator = getLocator(by, value); 257 | const element = await driver.wait(until.elementLocated(locator), timeout); 258 | const actions = driver.actions({ bridge: true }); 259 | await actions.move({ origin: element }).perform(); 260 | return { 261 | content: [{ type: 'text', text: 'Hovered over element' }] 262 | }; 263 | } catch (e) { 264 | return { 265 | content: [{ type: 'text', text: `Error hovering over element: ${e.message}` }] 266 | }; 267 | } 268 | } 269 | ); 270 | 271 | server.tool( 272 | "drag_and_drop", 273 | "drags an element and drops it onto another element", 274 | { 275 | ...locatorSchema, 276 | targetBy: z.enum(["id", "css", "xpath", "name", "tag", "class"]).describe("Locator strategy to find target element"), 277 | targetValue: z.string().describe("Value for the target locator strategy") 278 | }, 279 | async ({ by, value, targetBy, targetValue, timeout = 10000 }) => { 280 | try { 281 | const driver = getDriver(); 282 | const sourceLocator = getLocator(by, value); 283 | const targetLocator = getLocator(targetBy, targetValue); 284 | const sourceElement = await driver.wait(until.elementLocated(sourceLocator), timeout); 285 | const targetElement = await driver.wait(until.elementLocated(targetLocator), timeout); 286 | const actions = driver.actions({ bridge: true }); 287 | await actions.dragAndDrop(sourceElement, targetElement).perform(); 288 | return { 289 | content: [{ type: 'text', text: 'Drag and drop completed' }] 290 | }; 291 | } catch (e) { 292 | return { 293 | content: [{ type: 'text', text: `Error performing drag and drop: ${e.message}` }] 294 | }; 295 | } 296 | } 297 | ); 298 | 299 | server.tool( 300 | "double_click", 301 | "performs a double click on an element", 302 | { 303 | ...locatorSchema 304 | }, 305 | async ({ by, value, timeout = 10000 }) => { 306 | try { 307 | const driver = getDriver(); 308 | const locator = getLocator(by, value); 309 | const element = await driver.wait(until.elementLocated(locator), timeout); 310 | const actions = driver.actions({ bridge: true }); 311 | await actions.doubleClick(element).perform(); 312 | return { 313 | content: [{ type: 'text', text: 'Double click performed' }] 314 | }; 315 | } catch (e) { 316 | return { 317 | content: [{ type: 'text', text: `Error performing double click: ${e.message}` }] 318 | }; 319 | } 320 | } 321 | ); 322 | 323 | server.tool( 324 | "right_click", 325 | "performs a right click (context click) on an element", 326 | { 327 | ...locatorSchema 328 | }, 329 | async ({ by, value, timeout = 10000 }) => { 330 | try { 331 | const driver = getDriver(); 332 | const locator = getLocator(by, value); 333 | const element = await driver.wait(until.elementLocated(locator), timeout); 334 | const actions = driver.actions({ bridge: true }); 335 | await actions.contextClick(element).perform(); 336 | return { 337 | content: [{ type: 'text', text: 'Right click performed' }] 338 | }; 339 | } catch (e) { 340 | return { 341 | content: [{ type: 'text', text: `Error performing right click: ${e.message}` }] 342 | }; 343 | } 344 | } 345 | ); 346 | 347 | server.tool( 348 | "press_key", 349 | "simulates pressing a keyboard key", 350 | { 351 | key: z.string().describe("Key to press (e.g., 'Enter', 'Tab', 'a', etc.)") 352 | }, 353 | async ({ key }) => { 354 | try { 355 | const driver = getDriver(); 356 | const actions = driver.actions({ bridge: true }); 357 | await actions.keyDown(key).keyUp(key).perform(); 358 | return { 359 | content: [{ type: 'text', text: `Key '${key}' pressed` }] 360 | }; 361 | } catch (e) { 362 | return { 363 | content: [{ type: 'text', text: `Error pressing key: ${e.message}` }] 364 | }; 365 | } 366 | } 367 | ); 368 | 369 | server.tool( 370 | "upload_file", 371 | "uploads a file using a file input element", 372 | { 373 | ...locatorSchema, 374 | filePath: z.string().describe("Absolute path to the file to upload") 375 | }, 376 | async ({ by, value, filePath, timeout = 10000 }) => { 377 | try { 378 | const driver = getDriver(); 379 | const locator = getLocator(by, value); 380 | const element = await driver.wait(until.elementLocated(locator), timeout); 381 | await element.sendKeys(filePath); 382 | return { 383 | content: [{ type: 'text', text: 'File upload initiated' }] 384 | }; 385 | } catch (e) { 386 | return { 387 | content: [{ type: 'text', text: `Error uploading file: ${e.message}` }] 388 | }; 389 | } 390 | } 391 | ); 392 | 393 | server.tool( 394 | "take_screenshot", 395 | "captures a screenshot of the current page", 396 | { 397 | outputPath: z.string().optional().describe("Optional path where to save the screenshot. If not provided, returns base64 data.") 398 | }, 399 | async ({ outputPath }) => { 400 | try { 401 | const driver = getDriver(); 402 | const screenshot = await driver.takeScreenshot(); 403 | if (outputPath) { 404 | const fs = await import('fs'); 405 | await fs.promises.writeFile(outputPath, screenshot, 'base64'); 406 | return { 407 | content: [{ type: 'text', text: `Screenshot saved to ${outputPath}` }] 408 | }; 409 | } else { 410 | return { 411 | content: [ 412 | { type: 'text', text: 'Screenshot captured as base64:' }, 413 | { type: 'text', text: screenshot } 414 | ] 415 | }; 416 | } 417 | } catch (e) { 418 | return { 419 | content: [{ type: 'text', text: `Error taking screenshot: ${e.message}` }] 420 | }; 421 | } 422 | } 423 | ); 424 | 425 | server.tool( 426 | "close_session", 427 | "closes the current browser session", 428 | {}, 429 | async () => { 430 | try { 431 | const driver = getDriver(); 432 | await driver.quit(); 433 | state.drivers.delete(state.currentSession); 434 | const sessionId = state.currentSession; 435 | state.currentSession = null; 436 | return { 437 | content: [{ type: 'text', text: `Browser session ${sessionId} closed` }] 438 | }; 439 | } catch (e) { 440 | return { 441 | content: [{ type: 'text', text: `Error closing session: ${e.message}` }] 442 | }; 443 | } 444 | } 445 | ); 446 | 447 | // Resources 448 | server.resource( 449 | "browser-status", 450 | new ResourceTemplate("browser-status://current"), 451 | async (uri) => ({ 452 | contents: [{ 453 | uri: uri.href, 454 | text: state.currentSession 455 | ? `Active browser session: ${state.currentSession}` 456 | : "No active browser session" 457 | }] 458 | }) 459 | ); 460 | 461 | // Cleanup handler 462 | async function cleanup() { 463 | for (const [sessionId, driver] of state.drivers) { 464 | try { 465 | await driver.quit(); 466 | } catch (e) { 467 | console.error(`Error closing browser session ${sessionId}:`, e); 468 | } 469 | } 470 | state.drivers.clear(); 471 | state.currentSession = null; 472 | process.exit(0); 473 | } 474 | 475 | process.on('SIGTERM', cleanup); 476 | process.on('SIGINT', cleanup); 477 | 478 | // Start the server 479 | const transport = new StdioServerTransport(); 480 | await server.connect(transport); ```