# Directory Structure ``` ├── .gitignore ├── bun.lock ├── package.json ├── README.md ├── src │ ├── chrome-interface.ts │ ├── runtime-templates │ │ ├── ariaInteractiveElements.js │ │ └── removeTargetAttributes.js │ └── server.ts └── tsconfig.json ``` # Files -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- ``` 1 | node_modules/ 2 | ``` -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- ```markdown 1 | # Chrome MCP Server 2 | 3 | A Model Context Protocol (MCP) server that provides fine-grained control over a Chrome browser instance through the Chrome DevTools Protocol (CDP). 4 | 5 | ## Prerequisites 6 | 7 | - [Bun](https://bun.sh/) (recommended) or Node.js (v14 or higher) 8 | - Chrome browser with remote debugging enabled 9 | 10 | ## Setup 11 | 12 | ### Installing Bun 13 | 14 | 1. Install Bun (if not already installed): 15 | ```bash 16 | # macOS, Linux, or WSL 17 | curl -fsSL https://bun.sh/install | bash 18 | 19 | # Windows (using PowerShell) 20 | powershell -c "irm bun.sh/install.ps1 | iex" 21 | 22 | # Alternatively, using npm 23 | npm install -g bun 24 | ``` 25 | 26 | 2. Start Chrome with remote debugging enabled: 27 | 28 | ```bash 29 | # macOS 30 | /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 31 | 32 | # Windows 33 | start chrome --remote-debugging-port=9222 34 | 35 | # Linux 36 | google-chrome --remote-debugging-port=9222 37 | ``` 38 | 39 | 3. Install dependencies: 40 | ```bash 41 | bun install 42 | ``` 43 | 44 | 4. Start the server: 45 | ```bash 46 | bun start 47 | ``` 48 | 49 | For development with hot reloading: 50 | ```bash 51 | bun dev 52 | ``` 53 | 54 | The server will start on port 3000 by default. You can change this by setting the `PORT` environment variable. 55 | 56 | ## Configuring Roo Code to use this MCP server 57 | 58 | To use this Chrome MCP server with Roo Code: 59 | 60 | 1. Open Roo Code settings 61 | 2. Navigate to the MCP settings configuration file at: 62 | - macOS: `~/Library/Application Support/Code/User/globalStorage/rooveterinaryinc.roo-cline/settings/cline_mcp_settings.json` 63 | - Windows: `%APPDATA%\Code\User\globalStorage\rooveterinaryinc.roo-cline\settings\cline_mcp_settings.json` 64 | - Linux: `~/.config/Code/User/globalStorage/rooveterinaryinc.roo-cline/settings/cline_mcp_settings.json` 65 | 66 | 3. Add the following configuration to the `mcpServers` object: 67 | 68 | ```json 69 | { 70 | "mcpServers": { 71 | "chrome-control": { 72 | "url": "http://localhost:3000/sse", 73 | "disabled": false, 74 | "alwaysAllow": [] 75 | } 76 | } 77 | } 78 | ``` 79 | 80 | 4. Save the file and restart Roo Code to apply the changes. 81 | 82 | 5. You can now use the Chrome MCP tools in Roo Code to control the browser. 83 | 84 | ## Available Tools 85 | 86 | The server provides the following tools for browser control: 87 | 88 | ### navigate 89 | Navigate to a specific URL. 90 | 91 | Parameters: 92 | - `url` (string): The URL to navigate to 93 | 94 | ### click 95 | Click at specific coordinates. 96 | 97 | Parameters: 98 | - `x` (number): X coordinate 99 | - `y` (number): Y coordinate 100 | 101 | ### type 102 | Type text at the current focus. 103 | 104 | Parameters: 105 | - `text` (string): Text to type 106 | 107 | ### clickElement 108 | Click on an element by its index in the page info. 109 | 110 | Parameters: 111 | - `selector` (string): Element index (e.g., "0" for the first element) 112 | 113 | ### getText 114 | Get text content of an element using a CSS selector. 115 | 116 | Parameters: 117 | - `selector` (string): CSS selector to find the element 118 | 119 | ### getPageInfo 120 | Get semantic information about the page including interactive elements and text nodes. 121 | 122 | ### getPageState 123 | Get current page state including URL, title, scroll position, and viewport size. 124 | 125 | ## Usage 126 | 127 | The server implements the Model Context Protocol with SSE transport. Connect to the server at: 128 | - SSE endpoint: `http://localhost:3000/sse` 129 | - Messages endpoint: `http://localhost:3000/message?sessionId=...` 130 | 131 | When using with Roo Code, the configuration in the MCP settings file will handle the connection automatically. 132 | 133 | ## Development 134 | 135 | To run the server in development mode with hot reloading: 136 | ```bash 137 | bun dev 138 | ``` 139 | 140 | This uses Bun's built-in watch mode to automatically restart the server when files change. 141 | 142 | ## License 143 | 144 | MIT ``` -------------------------------------------------------------------------------- /src/runtime-templates/removeTargetAttributes.js: -------------------------------------------------------------------------------- ```javascript 1 | // Find all links and remove target attributes 2 | document.querySelectorAll('a').forEach(link => { 3 | if (link.hasAttribute('target')) { 4 | link.removeAttribute('target'); 5 | } 6 | }); 7 | console.log('[MCP Browser] Modified all links to prevent opening in new windows/tabs'); ``` -------------------------------------------------------------------------------- /tsconfig.json: -------------------------------------------------------------------------------- ```json 1 | { 2 | "compilerOptions": { 3 | "target": "ES2020", 4 | "module": "ES2020", 5 | "moduleResolution": "node", 6 | "lib": ["ES2020", "DOM"], 7 | "outDir": "./dist", 8 | "rootDir": "./src", 9 | "strict": true, 10 | "esModuleInterop": true, 11 | "skipLibCheck": true, 12 | "forceConsistentCasingInFileNames": true, 13 | "resolveJsonModule": true, 14 | "declaration": true, 15 | "types": ["node", "express"] 16 | }, 17 | "include": ["src/**/*"], 18 | "exclude": ["node_modules", "dist"] 19 | } ``` -------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- ```json 1 | { 2 | "name": "chrome-mcp", 3 | "version": "1.0.0", 4 | "description": "MCP server for Chrome browser control", 5 | "type": "module", 6 | "scripts": { 7 | "start": "bun run src/server.ts", 8 | "dev": "bun --watch src/server.ts" 9 | }, 10 | "dependencies": { 11 | "@modelcontextprotocol/sdk": "^1.8.0", 12 | "chrome-remote-interface": "^0.33.3", 13 | "cors": "^2.8.5", 14 | "diff": "^7.0.0", 15 | "express": "^4.21.2", 16 | "uuid": "^11.1.0" 17 | }, 18 | "devDependencies": { 19 | "@types/chrome-remote-interface": "^0.31.14", 20 | "@types/cors": "^2.8.17", 21 | "@types/diff": "^7.0.2", 22 | "@types/express": "^5.0.1", 23 | "@types/uuid": "^10.0.0" 24 | } 25 | } 26 | ``` -------------------------------------------------------------------------------- /src/server.ts: -------------------------------------------------------------------------------- ```typescript 1 | import express, { Request, Response, NextFunction } from "express"; 2 | import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; 3 | import { SSEServerTransport } from "@modelcontextprotocol/sdk/server/sse.js"; 4 | import { z } from "zod"; 5 | import cors from "cors"; 6 | import * as diff from 'diff'; 7 | import { ChromeInterface } from './chrome-interface'; 8 | 9 | // Type for content items 10 | type ContentItem = { 11 | type: "text"; 12 | text: string; 13 | }; 14 | 15 | // Type for tool responses 16 | type ToolResponse = { 17 | content: ContentItem[]; 18 | }; 19 | 20 | // Helper function for logging 21 | function logToolUsage<T extends Record<string, unknown>>(toolName: string, input: T, output: ToolResponse) { 22 | console.log(`\n[Tool Used] ${toolName}`); 23 | console.log('Input:', JSON.stringify(input, null, 2)); 24 | console.log('Output:', JSON.stringify(output, null, 2)); 25 | console.log('----------------------------------------'); 26 | } 27 | 28 | async function startServer() { 29 | // Create Chrome interface 30 | const chrome = new ChromeInterface(); 31 | let lastPageInfo: string | null = null; 32 | 33 | // Create an MCP server 34 | const server = new McpServer({ 35 | name: "Chrome MCP Server", 36 | version: "1.0.0", 37 | description: "Chrome browser automation using MCP. When user is asking to 'navigate' or 'go to' a URL, use the tools provided by this server. If fails, try again." 38 | }); 39 | 40 | // Connect to Chrome 41 | console.log("Connecting to Chrome..."); 42 | await chrome.connect().catch(error => { 43 | console.error('Failed to connect to Chrome:', error); 44 | process.exit(1); 45 | }); 46 | 47 | // Add Chrome tools 48 | server.tool( 49 | "navigate", 50 | "Navigate to a specified URL in the browser. Only use this if you have reasonably inferred the URL from the user's request. When navigation an existing session, prefer the other tools, like click, goBack, goForward, etc.", 51 | { url: z.string().url() }, 52 | async ({ url }): Promise<ToolResponse> => { 53 | const result: ToolResponse = { content: [{ type: "text", text: `Navigated to ${url}` }] }; 54 | await chrome.navigate(url); 55 | logToolUsage("navigate", { url }, result); 56 | return result; 57 | } 58 | ); 59 | 60 | server.tool( 61 | "click", 62 | "Click at specific x,y coordinates in the browser window. IMPORTANT: Always check the page info after clicking. When interacting with dropdowns, use ArrowUp and ArrowDown keys. Try to figure out what the selected item is when interacting with the dropdowns and use that to navigate.", 63 | { x: z.number(), y: z.number() }, 64 | async ({ x, y }): Promise<ToolResponse> => { 65 | await chrome.click(x, y); 66 | // Delay for 1 second 67 | await new Promise(resolve => setTimeout(resolve, 1000)); 68 | const result: ToolResponse = { content: [{ type: "text", text: `Clicked at (${x}, ${y})` }] }; 69 | logToolUsage("click", { x, y }, result); 70 | return result; 71 | } 72 | ); 73 | 74 | server.tool( 75 | "clickElementByIndex", 76 | "Click an interactive element by its index in the page. Indices are returned by getPageInfo. Always check the page info after clicking. For text input fields, prefer using focusElementByIndex instead.", 77 | { index: z.number() }, 78 | async ({ index }): Promise<ToolResponse> => { 79 | await chrome.clickElementByIndex(index); 80 | const result: ToolResponse = { content: [{ type: "text", text: `Clicked element at index: ${index}` }] }; 81 | logToolUsage("clickElementByIndex", { index }, result); 82 | return result; 83 | } 84 | ); 85 | 86 | server.tool( 87 | "focusElementByIndex", 88 | "Focus an interactive element by its index in the page. Indices are returned by getPageInfo. This is the preferred method for focusing text input fields before typing. Always check the page info after focusing.", 89 | { index: z.number() }, 90 | async ({ index }): Promise<ToolResponse> => { 91 | await chrome.focusElementByIndex(index); 92 | const result: ToolResponse = { content: [{ type: "text", text: `Focused element at index: ${index}` }] }; 93 | logToolUsage("focusElementByIndex", { index }, result); 94 | return result; 95 | } 96 | ); 97 | 98 | server.tool( 99 | "type", 100 | "Type text into the currently focused element, with support for special keys like {Enter}, {Tab}, etc. Use {Enter} for newlines in textareas or to submit forms. NEVER USE \n\n IN THE TEXT YOU TYPE. Use {Ctrl+A} to select all text in the focused element. If you think you're in a rich text editor, you probably can use {Ctrl+B} to bold, {Ctrl+I} to italic, {Ctrl+U} to underline, etc. IMPORTANT: Always use focusElementByIndex on text input fields before typing. ALSO IMPORTANT. NEVER RELY ON TABS AT ALL TO FOCUS ELEMENTS. EXPLICITLY USE focusElementByIndex ON ELEMENTS BEFORE TYPING. ALSO, ALWAYS CHECK THE PAGE INFO AFTER TYPING. Always check the page info after typing.", 101 | { text: z.string() }, 102 | async ({ text }): Promise<ToolResponse> => { 103 | await chrome.type(text); 104 | const result: ToolResponse = { content: [{ type: "text", text: `Typed: ${text}` }] }; 105 | logToolUsage("type", { text }, result); 106 | return result; 107 | } 108 | ); 109 | 110 | server.tool( 111 | "doubleClick", 112 | "Double click at specific x,y coordinates in the browser window. Useful for text selection or other double-click specific actions. Always check the page info after double clicking.", 113 | { x: z.number(), y: z.number() }, 114 | async ({ x, y }): Promise<ToolResponse> => { 115 | await chrome.doubleClick(x, y); 116 | const result: ToolResponse = { content: [{ type: "text", text: `Double clicked at (${x}, ${y})` }] }; 117 | logToolUsage("doubleClick", { x, y }, result); 118 | return result; 119 | } 120 | ); 121 | 122 | server.tool( 123 | "tripleClick", 124 | "Triple click at specific x,y coordinates in the browser window. Useful for selecting entire paragraphs or lines of text. Always check the page info after triple clicking.", 125 | { x: z.number(), y: z.number() }, 126 | async ({ x, y }): Promise<ToolResponse> => { 127 | await chrome.tripleClick(x, y); 128 | const result: ToolResponse = { content: [{ type: "text", text: `Triple clicked at (${x}, ${y})` }] }; 129 | logToolUsage("tripleClick", { x, y }, result); 130 | return result; 131 | } 132 | ); 133 | 134 | // server.tool( 135 | // "getText", 136 | // "Get text content of an element matching the specified CSS selector", 137 | // { selector: z.string() }, 138 | // async ({ selector }) => { 139 | // const text = await chrome.getElementText(selector); 140 | // return { content: [{ type: "text", text }] }; 141 | // } 142 | // ); 143 | 144 | server.tool( 145 | "getPageInfo", 146 | "Get semantic information about the current page, including interactive elements, their indices, and all the text content on the page. Returns a diff from one of the previous calls if available and if the diff is smaller than the full content. If you're missing context of the element indices, refer to one of your previous pageInfo results. If page info is fully incomplete, or you don't have context of the element indices, or previous page info results, use the force flag to try again. WARNING: don't use the force flag unless you're sure you need it. You can also use the search and percent flags to search for a specific term and navigate to a specific percentage of the page. Use evaluate to execute JavaScript code in order to udnerstand where in the viewport you are and infer the percent if needed. This is useful when navigating anchor links.", 147 | { 148 | force: z.boolean().optional(), 149 | cursor: z.number().optional(), 150 | remainingPages: z.number().optional(), 151 | search: z.string().optional(), 152 | percent: z.number().optional() 153 | }, 154 | async ({ force = false, cursor = 0, remainingPages = 1, search, percent }): Promise<ToolResponse> => { 155 | const PAGE_SIZE = 10 * 1024; // 10KB per page 156 | const CONTEXT_SIZE = 100; // Characters of context around search matches 157 | const currentPageInfo = await chrome.getPageInfo(); 158 | 159 | // Helper function to get text chunk by percentage 160 | const getTextByPercent = (text: string, percentage: number) => { 161 | if (percentage < 0 || percentage > 100) return 0; 162 | return Math.floor((text.length * percentage) / 100); 163 | }; 164 | 165 | // Helper function to get search results with context 166 | const getSearchResults = (text: string, searchTerm: string) => { 167 | if (!searchTerm) return null; 168 | 169 | const results: { start: number; end: number; text: string }[] = []; 170 | const regex = new RegExp(searchTerm, 'gi'); 171 | let match; 172 | 173 | while ((match = regex.exec(text)) !== null) { 174 | const start = Math.max(0, match.index - CONTEXT_SIZE); 175 | const end = Math.min(text.length, match.index + match[0].length + CONTEXT_SIZE); 176 | 177 | // Merge with previous section if they overlap 178 | if (results.length > 0 && start <= results[results.length - 1].end) { 179 | results[results.length - 1].end = end; 180 | } else { 181 | results.push({ start, end, text: text.slice(start, end) }); 182 | } 183 | } 184 | 185 | if (results.length === 0) return null; 186 | 187 | return results.map(({ start, end, text }) => { 188 | return `---- Match at position ${start}-${end} ----\n${text}`; 189 | }).join('\n'); 190 | }; 191 | 192 | // Helper function to paginate text 193 | const paginateText = (text: string, start: number, pageSize: number) => { 194 | const end = start + pageSize; 195 | const chunk = text.slice(start, end); 196 | const hasMore = end < text.length; 197 | const nextCursor = hasMore ? end : -1; 198 | const remainingSize = Math.ceil((text.length - end) / pageSize); 199 | return { chunk, nextCursor, remainingSize }; 200 | }; 201 | 202 | // Handle percentage-based navigation 203 | if (typeof percent === 'number') { 204 | cursor = getTextByPercent(currentPageInfo, percent); 205 | } 206 | 207 | // If force is true or there's no previous page info, return the paginated full content 208 | if (force || !lastPageInfo) { 209 | lastPageInfo = currentPageInfo; 210 | 211 | // If search is specified, return search results 212 | if (search) { 213 | const searchResults = getSearchResults(currentPageInfo, search); 214 | if (!searchResults) { 215 | return { content: [{ type: "text", text: `No matches found for "${search}"` }] }; 216 | } 217 | const { chunk, nextCursor, remainingSize } = paginateText(searchResults, cursor, PAGE_SIZE); 218 | const paginationInfo = nextCursor >= 0 ? 219 | `\n[Page info: next_cursor=${nextCursor}, remaining_pages=${remainingSize}]` : 220 | '\n[Page info: end of content]'; 221 | 222 | const result: ToolResponse = { content: [{ type: "text", text: chunk + paginationInfo }] }; 223 | logToolUsage("getPageInfo", { force, cursor, remainingPages, search, percent }, result); 224 | return result; 225 | } 226 | 227 | const { chunk, nextCursor, remainingSize } = paginateText(currentPageInfo, cursor, PAGE_SIZE); 228 | const paginationInfo = nextCursor >= 0 ? 229 | `\n[Page info: next_cursor=${nextCursor}, remaining_pages=${remainingSize}]` : 230 | '\n[Page info: end of content]'; 231 | 232 | const result: ToolResponse = { content: [{ type: "text", text: chunk + paginationInfo }] }; 233 | logToolUsage("getPageInfo", { force, cursor, remainingPages, search, percent }, result); 234 | return result; 235 | } 236 | 237 | // Calculate the diff between the last and current page info 238 | const changes = diff.diffWords(lastPageInfo, currentPageInfo); 239 | const diffText = changes 240 | .filter(part => part.added || part.removed) 241 | .map(part => { 242 | if (part.added) return `[ADDED] ${part.value}`; 243 | if (part.removed) return `[REMOVED] ${part.value}`; 244 | return ''; 245 | }) 246 | .join('\n'); 247 | 248 | // Helper function to check if diff is meaningful 249 | const isNonMeaningfulDiff = (diff: string) => { 250 | // Check if diff is mostly just numbers 251 | const lines = diff.split('\n'); 252 | const numericLines = lines.filter(line => { 253 | const value = line.replace(/\[ADDED\]|\[REMOVED\]/, '').trim(); 254 | return /^\d+$/.test(value); 255 | }); 256 | 257 | if (numericLines.length / lines.length > 0.5) { 258 | return true; 259 | } 260 | 261 | // Check if diff is too fragmented (lots of tiny changes) 262 | if (lines.length > 10 && lines.every(line => line.length < 10)) { 263 | return true; 264 | } 265 | 266 | return false; 267 | }; 268 | 269 | // If the diff is larger than the current content or not meaningful, return the paginated full content 270 | if (diffText.length > currentPageInfo.length || isNonMeaningfulDiff(diffText)) { 271 | lastPageInfo = currentPageInfo; 272 | 273 | // If search is specified, return search results 274 | if (search) { 275 | const searchResults = getSearchResults(currentPageInfo, search); 276 | if (!searchResults) { 277 | return { content: [{ type: "text", text: `No matches found for "${search}"` }] }; 278 | } 279 | const { chunk, nextCursor, remainingSize } = paginateText(searchResults, cursor, PAGE_SIZE); 280 | const paginationInfo = nextCursor >= 0 ? 281 | `\n[Page info: next_cursor=${nextCursor}, remaining_pages=${remainingSize}]` : 282 | '\n[Page info: end of content]'; 283 | 284 | const result: ToolResponse = { content: [{ type: "text", text: chunk + paginationInfo }] }; 285 | logToolUsage("getPageInfo", { force, cursor, remainingPages, search, percent }, result); 286 | return result; 287 | } 288 | 289 | const { chunk, nextCursor, remainingSize } = paginateText(currentPageInfo, cursor, PAGE_SIZE); 290 | const paginationInfo = nextCursor >= 0 ? 291 | `\n[Page info: next_cursor=${nextCursor}, remaining_pages=${remainingSize}]` : 292 | '\n[Page info: end of content]'; 293 | 294 | const result: ToolResponse = { content: [{ type: "text", text: chunk + paginationInfo }] }; 295 | logToolUsage("getPageInfo", { force, cursor, remainingPages, search, percent }, result); 296 | return result; 297 | } 298 | 299 | // Update the last page info and return the paginated diff 300 | lastPageInfo = currentPageInfo; 301 | const baseText = diffText || 'No changes detected'; 302 | 303 | // If search is specified, return search results from the diff 304 | if (search) { 305 | const searchResults = getSearchResults(baseText, search); 306 | if (!searchResults) { 307 | return { content: [{ type: "text", text: `No matches found for "${search}"` }] }; 308 | } 309 | const { chunk, nextCursor, remainingSize } = paginateText(searchResults, cursor, PAGE_SIZE); 310 | const paginationInfo = nextCursor >= 0 ? 311 | `\n[Page info: next_cursor=${nextCursor}, remaining_pages=${remainingSize}]` : 312 | '\n[Page info: end of content]'; 313 | 314 | const result: ToolResponse = { content: [{ type: "text", text: chunk + paginationInfo }] }; 315 | logToolUsage("getPageInfo", { force, cursor, remainingPages, search, percent }, result); 316 | return result; 317 | } 318 | 319 | const { chunk, nextCursor, remainingSize } = paginateText(baseText, cursor, PAGE_SIZE); 320 | const paginationInfo = nextCursor >= 0 ? 321 | `\n[Page info: next_cursor=${nextCursor}, remaining_pages=${remainingSize}]` : 322 | '\n[Page info: end of content]'; 323 | 324 | const result: ToolResponse = { content: [{ type: "text", text: chunk + paginationInfo }] }; 325 | logToolUsage("getPageInfo", { force, cursor, remainingPages, search, percent }, result); 326 | return result; 327 | } 328 | ); 329 | 330 | // server.tool( 331 | // "getPageState", 332 | // "Get current page state including URL, title, scroll position, and viewport size", 333 | // {}, 334 | // async () => { 335 | // const state = await chrome.getPageState(); 336 | // return { content: [{ type: "text", text: JSON.stringify(state) }] }; 337 | // } 338 | // ); 339 | 340 | server.tool( 341 | "goBack", 342 | "Navigate back one step in the browser history", 343 | {}, 344 | async (): Promise<ToolResponse> => { 345 | await chrome.goBack(); 346 | const result: ToolResponse = { content: [{ type: "text", text: "Navigated back" }] }; 347 | logToolUsage("goBack", {}, result); 348 | return result; 349 | } 350 | ); 351 | 352 | server.tool( 353 | "goForward", 354 | "Navigate forward one step in the browser history", 355 | {}, 356 | async (): Promise<ToolResponse> => { 357 | await chrome.goForward(); 358 | const result: ToolResponse = { content: [{ type: "text", text: "Navigated forward" }] }; 359 | logToolUsage("goForward", {}, result); 360 | return result; 361 | } 362 | ); 363 | 364 | server.tool( 365 | "evaluate", 366 | "Execute JavaScript code in the context of the current page", 367 | { expression: z.string() }, 368 | async ({ expression }): Promise<ToolResponse> => { 369 | const result = await chrome.evaluate(expression); 370 | const response: ToolResponse = { content: [{ type: "text", text: JSON.stringify(result) }] }; 371 | logToolUsage("evaluate", { expression }, response); 372 | return response; 373 | } 374 | ); 375 | 376 | // Create Express app 377 | const app = express(); 378 | app.use(cors()); 379 | 380 | // Store active transports 381 | const transports: {[sessionId: string]: SSEServerTransport} = {}; 382 | 383 | // SSE endpoint for client connectiWons 384 | app.get("/sse", async (_: Request, res: Response) => { 385 | const transport = new SSEServerTransport('/messages', res); 386 | transports[transport.sessionId] = transport; 387 | 388 | // Clean up when connection closes 389 | res.on("close", () => { 390 | delete transports[transport.sessionId]; 391 | }); 392 | 393 | // Connect the transport to our MCP server 394 | await server.connect(transport); 395 | }); 396 | 397 | // Endpoint for receiving messages from clients 398 | app.post("/messages", async (req: Request, res: Response) => { 399 | const sessionId = req.query.sessionId as string; 400 | const transport = transports[sessionId]; 401 | 402 | if (transport) { 403 | await transport.handlePostMessage(req, res); 404 | } else { 405 | res.status(400).send('No transport found for sessionId'); 406 | } 407 | }); 408 | 409 | // Start the server 410 | const port = 3000; 411 | app.listen(port, '0.0.0.0', () => { 412 | console.log(`MCP Server running at http://localhost:${port}`); 413 | console.log(`SSE endpoint: http://localhost:${port}/sse`); 414 | console.log(`Messages endpoint: http://localhost:${port}/messages`); 415 | }); 416 | 417 | // Handle cleanup 418 | process.on('SIGINT', async () => { 419 | await chrome.close(); 420 | process.exit(0); 421 | }); 422 | } 423 | 424 | // Start the server 425 | startServer().catch(error => { 426 | console.error('Failed to start server:', error); 427 | process.exit(1); 428 | }); 429 | ``` -------------------------------------------------------------------------------- /src/chrome-interface.ts: -------------------------------------------------------------------------------- ```typescript 1 | import CDP from 'chrome-remote-interface'; 2 | import fs from 'fs'; 3 | import path from 'path'; 4 | import diff from 'diff'; 5 | 6 | // Types for Chrome DevTools Protocol interactions 7 | interface NavigationResult { 8 | navigation: string; 9 | pageInfo: string; 10 | pageState: { 11 | url: string; 12 | title: string; 13 | readyState: string; 14 | scrollPosition: { x: number; y: number }; 15 | viewportSize: { width: number; height: number }; 16 | }; 17 | } 18 | 19 | type MouseButton = 'left' | 'right' | 'middle'; 20 | 21 | interface MouseEventOptions { 22 | type: 'mouseMoved' | 'mousePressed' | 'mouseReleased' | 'mouseWheel'; 23 | x: number; 24 | y: number; 25 | button?: MouseButton; 26 | buttons?: number; 27 | clickCount?: number; 28 | } 29 | 30 | interface SpecialKeyConfig { 31 | key: string; 32 | code: string; 33 | text?: string; 34 | unmodifiedText?: string; 35 | windowsVirtualKeyCode: number; 36 | nativeVirtualKeyCode: number; 37 | autoRepeat: boolean; 38 | isKeypad: boolean; 39 | isSystemKey: boolean; 40 | } 41 | 42 | // Function to load template file 43 | function loadAriaTemplate(): string { 44 | const TEMPLATES_DIR = path.join(__dirname, 'runtime-templates'); 45 | try { 46 | return fs.readFileSync(path.join(TEMPLATES_DIR, 'ariaInteractiveElements.js'), 'utf-8'); 47 | } catch (error) { 48 | console.error('Failed to load ariaInteractiveElements template:', error); 49 | throw error; 50 | } 51 | } 52 | 53 | // Chrome interface class to handle CDP interactions 54 | export class ChromeInterface { 55 | private client: CDP.Client | null = null; 56 | private page: any | null = null; 57 | private ariaScriptTemplate: string = ''; 58 | 59 | constructor() { 60 | this.ariaScriptTemplate = loadAriaTemplate(); 61 | } 62 | 63 | /** 64 | * Connects to Chrome and sets up necessary event listeners 65 | */ 66 | async connect() { 67 | try { 68 | this.client = await CDP(); 69 | const { Page, DOM, Runtime, Network } = this.client; 70 | 71 | // Enable necessary domains 72 | await Promise.all([ 73 | Page.enable(), 74 | DOM.enable(), 75 | Runtime.enable(), 76 | Network.enable(), 77 | ]); 78 | 79 | // Set up simple page load handler that injects the script 80 | Page.loadEventFired(async () => { 81 | console.log('[Page Load] Load event fired, injecting ARIA script'); 82 | await this.injectAriaScript(); 83 | }); 84 | 85 | return true; 86 | } catch (error) { 87 | console.error('Failed to connect to Chrome:', error); 88 | return false; 89 | } 90 | } 91 | 92 | /** 93 | * Injects the ARIA interactive elements script into the page 94 | */ 95 | private async injectAriaScript() { 96 | if (!this.client?.Runtime) return; 97 | 98 | console.log('[ARIA] Injecting ARIA interactive elements script'); 99 | 100 | await this.client.Runtime.evaluate({ 101 | expression: this.ariaScriptTemplate 102 | }); 103 | } 104 | 105 | /** 106 | * Navigates to a URL and waits for page load 107 | */ 108 | async navigate(url: string): Promise<NavigationResult> { 109 | if (!this.client) throw new Error('Chrome not connected'); 110 | 111 | console.log(`[Navigation] Starting navigation to ${url}`); 112 | 113 | try { 114 | const NAVIGATION_TIMEOUT = 30000; // 30 seconds timeout 115 | 116 | await Promise.race([ 117 | this.client.Page.navigate({ url }), 118 | new Promise((_, reject) => 119 | setTimeout(() => reject(new Error('Navigation timeout')), NAVIGATION_TIMEOUT) 120 | ) 121 | ]); 122 | 123 | console.log('[Navigation] Navigation successful'); 124 | 125 | const pageInfo = await this.getPageInfo(); 126 | const pageState = await this.getPageState(); 127 | 128 | return { 129 | navigation: `Successfully navigated to ${url}`, 130 | pageInfo, 131 | pageState 132 | }; 133 | 134 | } catch (error) { 135 | console.error('[Navigation] Navigation error:', error); 136 | throw error; 137 | } 138 | } 139 | 140 | /** 141 | * Simulates a mouse click at specified coordinates with verification 142 | */ 143 | async click(x: number, y: number) { 144 | if (!this.client) throw new Error('Chrome not connected'); 145 | const { Input, Runtime } = this.client; 146 | 147 | // Get element info before clicking 148 | const preClickInfo = await Runtime.evaluate({ 149 | expression: ` 150 | (function() { 151 | const element = document.elementFromPoint(${x}, ${y}); 152 | return element ? { 153 | tagName: element.tagName, 154 | href: element instanceof HTMLAnchorElement ? element.href : null, 155 | type: element instanceof HTMLInputElement ? element.type : null, 156 | isInteractive: ( 157 | element instanceof HTMLButtonElement || 158 | element instanceof HTMLAnchorElement || 159 | element instanceof HTMLInputElement || 160 | element.hasAttribute('role') || 161 | window.getComputedStyle(element).cursor === 'pointer' 162 | ) 163 | } : null; 164 | })() 165 | `, 166 | returnByValue: true 167 | }); 168 | 169 | const elementInfo = preClickInfo.result.value; 170 | console.log('[Click] Clicking element:', elementInfo); 171 | 172 | // Dispatch a complete mouse event sequence 173 | const dispatchMouseEvent = async (options: MouseEventOptions) => { 174 | await Input.dispatchMouseEvent({ 175 | ...options, 176 | button: 'left', 177 | buttons: options.type === 'mouseMoved' ? 0 : 1, 178 | clickCount: (options.type === 'mousePressed' || options.type === 'mouseReleased') ? 1 : 0, 179 | }); 180 | }; 181 | 182 | // Natural mouse movement sequence with hover first 183 | await dispatchMouseEvent({ type: 'mouseMoved', x: x - 50, y: y - 50 }); 184 | await new Promise(resolve => setTimeout(resolve, 50)); // Small delay for hover 185 | await dispatchMouseEvent({ type: 'mouseMoved', x, y }); 186 | await new Promise(resolve => setTimeout(resolve, 50)); // Small delay for hover effect 187 | 188 | // Click sequence 189 | await dispatchMouseEvent({ type: 'mousePressed', x, y }); 190 | await new Promise(resolve => setTimeout(resolve, 50)); // Small delay between press and release 191 | await dispatchMouseEvent({ type: 'mouseReleased', x, y, buttons: 0 }); 192 | 193 | // Verify the click had an effect and show visual feedback 194 | await Runtime.evaluate({ 195 | expression: ` 196 | (function() { 197 | const element = document.elementFromPoint(${x}, ${y}); 198 | if (element) { 199 | // Add a brief flash to show where we clicked 200 | const div = document.createElement('div'); 201 | div.style.position = 'fixed'; 202 | div.style.left = '${x}px'; 203 | div.style.top = '${y}px'; 204 | div.style.width = '20px'; 205 | div.style.height = '20px'; 206 | div.style.backgroundColor = 'rgba(255, 255, 0, 0.5)'; 207 | div.style.borderRadius = '50%'; 208 | div.style.pointerEvents = 'none'; 209 | div.style.zIndex = '999999'; 210 | div.style.transition = 'all 0.3s ease-out'; 211 | document.body.appendChild(div); 212 | 213 | // Animate the feedback 214 | setTimeout(() => { 215 | div.style.transform = 'scale(1.5)'; 216 | div.style.opacity = '0'; 217 | setTimeout(() => div.remove(), 300); 218 | }, 50); 219 | 220 | // For links, verify navigation started 221 | if (element instanceof HTMLAnchorElement) { 222 | element.dispatchEvent(new MouseEvent('click', { 223 | bubbles: true, 224 | cancelable: true, 225 | view: window 226 | })); 227 | } 228 | } 229 | })() 230 | ` 231 | }); 232 | 233 | // Additional delay for link clicks to start navigation 234 | if (elementInfo?.href) { 235 | await new Promise(resolve => setTimeout(resolve, 100)); 236 | } 237 | } 238 | 239 | /** 240 | * Simulates a double click at specified coordinates 241 | */ 242 | async doubleClick(x: number, y: number) { 243 | if (!this.client) throw new Error('Chrome not connected'); 244 | const { Input } = this.client; 245 | 246 | const dispatchMouseEvent = async (options: MouseEventOptions) => { 247 | await Input.dispatchMouseEvent({ 248 | ...options, 249 | button: 'left', 250 | buttons: options.type === 'mouseMoved' ? 0 : 1, 251 | clickCount: (options.type === 'mousePressed' || options.type === 'mouseReleased') ? 2 : 0, 252 | }); 253 | }; 254 | 255 | // Natural mouse movement sequence with double click 256 | await dispatchMouseEvent({ type: 'mouseMoved', x: x - 50, y: y - 50 }); 257 | await dispatchMouseEvent({ type: 'mouseMoved', x, y }); 258 | await dispatchMouseEvent({ type: 'mousePressed', x, y }); 259 | await dispatchMouseEvent({ type: 'mouseReleased', x, y, buttons: 0 }); 260 | } 261 | 262 | /** 263 | * Simulates a triple click at specified coordinates 264 | */ 265 | async tripleClick(x: number, y: number) { 266 | if (!this.client) throw new Error('Chrome not connected'); 267 | const { Input } = this.client; 268 | 269 | const dispatchMouseEvent = async (options: MouseEventOptions) => { 270 | await Input.dispatchMouseEvent({ 271 | ...options, 272 | button: 'left', 273 | buttons: options.type === 'mouseMoved' ? 0 : 1, 274 | clickCount: (options.type === 'mousePressed' || options.type === 'mouseReleased') ? 3 : 0, 275 | }); 276 | }; 277 | 278 | // Natural mouse movement sequence with triple click 279 | await dispatchMouseEvent({ type: 'mouseMoved', x: x - 50, y: y - 50 }); 280 | await dispatchMouseEvent({ type: 'mouseMoved', x, y }); 281 | await dispatchMouseEvent({ type: 'mousePressed', x, y }); 282 | await dispatchMouseEvent({ type: 'mouseReleased', x, y, buttons: 0 }); 283 | } 284 | 285 | /** 286 | * Focuses an element by its index in the interactive elements array 287 | */ 288 | async focusElementByIndex(index: number) { 289 | if (!this.client) throw new Error('Chrome not connected'); 290 | const { Runtime } = this.client; 291 | 292 | // Get element and focus it 293 | const { result } = await Runtime.evaluate({ 294 | expression: ` 295 | (function() { 296 | const element = window.interactiveElements[${index}]; 297 | if (!element) throw new Error('Element not found at index ' + ${index}); 298 | 299 | // Scroll into view with smooth behavior 300 | element.scrollIntoView({ behavior: 'smooth', block: 'center' }); 301 | 302 | // Wait a bit for scroll to complete 303 | return new Promise(resolve => { 304 | setTimeout(() => { 305 | element.focus(); 306 | resolve(true); 307 | }, 1000); 308 | }); 309 | })() 310 | `, 311 | awaitPromise: true, 312 | returnByValue: true 313 | }); 314 | 315 | if (result.subtype === 'error') { 316 | throw new Error(result.description); 317 | } 318 | 319 | // Highlight the element after focusing 320 | await this.highlightElement(`window.interactiveElements[${index}]`); 321 | } 322 | 323 | /** 324 | * Clicks an element by its index in the interactive elements array 325 | */ 326 | async clickElementByIndex(index: number) { 327 | if (!this.client) throw new Error('Chrome not connected'); 328 | const { Runtime } = this.client; 329 | 330 | // Get element info and coordinates 331 | const elementInfo = await Runtime.evaluate({ 332 | expression: ` 333 | (function() { 334 | const element = window.interactiveElements[${index}]; 335 | if (!element) throw new Error('Element not found at index ' + ${index}); 336 | 337 | // Scroll into view with smooth behavior 338 | element.scrollIntoView({ behavior: 'smooth', block: 'center' }); 339 | 340 | return new Promise(resolve => { 341 | setTimeout(() => { 342 | const rect = element.getBoundingClientRect(); 343 | resolve({ 344 | rect: { 345 | x: Math.round(rect.left + (rect.width * 0.5)), // Click in center 346 | y: Math.round(rect.top + (rect.height * 0.5)) 347 | }, 348 | tagName: element.tagName, 349 | href: element instanceof HTMLAnchorElement ? element.href : null, 350 | type: element instanceof HTMLInputElement ? element.type : null 351 | }); 352 | }, 1000); // Wait for scroll 353 | }); 354 | })() 355 | `, 356 | awaitPromise: true, 357 | returnByValue: true 358 | }); 359 | 360 | if (elementInfo.result.subtype === 'error') { 361 | throw new Error(elementInfo.result.description); 362 | } 363 | 364 | const { x, y } = elementInfo.result.value.rect; 365 | 366 | // Highlight the element before clicking 367 | await this.highlightElement(`window.interactiveElements[${index}]`); 368 | 369 | // Add a small delay to make the highlight visible 370 | await new Promise(resolve => setTimeout(resolve, 300)); 371 | 372 | // Perform the physical click 373 | await this.click(x, y); 374 | 375 | // For inputs, ensure they're focused after click 376 | if (elementInfo.result.value.type) { 377 | await Runtime.evaluate({ 378 | expression: `window.interactiveElements[${index}].focus()` 379 | }); 380 | } 381 | } 382 | 383 | /** 384 | * Types text with support for special keys 385 | */ 386 | async type(text: string) { 387 | if (!this.client) throw new Error('Chrome not connected'); 388 | const { Input } = this.client; 389 | 390 | // Add random delay between keystrokes to simulate human typing 391 | const getRandomDelay = () => { 392 | // Base delay between 100-200ms with occasional longer pauses 393 | return Math.random() * 20 + 20; 394 | }; 395 | 396 | const specialKeys: Record<string, SpecialKeyConfig> = { 397 | Enter: { 398 | key: 'Enter', 399 | code: 'Enter', 400 | text: '\r', 401 | unmodifiedText: '\r', 402 | windowsVirtualKeyCode: 13, 403 | nativeVirtualKeyCode: 13, 404 | autoRepeat: false, 405 | isKeypad: false, 406 | isSystemKey: false, 407 | }, 408 | Tab: { 409 | key: 'Tab', 410 | code: 'Tab', 411 | windowsVirtualKeyCode: 9, 412 | nativeVirtualKeyCode: 9, 413 | autoRepeat: false, 414 | isKeypad: false, 415 | isSystemKey: false, 416 | }, 417 | Backspace: { 418 | key: 'Backspace', 419 | code: 'Backspace', 420 | windowsVirtualKeyCode: 8, 421 | nativeVirtualKeyCode: 8, 422 | autoRepeat: false, 423 | isKeypad: false, 424 | isSystemKey: false, 425 | }, 426 | ArrowUp: { 427 | key: 'ArrowUp', 428 | code: 'ArrowUp', 429 | windowsVirtualKeyCode: 38, 430 | nativeVirtualKeyCode: 38, 431 | autoRepeat: false, 432 | isKeypad: false, 433 | isSystemKey: false, 434 | }, 435 | ArrowDown: { 436 | key: 'ArrowDown', 437 | code: 'ArrowDown', 438 | windowsVirtualKeyCode: 40, 439 | nativeVirtualKeyCode: 40, 440 | autoRepeat: false, 441 | isKeypad: false, 442 | isSystemKey: false, 443 | }, 444 | ArrowLeft: { 445 | key: 'ArrowLeft', 446 | code: 'ArrowLeft', 447 | windowsVirtualKeyCode: 37, 448 | nativeVirtualKeyCode: 37, 449 | autoRepeat: false, 450 | isKeypad: false, 451 | isSystemKey: false, 452 | }, 453 | ArrowRight: { 454 | key: 'ArrowRight', 455 | code: 'ArrowRight', 456 | windowsVirtualKeyCode: 39, 457 | nativeVirtualKeyCode: 39, 458 | autoRepeat: false, 459 | isKeypad: false, 460 | isSystemKey: false, 461 | }, 462 | 'Ctrl+A': { key: 'a', code: 'KeyA', windowsVirtualKeyCode: 65, nativeVirtualKeyCode: 65, autoRepeat: false, isKeypad: false, isSystemKey: false }, 463 | 'Ctrl+B': { key: 'b', code: 'KeyB', windowsVirtualKeyCode: 66, nativeVirtualKeyCode: 66, autoRepeat: false, isKeypad: false, isSystemKey: false }, 464 | 'Ctrl+C': { key: 'c', code: 'KeyC', windowsVirtualKeyCode: 67, nativeVirtualKeyCode: 67, autoRepeat: false, isKeypad: false, isSystemKey: false }, 465 | 'Ctrl+I': { key: 'i', code: 'KeyI', windowsVirtualKeyCode: 73, nativeVirtualKeyCode: 73, autoRepeat: false, isKeypad: false, isSystemKey: false }, 466 | 'Ctrl+U': { key: 'u', code: 'KeyU', windowsVirtualKeyCode: 85, nativeVirtualKeyCode: 85, autoRepeat: false, isKeypad: false, isSystemKey: false }, 467 | 'Ctrl+V': { key: 'v', code: 'KeyV', windowsVirtualKeyCode: 86, nativeVirtualKeyCode: 86, autoRepeat: false, isKeypad: false, isSystemKey: false }, 468 | 'Ctrl+X': { key: 'x', code: 'KeyX', windowsVirtualKeyCode: 88, nativeVirtualKeyCode: 88, autoRepeat: false, isKeypad: false, isSystemKey: false }, 469 | 'Ctrl+Z': { key: 'z', code: 'KeyZ', windowsVirtualKeyCode: 90, nativeVirtualKeyCode: 90, autoRepeat: false, isKeypad: false, isSystemKey: false }, 470 | }; 471 | 472 | const handleModifierKey = async (keyConfig: SpecialKeyConfig, modifiers: { ctrl?: boolean; shift?: boolean; alt?: boolean; meta?: boolean }) => { 473 | if (!this.client) return; 474 | const { Input } = this.client; 475 | 476 | if (modifiers.ctrl) { 477 | await Input.dispatchKeyEvent({ 478 | type: 'keyDown', 479 | key: 'Control', 480 | code: 'ControlLeft', 481 | windowsVirtualKeyCode: 17, 482 | nativeVirtualKeyCode: 17, 483 | modifiers: 2, 484 | isSystemKey: false 485 | }); 486 | } 487 | 488 | await Input.dispatchKeyEvent({ 489 | type: 'keyDown', 490 | ...keyConfig, 491 | modifiers: modifiers.ctrl ? 2 : 0, 492 | }); 493 | 494 | await Input.dispatchKeyEvent({ 495 | type: 'keyUp', 496 | ...keyConfig, 497 | modifiers: modifiers.ctrl ? 2 : 0, 498 | }); 499 | 500 | if (modifiers.ctrl) { 501 | await Input.dispatchKeyEvent({ 502 | type: 'keyUp', 503 | key: 'Control', 504 | code: 'ControlLeft', 505 | windowsVirtualKeyCode: 17, 506 | nativeVirtualKeyCode: 17, 507 | modifiers: 0, 508 | isSystemKey: false 509 | }); 510 | } 511 | }; 512 | 513 | const parts = text.split(/(\{[^}]+\})/); 514 | 515 | for (const part of parts) { 516 | if (part.startsWith('{') && part.endsWith('}')) { 517 | const keyName = part.slice(1, -1); 518 | if (keyName in specialKeys) { 519 | const keyConfig = specialKeys[keyName]; 520 | 521 | if (keyName.startsWith('Ctrl+')) { 522 | await handleModifierKey(keyConfig, { ctrl: true }); 523 | } else { 524 | await Input.dispatchKeyEvent({ 525 | type: 'keyDown', 526 | ...keyConfig, 527 | }); 528 | 529 | if (keyName === 'Enter') { 530 | await Input.dispatchKeyEvent({ 531 | type: 'char', 532 | text: '\r', 533 | unmodifiedText: '\r', 534 | windowsVirtualKeyCode: 13, 535 | nativeVirtualKeyCode: 13, 536 | autoRepeat: false, 537 | isKeypad: false, 538 | isSystemKey: false, 539 | }); 540 | } 541 | 542 | await Input.dispatchKeyEvent({ 543 | type: 'keyUp', 544 | ...keyConfig, 545 | }); 546 | 547 | await new Promise(resolve => setTimeout(resolve, 50)); 548 | 549 | if (keyName === 'Enter' || keyName === 'Tab') { 550 | await new Promise(resolve => setTimeout(resolve, 100)); 551 | } 552 | } 553 | } else { 554 | for (const char of part) { 555 | // Add random delay before each keystroke 556 | await new Promise(resolve => setTimeout(resolve, getRandomDelay())); 557 | 558 | await Input.dispatchKeyEvent({ 559 | type: 'keyDown', 560 | text: char, 561 | unmodifiedText: char, 562 | key: char, 563 | code: `Key${char.toUpperCase()}`, 564 | }); 565 | await Input.dispatchKeyEvent({ 566 | type: 'keyUp', 567 | text: char, 568 | unmodifiedText: char, 569 | key: char, 570 | code: `Key${char.toUpperCase()}`, 571 | }); 572 | } 573 | } 574 | } else { 575 | for (const char of part) { 576 | // Add random delay before each keystroke 577 | await new Promise(resolve => setTimeout(resolve, getRandomDelay())); 578 | 579 | await Input.dispatchKeyEvent({ 580 | type: 'keyDown', 581 | text: char, 582 | unmodifiedText: char, 583 | key: char, 584 | code: `Key${char.toUpperCase()}`, 585 | }); 586 | await Input.dispatchKeyEvent({ 587 | type: 'keyUp', 588 | text: char, 589 | unmodifiedText: char, 590 | key: char, 591 | code: `Key${char.toUpperCase()}`, 592 | }); 593 | } 594 | } 595 | } 596 | 597 | // Add a slightly longer delay after finishing typing 598 | await new Promise(resolve => setTimeout(resolve, 500)); 599 | } 600 | 601 | /** 602 | * Gets text content of an element by selector 603 | */ 604 | async getElementText(selector: string): Promise<string> { 605 | if (!this.client) throw new Error('Chrome not connected'); 606 | const { Runtime } = this.client; 607 | 608 | const result = await Runtime.evaluate({ 609 | expression: `document.querySelector('${selector}')?.textContent || ''`, 610 | }); 611 | 612 | return result.result.value; 613 | } 614 | 615 | /** 616 | * Closes the Chrome connection 617 | */ 618 | async close() { 619 | if (this.client) { 620 | await this.client.close(); 621 | this.client = null; 622 | this.page = null; 623 | } 624 | } 625 | 626 | /** 627 | * Gets semantic information about the page 628 | */ 629 | async getPageInfo() { 630 | if (!this.client) throw new Error('Chrome not connected'); 631 | const { Runtime } = this.client; 632 | 633 | const { result } = await Runtime.evaluate({ 634 | expression: 'window.createTextRepresentation(); window.textRepresentation || "Page text representation not available"', 635 | returnByValue: true 636 | }); 637 | 638 | return result.value; 639 | } 640 | 641 | /** 642 | * Highlights an element briefly before interaction 643 | */ 644 | private async highlightElement(element: string) { 645 | if (!this.client) throw new Error('Chrome not connected'); 646 | const { Runtime } = this.client; 647 | 648 | await Runtime.evaluate({ 649 | expression: ` 650 | (function() { 651 | const el = ${element}; 652 | if (!el) return; 653 | 654 | // Store original styles 655 | const originalOutline = el.style.outline; 656 | const originalOutlineOffset = el.style.outlineOffset; 657 | 658 | // Add highlight effect 659 | el.style.outline = '2px solid #007AFF'; 660 | el.style.outlineOffset = '2px'; 661 | 662 | // Remove highlight after animation 663 | setTimeout(() => { 664 | el.style.outline = originalOutline; 665 | el.style.outlineOffset = originalOutlineOffset; 666 | }, 500); 667 | })() 668 | ` 669 | }); 670 | } 671 | 672 | /** 673 | * Gets the current page state 674 | */ 675 | async getPageState() { 676 | if (!this.client) throw new Error('Chrome not connected'); 677 | const { Runtime } = this.client; 678 | 679 | const result = await Runtime.evaluate({ 680 | expression: ` 681 | (function() { 682 | return { 683 | url: window.location.href, 684 | title: document.title, 685 | readyState: document.readyState, 686 | scrollPosition: { 687 | x: window.scrollX, 688 | y: window.scrollY 689 | }, 690 | viewportSize: { 691 | width: window.innerWidth, 692 | height: window.innerHeight 693 | } 694 | }; 695 | })() 696 | `, 697 | returnByValue: true, 698 | }); 699 | 700 | return result.result.value; 701 | } 702 | 703 | /** 704 | * Navigates back in history 705 | */ 706 | async goBack(): Promise<NavigationResult> { 707 | if (!this.client) throw new Error('Chrome not connected'); 708 | 709 | console.log('[Navigation] Going back in history'); 710 | await this.client.Page.navigate({ url: 'javascript:history.back()' }); 711 | 712 | const pageInfo = await this.getPageInfo(); 713 | const pageState = await this.getPageState(); 714 | 715 | return { 716 | navigation: 'Navigated back in history', 717 | pageInfo, 718 | pageState 719 | }; 720 | } 721 | 722 | /** 723 | * Navigates forward in history 724 | */ 725 | async goForward(): Promise<NavigationResult> { 726 | if (!this.client) throw new Error('Chrome not connected'); 727 | 728 | console.log('[Navigation] Going forward in history'); 729 | await this.client.Page.navigate({ url: 'javascript:history.forward()' }); 730 | 731 | const pageInfo = await this.getPageInfo(); 732 | const pageState = await this.getPageState(); 733 | 734 | return { 735 | navigation: 'Navigated forward in history', 736 | pageInfo, 737 | pageState 738 | }; 739 | } 740 | 741 | /** 742 | * Evaluates JavaScript code in the page context 743 | */ 744 | async evaluate(expression: string) { 745 | if (!this.client) throw new Error('Chrome not connected'); 746 | const { Runtime } = this.client; 747 | 748 | const result = await Runtime.evaluate({ 749 | expression, 750 | returnByValue: true 751 | }); 752 | 753 | return result.result.value; 754 | } 755 | } 756 | ``` -------------------------------------------------------------------------------- /src/runtime-templates/ariaInteractiveElements.js: -------------------------------------------------------------------------------- ```javascript 1 | (function () { 2 | function createTextRepresentation() { 3 | // Native interactive HTML elements that are inherently focusable/clickable 4 | const INTERACTIVE_ELEMENTS = [ 5 | 'a[href]', 6 | 'button', 7 | 'input:not([type="hidden"])', 8 | 'select', 9 | 'textarea', 10 | 'summary', 11 | 'video[controls]', 12 | 'audio[controls]', 13 | ]; 14 | 15 | // Interactive ARIA roles that make elements programmatically interactive 16 | const INTERACTIVE_ROLES = [ 17 | 'button', 18 | 'checkbox', 19 | 'combobox', 20 | 'gridcell', 21 | 'link', 22 | 'listbox', 23 | 'menuitem', 24 | 'menuitemcheckbox', 25 | 'menuitemradio', 26 | 'option', 27 | 'radio', 28 | 'searchbox', 29 | 'slider', 30 | 'spinbutton', 31 | 'switch', 32 | 'tab', 33 | 'textbox', 34 | 'treeitem', 35 | ]; 36 | 37 | // Build complete selector for all interactive elements 38 | const completeSelector = [...INTERACTIVE_ELEMENTS, ...INTERACTIVE_ROLES.map((role) => `[role="${role}"]`)].join( 39 | ',' 40 | ); 41 | 42 | // Helper to get accessible name of an element following ARIA naming specs 43 | const getAccessibleName = (el) => { 44 | // First try explicit labels 45 | const explicitLabel = el.getAttribute('aria-label'); 46 | if (explicitLabel) return explicitLabel; 47 | 48 | // Then try labelledby 49 | const labelledBy = el.getAttribute('aria-labelledby'); 50 | if (labelledBy) { 51 | const labelElements = labelledBy.split(' ').map((id) => document.getElementById(id)); 52 | const labelText = labelElements.map((labelEl) => (labelEl ? labelEl.textContent.trim() : '')).join(' '); 53 | if (labelText) return labelText; 54 | } 55 | 56 | // Then try associated label element 57 | const label = el.labels ? el.labels[0] : null; 58 | if (label) return label.textContent.trim(); 59 | 60 | // Then try placeholder 61 | const placeholder = el.getAttribute('placeholder'); 62 | if (placeholder) return placeholder; 63 | 64 | // Then try title 65 | const title = el.getAttribute('title'); 66 | if (title) return title; 67 | 68 | // For inputs, use value 69 | if (el.tagName.toLowerCase() === 'input') { 70 | return el.getAttribute('value') || el.value || ''; 71 | } 72 | 73 | // For other elements, get all text content including from child elements 74 | let textContent = ''; 75 | const walker = document.createTreeWalker(el, NodeFilter.SHOW_TEXT, { 76 | acceptNode: (node) => { 77 | // Skip text in hidden elements 78 | let parent = node.parentElement; 79 | while (parent && parent !== el) { 80 | const style = window.getComputedStyle(parent); 81 | if (style.display === 'none' || style.visibility === 'hidden') { 82 | return NodeFilter.FILTER_REJECT; 83 | } 84 | parent = parent.parentElement; 85 | } 86 | return NodeFilter.FILTER_ACCEPT; 87 | } 88 | }); 89 | 90 | let node; 91 | while ((node = walker.nextNode())) { 92 | const text = node.textContent.trim(); 93 | if (text) textContent += (textContent ? ' ' : '') + text; 94 | } 95 | return textContent || ''; 96 | }; 97 | 98 | 99 | const interactiveElements = []; 100 | 101 | // Find all interactive elements in DOM order 102 | const findInteractiveElements = () => { 103 | // Clear existing elements 104 | interactiveElements.length = 0; 105 | 106 | // First find all native buttons and interactive elements 107 | document.querySelectorAll(completeSelector).forEach(node => { 108 | if ( 109 | node.getAttribute('aria-hidden') !== 'true' && 110 | !node.hasAttribute('disabled') && 111 | !node.hasAttribute('inert') && 112 | window.getComputedStyle(node).display !== 'none' && 113 | window.getComputedStyle(node).visibility !== 'hidden' 114 | ) { 115 | interactiveElements.push(node); 116 | } 117 | }); 118 | 119 | // Then use TreeWalker for any we might have missed 120 | const walker = document.createTreeWalker(document.body, NodeFilter.SHOW_ELEMENT, { 121 | acceptNode: (node) => { 122 | if ( 123 | !interactiveElements.includes(node) && // Skip if already found 124 | node.matches(completeSelector) && 125 | node.getAttribute('aria-hidden') !== 'true' && 126 | !node.hasAttribute('disabled') && 127 | !node.hasAttribute('inert') && 128 | window.getComputedStyle(node).display !== 'none' && 129 | window.getComputedStyle(node).visibility !== 'hidden' 130 | ) { 131 | return NodeFilter.FILTER_ACCEPT; 132 | } 133 | return NodeFilter.FILTER_SKIP; 134 | } 135 | }); 136 | 137 | let node; 138 | while ((node = walker.nextNode())) { 139 | if (!interactiveElements.includes(node)) { 140 | interactiveElements.push(node); 141 | } 142 | } 143 | }; 144 | 145 | // Create text representation of the page with interactive elements 146 | const createTextRepresentation = () => { 147 | const USE_ELEMENT_POSITION_FOR_TEXT_REPRESENTATION = false; // Flag to control text representation method 148 | 149 | if (USE_ELEMENT_POSITION_FOR_TEXT_REPRESENTATION) { 150 | // Position-based text representation (existing implementation) 151 | const output = []; 152 | const processedElements = new Set(); 153 | const LINE_HEIGHT = 20; // Base line height 154 | const MIN_GAP_FOR_NEWLINE = LINE_HEIGHT * 1.2; // Gap threshold for newline 155 | const HORIZONTAL_GAP = 50; // Minimum horizontal gap to consider elements on different lines 156 | 157 | // Helper to get element's bounding box 158 | const getBoundingBox = (node) => { 159 | if (node.nodeType === Node.TEXT_NODE) { 160 | const range = document.createRange(); 161 | range.selectNodeContents(node); 162 | return range.getBoundingClientRect(); 163 | } 164 | return node.getBoundingClientRect(); 165 | }; 166 | 167 | // Store nodes with their positions for sorting 168 | const nodePositions = []; 169 | 170 | // Process all nodes in DOM order 171 | const walker = document.createTreeWalker(document.body, NodeFilter.SHOW_ELEMENT | NodeFilter.SHOW_TEXT, { 172 | acceptNode: (node) => { 173 | // Skip script/style elements and their contents 174 | if ( 175 | node.nodeType === Node.ELEMENT_NODE && 176 | (node.tagName.toLowerCase() === 'script' || 177 | node.tagName.toLowerCase() === 'style' || 178 | node.tagName.toLowerCase() === 'head' || 179 | node.tagName.toLowerCase() === 'meta' || 180 | node.tagName.toLowerCase() === 'link') 181 | ) { 182 | return NodeFilter.FILTER_REJECT; 183 | } 184 | return NodeFilter.FILTER_ACCEPT; 185 | }, 186 | }); 187 | 188 | let node; 189 | while ((node = walker.nextNode())) { 190 | // Handle text nodes 191 | if (node.nodeType === Node.TEXT_NODE) { 192 | const text = node.textContent.trim(); 193 | if (!text) continue; 194 | 195 | // Skip text in hidden elements 196 | let parent = node.parentElement; 197 | let isHidden = false; 198 | let isInsideProcessedInteractive = false; 199 | let computedStyles = new Map(); // Cache computed styles 200 | 201 | while (parent) { 202 | // Cache and reuse computed styles 203 | let style = computedStyles.get(parent); 204 | if (!style) { 205 | style = window.getComputedStyle(parent); 206 | computedStyles.set(parent, style); 207 | } 208 | 209 | if ( 210 | style.display === 'none' || 211 | style.visibility === 'hidden' || 212 | parent.getAttribute('aria-hidden') === 'true' 213 | ) { 214 | isHidden = true; 215 | break; 216 | } 217 | if (processedElements.has(parent)) { 218 | isInsideProcessedInteractive = true; 219 | break; 220 | } 221 | parent = parent.parentElement; 222 | } 223 | if (isHidden || isInsideProcessedInteractive) continue; 224 | 225 | // Skip if this is just a number inside a highlight element 226 | if (/^\d+$/.test(text)) { 227 | parent = node.parentElement; 228 | while (parent) { 229 | if (parent.classList && parent.classList.contains('claude-highlight')) { 230 | isHidden = true; 231 | break; 232 | } 233 | parent = parent.parentElement; 234 | } 235 | if (isHidden) continue; 236 | } 237 | 238 | // Check if this text is inside an interactive element 239 | let isInsideInteractive = false; 240 | let interactiveParent = null; 241 | parent = node.parentElement; 242 | while (parent) { 243 | if (parent.matches(completeSelector)) { 244 | isInsideInteractive = true; 245 | interactiveParent = parent; 246 | break; 247 | } 248 | parent = parent.parentElement; 249 | } 250 | 251 | // If inside an interactive element, add it to the interactive element's content 252 | if (isInsideInteractive && interactiveParent) { 253 | const index = interactiveElements.indexOf(interactiveParent); 254 | if (index !== -1 && !processedElements.has(interactiveParent)) { 255 | const role = interactiveParent.getAttribute('role') || interactiveParent.tagName.toLowerCase(); 256 | const name = getAccessibleName(interactiveParent); 257 | if (name) { 258 | const box = getBoundingBox(interactiveParent); 259 | if (box.width > 0 && box.height > 0) { 260 | nodePositions.push({ 261 | type: 'interactive', 262 | content: `[${index}]{${role}}(${name})`, 263 | box, 264 | y: box.top + window.pageYOffset, 265 | x: box.left + window.pageXOffset 266 | }); 267 | } 268 | } 269 | processedElements.add(interactiveParent); 270 | } 271 | continue; 272 | } 273 | 274 | // If not inside an interactive element, add as regular text 275 | const box = getBoundingBox(node); 276 | if (box.width > 0 && box.height > 0) { 277 | nodePositions.push({ 278 | type: 'text', 279 | content: text, 280 | box, 281 | y: box.top + window.pageYOffset, 282 | x: box.left + window.pageXOffset 283 | }); 284 | } 285 | } 286 | 287 | // Handle interactive elements 288 | if (node.nodeType === Node.ELEMENT_NODE && node.matches(completeSelector)) { 289 | const index = interactiveElements.indexOf(node); 290 | if (index !== -1 && !processedElements.has(node)) { 291 | const role = node.getAttribute('role') || node.tagName.toLowerCase(); 292 | const name = getAccessibleName(node); 293 | if (name) { 294 | const box = getBoundingBox(node); 295 | if (box.width > 0 && box.height > 0) { 296 | nodePositions.push({ 297 | type: 'interactive', 298 | content: `[${index}]{${role}}(${name})`, 299 | box, 300 | y: box.top + window.pageYOffset, 301 | x: box.left + window.pageXOffset 302 | }); 303 | } 304 | } 305 | processedElements.add(node); 306 | } 307 | } 308 | } 309 | 310 | // Sort nodes by vertical position first, then horizontal 311 | nodePositions.sort((a, b) => { 312 | const yDiff = a.y - b.y; 313 | if (Math.abs(yDiff) < MIN_GAP_FOR_NEWLINE) { 314 | return a.x - b.x; 315 | } 316 | return yDiff; 317 | }); 318 | 319 | // Group nodes into lines 320 | let currentLine = []; 321 | let lastY = 0; 322 | let lastX = 0; 323 | 324 | const flushLine = () => { 325 | if (currentLine.length > 0) { 326 | // Sort line by x position 327 | currentLine.sort((a, b) => a.x - b.x); 328 | output.push(currentLine.map(node => node.content).join(' ')); 329 | currentLine = []; 330 | } 331 | }; 332 | 333 | for (const node of nodePositions) { 334 | // Start new line if significant vertical gap or if horizontal position is before previous element 335 | if (currentLine.length > 0 && 336 | (Math.abs(node.y - lastY) > MIN_GAP_FOR_NEWLINE || 337 | node.x < lastX - HORIZONTAL_GAP)) { 338 | flushLine(); 339 | output.push('\n'); 340 | } 341 | 342 | currentLine.push(node); 343 | lastY = node.y; 344 | lastX = node.x + node.box.width; 345 | } 346 | 347 | // Flush final line 348 | flushLine(); 349 | 350 | // Join all text with appropriate spacing 351 | return output 352 | .join('\n') 353 | .replace(/\n\s+/g, '\n') // Clean up newline spacing 354 | .replace(/\n{3,}/g, '\n\n') // Limit consecutive newlines to 2 355 | .trim(); 356 | } else { 357 | // DOM-based text representation 358 | const output = []; 359 | const processedElements = new Set(); 360 | 361 | // Process all nodes in DOM order 362 | const walker = document.createTreeWalker(document.body, NodeFilter.SHOW_ELEMENT | NodeFilter.SHOW_TEXT, { 363 | acceptNode: (node) => { 364 | // Skip script/style elements and their contents 365 | if ( 366 | node.nodeType === Node.ELEMENT_NODE && 367 | (node.tagName.toLowerCase() === 'script' || 368 | node.tagName.toLowerCase() === 'style' || 369 | node.tagName.toLowerCase() === 'head' || 370 | node.tagName.toLowerCase() === 'meta' || 371 | node.tagName.toLowerCase() === 'link') 372 | ) { 373 | return NodeFilter.FILTER_REJECT; 374 | } 375 | return NodeFilter.FILTER_ACCEPT; 376 | }, 377 | }); 378 | 379 | let node; 380 | let currentBlock = []; 381 | 382 | const flushBlock = () => { 383 | if (currentBlock.length > 0) { 384 | output.push(currentBlock.join(' ')); 385 | currentBlock = []; 386 | } 387 | }; 388 | 389 | while ((node = walker.nextNode())) { 390 | // Skip hidden elements 391 | let parent = node.parentElement; 392 | let isHidden = false; 393 | while (parent) { 394 | const style = window.getComputedStyle(parent); 395 | if ( 396 | style.display === 'none' || 397 | style.visibility === 'hidden' || 398 | parent.getAttribute('aria-hidden') === 'true' 399 | ) { 400 | isHidden = true; 401 | break; 402 | } 403 | parent = parent.parentElement; 404 | } 405 | if (isHidden) continue; 406 | 407 | // Handle text nodes 408 | if (node.nodeType === Node.TEXT_NODE) { 409 | const text = node.textContent.trim(); 410 | if (!text) continue; 411 | 412 | // Skip if this is just a number inside a highlight element 413 | if (/^\d+$/.test(text)) { 414 | parent = node.parentElement; 415 | while (parent) { 416 | if (parent.classList && parent.classList.contains('claude-highlight')) { 417 | isHidden = true; 418 | break; 419 | } 420 | parent = parent.parentElement; 421 | } 422 | if (isHidden) continue; 423 | } 424 | 425 | // Check if this text is inside an interactive element 426 | let isInsideInteractive = false; 427 | let interactiveParent = null; 428 | parent = node.parentElement; 429 | while (parent) { 430 | if (parent.matches(completeSelector)) { 431 | isInsideInteractive = true; 432 | interactiveParent = parent; 433 | break; 434 | } 435 | parent = parent.parentElement; 436 | } 437 | 438 | // If inside an interactive element, add it to the interactive element's content 439 | if (isInsideInteractive && interactiveParent) { 440 | if (!processedElements.has(interactiveParent)) { 441 | const index = interactiveElements.indexOf(interactiveParent); 442 | if (index !== -1) { 443 | const role = interactiveParent.getAttribute('role') || interactiveParent.tagName.toLowerCase(); 444 | const name = getAccessibleName(interactiveParent); 445 | if (name) { 446 | currentBlock.push(`[${index}]{${role}}(${name})`); 447 | } 448 | processedElements.add(interactiveParent); 449 | } 450 | } 451 | continue; 452 | } 453 | 454 | // Add text to current block 455 | currentBlock.push(text); 456 | } 457 | 458 | // Handle block-level elements and interactive elements 459 | if (node.nodeType === Node.ELEMENT_NODE) { 460 | const style = window.getComputedStyle(node); 461 | const isBlockLevel = style.display === 'block' || 462 | style.display === 'flex' || 463 | style.display === 'grid' || 464 | node.tagName.toLowerCase() === 'br'; 465 | 466 | // Handle interactive elements 467 | if (node.matches(completeSelector) && !processedElements.has(node)) { 468 | const index = interactiveElements.indexOf(node); 469 | if (index !== -1) { 470 | const role = node.getAttribute('role') || node.tagName.toLowerCase(); 471 | const name = getAccessibleName(node); 472 | if (name) { 473 | currentBlock.push(`[${index}]{${role}}(${name})`); 474 | } 475 | processedElements.add(node); 476 | } 477 | } 478 | 479 | // Add newline for block-level elements 480 | if (isBlockLevel) { 481 | flushBlock(); 482 | output.push(''); 483 | } 484 | } 485 | } 486 | 487 | // Flush final block 488 | flushBlock(); 489 | 490 | // Join all text with appropriate spacing 491 | return output 492 | .join('\n') 493 | .replace(/\n\s+/g, '\n') // Clean up newline spacing 494 | .replace(/\n{3,}/g, '\n\n') // Limit consecutive newlines to 2 495 | .trim(); 496 | } 497 | }; 498 | 499 | // Helper functions for accurate clicking 500 | const isElementClickable = (element) => { 501 | if (!element) return false; 502 | 503 | const style = window.getComputedStyle(element); 504 | const rect = element.getBoundingClientRect(); 505 | 506 | return ( 507 | // Element must be visible 508 | style.display !== 'none' && 509 | style.visibility !== 'hidden' && 510 | style.opacity !== '0' && 511 | // Must have non-zero dimensions 512 | rect.width > 0 && 513 | rect.height > 0 && 514 | // Must be within viewport bounds 515 | rect.top >= 0 && 516 | rect.left >= 0 && 517 | rect.bottom <= (window.innerHeight || document.documentElement.clientHeight) && 518 | rect.right <= (window.innerWidth || document.documentElement.clientWidth) && 519 | // Must not be disabled 520 | !element.hasAttribute('disabled') && 521 | !element.hasAttribute('aria-disabled') && 522 | element.getAttribute('aria-hidden') !== 'true' 523 | ); 524 | }; 525 | 526 | const getClickableCenter = (element) => { 527 | const rect = element.getBoundingClientRect(); 528 | // Get the actual visible area accounting for overflow 529 | const style = window.getComputedStyle(element); 530 | const overflowX = style.overflowX; 531 | const overflowY = style.overflowY; 532 | 533 | let width = rect.width; 534 | let height = rect.height; 535 | 536 | // Adjust for overflow 537 | if (overflowX === 'hidden') { 538 | width = Math.min(width, element.clientWidth); 539 | } 540 | if (overflowY === 'hidden') { 541 | height = Math.min(height, element.clientHeight); 542 | } 543 | 544 | // Calculate center coordinates 545 | const x = rect.left + (width / 2); 546 | const y = rect.top + (height / 2); 547 | 548 | return { 549 | x: Math.round(x + window.pageXOffset), 550 | y: Math.round(y + window.pageYOffset) 551 | }; 552 | }; 553 | 554 | // Expose helper functions to window for use by MCP 555 | window.isElementClickable = isElementClickable; 556 | window.getClickableCenter = getClickableCenter; 557 | 558 | // Main execution 559 | findInteractiveElements(); 560 | const textRepresentation = createTextRepresentation(); 561 | 562 | if (false) 563 | requestAnimationFrame(() => { 564 | // Clear existing highlights 565 | document.querySelectorAll('.claude-highlight').forEach((el) => el.remove()); 566 | 567 | // Create main overlay container 568 | const overlay = document.createElement('div'); 569 | overlay.className = 'claude-highlight'; 570 | overlay.style.cssText = ` 571 | position: absolute; 572 | top: 0; 573 | left: 0; 574 | width: 100%; 575 | height: ${Math.max(document.body.scrollHeight, document.documentElement.scrollHeight)}px; 576 | pointer-events: none; 577 | z-index: 2147483647; 578 | `; 579 | document.body.appendChild(overlay); 580 | 581 | // Batch DOM operations and reduce reflows 582 | const fragment = document.createDocumentFragment(); 583 | const pageXOffset = window.pageXOffset; 584 | const pageYOffset = window.pageYOffset; 585 | 586 | // Create highlights in a batch 587 | interactiveElements.forEach((el, index) => { 588 | const rect = el.getBoundingClientRect(); 589 | 590 | if (rect.width <= 0 || rect.height <= 0) return; 591 | 592 | const highlight = document.createElement('div'); 593 | highlight.className = 'claude-highlight'; 594 | highlight.style.cssText = ` 595 | position: absolute; 596 | left: ${pageXOffset + rect.left}px; 597 | top: ${pageYOffset + rect.top}px; 598 | width: ${rect.width}px; 599 | height: ${rect.height}px; 600 | background-color: hsla(${(index * 30) % 360}, 80%, 50%, 0.3); 601 | display: flex; 602 | align-items: center; 603 | justify-content: center; 604 | font-size: 10px; 605 | font-weight: bold; 606 | color: #000; 607 | pointer-events: none; 608 | border: none; 609 | z-index: 2147483647; 610 | `; 611 | 612 | highlight.textContent = index; 613 | fragment.appendChild(highlight); 614 | }); 615 | 616 | // Single DOM update 617 | overlay.appendChild(fragment); 618 | }); 619 | 620 | // Return the results 621 | const result = { 622 | interactiveElements, 623 | textRepresentation, 624 | }; 625 | 626 | window.interactiveElements = interactiveElements; 627 | window.textRepresentation = textRepresentation; 628 | 629 | console.log(`Gerenated ${interactiveElements.length} interactive elements`); 630 | console.log(`Text representation size: ${textRepresentation.length} characters`); 631 | 632 | return result; 633 | } 634 | 635 | // // Debounce helper function 636 | // function debounce(func, wait) { 637 | // let timeout; 638 | // return function executedFunction(...args) { 639 | // const later = () => { 640 | // clearTimeout(timeout); 641 | // func(...args); 642 | // }; 643 | // clearTimeout(timeout); 644 | // timeout = setTimeout(later, wait); 645 | // }; 646 | // } 647 | 648 | // // Create a debounced version of the text representation creation 649 | // const debouncedCreateTextRepresentation = debounce(() => { 650 | // const result = createTextRepresentation(); 651 | // // Dispatch a custom event with the new text representation 652 | // const event = new CustomEvent('textRepresentationUpdated', { 653 | // detail: result, 654 | // }); 655 | // document.dispatchEvent(event); 656 | // }, 250); // 250ms debounce time 657 | 658 | // // Set up mutation observer to watch for DOM changes 659 | // const observer = new MutationObserver((mutations) => { 660 | // // Check if any mutation is relevant (affects visibility, attributes, or structure) 661 | // const isRelevantMutation = mutations.some((mutation) => { 662 | // // Check if the mutation affects visibility or attributes 663 | // if ( 664 | // mutation.type === 'attributes' && 665 | // (mutation.attributeName === 'aria-hidden' || 666 | // mutation.attributeName === 'disabled' || 667 | // mutation.attributeName === 'inert' || 668 | // mutation.attributeName === 'style' || 669 | // mutation.attributeName === 'class') 670 | // ) { 671 | // return true; 672 | // } 673 | 674 | // // Check if the mutation affects the DOM structure 675 | // if (mutation.type === 'childList') { 676 | // return true; 677 | // } 678 | 679 | // return false; 680 | // }); 681 | 682 | // if (isRelevantMutation) { 683 | // debouncedCreateTextRepresentation(); 684 | // } 685 | // }); 686 | 687 | // // Start observing the document with the configured parameters 688 | // observer.observe(document.body, { 689 | // childList: true, 690 | // subtree: true, 691 | // attributes: true, 692 | // characterData: true, 693 | // attributeFilter: ['aria-hidden', 'disabled', 'inert', 'style', 'class', 'role', 'aria-label', 'aria-labelledby'], 694 | // }); 695 | 696 | window.createTextRepresentation = createTextRepresentation; 697 | 698 | // Initial creation 699 | createTextRepresentation(); 700 | 701 | // // Also rerun when dynamic content might be loaded 702 | // window.addEventListener('load', createTextRepresentation); 703 | // document.addEventListener('DOMContentLoaded', createTextRepresentation); 704 | 705 | // // Handle dynamic updates like dialogs 706 | // const dynamicUpdateEvents = ['dialog', 'popstate', 'pushstate', 'replacestate']; 707 | // dynamicUpdateEvents.forEach(event => { 708 | // window.addEventListener(event, () => { 709 | // setTimeout(createTextRepresentation, 100); // Small delay to let content settle 710 | // }); 711 | // }); 712 | 713 | console.log('Aria Interactive Elements script loaded'); 714 | })(); 715 | ```