# Directory Structure
```
├── .github
│ └── workflows
│ ├── publish-mcp.yml
│ └── release.yml
├── .gitignore
├── .npmignore
├── aria_snapshot_filter.js
├── assets
│ ├── Demo.gif
│ ├── Demo2.gif
│ ├── Demo3.gif
│ ├── logo.png
│ └── Tools.md
├── brightdata-mcp-extension.dxt
├── browser_session.js
├── browser_tools.js
├── CHANGELOG.md
├── Dockerfile
├── examples
│ └── README.md
├── LICENSE
├── package-lock.json
├── package.json
├── README.md
├── server.js
├── server.json
└── smithery.yaml
```
# Files
--------------------------------------------------------------------------------
/.npmignore:
--------------------------------------------------------------------------------
```
1 | *.dxt
2 | smithery.yaml
3 | Dockerfile
4 | examples
5 | assets
6 | CHANGELOG.md
7 |
```
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
```
1 | # build output
2 | dist/
3 |
4 | # generated types
5 | .astro/
6 |
7 | # dependencies
8 | node_modules/
9 |
10 | # logs
11 | npm-debug.log*
12 | yarn-debug.log*
13 | yarn-error.log*
14 | pnpm-debug.log*
15 |
16 | # environment variables
17 | .env
18 | .env.production
19 |
20 | # macOS-specific files
21 | .DS_Store
22 |
23 | # jetbrains setting folder
24 | .idea/
25 |
```
--------------------------------------------------------------------------------
/examples/README.md:
--------------------------------------------------------------------------------
```markdown
1 | # MCP Usage Examples
2 |
3 | A curated list of community demos using Bright Data's MCP server.
4 |
5 | ## 🧠 Notable Examples
6 |
7 | - **AI voice agent that closed 4 deals & made $596 overnight 🤑**
8 | [📹 YouTube Demo](https://www.youtube.com/watch?v=YGzT3sVdwdY)
9 |
10 | [💻 GitHub Repo](https://github.com/llSourcell/my_ai_intern)
11 |
12 | - **Langgraph with mcp-adapters demo**
13 |
14 | [📹 YouTube Demo](https://www.youtube.com/watch?v=6DXuadyaJ4g)
15 |
16 | [💻 Source Code](https://github.com/techwithtim/BrightDataMCPServerAgent)
17 |
18 | - **Researcher Agent built with Google ADK that is connected to Bright Data's MCP to fetch real-time data**
19 |
20 | [📹 YouTube Demo](https://www.youtube.com/watch?v=r7WG6dXWdUI)
21 |
22 | [💻Source Code](https://github.com/MeirKaD/MCP_ADK)
23 |
24 | - **Replacing 3 MCP servers with our MCP server to avoid getting blocked 🤯**
25 |
26 | [📹 YouTube Demo](https://www.youtube.com/watch?v=0xmE0OJrNmg)
27 |
28 | - **Scrape ANY Website In Realtime With This Powerful AI MCP Server**
29 |
30 | [📹 YouTube Demo](https://www.youtube.com/watch?v=bL5JIeGL3J0)
31 |
32 | - **Multi-Agent job finder using Bright Data MCP and TypeScript from SCRATCH**
33 |
34 | [📹 YouTube Demo](https://www.youtube.com/watch?v=45OtteCGFiI)
35 |
36 | [💻Source Code](https://github.com/bitswired/jobwizard)
37 |
38 | - **Usage example with Gemini CLI**
39 |
40 | [📹 YouTube Tutorial](https://www.youtube.com/watch?v=FE1LChbgFEw)
41 | ---
42 |
43 | Got a cool example? Open a PR or contact us!
44 |
```
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
```markdown
1 | <div align="center">
2 | <a href="https://brightdata.com/ai/mcp-server">
3 | <img src="https://github.com/user-attachments/assets/c21b3f7b-7ff1-40c3-b3d8-66706913d62f" alt="Bright Data Logo">
4 | </a>
5 |
6 | <h1>The Web MCP</h1>
7 |
8 | <p>
9 | <strong>🌐 Give your AI real-time web superpowers</strong><br/>
10 | <i>Seamlessly connect LLMs to the live web without getting blocked</i>
11 | </p>
12 |
13 | <p>
14 | <a href="https://www.npmjs.com/package/@brightdata/mcp">
15 | <img src="https://img.shields.io/npm/v/@brightdata/mcp?style=for-the-badge&color=blue" alt="npm version"/>
16 | </a>
17 | <a href="https://www.npmjs.com/package/@brightdata/mcp">
18 | <img src="https://img.shields.io/npm/dw/@brightdata/mcp?style=for-the-badge&color=green" alt="npm downloads"/>
19 | </a>
20 | <a href="https://github.com/brightdata-com/brightdata-mcp/blob/main/LICENSE">
21 | <img src="https://img.shields.io/badge/license-MIT-purple?style=for-the-badge" alt="License"/>
22 | </a>
23 | </p>
24 |
25 | <p>
26 | <a href="#-quick-start">Quick Start</a> •
27 | <a href="#-features">Features</a> •
28 | <a href="#-pricing--modes">Pricing</a> •
29 | <a href="#-demos">Demos</a> •
30 | <a href="#-documentation">Docs</a> •
31 | <a href="#-support">Support</a>
32 | </p>
33 |
34 | <div>
35 | <h3>🎉 <strong>Free Tier Available!</strong> 🎉</h3>
36 | <p><strong>5,000 requests/month FREE</strong> <br/>
37 | <sub>Perfect for prototyping and everyday AI workflows</sub></p>
38 | </div>
39 | </div>
40 |
41 | ---
42 |
43 | ## 🌟 Overview
44 |
45 | **The Web MCP** is your gateway to giving AI assistants true web capabilities. No more outdated responses, no more "I can't access real-time information" - just seamless, reliable web access that actually works.
46 |
47 | Built by [Bright Data](https://brightdata.com), the world's #1 web data platform, this MCP server ensures your AI never gets blocked, rate-limited, or served CAPTCHAs.
48 |
49 | <div align="center">
50 | <table>
51 | <tr>
52 | <td align="center">✅ <strong>Works with Any LLM</strong><br/><sub>Claude, GPT, Gemini, Llama</sub></td>
53 | <td align="center">🛡️ <strong>Never Gets Blocked</strong><br/><sub>Enterprise-grade unblocking</sub></td>
54 | <td align="center">🚀 <strong>5,000 Free Requests</strong><br/><sub>Monthly</sub></td>
55 | <td align="center">⚡ <strong>Zero Config</strong><br/><sub>Works out of the box</sub></td>
56 | </tr>
57 | </table>
58 | </div>
59 |
60 | ---
61 |
62 | ## 🎯 Perfect For
63 |
64 | - 🔍 **Real-time Research** - Get current prices, news, and live data
65 | - 🛍️ **E-commerce Intelligence** - Monitor products, prices, and availability
66 | - 📊 **Market Analysis** - Track competitors and industry trends
67 | - 🤖 **AI Agents** - Build agents that can actually browse the web
68 | - 📝 **Content Creation** - Access up-to-date information for writing
69 | - 🎓 **Academic Research** - Gather data from multiple sources efficiently
70 |
71 | ---
72 |
73 | ## ⚡ Quick Start
74 |
75 |
76 | <summary><b>📡 Use our hosted server - No installation needed!</b></summary>
77 |
78 | Perfect for users who want zero setup. Just add this URL to your MCP client:
79 |
80 | ```
81 | https://mcp.brightdata.com/mcp?token=YOUR_API_TOKEN_HERE
82 | ```
83 |
84 | **Setup in Claude Desktop:**
85 | 1. Go to: Settings → Connectors → Add custom connector
86 | 2. Name: `Bright Data Web`
87 | 3. URL: `https://mcp.brightdata.com/mcp?token=YOUR_API_TOKEN`
88 | 4. Click "Add" and you're done! ✨
89 |
90 |
91 | <summary><b>Run locally on your machine</b></summary>
92 |
93 | ```json
94 | {
95 | "mcpServers": {
96 | "Bright Data": {
97 | "command": "npx",
98 | "args": ["@brightdata/mcp"],
99 | "env": {
100 | "API_TOKEN": "<your-api-token-here>"
101 | }
102 | }
103 | }
104 | }
105 | ```
106 |
107 |
108 | ---
109 |
110 | ## 🚀 Pricing & Modes
111 |
112 | <div align="center">
113 | <table>
114 | <tr>
115 | <th width="33%">⚡ Rapid Mode (Free tier)</th>
116 | <th width="33%">💎 Pro Mode</th>
117 | </tr>
118 | <tr>
119 | <td align="center">
120 | <h3>$0/month</h3>
121 | <p><strong>5,000 requests</strong></p>
122 | <hr/>
123 | <p>✅ Web Search<br/>
124 | ✅ Scraping with Web unlocker<br/>
125 | ❌ Browser Automation<br/>
126 | ❌ Web data tools</p>
127 | <br/>
128 | <code>Default Mode</code>
129 | </td>
130 | <td align="center">
131 | <h3>Pay-as-you-go</h3>
132 | <p><strong>Every thing in rapid and 60+ Advanced Tools</strong></p>
133 | <hr/>
134 | <p>✅ Browser Control<br/>
135 | ✅ Web Data APIs<br/>
136 | <br/>
137 | <br/>
138 | <br/>
139 | <code>PRO_MODE=true</code>
140 | </td>
141 | </tr>
142 | </table>
143 | </div>
144 |
145 | > **💡 Note:** Pro mode is **not included** in the free tier and incurs additional charges based on usage.
146 |
147 | ---
148 |
149 | ## ✨ Features
150 |
151 | ### 🔥 Core Capabilities
152 |
153 | <table>
154 | <tr>
155 | <td>🔍 <b>Smart Web Search</b><br/>Google-quality results optimized for AI</td>
156 | <td>📄 <b>Clean Markdown</b><br/>AI-ready content extraction</td>
157 | </tr>
158 | <tr>
159 | <td>🌍 <b>Global Access</b><br/>Bypass geo-restrictions automatically</td>
160 | <td>🛡️ <b>Anti-Bot Protection</b><br/>Never get blocked or rate-limited</td>
161 | </tr>
162 | <tr>
163 | <td>🤖 <b>Browser Automation</b><br/>Control real browsers remotely (Pro)</td>
164 | <td>⚡ <b>Lightning Fast</b><br/>Optimized for minimal latency</td>
165 | </tr>
166 | </table>
167 |
168 | ### 🎯 Example Queries That Just Work
169 |
170 | ```yaml
171 | ✅ "What's Tesla's current stock price?"
172 | ✅ "Find the best-rated restaurants in Tokyo right now"
173 | ✅ "Get today's weather forecast for New York"
174 | ✅ "What movies are releasing this week?"
175 | ✅ "What are the trending topics on Twitter today?"
176 | ```
177 |
178 | ---
179 |
180 | ## 🎬 Demos
181 |
182 | > **Note:** These videos show earlier versions. New demos coming soon! 🎥
183 |
184 | <details>
185 | <summary><b>View Demo Videos</b></summary>
186 |
187 | ### Basic Web Search Demo
188 | https://github.com/user-attachments/assets/59f6ebba-801a-49ab-8278-1b2120912e33
189 |
190 | ### Advanced Scraping Demo
191 | https://github.com/user-attachments/assets/61ab0bee-fdfa-4d50-b0de-5fab96b4b91d
192 |
193 | [📺 More tutorials on YouTube →](https://github.com/brightdata-com/brightdata-mcp/blob/main/examples/README.md)
194 |
195 | </details>
196 |
197 | ---
198 |
199 | ## 🔧 Available Tools
200 |
201 | ### ⚡ Rapid Mode Tools (Default - Free)
202 |
203 | | Tool | Description | Use Case |
204 | |------|-------------|----------|
205 | | 🔍 `search_engine` | Web search with AI-optimized results | Research, fact-checking, current events |
206 | | 📄 `scrape_as_markdown` | Convert any webpage to clean markdown | Content extraction, documentation |
207 |
208 | ### 💎 Pro Mode Tools (60+ Tools)
209 |
210 | <details>
211 | <summary><b>Click to see all Pro tools</b></summary>
212 |
213 | | Category | Tools | Description |
214 | |----------|-------|-------------|
215 | | **Browser Control** | `scraping_browser.*` | Full browser automation |
216 | | **Web Data APIs** | `web_data_*` | Structured data extraction |
217 | | **E-commerce** | Product scrapers | Amazon, eBay, Walmart data |
218 | | **Social Media** | Social scrapers | Twitter, LinkedIn, Instagram |
219 | | **Maps & Local** | Location tools | Google Maps, business data |
220 |
221 | [📚 View complete tool documentation →](https://github.com/brightdata-com/brightdata-mcp/blob/main/assets/Tools.md)
222 |
223 | </details>
224 |
225 | ---
226 |
227 | ## 🎮 Try It Now!
228 |
229 | ### 🧪 Online Playground
230 | Try the Web MCP without any setup:
231 |
232 | <div align="center">
233 | <a href="https://brightdata.com/ai/playground-chat">
234 | <img src="https://img.shields.io/badge/Try_on-Playground-00C7B7?style=for-the-badge&logo=" alt="Playground"/>
235 | </a>
236 | </div>
237 |
238 | ---
239 |
240 | ## 🔧 Configuration
241 |
242 | ### Basic Setup
243 | ```json
244 | {
245 | "mcpServers": {
246 | "Bright Data": {
247 | "command": "npx",
248 | "args": ["@brightdata/mcp"],
249 | "env": {
250 | "API_TOKEN": "your-token-here"
251 | }
252 | }
253 | }
254 | }
255 | ```
256 |
257 | ### Advanced Configuration
258 | ```json
259 | {
260 | "mcpServers": {
261 | "Bright Data": {
262 | "command": "npx",
263 | "args": ["@brightdata/mcp"],
264 | "env": {
265 | "API_TOKEN": "your-token-here",
266 | "PRO_MODE": "true", // Enable all 60+ tools
267 | "RATE_LIMIT": "100/1h", // Custom rate limiting
268 | "WEB_UNLOCKER_ZONE": "custom", // Custom unlocker zone
269 | "BROWSER_ZONE": "custom_browser" // Custom browser zone
270 | }
271 | }
272 | }
273 | }
274 | ```
275 |
276 | ---
277 |
278 | ## 📚 Documentation
279 |
280 | <div align="center">
281 | <table>
282 | <tr>
283 | <td align="center">
284 | <a href="https://docs.brightdata.com/mcp-server/overview">
285 | <img src="https://img.shields.io/badge/📖-API_Docs-blue?style=for-the-badge" alt="API Docs"/>
286 | </a>
287 | </td>
288 | <td align="center">
289 | <a href="https://github.com/brightdata-com/brightdata-mcp/blob/main/examples">
290 | <img src="https://img.shields.io/badge/💡-Examples-green?style=for-the-badge" alt="Examples"/>
291 | </a>
292 | </td>
293 | <td align="center">
294 | <a href="https://github.com/brightdata-com/brightdata-mcp/blob/main/CHANGELOG.md">
295 | <img src="https://img.shields.io/badge/📝-Changelog-orange?style=for-the-badge" alt="Changelog"/>
296 | </a>
297 | </td>
298 | <td align="center">
299 | <a href="https://brightdata.com/blog/ai/web-scraping-with-mcp">
300 | <img src="https://img.shields.io/badge/📚-Tutorial-purple?style=for-the-badge" alt="Tutorial"/>
301 | </a>
302 | </td>
303 | </tr>
304 | </table>
305 | </div>
306 |
307 | ---
308 |
309 | ## 🚨 Common Issues & Solutions
310 |
311 | <details>
312 | <summary><b>🔧 Troubleshooting Guide</b></summary>
313 |
314 | ### ❌ "spawn npx ENOENT" Error
315 | **Solution:** Install Node.js or use the full path to node:
316 | ```json
317 | "command": "/usr/local/bin/node" // macOS/Linux
318 | "command": "C:\\Program Files\\nodejs\\node.exe" // Windows
319 | ```
320 |
321 | ### ⏱️ Timeouts on Complex Sites
322 | **Solution:** Increase timeout in your client settings to 180s
323 |
324 | ### 🔑 Authentication Issues
325 | **Solution:** Ensure your API token is valid and has proper permissions
326 |
327 | ### 📡 Remote Server Connection
328 | **Solution:** Check your internet connection and firewall settings
329 |
330 | [More troubleshooting →](https://github.com/brightdata-com/brightdata-mcp#troubleshooting)
331 |
332 | </details>
333 |
334 | ---
335 |
336 | ## 🤝 Contributing
337 |
338 | We love contributions! Here's how you can help:
339 |
340 | - 🐛 [Report bugs](https://github.com/brightdata-com/brightdata-mcp/issues)
341 | - 💡 [Suggest features](https://github.com/brightdata-com/brightdata-mcp/issues)
342 | - 🔧 [Submit PRs](https://github.com/brightdata-com/brightdata-mcp/pulls)
343 | - ⭐ Star this repo!
344 |
345 | Please follow [Bright Data's coding standards](https://brightdata.com/dna/js_code).
346 |
347 | ---
348 |
349 | ## 📞 Support
350 |
351 | <div align="center">
352 | <table>
353 | <tr>
354 | <td align="center">
355 | <a href="https://github.com/brightdata-com/brightdata-mcp/issues">
356 | <strong>🐛 GitHub Issues</strong><br/>
357 | <sub>Report bugs & features</sub>
358 | </a>
359 | </td>
360 | <td align="center">
361 | <a href="https://docs.brightdata.com/mcp-server/overview">
362 | <strong>📚 Documentation</strong><br/>
363 | <sub>Complete guides</sub>
364 | </a>
365 | </td>
366 | <td align="center">
367 | <a href="mailto:[email protected]">
368 | <strong>✉️ Email</strong><br/>
369 | <sub>[email protected]</sub>
370 | </a>
371 | </td>
372 | </tr>
373 | </table>
374 | </div>
375 |
376 | ---
377 |
378 | ## 📜 License
379 |
380 | MIT © [Bright Data Ltd.](https://brightdata.com)
381 |
382 | ---
383 |
384 | <div align="center">
385 | <p>
386 | <strong>Built with ❤️ by</strong><br/>
387 | <a href="https://brightdata.com">
388 | <img src="https://idsai.net.technion.ac.il/files/2022/01/Logo-600.png" alt="Bright Data" height="30"/>
389 | </a>
390 | </p>
391 | <p>
392 | <sub>The world's #1 web data platform</sub>
393 | </p>
394 |
395 | <br/>
396 |
397 | <p>
398 | <a href="https://github.com/brightdata-com/brightdata-mcp">⭐ Star us on GitHub</a> •
399 | <a href="https://brightdata.com/blog">Read our Blog</a>
400 | </p>
401 | </div>
402 |
```
--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------
```dockerfile
1 | FROM node:22.12-alpine AS builder
2 |
3 |
4 | COPY . /app
5 | WORKDIR /app
6 |
7 |
8 | RUN --mount=type=cache,target=/root/.npm npm install
9 |
10 | FROM node:22-alpine AS release
11 |
12 | WORKDIR /app
13 |
14 |
15 | COPY --from=builder /app/server.js /app/
16 | COPY --from=builder /app/browser_tools.js /app/
17 | COPY --from=builder /app/browser_session.js /app/
18 | COPY --from=builder /app/package.json /app/
19 | COPY --from=builder /app/package-lock.json /app/
20 |
21 |
22 | ENV NODE_ENV=production
23 |
24 |
25 | RUN npm ci --ignore-scripts --omit-dev
26 |
27 |
28 | ENTRYPOINT ["node", "server.js"]
29 |
```
--------------------------------------------------------------------------------
/.github/workflows/release.yml:
--------------------------------------------------------------------------------
```yaml
1 | name: Release
2 | on:
3 | push:
4 | tags: v*
5 |
6 | jobs:
7 | release:
8 | name: Release
9 | runs-on: ubuntu-latest
10 | permissions:
11 | contents: read
12 | id-token: write
13 | steps:
14 | - uses: actions/checkout@v5
15 | - uses: actions/setup-node@v5
16 | with:
17 | node-version: 22
18 | cache: "npm"
19 | registry-url: 'https://registry.npmjs.org'
20 | scope: '@brightdata'
21 | - run: npm ci
22 | - run: npm audit signatures
23 | - run: npm publish
24 | env:
25 | NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
26 |
```
--------------------------------------------------------------------------------
/smithery.yaml:
--------------------------------------------------------------------------------
```yaml
1 | startCommand:
2 | type: stdio
3 | configSchema:
4 | type: object
5 | properties:
6 | apiToken:
7 | type: string
8 | description: "Bright Data API key, available in https://brightdata.com/cp/setting/users"
9 | webUnlockerZone:
10 | type: string
11 | default: 'mcp_unlocker'
12 | description: "Optional: The Web Unlocker zone name (defaults to 'mcp_unlocker')"
13 | browserZone:
14 | type: string
15 | default: 'mcp_browser'
16 | description: "Optional: Zone name for the Browser API (enables browser control tools, deafults to 'mcp_browser')"
17 | commandFunction: |-
18 | config => ({
19 | command: 'node',
20 | args: ['server.js'],
21 | env: {
22 | API_TOKEN: config.apiToken,
23 | WEB_UNLOCKER_ZONE: config.webUnlockerZone,
24 | BROWSER_ZONE: config.browserZone
25 | }
26 | })
27 |
```
--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------
```json
1 | {
2 | "name": "@brightdata/mcp",
3 | "version": "2.6.0",
4 | "description": "An MCP interface into the Bright Data toolset",
5 | "type": "module",
6 | "main": "./server.js",
7 | "bin": {
8 | "@brightdata/mcp": "./server.js"
9 | },
10 | "scripts": {
11 | "start": "node server.js"
12 | },
13 | "keywords": [
14 | "mcp",
15 | "brightdata"
16 | ],
17 | "author": "Bright Data",
18 | "repository": {
19 | "type": "git",
20 | "url": "https://github.com/brightdata/brightdata-mcp.git"
21 | },
22 | "bugs": {
23 | "url": "https://github.com/brightdata/brightdata-mcp/issues"
24 | },
25 | "license": "MIT",
26 | "dependencies": {
27 | "axios": "^1.11.0",
28 | "fastmcp": "^3.1.1",
29 | "playwright": "^1.51.1",
30 | "zod": "^3.24.2"
31 | },
32 | "publishConfig": {
33 | "access": "public"
34 | },
35 | "files": [
36 | "server.js",
37 | "browser_tools.js",
38 | "browser_session.js",
39 | "aria_snapshot_filter.js"
40 | ],
41 | "mcpName": "io.github.brightdata/brightdata-mcp"
42 | }
43 |
```
--------------------------------------------------------------------------------
/.github/workflows/publish-mcp.yml:
--------------------------------------------------------------------------------
```yaml
1 | name: Publish to MCP Registry
2 |
3 | on:
4 | push:
5 | tags: ["v*"]
6 | workflow_dispatch:
7 |
8 | jobs:
9 | publish:
10 | runs-on: ubuntu-latest
11 | permissions:
12 | id-token: write
13 | contents: read
14 |
15 | steps:
16 | - name: Checkout code
17 | uses: actions/checkout@v4
18 |
19 | - name: Setup Node.js
20 | uses: actions/setup-node@v4
21 | with:
22 | node-version: "22"
23 |
24 | - name: Sync version in server.json with package.json
25 | run: |
26 | VERSION=$(node -p "require('./package.json').version")
27 | echo "Syncing version to: $VERSION"
28 | jq --arg v "$VERSION" '.version = $v | .packages[0].version = $v' server.json > tmp.json && mv tmp.json server.json
29 | echo "Updated server.json:"
30 | cat server.json
31 |
32 | - name: Install MCP Publisher
33 | run: |
34 | curl -L "https://github.com/modelcontextprotocol/registry/releases/latest/download/mcp-publisher_$(uname -s | tr '[:upper:]' '[:lower:]')_$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/').tar.gz" | tar xz mcp-publisher
35 | chmod +x mcp-publisher
36 |
37 | - name: Login to MCP Registry
38 | run: ./mcp-publisher login github-oidc
39 |
40 | - name: Publish to MCP Registry
41 | run: ./mcp-publisher publish
42 |
```
--------------------------------------------------------------------------------
/server.json:
--------------------------------------------------------------------------------
```json
1 | {
2 | "$schema": "https://static.modelcontextprotocol.io/schemas/2025-10-17/server.schema.json",
3 | "name": "io.github.brightdata/brightdata-mcp",
4 | "description": "Bright Data's Web MCP server enabling AI agents to search, extract & navigate the web",
5 | "repository": {
6 | "url": "https://github.com/brightdata/brightdata-mcp",
7 | "source": "github"
8 | },
9 | "version": "2.5.0",
10 | "packages": [
11 | {
12 | "registryType": "npm",
13 | "registryBaseUrl": "https://registry.npmjs.org",
14 | "identifier": "@brightdata/mcp",
15 | "version": "2.5.0",
16 | "transport": {
17 | "type": "stdio"
18 | },
19 | "environmentVariables": [
20 | {
21 | "name": "API_TOKEN",
22 | "description": "Your API key for Bright Data",
23 | "isRequired": true,
24 | "isSecret": true,
25 | "format": "string"
26 | },
27 | {
28 | "name": "WEB_UNLOCKER_ZONE",
29 | "description": "Your unlocker zone name",
30 | "isRequired": false,
31 | "isSecret": false,
32 | "format": "string"
33 | },
34 | {
35 | "name": "BROWSER_ZONE",
36 | "description": "Your browser zone name",
37 | "isRequired": false,
38 | "isSecret": false,
39 | "format": "string"
40 | },
41 | {
42 | "name": "PRO_MODE",
43 | "description": "To enable PRO_MODE - set to true",
44 | "isRequired": false,
45 | "isSecret": false,
46 | "format": "boolean"
47 | }
48 | ]
49 | }
50 | ]
51 | }
52 |
```
--------------------------------------------------------------------------------
/aria_snapshot_filter.js:
--------------------------------------------------------------------------------
```javascript
1 | // LICENSE_CODE ZON
2 | 'use strict'; /*jslint node:true es9:true*/
3 |
4 | export class Aria_snapshot_filter {
5 | static INTERACTIVE_ROLES = new Set([
6 | 'button', 'link', 'textbox', 'searchbox', 'combobox', 'checkbox',
7 | 'radio', 'switch', 'slider', 'tab', 'menuitem', 'option',
8 | ]);
9 | static parse_playwright_snapshot(snapshot_text){
10 | const lines = snapshot_text.split('\n');
11 | const elements = [];
12 | for (const line of lines)
13 | {
14 | const trimmed = line.trim();
15 | if (!trimmed || !trimmed.startsWith('-'))
16 | continue;
17 | const ref_match = trimmed.match(/\[ref=([^\]]+)\]/);
18 | if (!ref_match)
19 | continue;
20 | const ref = ref_match[1];
21 | const role_match = trimmed.match(/^-\s+([a-zA-Z]+)/);
22 | if (!role_match)
23 | continue;
24 | const role = role_match[1];
25 | if (!this.INTERACTIVE_ROLES.has(role))
26 | continue;
27 | const name_match = trimmed.match(/"([^"]*)"/);
28 | const name = name_match ? name_match[1] : '';
29 | let url = null;
30 | const next_line_index = lines.indexOf(line)+1;
31 | if (next_line_index<lines.length)
32 | {
33 | const next_line = lines[next_line_index];
34 | const url_match = next_line.match(/\/url:\s*(.+)/);
35 | if (url_match)
36 | url = url_match[1].trim().replace(/^["']|["']$/g, '');
37 | }
38 | elements.push({ref, role, name, url});
39 | }
40 | return elements;
41 | }
42 |
43 | static format_compact(elements){
44 | const lines = [];
45 | for (const el of elements)
46 | {
47 | const parts = [`[${el.ref}]`, el.role];
48 | if (el.name && el.name.length>0)
49 | {
50 | const name = el.name.length>60 ?
51 | el.name.substring(0, 57)+'...' : el.name;
52 | parts.push(`"${name}"`);
53 | }
54 | if (el.url && el.url.length>0 && !el.url.startsWith('#'))
55 | {
56 | let url = el.url;
57 | if (url.length>50)
58 | url = url.substring(0, 47)+'...';
59 | parts.push(`→ ${url}`);
60 | }
61 | lines.push(parts.join(' '));
62 | }
63 | return lines.join('\n');
64 | }
65 |
66 | static filter_snapshot(snapshot_text){
67 | try {
68 | const elements = this.parse_playwright_snapshot(snapshot_text);
69 | if (elements.length===0)
70 | return 'No interactive elements found';
71 | return this.format_compact(elements);
72 | } catch(e){
73 | return `Error filtering snapshot: ${e.message}\n${e.stack}`;
74 | }
75 | }
76 | }
77 |
```
--------------------------------------------------------------------------------
/CHANGELOG.md:
--------------------------------------------------------------------------------
```markdown
1 | # Changelog
2 |
3 | All notable changes to this project will be documented in this file.
4 |
5 | ## [2.6.0] - 2025-10-27
6 |
7 | ### Added
8 | - Client name logging and header passthrough for improved observability (PR #75)
9 | - ARIA ref-based browser automation for more reliable element interactions (PR #65)
10 | - ARIA snapshot filtering for better element targeting
11 | - Network request tracking in browser sessions
12 | - MCP Registry support (PR #71)
13 | - `scraping_browser_snapshot` tool to capture ARIA snapshots
14 | - `scraping_browser_click_ref`, `scraping_browser_type_ref`, `scraping_browser_wait_for_ref` tools using ref-based selectors
15 | - `scraping_browser_network_requests` tool to track HTTP requests
16 |
17 | ### Changed
18 | - Enhanced search engine tool to return JSON with only relevant fields (PR #57)
19 | - Added `fixed_values` parameter to reduce token usage (PR #60)
20 | - Browser tools now use ARIA refs instead of CSS selectors for better reliability
21 |
22 | ### Fixed
23 | - Stop polling on HTTP 400 errors in web data tools (PR #64)
24 |
25 | ### Deprecated
26 | - Legacy selector-based tools (`scraping_browser_click`, `scraping_browser_type`, `scraping_browser_wait_for`) replaced by ref-based equivalents
27 | - `scraping_browser_links` tool deprecated in favor of snapshot-based approach
28 |
29 |
30 | ## [2.0.0] - 2025-05-26
31 |
32 | ### Changed
33 | - Updated browser authentication to use API_TOKEN instead of previous authentication method
34 | - BROWSER_ZONE is now an optional parameter, the deafult zone is `mcp_browser`
35 | - Removed duplicate web_data_ tools
36 |
37 | ## [1.9.2] - 2025-05-23
38 |
39 | ### Fixed
40 | - Fixed GitHub references and repository settings
41 |
42 | ## [1.9.1] - 2025-05-21
43 |
44 | ### Fixed
45 | - Fixed spelling errors and improved coding conventions
46 | - Converted files back to Unix line endings for consistency
47 |
48 | ## [1.9.0] - 2025-05-21
49 |
50 | ### Added
51 | - Added 23 new web data tools for enhanced data collection capabilities
52 | - Added progress reporting functionality for better user feedback
53 | - Added default parameter handling for improved tool usability
54 |
55 | ### Changed
56 | - Improved coding conventions and file formatting
57 | - Enhanced web data API endpoints integration
58 |
59 | ## [1.8.3] - 2025-05-21
60 |
61 | ### Added
62 | - Added Bright Data MCP with Claude demo video to README.md
63 |
64 | ### Changed
65 | - Updated documentation with video demonstrations
66 |
67 | ## [1.8.2] - 2025-05-13
68 |
69 | ### Changed
70 | - Bumped FastMCP version for improved performance
71 | - Updated README.md with additional documentation
72 |
73 | ## [1.8.1] - 2025-05-05
74 |
75 | ### Added
76 | - Added 12 new WSAPI endpoints for enhanced functionality
77 | - Changed to polling mechanism for better reliability
78 |
79 | ### Changed
80 | - Applied dos2unix formatting for consistency
81 | - Updated Docker configuration
82 | - Updated smithery.yaml configuration
83 |
84 | ## [1.8.0] - 2025-05-03
85 |
86 | ### Added
87 | - Added domain-based browser sessions to avoid navigation limit issues
88 | - Added automatic creation of required unlocker zone when not present
89 |
90 | ### Fixed
91 | - Fixed browser context maintenance across tool calls with current domain tracking
92 | - Minor lint fixes
93 |
94 | ## [1.0.0] - 2025-04-29
95 |
96 | ### Added
97 | - Initial release of Bright Data MCP server
98 | - Browser automation capabilities with Bright Data integration
99 | - Core web scraping and data collection tools
100 | - Smithery.yaml configuration for deployment in Smithery.ai
101 | - MIT License
102 | - Demo materials and documentation
103 |
104 | ### Documentation
105 | - Created comprehensive README.md
106 | - Added demo.md with usage examples
107 | - Created examples/README.md for sample implementations
108 | - Added Tools.md documentation for available tools
109 |
110 | ---
111 |
112 | ## Release Notes
113 |
114 | ### Version 1.9.x Series
115 | The 1.9.x series focuses on expanding web data collection capabilities and improving authentication mechanisms. Key highlights include the addition of 23 new web data tools.
116 |
117 | ### Version 1.8.x Series
118 | The 1.8.x series introduced significant improvements to browser session management, WSAPI endpoints, and overall system reliability. Notable features include domain-based sessions and automatic zone creation.
119 |
120 | ### Version 1.0.0
121 | Initial stable release providing core MCP server functionality for Bright Data integration with comprehensive browser automation and web scraping capabilities.
122 |
123 |
```
--------------------------------------------------------------------------------
/browser_session.js:
--------------------------------------------------------------------------------
```javascript
1 | 'use strict'; /*jslint node:true es9:true*/
2 | import * as playwright from 'playwright';
3 | import {Aria_snapshot_filter} from './aria_snapshot_filter.js';
4 |
5 | export class Browser_session {
6 | constructor({cdp_endpoint}){
7 | this.cdp_endpoint = cdp_endpoint;
8 | this._domainSessions = new Map();
9 | this._currentDomain = 'default';
10 | }
11 |
12 | _getDomain(url){
13 | try {
14 | const urlObj = new URL(url);
15 | return urlObj.hostname;
16 | } catch(e){
17 | console.error(`Error extracting domain from ${url}:`, e);
18 | return 'default';
19 | }
20 | }
21 |
22 | async _getDomainSession(domain, {log}={}){
23 | if (!this._domainSessions.has(domain))
24 | {
25 | this._domainSessions.set(domain, {
26 | browser: null,
27 | page: null,
28 | browserClosed: true,
29 | requests: new Map()
30 | });
31 | }
32 | return this._domainSessions.get(domain);
33 | }
34 |
35 | async get_browser({log, domain='default'}={}){
36 | try {
37 | const session = await this._getDomainSession(domain, {log});
38 | if (session.browser)
39 | {
40 | try { await session.browser.contexts(); }
41 | catch(e){
42 | log?.(`Browser connection lost for domain ${domain} (${e.message}), `
43 | +`reconnecting...`);
44 | session.browser = null;
45 | session.page = null;
46 | session.browserClosed = true;
47 | }
48 | }
49 | if (!session.browser)
50 | {
51 | log?.(`Connecting to Bright Data Scraping Browser for domain ${domain}.`);
52 | session.browser = await playwright.chromium.connectOverCDP(
53 | this.cdp_endpoint);
54 | session.browserClosed = false;
55 | session.browser.on('disconnected', ()=>{
56 | log?.(`Browser disconnected for domain ${domain}`);
57 | session.browser = null;
58 | session.page = null;
59 | session.browserClosed = true;
60 | });
61 | log?.(`Connected to Bright Data Scraping Browser for domain ${domain}`);
62 | }
63 | return session.browser;
64 | } catch(e){
65 | console.error(`Error connecting to browser for domain ${domain}:`, e);
66 | const session = this._domainSessions.get(domain);
67 | if (session)
68 | {
69 | session.browser = null;
70 | session.page = null;
71 | session.browserClosed = true;
72 | }
73 | throw e;
74 | }
75 | }
76 |
77 | async get_page({url=null}={}){
78 | if (url)
79 | {
80 | this._currentDomain = this._getDomain(url);
81 | }
82 | const domain = this._currentDomain;
83 | try {
84 | const session = await this._getDomainSession(domain);
85 | if (session.browserClosed || !session.page)
86 | {
87 | const browser = await this.get_browser({domain});
88 | const existingContexts = browser.contexts();
89 | if (existingContexts.length === 0)
90 | {
91 | const context = await browser.newContext();
92 | session.page = await context.newPage();
93 | }
94 | else
95 | {
96 | const existingPages = existingContexts[0]?.pages();
97 | if (existingPages && existingPages.length > 0)
98 | session.page = existingPages[0];
99 | else
100 | session.page = await existingContexts[0].newPage();
101 | }
102 | session.page.on('request', request=>
103 | session.requests.set(request, null));
104 | session.page.on('response', response=>
105 | session.requests.set(response.request(), response));
106 | session.browserClosed = false;
107 | session.page.once('close', ()=>{
108 | session.page = null;
109 | });
110 | }
111 | return session.page;
112 | } catch(e){
113 | console.error(`Error getting page for domain ${domain}:`, e);
114 | const session = this._domainSessions.get(domain);
115 | if (session)
116 | {
117 | session.browser = null;
118 | session.page = null;
119 | session.browserClosed = true;
120 | }
121 | throw e;
122 | }
123 | }
124 |
125 | async capture_snapshot({filtered=true}={}){
126 | const page = await this.get_page();
127 | try {
128 | const full_snapshot = await page._snapshotForAI();
129 | if (!filtered)
130 | {
131 | return {
132 | url: page.url(),
133 | title: await page.title(),
134 | aria_snapshot: full_snapshot,
135 | };
136 | }
137 | const filtered_snapshot = Aria_snapshot_filter.filter_snapshot(
138 | full_snapshot);
139 | return {
140 | url: page.url(),
141 | title: await page.title(),
142 | aria_snapshot: filtered_snapshot,
143 | };
144 | } catch(e){
145 | throw new Error(`Error capturing ARIA snapshot: ${e.message}`);
146 | }
147 | }
148 |
149 | async ref_locator({element, ref}){
150 | const page = await this.get_page();
151 | try {
152 | const snapshot = await page._snapshotForAI();
153 | if (!snapshot.includes(`[ref=${ref}]`))
154 | throw new Error('Ref '+ref+' not found in the current page '
155 | +'snapshot. Try capturing new snapshot.');
156 | return page.locator(`aria-ref=${ref}`).describe(element);
157 | } catch(e){
158 | throw new Error(`Error creating ref locator for ${element} with ref ${ref}: ${e.message}`);
159 | }
160 | }
161 |
162 | async get_requests(){
163 | const domain = this._currentDomain;
164 | const session = await this._getDomainSession(domain);
165 | return session.requests;
166 | }
167 |
168 | async clear_requests(){
169 | const domain = this._currentDomain;
170 | const session = await this._getDomainSession(domain);
171 | session.requests.clear();
172 | }
173 |
174 | async close(domain=null){
175 | if (domain){
176 | const session = this._domainSessions.get(domain);
177 | if (session && session.browser)
178 | {
179 | try { await session.browser.close(); }
180 | catch(e){ console.error(`Error closing browser for domain ${domain}:`, e); }
181 | session.browser = null;
182 | session.page = null;
183 | session.browserClosed = true;
184 | session.requests.clear();
185 | this._domainSessions.delete(domain);
186 | }
187 | }
188 | else {
189 | for (const [domain, session] of this._domainSessions.entries()) {
190 | if (session.browser)
191 | {
192 | try { await session.browser.close(); }
193 | catch(e){ console.error(`Error closing browser for domain ${domain}:`, e); }
194 | session.browser = null;
195 | session.page = null;
196 | session.browserClosed = true;
197 | session.requests.clear();
198 | }
199 | }
200 | this._domainSessions.clear();
201 | }
202 | if (!domain)
203 | {
204 | this._currentDomain = 'default';
205 | }
206 | }
207 | }
208 |
209 |
```
--------------------------------------------------------------------------------
/assets/Tools.md:
--------------------------------------------------------------------------------
```markdown
1 | |Feature|Description|
2 | |---|---|
3 | |search_engine|Scrape search results from Google, Bing, or Yandex. Returns SERP results in JSON for Google and Markdown for Bing/Yandex; supports pagination with the cursor parameter.|
4 | |scrape_as_markdown|Scrape a single webpage with advanced extraction and return Markdown. Uses Bright Data's unlocker to handle bot protection and CAPTCHA.|
5 | |search_engine_batch|Run up to 10 search queries in parallel. Returns JSON for Google results and Markdown for Bing/Yandex.|
6 | |scrape_batch|Scrape up to 10 webpages in one request and return an array of URL/content pairs in Markdown format.|
7 | |scrape_as_html|Scrape a single webpage with advanced extraction and return the HTML response body. Handles sites protected by bot detection or CAPTCHA.|
8 | |extract|Scrape a webpage as Markdown and convert it to structured JSON using AI sampling, with an optional custom extraction prompt.|
9 | |session_stats|Report how many times each tool has been called during the current MCP session.|
10 | |web_data_amazon_product|Quickly read structured Amazon product data. Requires a valid product URL containing /dp/. Often faster and more reliable than scraping.|
11 | |web_data_amazon_product_reviews|Quickly read structured Amazon product review data. Requires a valid product URL containing /dp/. Often faster and more reliable than scraping.|
12 | |web_data_amazon_product_search|Retrieve structured Amazon search results. Requires a search keyword and Amazon domain URL; limited to the first page of results.|
13 | |web_data_walmart_product|Quickly read structured Walmart product data. Requires a product URL containing /ip/. Often faster and more reliable than scraping.|
14 | |web_data_walmart_seller|Quickly read structured Walmart seller data. Requires a valid Walmart seller URL. Often faster and more reliable than scraping.|
15 | |web_data_ebay_product|Quickly read structured eBay product data. Requires a valid eBay product URL. Often faster and more reliable than scraping.|
16 | |web_data_homedepot_products|Quickly read structured Home Depot product data. Requires a valid homedepot.com product URL. Often faster and more reliable than scraping.|
17 | |web_data_zara_products|Quickly read structured Zara product data. Requires a valid Zara product URL. Often faster and more reliable than scraping.|
18 | |web_data_etsy_products|Quickly read structured Etsy product data. Requires a valid Etsy product URL. Often faster and more reliable than scraping.|
19 | |web_data_bestbuy_products|Quickly read structured Best Buy product data. Requires a valid Best Buy product URL. Often faster and more reliable than scraping.|
20 | |web_data_linkedin_person_profile|Quickly read structured LinkedIn people profile data. Requires a valid LinkedIn profile URL. Often faster and more reliable than scraping.|
21 | |web_data_linkedin_company_profile|Quickly read structured LinkedIn company profile data. Requires a valid LinkedIn company URL. Often faster and more reliable than scraping.|
22 | |web_data_linkedin_job_listings|Quickly read structured LinkedIn job listings data. Requires a valid LinkedIn jobs URL or search URL. Often faster and more reliable than scraping.|
23 | |web_data_linkedin_posts|Quickly read structured LinkedIn posts data. Requires a valid LinkedIn post URL. Often faster and more reliable than scraping.|
24 | |web_data_linkedin_people_search|Quickly read structured LinkedIn people search data. Requires a LinkedIn people search URL. Often faster and more reliable than scraping.|
25 | |web_data_crunchbase_company|Quickly read structured Crunchbase company data. Requires a valid Crunchbase company URL. Often faster and more reliable than scraping.|
26 | |web_data_zoominfo_company_profile|Quickly read structured ZoomInfo company profile data. Requires a valid ZoomInfo company URL. Often faster and more reliable than scraping.|
27 | |web_data_instagram_profiles|Quickly read structured Instagram profile data. Requires a valid Instagram profile URL. Often faster and more reliable than scraping.|
28 | |web_data_instagram_posts|Quickly read structured Instagram post data. Requires a valid Instagram post URL. Often faster and more reliable than scraping.|
29 | |web_data_instagram_reels|Quickly read structured Instagram reel data. Requires a valid Instagram reel URL. Often faster and more reliable than scraping.|
30 | |web_data_instagram_comments|Quickly read structured Instagram comments data. Requires a valid Instagram URL. Often faster and more reliable than scraping.|
31 | |web_data_facebook_posts|Quickly read structured Facebook post data. Requires a valid Facebook post URL. Often faster and more reliable than scraping.|
32 | |web_data_facebook_marketplace_listings|Quickly read structured Facebook Marketplace listing data. Requires a valid Marketplace listing URL. Often faster and more reliable than scraping.|
33 | |web_data_facebook_company_reviews|Quickly read structured Facebook company reviews data. Requires a valid Facebook company URL and review count. Often faster and more reliable than scraping.|
34 | |web_data_facebook_events|Quickly read structured Facebook events data. Requires a valid Facebook event URL. Often faster and more reliable than scraping.|
35 | |web_data_tiktok_profiles|Quickly read structured TikTok profile data. Requires a valid TikTok profile URL. Often faster and more reliable than scraping.|
36 | |web_data_tiktok_posts|Quickly read structured TikTok post data. Requires a valid TikTok post URL. Often faster and more reliable than scraping.|
37 | |web_data_tiktok_shop|Quickly read structured TikTok Shop product data. Requires a valid TikTok Shop product URL. Often faster and more reliable than scraping.|
38 | |web_data_tiktok_comments|Quickly read structured TikTok comments data. Requires a valid TikTok video URL. Often faster and more reliable than scraping.|
39 | |web_data_google_maps_reviews|Quickly read structured Google Maps reviews data. Requires a valid Google Maps URL and optional days_limit (default 3). Often faster and more reliable than scraping.|
40 | |web_data_google_shopping|Quickly read structured Google Shopping product data. Requires a valid Google Shopping product URL. Often faster and more reliable than scraping.|
41 | |web_data_google_play_store|Quickly read structured Google Play Store app data. Requires a valid Play Store app URL. Often faster and more reliable than scraping.|
42 | |web_data_apple_app_store|Quickly read structured Apple App Store app data. Requires a valid App Store app URL. Often faster and more reliable than scraping.|
43 | |web_data_reuter_news|Quickly read structured Reuters news data. Requires a valid Reuters news article URL. Often faster and more reliable than scraping.|
44 | |web_data_github_repository_file|Quickly read structured GitHub repository file data. Requires a valid GitHub file URL. Often faster and more reliable than scraping.|
45 | |web_data_yahoo_finance_business|Quickly read structured Yahoo Finance company profile data. Requires a valid Yahoo Finance business URL. Often faster and more reliable than scraping.|
46 | |web_data_x_posts|Quickly read structured X (Twitter) post data. Requires a valid X post URL. Often faster and more reliable than scraping.|
47 | |web_data_zillow_properties_listing|Quickly read structured Zillow property listing data. Requires a valid Zillow listing URL. Often faster and more reliable than scraping.|
48 | |web_data_booking_hotel_listings|Quickly read structured Booking.com hotel listing data. Requires a valid Booking.com listing URL. Often faster and more reliable than scraping.|
49 | |web_data_youtube_profiles|Quickly read structured YouTube channel profile data. Requires a valid YouTube channel URL. Often faster and more reliable than scraping.|
50 | |web_data_youtube_comments|Quickly read structured YouTube comments data. Requires a valid YouTube video URL and optional num_of_comments (default 10). Often faster and more reliable than scraping.|
51 | |web_data_reddit_posts|Quickly read structured Reddit post data. Requires a valid Reddit post URL. Often faster and more reliable than scraping.|
52 | |web_data_youtube_videos|Quickly read structured YouTube video metadata. Requires a valid YouTube video URL. Often faster and more reliable than scraping.|
53 | |scraping_browser_navigate|Open or reuse a scraping-browser session and navigate to the provided URL, resetting tracked network requests.|
54 | |scraping_browser_go_back|Navigate the active scraping-browser session back to the previous page and report the new URL and title.|
55 | |scraping_browser_go_forward|Navigate the active scraping-browser session forward to the next page and report the new URL and title.|
56 | |scraping_browser_snapshot|Capture an ARIA snapshot of the current page listing interactive elements and their refs for later ref-based actions.|
57 | |scraping_browser_click_ref|Click an element using its ref from the latest ARIA snapshot; requires a ref and human-readable element description.|
58 | |scraping_browser_type_ref|Fill an element identified by ref from the ARIA snapshot, optionally pressing Enter to submit after typing.|
59 | |scraping_browser_screenshot|Capture a screenshot of the current page; supports optional full_page mode for full-length images.|
60 | |scraping_browser_network_requests|List the network requests recorded since page load with HTTP method, URL, and response status for debugging.|
61 | |scraping_browser_wait_for_ref|Wait until an element identified by ARIA ref becomes visible, with an optional timeout in milliseconds.|
62 | |scraping_browser_get_text|Return the text content of the current page's body element.|
63 | |scraping_browser_get_html|Return the HTML content of the current page; avoid the full_page option unless head or script tags are required.|
64 | |scraping_browser_scroll|Scroll to the bottom of the current page in the scraping-browser session.|
65 | |scraping_browser_scroll_to_ref|Scroll the page until the element referenced in the ARIA snapshot is in view.|
66 |
```
--------------------------------------------------------------------------------
/browser_tools.js:
--------------------------------------------------------------------------------
```javascript
1 | 'use strict'; /*jslint node:true es9:true*/
2 | import {UserError, imageContent as image_content} from 'fastmcp';
3 | import {z} from 'zod';
4 | import axios from 'axios';
5 | import {Browser_session} from './browser_session.js';
6 | let browser_zone = process.env.BROWSER_ZONE || 'mcp_browser';
7 |
8 | let open_session;
9 | const require_browser = async()=>{
10 | if (!open_session)
11 | {
12 | open_session = new Browser_session({
13 | cdp_endpoint: await calculate_cdp_endpoint(),
14 | });
15 | }
16 | return open_session;
17 | };
18 |
19 | const calculate_cdp_endpoint = async()=>{
20 | try {
21 | const status_response = await axios({
22 | url: 'https://api.brightdata.com/status',
23 | method: 'GET',
24 | headers: {authorization: `Bearer ${process.env.API_TOKEN}`},
25 | });
26 | const customer = status_response.data.customer;
27 | const password_response = await axios({
28 | url: `https://api.brightdata.com/zone/passwords?zone=${browser_zone}`,
29 | method: 'GET',
30 | headers: {authorization: `Bearer ${process.env.API_TOKEN}`},
31 | });
32 | const password = password_response.data.passwords[0];
33 |
34 | return `wss://brd-customer-${customer}-zone-${browser_zone}:`
35 | +`${password}@brd.superproxy.io:9222`;
36 | } catch(e){
37 | if (e.response?.status===422)
38 | throw new Error(`Browser zone '${browser_zone}' does not exist`);
39 | throw new Error(`Error retrieving browser credentials: ${e.message}`);
40 | }
41 | };
42 |
43 | let scraping_browser_navigate = {
44 | name: 'scraping_browser_navigate',
45 | description: 'Navigate a scraping browser session to a new URL',
46 | parameters: z.object({
47 | url: z.string().describe('The URL to navigate to'),
48 | }),
49 | execute: async({url})=>{
50 | const browser_session = await require_browser();
51 | const page = await browser_session.get_page({url});
52 | await browser_session.clear_requests();
53 | try {
54 | await page.goto(url, {
55 | timeout: 120000,
56 | waitUntil: 'domcontentloaded',
57 | });
58 | return [
59 | `Successfully navigated to ${url}`,
60 | `Title: ${await page.title()}`,
61 | `URL: ${page.url()}`,
62 | ].join('\n');
63 | } catch(e){
64 | throw new UserError(`Error navigating to ${url}: ${e}`);
65 | }
66 | },
67 | };
68 |
69 | let scraping_browser_go_back = {
70 | name: 'scraping_browser_go_back',
71 | description: 'Go back to the previous page',
72 | parameters: z.object({}),
73 | execute: async()=>{
74 | const page = await (await require_browser()).get_page();
75 | try {
76 | await page.goBack();
77 | return [
78 | 'Successfully navigated back',
79 | `Title: ${await page.title()}`,
80 | `URL: ${page.url()}`,
81 | ].join('\n');
82 | } catch(e){
83 | throw new UserError(`Error navigating back: ${e}`);
84 | }
85 | },
86 | };
87 |
88 | const scraping_browser_go_forward = {
89 | name: 'scraping_browser_go_forward',
90 | description: 'Go forward to the next page',
91 | parameters: z.object({}),
92 | execute: async()=>{
93 | const page = await (await require_browser()).get_page();
94 | try {
95 | await page.goForward();
96 | return [
97 | 'Successfully navigated forward',
98 | `Title: ${await page.title()}`,
99 | `URL: ${page.url()}`,
100 | ].join('\n');
101 | } catch(e){
102 | throw new UserError(`Error navigating forward: ${e}`);
103 | }
104 | },
105 | };
106 |
107 | let scraping_browser_snapshot = {
108 | name: 'scraping_browser_snapshot',
109 | description: [
110 | 'Capture an ARIA snapshot of the current page showing all interactive '
111 | +'elements with their refs.',
112 | 'This provides accurate element references that can be used with '
113 | +'ref-based tools.',
114 | 'Use this before interacting with elements to get proper refs instead '
115 | +'of guessing selectors.'
116 | ].join('\n'),
117 | parameters: z.object({}),
118 | execute: async()=>{
119 | const browser_session = await require_browser();
120 | try {
121 | const snapshot = await browser_session.capture_snapshot();
122 | return [
123 | `Page: ${snapshot.url}`,
124 | `Title: ${snapshot.title}`,
125 | '',
126 | 'Interactive Elements:',
127 | snapshot.aria_snapshot
128 | ].join('\n');
129 | } catch(e){
130 | throw new UserError(`Error capturing snapshot: ${e}`);
131 | }
132 | },
133 | };
134 |
135 | let scraping_browser_click_ref = {
136 | name: 'scraping_browser_click_ref',
137 | description: [
138 | 'Click on an element using its ref from the ARIA snapshot.',
139 | 'Use scraping_browser_snapshot first to get the correct ref values.',
140 | 'This is more reliable than CSS selectors.'
141 | ].join('\n'),
142 | parameters: z.object({
143 | ref: z.string().describe('The ref attribute from the ARIA snapshot (e.g., "23")'),
144 | element: z.string().describe('Description of the element being clicked for context'),
145 | }),
146 | execute: async({ref, element})=>{
147 | const browser_session = await require_browser();
148 | try {
149 | const locator = await browser_session.ref_locator({element, ref});
150 | await locator.click({timeout: 5000});
151 | return `Successfully clicked element: ${element} (ref=${ref})`;
152 | } catch(e){
153 | throw new UserError(`Error clicking element ${element} with ref ${ref}: ${e}`);
154 | }
155 | },
156 | };
157 |
158 | let scraping_browser_type_ref = {
159 | name: 'scraping_browser_type_ref',
160 | description: [
161 | 'Type text into an element using its ref from the ARIA snapshot.',
162 | 'Use scraping_browser_snapshot first to get the correct ref values.',
163 | 'This is more reliable than CSS selectors.'
164 | ].join('\n'),
165 | parameters: z.object({
166 | ref: z.string().describe('The ref attribute from the ARIA snapshot (e.g., "23")'),
167 | element: z.string().describe('Description of the element being typed into for context'),
168 | text: z.string().describe('Text to type'),
169 | submit: z.boolean().optional()
170 | .describe('Whether to submit the form after typing (press Enter)'),
171 | }),
172 | execute: async({ref, element, text, submit})=>{
173 | const browser_session = await require_browser();
174 | try {
175 | const locator = await browser_session.ref_locator({element, ref});
176 | await locator.fill(text);
177 | if (submit)
178 | await locator.press('Enter');
179 | const suffix = submit ? ' and submitted the form' : '';
180 | return 'Successfully typed "'+text+'" into element: '+element
181 | +' (ref='+ref+')'+suffix;
182 | } catch(e){
183 | throw new UserError(`Error typing into element ${element} with ref ${ref}: ${e}`);
184 | }
185 | },
186 | };
187 |
188 | let scraping_browser_screenshot = {
189 | name: 'scraping_browser_screenshot',
190 | description: 'Take a screenshot of the current page',
191 | parameters: z.object({
192 | full_page: z.boolean().optional().describe([
193 | 'Whether to screenshot the full page (default: false)',
194 | 'You should avoid fullscreen if it\'s not important, since the '
195 | +'images can be quite large',
196 | ].join('\n')),
197 | }),
198 | execute: async({full_page=false})=>{
199 | const page = await (await require_browser()).get_page();
200 | try {
201 | const buffer = await page.screenshot({fullPage: full_page});
202 | return image_content({buffer});
203 | } catch(e){
204 | throw new UserError(`Error taking screenshot: ${e}`);
205 | }
206 | },
207 | };
208 |
209 | let scraping_browser_get_html = {
210 | name: 'scraping_browser_get_html',
211 | description: 'Get the HTML content of the current page. Avoid using this '
212 | +'tool and if used, use full_page option unless it is important to see '
213 | +'things like script tags since this can be large',
214 | parameters: z.object({
215 | full_page: z.boolean().optional().describe([
216 | 'Whether to get the full page HTML including head and script tags',
217 | 'Avoid this if you only need the extra HTML, since it can be '
218 | +'quite large',
219 | ].join('\n')),
220 | }),
221 | execute: async({full_page=false})=>{
222 | const page = await (await require_browser()).get_page();
223 | try {
224 | if (!full_page)
225 | return await page.$eval('body', body=>body.innerHTML);
226 | const html = await page.content();
227 | if (!full_page && html)
228 | return html.split('<body>')[1].split('</body>')[0];
229 | return html;
230 | } catch(e){
231 | throw new UserError(`Error getting HTML content: ${e}`);
232 | }
233 | },
234 | };
235 |
236 | let scraping_browser_get_text = {
237 | name: 'scraping_browser_get_text',
238 | description: 'Get the text content of the current page',
239 | parameters: z.object({}),
240 | execute: async()=>{
241 | const page = await (await require_browser()).get_page();
242 | try { return await page.$eval('body', body=>body.innerText); }
243 | catch(e){ throw new UserError(`Error getting text content: ${e}`); }
244 | },
245 | };
246 |
247 | let scraping_browser_scroll = {
248 | name: 'scraping_browser_scroll',
249 | description: 'Scroll to the bottom of the current page',
250 | parameters: z.object({}),
251 | execute: async()=>{
252 | const page = await (await require_browser()).get_page();
253 | try {
254 | await page.evaluate(()=>{
255 | window.scrollTo(0, document.body.scrollHeight);
256 | });
257 | return 'Successfully scrolled to the bottom of the page';
258 | } catch(e){
259 | throw new UserError(`Error scrolling page: ${e}`);
260 | }
261 | },
262 | };
263 |
264 | let scraping_browser_scroll_to_ref = {
265 | name: 'scraping_browser_scroll_to_ref',
266 | description: [
267 | 'Scroll to a specific element using its ref from the ARIA snapshot.',
268 | 'Use scraping_browser_snapshot first to get the correct ref values.',
269 | 'This is more reliable than CSS selectors.'
270 | ].join('\n'),
271 | parameters: z.object({
272 | ref: z.string().describe('The ref attribute from the ARIA snapshot (e.g., "23")'),
273 | element: z.string().describe('Description of the element to scroll to'),
274 | }),
275 | execute: async({ref, element})=>{
276 | const browser_session = await require_browser();
277 | try {
278 | const locator = await browser_session.ref_locator({element, ref});
279 | await locator.scrollIntoViewIfNeeded();
280 | return `Successfully scrolled to element: ${element} (ref=${ref})`;
281 | } catch(e){
282 | throw new UserError(`Error scrolling to element ${element} with `
283 | +`ref ${ref}: ${e}`);
284 | }
285 | },
286 | };
287 |
288 | let scraping_browser_network_requests = {
289 | name: 'scraping_browser_network_requests',
290 | description: [
291 | 'Get all network requests made since loading the current page.',
292 | 'Shows HTTP method, URL, status code and status text for each request.',
293 | 'Useful for debugging API calls, tracking data fetching, and '
294 | +'understanding page behavior.'
295 | ].join('\n'),
296 | parameters: z.object({}),
297 | execute: async()=>{
298 | const browser_session = await require_browser();
299 | try {
300 | const requests = await browser_session.get_requests();
301 | if (requests.size==0)
302 | return 'No network requests recorded for the current page.';
303 |
304 | const results = [];
305 | requests.forEach((response, request)=>{
306 | const result = [];
307 | result.push(`[${request.method().toUpperCase()}] ${request.url()}`);
308 | if (response)
309 | result.push(`=> [${response.status()}] ${response.statusText()}`);
310 |
311 | results.push(result.join(' '));
312 | });
313 |
314 | return [
315 | `Network Requests (${results.length} total):`,
316 | '',
317 | ...results
318 | ].join('\n');
319 | } catch(e){
320 | throw new UserError(`Error getting network requests: ${e}`);
321 | }
322 | },
323 | };
324 |
325 | let scraping_browser_wait_for_ref = {
326 | name: 'scraping_browser_wait_for_ref',
327 | description: [
328 | 'Wait for an element to be visible using its ref from the ARIA snapshot.',
329 | 'Use scraping_browser_snapshot first to get the correct ref values.',
330 | 'This is more reliable than CSS selectors.'
331 | ].join('\n'),
332 | parameters: z.object({
333 | ref: z.string().describe('The ref attribute from the ARIA snapshot (e.g., "23")'),
334 | element: z.string().describe('Description of the element being waited for'),
335 | timeout: z.number().optional()
336 | .describe('Maximum time to wait in milliseconds (default: 30000)'),
337 | }),
338 | execute: async({ref, element, timeout})=>{
339 | const browser_session = await require_browser();
340 | try {
341 | const locator = await browser_session.ref_locator({element, ref});
342 | await locator.waitFor({timeout: timeout || 30000});
343 | return `Successfully waited for element: ${element} (ref=${ref})`;
344 | } catch(e){
345 | throw new UserError(`Error waiting for element ${element} with ref ${ref}: ${e}`);
346 | }
347 | },
348 | };
349 |
350 | export const tools = [
351 | scraping_browser_navigate,
352 | scraping_browser_go_back,
353 | scraping_browser_go_forward,
354 | scraping_browser_snapshot,
355 | scraping_browser_click_ref,
356 | scraping_browser_type_ref,
357 | scraping_browser_screenshot,
358 | scraping_browser_network_requests,
359 | scraping_browser_wait_for_ref,
360 | scraping_browser_get_text,
361 | scraping_browser_get_html,
362 | scraping_browser_scroll,
363 | scraping_browser_scroll_to_ref,
364 | ];
365 |
```
--------------------------------------------------------------------------------
/server.js:
--------------------------------------------------------------------------------
```javascript
1 | #!/usr/bin/env node
2 | 'use strict'; /*jslint node:true es9:true*/
3 | import {FastMCP} from 'fastmcp';
4 | import {z} from 'zod';
5 | import axios from 'axios';
6 | import {tools as browser_tools} from './browser_tools.js';
7 | import {createRequire} from 'node:module';
8 | const require = createRequire(import.meta.url);
9 | const package_json = require('./package.json');
10 | const api_token = process.env.API_TOKEN;
11 | const unlocker_zone = process.env.WEB_UNLOCKER_ZONE || 'mcp_unlocker';
12 | const browser_zone = process.env.BROWSER_ZONE || 'mcp_browser';
13 | const pro_mode = process.env.PRO_MODE === 'true';
14 | const pro_mode_tools = ['search_engine', 'scrape_as_markdown',
15 | 'search_engine_batch', 'scrape_batch'];
16 | function parse_rate_limit(rate_limit_str) {
17 | if (!rate_limit_str)
18 | return null;
19 |
20 | const match = rate_limit_str.match(/^(\d+)\/(\d+)([mhs])$/);
21 | if (!match)
22 | throw new Error('Invalid RATE_LIMIT format. Use: 100/1h or 50/30m');
23 |
24 | const [, limit, time, unit] = match;
25 | const multiplier = unit==='h' ? 3600 : unit==='m' ? 60 : 1;
26 |
27 | return {
28 | limit: parseInt(limit),
29 | window: parseInt(time) * multiplier * 1000,
30 | display: rate_limit_str
31 | };
32 | }
33 |
34 | const rate_limit_config = parse_rate_limit(process.env.RATE_LIMIT);
35 |
36 | if (!api_token)
37 | throw new Error('Cannot run MCP server without API_TOKEN env');
38 |
39 | const api_headers = (clientName=null)=>({
40 | 'user-agent': `${package_json.name}/${package_json.version}`,
41 | authorization: `Bearer ${api_token}`,
42 | ...(clientName ? {'x-mcp-client-name': clientName} : {}),
43 | });
44 |
45 | function check_rate_limit(){
46 | if (!rate_limit_config)
47 | return true;
48 |
49 | const now = Date.now();
50 | const window_start = now - rate_limit_config.window;
51 |
52 | debug_stats.call_timestamps = debug_stats.call_timestamps.filter(timestamp=>timestamp>window_start);
53 |
54 | if (debug_stats.call_timestamps.length>=rate_limit_config.limit)
55 | throw new Error(`Rate limit exceeded: ${rate_limit_config.display}`);
56 |
57 | debug_stats.call_timestamps.push(now);
58 | return true;
59 | }
60 |
61 | async function ensure_required_zones(){
62 | try {
63 | console.error('Checking for required zones...');
64 | let response = await axios({
65 | url: 'https://api.brightdata.com/zone/get_active_zones',
66 | method: 'GET',
67 | headers: api_headers(),
68 | });
69 | let zones = response.data || [];
70 | let has_unlocker_zone = zones.some(zone=>zone.name==unlocker_zone);
71 | let has_browser_zone = zones.some(zone=>zone.name==browser_zone);
72 |
73 | if (!has_unlocker_zone)
74 | {
75 | console.error(`Required zone "${unlocker_zone}" not found, `
76 | +`creating it...`);
77 | await axios({
78 | url: 'https://api.brightdata.com/zone',
79 | method: 'POST',
80 | headers: {
81 | ...api_headers(),
82 | 'Content-Type': 'application/json',
83 | },
84 | data: {
85 | zone: {name: unlocker_zone, type: 'unblocker'},
86 | plan: {type: 'unblocker'},
87 | },
88 | });
89 | console.error(`Zone "${unlocker_zone}" created successfully`);
90 | }
91 | else
92 | console.error(`Required zone "${unlocker_zone}" already exists`);
93 |
94 | if (!has_browser_zone)
95 | {
96 | console.error(`Required zone "${browser_zone}" not found, `
97 | +`creating it...`);
98 | await axios({
99 | url: 'https://api.brightdata.com/zone',
100 | method: 'POST',
101 | headers: {
102 | ...api_headers(),
103 | 'Content-Type': 'application/json',
104 | },
105 | data: {
106 | zone: {name: browser_zone, type: 'browser_api'},
107 | plan: {type: 'browser_api'},
108 | },
109 | });
110 | console.error(`Zone "${browser_zone}" created successfully`);
111 | }
112 | else
113 | console.error(`Required zone "${browser_zone}" already exists`);
114 | } catch(e){
115 | console.error('Error checking/creating zones:',
116 | e.response?.data||e.message);
117 | }
118 | }
119 |
120 | await ensure_required_zones();
121 |
122 | let server = new FastMCP({
123 | name: 'Bright Data',
124 | version: package_json.version,
125 | });
126 | let debug_stats = {tool_calls: {}, session_calls: 0, call_timestamps: []};
127 |
128 | const addTool = (tool) => {
129 | if (!pro_mode && !pro_mode_tools.includes(tool.name))
130 | return;
131 | server.addTool(tool);
132 | };
133 |
134 | addTool({
135 | name: 'search_engine',
136 | description: 'Scrape search results from Google, Bing or Yandex. Returns '
137 | +'SERP results in JSON or Markdown (URL, title, description), Ideal for'
138 | +'gathering current information, news, and detailed search results.',
139 | parameters: z.object({
140 | query: z.string(),
141 | engine: z.enum(['google', 'bing', 'yandex'])
142 | .optional()
143 | .default('google'),
144 | cursor: z.string()
145 | .optional()
146 | .describe('Pagination cursor for next page'),
147 | }),
148 | execute: tool_fn('search_engine', async ({query, engine, cursor}, ctx)=>{
149 | const is_google = engine=='google';
150 | const url = search_url(engine, query, cursor);
151 | let response = await axios({
152 | url: 'https://api.brightdata.com/request',
153 | method: 'POST',
154 | data: {
155 | url: url,
156 | zone: unlocker_zone,
157 | format: 'raw',
158 | data_format: is_google ? 'parsed' : 'markdown',
159 | },
160 | headers: api_headers(ctx.clientName),
161 | responseType: 'text',
162 | });
163 | if (!is_google)
164 | return response.data;
165 | try {
166 | const searchData = JSON.parse(response.data);
167 | return JSON.stringify({
168 | organic: searchData.organic || [],
169 | images: searchData.images
170 | ? searchData.images.map(img=>img.link) : [],
171 | current_page: searchData.pagination.current_page || {},
172 | related: searchData.related || [],
173 | ai_overview: searchData.ai_overview || null,
174 | });
175 | } catch(e){
176 | return JSON.stringify({
177 | organic: [],
178 | images: [],
179 | pagination: {},
180 | related: [],
181 | });
182 | }
183 | }),
184 | });
185 |
186 | addTool({
187 | name: 'scrape_as_markdown',
188 | description: 'Scrape a single webpage URL with advanced options for '
189 | +'content extraction and get back the results in MarkDown language. '
190 | +'This tool can unlock any webpage even if it uses bot detection or '
191 | +'CAPTCHA.',
192 | parameters: z.object({url: z.string().url()}),
193 | execute: tool_fn('scrape_as_markdown', async({url}, ctx)=>{
194 | let response = await axios({
195 | url: 'https://api.brightdata.com/request',
196 | method: 'POST',
197 | data: {
198 | url,
199 | zone: unlocker_zone,
200 | format: 'raw',
201 | data_format: 'markdown',
202 | },
203 | headers: api_headers(ctx.clientName),
204 | responseType: 'text',
205 | });
206 | return response.data;
207 | }),
208 | });
209 |
210 | addTool({
211 | name: 'search_engine_batch',
212 | description: 'Run multiple search queries simultaneously. Returns '
213 | +'JSON for Google, Markdown for Bing/Yandex.',
214 | parameters: z.object({
215 | queries: z.array(z.object({
216 | query: z.string(),
217 | engine: z.enum(['google', 'bing', 'yandex'])
218 | .optional()
219 | .default('google'),
220 | cursor: z.string()
221 | .optional(),
222 | })).min(1).max(10),
223 | }),
224 | execute: tool_fn('search_engine_batch', async ({queries}, ctx)=>{
225 | const search_promises = queries.map(({query, engine, cursor})=>{
226 | const is_google = (engine || 'google') === 'google';
227 | const url = is_google
228 | ? `${search_url(engine || 'google', query, cursor)}&brd_json=1`
229 | : search_url(engine || 'google', query, cursor);
230 |
231 | return axios({
232 | url: 'https://api.brightdata.com/request',
233 | method: 'POST',
234 | data: {
235 | url,
236 | zone: unlocker_zone,
237 | format: 'raw',
238 | data_format: is_google ? undefined : 'markdown',
239 | },
240 | headers: api_headers(ctx.clientName),
241 | responseType: 'text',
242 | }).then(response => {
243 | if (is_google) {
244 | const search_data = JSON.parse(response.data);
245 | return {
246 | query,
247 | engine: engine || 'google',
248 | result: {
249 | organic: search_data.organic || [],
250 | images: search_data.images ? search_data.images.map(img => img.link) : [],
251 | current_page: search_data.pagination?.current_page || {},
252 | related: search_data.related || [],
253 | ai_overview: search_data.ai_overview || null
254 | }
255 | };
256 | }
257 | return {
258 | query,
259 | engine: engine || 'google',
260 | result: response.data
261 | };
262 | });
263 | });
264 |
265 | const results = await Promise.all(search_promises);
266 | return JSON.stringify(results, null, 2);
267 | }),
268 | });
269 |
270 | addTool({
271 | name: 'scrape_batch',
272 | description: 'Scrape multiple webpages URLs with advanced options for '
273 | +'content extraction and get back the results in MarkDown language. '
274 | +'This tool can unlock any webpage even if it uses bot detection or '
275 | +'CAPTCHA.',
276 | parameters: z.object({
277 | urls: z.array(z.string().url()).min(1).max(10).describe('Array of URLs to scrape (max 10)')
278 | }),
279 | execute: tool_fn('scrape_batch', async ({urls}, ctx)=>{
280 | const scrapePromises = urls.map(url =>
281 | axios({
282 | url: 'https://api.brightdata.com/request',
283 | method: 'POST',
284 | data: {
285 | url,
286 | zone: unlocker_zone,
287 | format: 'raw',
288 | data_format: 'markdown',
289 | },
290 | headers: api_headers(ctx.clientName),
291 | responseType: 'text',
292 | }).then(response => ({
293 | url,
294 | content: response.data
295 | }))
296 | );
297 |
298 | const results = await Promise.all(scrapePromises);
299 | return JSON.stringify(results, null, 2);
300 | }),
301 | });
302 |
303 | addTool({
304 | name: 'scrape_as_html',
305 | description: 'Scrape a single webpage URL with advanced options for '
306 | +'content extraction and get back the results in HTML. '
307 | +'This tool can unlock any webpage even if it uses bot detection or '
308 | +'CAPTCHA.',
309 | parameters: z.object({url: z.string().url()}),
310 | execute: tool_fn('scrape_as_html', async({url}, ctx)=>{
311 | let response = await axios({
312 | url: 'https://api.brightdata.com/request',
313 | method: 'POST',
314 | data: {
315 | url,
316 | zone: unlocker_zone,
317 | format: 'raw',
318 | },
319 | headers: api_headers(ctx.clientName),
320 | responseType: 'text',
321 | });
322 | return response.data;
323 | }),
324 | });
325 |
326 | addTool({
327 | name: 'extract',
328 | description: 'Scrape a webpage and extract structured data as JSON. '
329 | + 'First scrapes the page as markdown, then uses AI sampling to convert '
330 | + 'it to structured JSON format. This tool can unlock any webpage even '
331 | + 'if it uses bot detection or CAPTCHA.',
332 | parameters: z.object({
333 | url: z.string().url(),
334 | extraction_prompt: z.string().optional().describe(
335 | 'Custom prompt to guide the extraction process. If not provided, '
336 | + 'will extract general structured data from the page.'
337 | ),
338 | }),
339 | execute: tool_fn('extract', async ({ url, extraction_prompt }, ctx) => {
340 | let scrape_response = await axios({
341 | url: 'https://api.brightdata.com/request',
342 | method: 'POST',
343 | data: {
344 | url,
345 | zone: unlocker_zone,
346 | format: 'raw',
347 | data_format: 'markdown',
348 | },
349 | headers: api_headers(ctx.clientName),
350 | responseType: 'text',
351 | });
352 |
353 | let markdown_content = scrape_response.data;
354 |
355 | let system_prompt = 'You are a data extraction specialist. You MUST respond with ONLY valid JSON, no other text or formatting. '
356 | + 'Extract the requested information from the markdown content and return it as a properly formatted JSON object. '
357 | + 'Do not include any explanations, markdown formatting, or text outside the JSON response.';
358 |
359 | let user_prompt = extraction_prompt ||
360 | 'Extract the requested information from this markdown content and return ONLY a JSON object:';
361 |
362 | let session = server.sessions[0]; // Get the first active session
363 | if (!session) throw new Error('No active session available for sampling');
364 |
365 | let sampling_response = await session.requestSampling({
366 | messages: [
367 | {
368 | role: "user",
369 | content: {
370 | type: "text",
371 | text: `${user_prompt}\n\nMarkdown content:\n${markdown_content}\n\nRemember: Respond with ONLY valid JSON, no other text.`,
372 | },
373 | },
374 | ],
375 | systemPrompt: system_prompt,
376 | includeContext: "thisServer",
377 | });
378 |
379 | return sampling_response.content.text;
380 | }),
381 | });
382 |
383 | addTool({
384 | name: 'session_stats',
385 | description: 'Tell the user about the tool usage during this session',
386 | parameters: z.object({}),
387 | execute: tool_fn('session_stats', async()=>{
388 | let used_tools = Object.entries(debug_stats.tool_calls);
389 | let lines = ['Tool calls this session:'];
390 | for (let [name, calls] of used_tools)
391 | lines.push(`- ${name} tool: called ${calls} times`);
392 | return lines.join('\n');
393 | }),
394 | });
395 |
396 | const datasets = [{
397 | id: 'amazon_product',
398 | dataset_id: 'gd_l7q7dkf244hwjntr0',
399 | description: [
400 | 'Quickly read structured amazon product data.',
401 | 'Requires a valid product URL with /dp/ in it.',
402 | 'This can be a cache lookup, so it can be more reliable than scraping',
403 | ].join('\n'),
404 | inputs: ['url'],
405 | }, {
406 | id: 'amazon_product_reviews',
407 | dataset_id: 'gd_le8e811kzy4ggddlq',
408 | description: [
409 | 'Quickly read structured amazon product review data.',
410 | 'Requires a valid product URL with /dp/ in it.',
411 | 'This can be a cache lookup, so it can be more reliable than scraping',
412 | ].join('\n'),
413 | inputs: ['url'],
414 | }, {
415 | id: 'amazon_product_search',
416 | dataset_id: 'gd_lwdb4vjm1ehb499uxs',
417 | description: [
418 | 'Quickly read structured amazon product search data.',
419 | 'Requires a valid search keyword and amazon domain URL.',
420 | 'This can be a cache lookup, so it can be more reliable than scraping',
421 | ].join('\n'),
422 | inputs: ['keyword', 'url'],
423 | fixed_values: {pages_to_search: '1'},
424 | }, {
425 | id: 'walmart_product',
426 | dataset_id: 'gd_l95fol7l1ru6rlo116',
427 | description: [
428 | 'Quickly read structured walmart product data.',
429 | 'Requires a valid product URL with /ip/ in it.',
430 | 'This can be a cache lookup, so it can be more reliable than scraping',
431 | ].join('\n'),
432 | inputs: ['url'],
433 | }, {
434 | id: 'walmart_seller',
435 | dataset_id: 'gd_m7ke48w81ocyu4hhz0',
436 | description: [
437 | 'Quickly read structured walmart seller data.',
438 | 'Requires a valid walmart seller URL.',
439 | 'This can be a cache lookup, so it can be more reliable than scraping',
440 | ].join('\n'),
441 | inputs: ['url'],
442 | }, {
443 | id: 'ebay_product',
444 | dataset_id: 'gd_ltr9mjt81n0zzdk1fb',
445 | description: [
446 | 'Quickly read structured ebay product data.',
447 | 'Requires a valid ebay product URL.',
448 | 'This can be a cache lookup, so it can be more reliable than scraping',
449 | ].join('\n'),
450 | inputs: ['url'],
451 | }, {
452 | id: 'homedepot_products',
453 | dataset_id: 'gd_lmusivh019i7g97q2n',
454 | description: [
455 | 'Quickly read structured homedepot product data.',
456 | 'Requires a valid homedepot product URL.',
457 | 'This can be a cache lookup, so it can be more reliable than scraping',
458 | ].join('\n'),
459 | inputs: ['url'],
460 | }, {
461 | id: 'zara_products',
462 | dataset_id: 'gd_lct4vafw1tgx27d4o0',
463 | description: [
464 | 'Quickly read structured zara product data.',
465 | 'Requires a valid zara product URL.',
466 | 'This can be a cache lookup, so it can be more reliable than scraping',
467 | ].join('\n'),
468 | inputs: ['url'],
469 | }, {
470 | id: 'etsy_products',
471 | dataset_id: 'gd_ltppk0jdv1jqz25mz',
472 | description: [
473 | 'Quickly read structured etsy product data.',
474 | 'Requires a valid etsy product URL.',
475 | 'This can be a cache lookup, so it can be more reliable than scraping',
476 | ].join('\n'),
477 | inputs: ['url'],
478 | }, {
479 | id: 'bestbuy_products',
480 | dataset_id: 'gd_ltre1jqe1jfr7cccf',
481 | description: [
482 | 'Quickly read structured bestbuy product data.',
483 | 'Requires a valid bestbuy product URL.',
484 | 'This can be a cache lookup, so it can be more reliable than scraping',
485 | ].join('\n'),
486 | inputs: ['url'],
487 | }, {
488 | id: 'linkedin_person_profile',
489 | dataset_id: 'gd_l1viktl72bvl7bjuj0',
490 | description: [
491 | 'Quickly read structured linkedin people profile data.',
492 | 'This can be a cache lookup, so it can be more reliable than scraping',
493 | ].join('\n'),
494 | inputs: ['url'],
495 | }, {
496 | id: 'linkedin_company_profile',
497 | dataset_id: 'gd_l1vikfnt1wgvvqz95w',
498 | description: [
499 | 'Quickly read structured linkedin company profile data',
500 | 'This can be a cache lookup, so it can be more reliable than scraping',
501 | ].join('\n'),
502 | inputs: ['url'],
503 | }, {
504 | id: 'linkedin_job_listings',
505 | dataset_id: 'gd_lpfll7v5hcqtkxl6l',
506 | description: [
507 | 'Quickly read structured linkedin job listings data',
508 | 'This can be a cache lookup, so it can be more reliable than scraping',
509 | ].join('\n'),
510 | inputs: ['url'],
511 | }, {
512 | id: 'linkedin_posts',
513 | dataset_id: 'gd_lyy3tktm25m4avu764',
514 | description: [
515 | 'Quickly read structured linkedin posts data',
516 | 'This can be a cache lookup, so it can be more reliable than scraping',
517 | ].join('\n'),
518 | inputs: ['url'],
519 | }, {
520 | id: 'linkedin_people_search',
521 | dataset_id: 'gd_m8d03he47z8nwb5xc',
522 | description: [
523 | 'Quickly read structured linkedin people search data',
524 | 'This can be a cache lookup, so it can be more reliable than scraping',
525 | ].join('\n'),
526 | inputs: ['url', 'first_name', 'last_name'],
527 | }, {
528 | id: 'crunchbase_company',
529 | dataset_id: 'gd_l1vijqt9jfj7olije',
530 | description: [
531 | 'Quickly read structured crunchbase company data',
532 | 'This can be a cache lookup, so it can be more reliable than scraping',
533 | ].join('\n'),
534 | inputs: ['url'],
535 | },
536 | {
537 | id: 'zoominfo_company_profile',
538 | dataset_id: 'gd_m0ci4a4ivx3j5l6nx',
539 | description: [
540 | 'Quickly read structured ZoomInfo company profile data.',
541 | 'Requires a valid ZoomInfo company URL.',
542 | 'This can be a cache lookup, so it can be more reliable than scraping',
543 | ].join('\n'),
544 | inputs: ['url'],
545 | },
546 | {
547 | id: 'instagram_profiles',
548 | dataset_id: 'gd_l1vikfch901nx3by4',
549 | description: [
550 | 'Quickly read structured Instagram profile data.',
551 | 'Requires a valid Instagram URL.',
552 | 'This can be a cache lookup, so it can be more reliable than scraping',
553 | ].join('\n'),
554 | inputs: ['url'],
555 | },
556 | {
557 | id: 'instagram_posts',
558 | dataset_id: 'gd_lk5ns7kz21pck8jpis',
559 | description: [
560 | 'Quickly read structured Instagram post data.',
561 | 'Requires a valid Instagram URL.',
562 | 'This can be a cache lookup, so it can be more reliable than scraping',
563 | ].join('\n'),
564 | inputs: ['url'],
565 | },
566 | {
567 | id: 'instagram_reels',
568 | dataset_id: 'gd_lyclm20il4r5helnj',
569 | description: [
570 | 'Quickly read structured Instagram reel data.',
571 | 'Requires a valid Instagram URL.',
572 | 'This can be a cache lookup, so it can be more reliable than scraping',
573 | ].join('\n'),
574 | inputs: ['url'],
575 | },
576 | {
577 | id: 'instagram_comments',
578 | dataset_id: 'gd_ltppn085pokosxh13',
579 | description: [
580 | 'Quickly read structured Instagram comments data.',
581 | 'Requires a valid Instagram URL.',
582 | 'This can be a cache lookup, so it can be more reliable than scraping',
583 | ].join('\n'),
584 | inputs: ['url'],
585 | },
586 | {
587 | id: 'facebook_posts',
588 | dataset_id: 'gd_lyclm1571iy3mv57zw',
589 | description: [
590 | 'Quickly read structured Facebook post data.',
591 | 'Requires a valid Facebook post URL.',
592 | 'This can be a cache lookup, so it can be more reliable than scraping',
593 | ].join('\n'),
594 | inputs: ['url'],
595 | },
596 | {
597 | id: 'facebook_marketplace_listings',
598 | dataset_id: 'gd_lvt9iwuh6fbcwmx1a',
599 | description: [
600 | 'Quickly read structured Facebook marketplace listing data.',
601 | 'Requires a valid Facebook marketplace listing URL.',
602 | 'This can be a cache lookup, so it can be more reliable than scraping',
603 | ].join('\n'),
604 | inputs: ['url'],
605 | },
606 | {
607 | id: 'facebook_company_reviews',
608 | dataset_id: 'gd_m0dtqpiu1mbcyc2g86',
609 | description: [
610 | 'Quickly read structured Facebook company reviews data.',
611 | 'Requires a valid Facebook company URL and number of reviews.',
612 | 'This can be a cache lookup, so it can be more reliable than scraping',
613 | ].join('\n'),
614 | inputs: ['url', 'num_of_reviews'],
615 | }, {
616 | id: 'facebook_events',
617 | dataset_id: 'gd_m14sd0to1jz48ppm51',
618 | description: [
619 | 'Quickly read structured Facebook events data.',
620 | 'Requires a valid Facebook event URL.',
621 | 'This can be a cache lookup, so it can be more reliable than scraping',
622 | ].join('\n'),
623 | inputs: ['url'],
624 | }, {
625 | id: 'tiktok_profiles',
626 | dataset_id: 'gd_l1villgoiiidt09ci',
627 | description: [
628 | 'Quickly read structured Tiktok profiles data.',
629 | 'Requires a valid Tiktok profile URL.',
630 | 'This can be a cache lookup, so it can be more reliable than scraping',
631 | ].join('\n'),
632 | inputs: ['url'],
633 | }, {
634 | id: 'tiktok_posts',
635 | dataset_id: 'gd_lu702nij2f790tmv9h',
636 | description: [
637 | 'Quickly read structured Tiktok post data.',
638 | 'Requires a valid Tiktok post URL.',
639 | 'This can be a cache lookup, so it can be more reliable than scraping',
640 | ].join('\n'),
641 | inputs: ['url'],
642 | }, {
643 | id: 'tiktok_shop',
644 | dataset_id: 'gd_m45m1u911dsa4274pi',
645 | description: [
646 | 'Quickly read structured Tiktok shop data.',
647 | 'Requires a valid Tiktok shop product URL.',
648 | 'This can be a cache lookup, so it can be more reliable than scraping',
649 | ].join('\n'),
650 | inputs: ['url'],
651 | }, {
652 | id: 'tiktok_comments',
653 | dataset_id: 'gd_lkf2st302ap89utw5k',
654 | description: [
655 | 'Quickly read structured Tiktok comments data.',
656 | 'Requires a valid Tiktok video URL.',
657 | 'This can be a cache lookup, so it can be more reliable than scraping',
658 | ].join('\n'),
659 | inputs: ['url'],
660 | }, {
661 | id: 'google_maps_reviews',
662 | dataset_id: 'gd_luzfs1dn2oa0teb81',
663 | description: [
664 | 'Quickly read structured Google maps reviews data.',
665 | 'Requires a valid Google maps URL.',
666 | 'This can be a cache lookup, so it can be more reliable than scraping',
667 | ].join('\n'),
668 | inputs: ['url', 'days_limit'],
669 | defaults: {days_limit: '3'},
670 | }, {
671 | id: 'google_shopping',
672 | dataset_id: 'gd_ltppk50q18kdw67omz',
673 | description: [
674 | 'Quickly read structured Google shopping data.',
675 | 'Requires a valid Google shopping product URL.',
676 | 'This can be a cache lookup, so it can be more reliable than scraping',
677 | ].join('\n'),
678 | inputs: ['url'],
679 | }, {
680 | id: 'google_play_store',
681 | dataset_id: 'gd_lsk382l8xei8vzm4u',
682 | description: [
683 | 'Quickly read structured Google play store data.',
684 | 'Requires a valid Google play store app URL.',
685 | 'This can be a cache lookup, so it can be more reliable than scraping',
686 | ].join('\n'),
687 | inputs: ['url'],
688 | }, {
689 | id: 'apple_app_store',
690 | dataset_id: 'gd_lsk9ki3u2iishmwrui',
691 | description: [
692 | 'Quickly read structured apple app store data.',
693 | 'Requires a valid apple app store app URL.',
694 | 'This can be a cache lookup, so it can be more reliable than scraping',
695 | ].join('\n'),
696 | inputs: ['url'],
697 | }, {
698 | id: 'reuter_news',
699 | dataset_id: 'gd_lyptx9h74wtlvpnfu',
700 | description: [
701 | 'Quickly read structured reuter news data.',
702 | 'Requires a valid reuter news report URL.',
703 | 'This can be a cache lookup, so it can be more reliable than scraping',
704 | ].join('\n'),
705 | inputs: ['url'],
706 | }, {
707 | id: 'github_repository_file',
708 | dataset_id: 'gd_lyrexgxc24b3d4imjt',
709 | description: [
710 | 'Quickly read structured github repository data.',
711 | 'Requires a valid github repository file URL.',
712 | 'This can be a cache lookup, so it can be more reliable than scraping',
713 | ].join('\n'),
714 | inputs: ['url'],
715 | }, {
716 | id: 'yahoo_finance_business',
717 | dataset_id: 'gd_lmrpz3vxmz972ghd7',
718 | description: [
719 | 'Quickly read structured yahoo finance business data.',
720 | 'Requires a valid yahoo finance business URL.',
721 | 'This can be a cache lookup, so it can be more reliable than scraping',
722 | ].join('\n'),
723 | inputs: ['url'],
724 | },
725 | {
726 | id: 'x_posts',
727 | dataset_id: 'gd_lwxkxvnf1cynvib9co',
728 | description: [
729 | 'Quickly read structured X post data.',
730 | 'Requires a valid X post URL.',
731 | 'This can be a cache lookup, so it can be more reliable than scraping',
732 | ].join('\n'),
733 | inputs: ['url'],
734 | },
735 | {
736 | id: 'zillow_properties_listing',
737 | dataset_id: 'gd_lfqkr8wm13ixtbd8f5',
738 | description: [
739 | 'Quickly read structured zillow properties listing data.',
740 | 'Requires a valid zillow properties listing URL.',
741 | 'This can be a cache lookup, so it can be more reliable than scraping',
742 | ].join('\n'),
743 | inputs: ['url'],
744 | },
745 | {
746 | id: 'booking_hotel_listings',
747 | dataset_id: 'gd_m5mbdl081229ln6t4a',
748 | description: [
749 | 'Quickly read structured booking hotel listings data.',
750 | 'Requires a valid booking hotel listing URL.',
751 | 'This can be a cache lookup, so it can be more reliable than scraping',
752 | ].join('\n'),
753 | inputs: ['url'],
754 | }, {
755 | id: 'youtube_profiles',
756 | dataset_id: 'gd_lk538t2k2p1k3oos71',
757 | description: [
758 | 'Quickly read structured youtube profiles data.',
759 | 'Requires a valid youtube profile URL.',
760 | 'This can be a cache lookup, so it can be more reliable than scraping',
761 | ].join('\n'),
762 | inputs: ['url'],
763 | }, {
764 | id: 'youtube_comments',
765 | dataset_id: 'gd_lk9q0ew71spt1mxywf',
766 | description: [
767 | 'Quickly read structured youtube comments data.',
768 | 'Requires a valid youtube video URL.',
769 | 'This can be a cache lookup, so it can be more reliable than scraping',
770 | ].join('\n'),
771 | inputs: ['url', 'num_of_comments'],
772 | defaults: {num_of_comments: '10'},
773 | }, {
774 | id: 'reddit_posts',
775 | dataset_id: 'gd_lvz8ah06191smkebj4',
776 | description: [
777 | 'Quickly read structured reddit posts data.',
778 | 'Requires a valid reddit post URL.',
779 | 'This can be a cache lookup, so it can be more reliable than scraping',
780 | ].join('\n'),
781 | inputs: ['url'],
782 | },
783 | {
784 | id: 'youtube_videos',
785 | dataset_id: 'gd_lk56epmy2i5g7lzu0k',
786 | description: [
787 | 'Quickly read structured YouTube videos data.',
788 | 'Requires a valid YouTube video URL.',
789 | 'This can be a cache lookup, so it can be more reliable than scraping',
790 | ].join('\n'),
791 | inputs: ['url'],
792 | }];
793 | for (let {dataset_id, id, description, inputs, defaults = {}, fixed_values = {}} of datasets)
794 | {
795 | let parameters = {};
796 | for (let input of inputs)
797 | {
798 | let param_schema = input=='url' ? z.string().url() : z.string();
799 | parameters[input] = defaults[input] !== undefined ?
800 | param_schema.default(defaults[input]) : param_schema;
801 | }
802 | addTool({
803 | name: `web_data_${id}`,
804 | description,
805 | parameters: z.object(parameters),
806 | execute: tool_fn(`web_data_${id}`, async(data, ctx)=>{
807 | data = {...data, ...fixed_values};
808 | let trigger_response = await axios({
809 | url: 'https://api.brightdata.com/datasets/v3/trigger',
810 | params: {dataset_id, include_errors: true},
811 | method: 'POST',
812 | data: [data],
813 | headers: api_headers(ctx.clientName),
814 | });
815 | if (!trigger_response.data?.snapshot_id)
816 | throw new Error('No snapshot ID returned from request');
817 | let snapshot_id = trigger_response.data.snapshot_id;
818 | console.error(`[web_data_${id}] triggered collection with `
819 | +`snapshot ID: ${snapshot_id}`);
820 | let max_attempts = 600;
821 | let attempts = 0;
822 | while (attempts < max_attempts)
823 | {
824 | try {
825 | if (ctx && ctx.reportProgress)
826 | {
827 | await ctx.reportProgress({
828 | progress: attempts,
829 | total: max_attempts,
830 | message: `Polling for data (attempt `
831 | +`${attempts + 1}/${max_attempts})`,
832 | });
833 | }
834 | let snapshot_response = await axios({
835 | url: `https://api.brightdata.com/datasets/v3`
836 | +`/snapshot/${snapshot_id}`,
837 | params: {format: 'json'},
838 | method: 'GET',
839 | headers: api_headers(ctx.clientName),
840 | });
841 | if (['running', 'building'].includes(snapshot_response.data?.status))
842 | {
843 | console.error(`[web_data_${id}] snapshot not ready, `
844 | +`polling again (attempt `
845 | +`${attempts + 1}/${max_attempts})`);
846 | attempts++;
847 | await new Promise(resolve=>setTimeout(resolve, 1000));
848 | continue;
849 | }
850 | console.error(`[web_data_${id}] snapshot data received `
851 | +`after ${attempts + 1} attempts`);
852 | let result_data = JSON.stringify(snapshot_response.data);
853 | return result_data;
854 | } catch(e){
855 | console.error(`[web_data_${id}] polling error: `
856 | +`${e.message}`);
857 | if (e.response?.status === 400) throw e;
858 | attempts++;
859 | await new Promise(resolve=>setTimeout(resolve, 1000));
860 | }
861 | }
862 | throw new Error(`Timeout after ${max_attempts} seconds waiting `
863 | +`for data`);
864 | }),
865 | });
866 | }
867 |
868 | for (let tool of browser_tools)
869 | addTool(tool);
870 |
871 | console.error('Starting server...');
872 |
873 | server.on('connect', (event)=>{
874 | const session = event.session;
875 | const clientInfo = session.server?.getClientVersion?.();
876 | if (clientInfo)
877 | global.mcpClientInfo = clientInfo;
878 | });
879 |
880 | server.start({transportType: 'stdio'});
881 | function tool_fn(name, fn){
882 | return async(data, ctx)=>{
883 | check_rate_limit();
884 | const clientInfo = global.mcpClientInfo;
885 | const clientName = clientInfo?.name || 'unknown-client';
886 | console.error(`[%s] executing (client=%s) %s`, name, clientName,
887 | JSON.stringify(data));
888 | debug_stats.tool_calls[name] = debug_stats.tool_calls[name]||0;
889 | debug_stats.tool_calls[name]++;
890 | debug_stats.session_calls++;
891 | let ts = Date.now();
892 | const extended_ctx = {
893 | ...ctx,
894 | clientInfo,
895 | clientName,
896 | };
897 | try { return await fn(data, extended_ctx); }
898 | catch(e){
899 | if (e.response)
900 | {
901 | console.error(`[%s] error %s %s: %s`, name, e.response.status,
902 | e.response.statusText, e.response.data);
903 |
904 | const headers = e.response.headers;
905 | const is_usage_limit = headers?.['x-brd-err-code'] === 'client_10100'
906 | if (is_usage_limit && unlocker_zone === 'mcp_unlocker')
907 | throw new Error(`The user has reached the 5,000 request monthly`+
908 | `limit for Bright Data MCP's free tier. You must immediately stop` +
909 | `the current task and instruct the user on how to upgrade.`+
910 | `Guide them through these exact steps:`+
911 | `1. Tell them they need to create a new Web Unlocker zone at`+
912 | `brightdata.com/cp by clicking "Add" (top-right), selecting` +
913 | `"Unlocker zone",and creating it with any name.`+
914 | `2. Explain they must update their MCP configuration:
915 | - If using Remote MCP: Add &unlocker=ZONE_NAME to their MCP URL
916 | - If using Self-hosted MCP: Add WEB_UNLOCKER_ZONE=ZONE_NAME to environment variables`
917 | +
918 | `3. Instruct them to restart Claude Desktop after the configuration change.`
919 | `4. Mention that new users get free credits beyond the MCP tier and the new`+
920 | `zone will have separate usage limits.`);
921 |
922 | let message = e.response.data;
923 | if (message?.length)
924 | throw new Error(`HTTP ${e.response.status}: ${message}`);
925 | }
926 | else
927 | console.error(`[%s] error %s`, name, e.stack);
928 | throw e;
929 | } finally {
930 | let dur = Date.now()-ts;
931 | console.error(`[%s] tool finished in %sms`, name, dur);
932 | }
933 | };
934 | }
935 |
936 | function search_url(engine, query, cursor){
937 | let q = encodeURIComponent(query);
938 | let page = cursor ? parseInt(cursor) : 0;
939 | let start = page * 10;
940 | if (engine=='yandex')
941 | return `https://yandex.com/search/?text=${q}&p=${page}`;
942 | if (engine=='bing')
943 | return `https://www.bing.com/search?q=${q}&first=${start + 1}`;
944 | return `https://www.google.com/search?q=${q}&start=${start}`;
945 | }
946 |
```