#
tokens: 21359/50000 16/16 files
lines: off (toggle) GitHub
raw markdown copy
# Directory Structure

```
├── .github
│   └── workflows
│       ├── publish-mcp.yml
│       └── release.yml
├── .gitignore
├── .npmignore
├── aria_snapshot_filter.js
├── assets
│   ├── Demo.gif
│   ├── Demo2.gif
│   ├── Demo3.gif
│   ├── logo.png
│   └── Tools.md
├── brightdata-mcp-extension.dxt
├── browser_session.js
├── browser_tools.js
├── CHANGELOG.md
├── Dockerfile
├── examples
│   └── README.md
├── LICENSE
├── package-lock.json
├── package.json
├── README.md
├── server.js
├── server.json
└── smithery.yaml
```

# Files

--------------------------------------------------------------------------------
/.npmignore:
--------------------------------------------------------------------------------

```
*.dxt
smithery.yaml
Dockerfile
examples
assets
CHANGELOG.md

```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
# build output
dist/

# generated types
.astro/

# dependencies
node_modules/

# logs
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*

# environment variables
.env
.env.production

# macOS-specific files
.DS_Store

# jetbrains setting folder
.idea/

```

--------------------------------------------------------------------------------
/examples/README.md:
--------------------------------------------------------------------------------

```markdown
# MCP Usage Examples

A curated list of community demos using Bright Data's MCP server.

## 🧠 Notable Examples

- **AI voice agent that closed 4 deals & made $596 overnight 🤑**  
  [📹 YouTube Demo](https://www.youtube.com/watch?v=YGzT3sVdwdY) 

   [💻 GitHub Repo](https://github.com/llSourcell/my_ai_intern)

- **Langgraph with mcp-adapters demo**

  [📹 YouTube Demo](https://www.youtube.com/watch?v=6DXuadyaJ4g)
  
  [💻 Source Code](https://github.com/techwithtim/BrightDataMCPServerAgent)

- **Researcher Agent built with Google ADK that is connected to Bright Data's MCP to fetch real-time data**

   [📹 YouTube Demo](https://www.youtube.com/watch?v=r7WG6dXWdUI)
  
  [💻Source Code](https://github.com/MeirKaD/MCP_ADK)

- **Replacing 3 MCP servers with our MCP server to avoid getting blocked 🤯**  

  [📹 YouTube Demo](https://www.youtube.com/watch?v=0xmE0OJrNmg) 

- **Scrape ANY Website In Realtime With This Powerful AI MCP Server**

   [📹 YouTube Demo](https://www.youtube.com/watch?v=bL5JIeGL3J0)

 - **Multi-Agent job finder using Bright Data MCP and TypeScript from SCRATCH**

   [📹 YouTube Demo](https://www.youtube.com/watch?v=45OtteCGFiI)
   
   [💻Source Code](https://github.com/bitswired/jobwizard)

  - **Usage example with Gemini CLI**

    [📹 YouTube Tutorial](https://www.youtube.com/watch?v=FE1LChbgFEw)
---

Got a cool example? Open a PR or contact us!

```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
<div align="center">
  <a href="https://brightdata.com/ai/mcp-server">
    <img src="https://github.com/user-attachments/assets/c21b3f7b-7ff1-40c3-b3d8-66706913d62f" alt="Bright Data Logo">
  </a>

  <h1>The Web MCP</h1>
  
  <p>
    <strong>🌐 Give your AI real-time web superpowers</strong><br/>
    <i>Seamlessly connect LLMs to the live web without getting blocked</i>
  </p>

  <p>
    <a href="https://www.npmjs.com/package/@brightdata/mcp">
      <img src="https://img.shields.io/npm/v/@brightdata/mcp?style=for-the-badge&color=blue" alt="npm version"/>
    </a>
    <a href="https://www.npmjs.com/package/@brightdata/mcp">
      <img src="https://img.shields.io/npm/dw/@brightdata/mcp?style=for-the-badge&color=green" alt="npm downloads"/>
    </a>
    <a href="https://github.com/brightdata-com/brightdata-mcp/blob/main/LICENSE">
      <img src="https://img.shields.io/badge/license-MIT-purple?style=for-the-badge" alt="License"/>
    </a>
  </p>

  <p>
    <a href="#-quick-start">Quick Start</a> •
    <a href="#-features">Features</a> •
    <a href="#-pricing--modes">Pricing</a> •
    <a href="#-demos">Demos</a> •
    <a href="#-documentation">Docs</a> •
    <a href="#-support">Support</a>
  </p>

  <div>
    <h3>🎉 <strong>Free Tier Available!</strong> 🎉</h3>
    <p><strong>5,000 requests/month FREE</strong> <br/>
    <sub>Perfect for prototyping and everyday AI workflows</sub></p>
  </div>
</div>

---

## 🌟 Overview

**The Web MCP** is your gateway to giving AI assistants true web capabilities. No more outdated responses, no more "I can't access real-time information" - just seamless, reliable web access that actually works.

Built by [Bright Data](https://brightdata.com), the world's #1 web data platform, this MCP server ensures your AI never gets blocked, rate-limited, or served CAPTCHAs.

<div align="center">
  <table>
    <tr>
      <td align="center">✅ <strong>Works with Any LLM</strong><br/><sub>Claude, GPT, Gemini, Llama</sub></td>
      <td align="center">🛡️ <strong>Never Gets Blocked</strong><br/><sub>Enterprise-grade unblocking</sub></td>
      <td align="center">🚀 <strong>5,000 Free Requests</strong><br/><sub>Monthly</sub></td>
      <td align="center">⚡ <strong>Zero Config</strong><br/><sub>Works out of the box</sub></td>
    </tr>
  </table>
</div>

---

## 🎯 Perfect For

- 🔍 **Real-time Research** - Get current prices, news, and live data
- 🛍️ **E-commerce Intelligence** - Monitor products, prices, and availability  
- 📊 **Market Analysis** - Track competitors and industry trends
- 🤖 **AI Agents** - Build agents that can actually browse the web
- 📝 **Content Creation** - Access up-to-date information for writing
- 🎓 **Academic Research** - Gather data from multiple sources efficiently

---

## ⚡ Quick Start


<summary><b>📡 Use our hosted server - No installation needed!</b></summary>

Perfect for users who want zero setup. Just add this URL to your MCP client:

```
https://mcp.brightdata.com/mcp?token=YOUR_API_TOKEN_HERE
```

**Setup in Claude Desktop:**
1. Go to: Settings → Connectors → Add custom connector
2. Name: `Bright Data Web`
3. URL: `https://mcp.brightdata.com/mcp?token=YOUR_API_TOKEN`
4. Click "Add" and you're done! ✨


<summary><b>Run locally on your machine</b></summary>

```json
{
  "mcpServers": {
    "Bright Data": {
      "command": "npx",
      "args": ["@brightdata/mcp"],
      "env": {
        "API_TOKEN": "<your-api-token-here>"
      }
    }
  }
}
```


---

## 🚀 Pricing & Modes

<div align="center">
  <table>
    <tr>
      <th width="33%">⚡ Rapid Mode (Free tier)</th>
      <th width="33%">💎 Pro Mode</th>
    </tr>
    <tr>
      <td align="center">
        <h3>$0/month</h3>
        <p><strong>5,000 requests</strong></p>
        <hr/>
        <p>✅ Web Search<br/>
        ✅ Scraping with Web unlocker<br/>
        ❌ Browser Automation<br/>
        ❌ Web data tools</p>
        <br/>
        <code>Default Mode</code>
      </td>
      <td align="center">
        <h3>Pay-as-you-go</h3>
        <p><strong>Every thing in rapid and 60+ Advanced Tools</strong></p>
        <hr/>
        <p>✅ Browser Control<br/>
        ✅ Web Data APIs<br/>
        <br/>
        <br/>
        <br/>
        <code>PRO_MODE=true</code>
      </td>
    </tr>
  </table>
</div>

> **💡 Note:** Pro mode is **not included** in the free tier and incurs additional charges based on usage.

---

## ✨ Features

### 🔥 Core Capabilities

<table>
  <tr>
    <td>🔍 <b>Smart Web Search</b><br/>Google-quality results optimized for AI</td>
    <td>📄 <b>Clean Markdown</b><br/>AI-ready content extraction</td>
  </tr>
  <tr>
    <td>🌍 <b>Global Access</b><br/>Bypass geo-restrictions automatically</td>
    <td>🛡️ <b>Anti-Bot Protection</b><br/>Never get blocked or rate-limited</td>
  </tr>
  <tr>
    <td>🤖 <b>Browser Automation</b><br/>Control real browsers remotely (Pro)</td>
    <td>⚡ <b>Lightning Fast</b><br/>Optimized for minimal latency</td>
  </tr>
</table>

### 🎯 Example Queries That Just Work

```yaml
✅ "What's Tesla's current stock price?"
✅ "Find the best-rated restaurants in Tokyo right now"
✅ "Get today's weather forecast for New York"
✅ "What movies are releasing this week?"
✅ "What are the trending topics on Twitter today?"
```

---

## 🎬 Demos

> **Note:** These videos show earlier versions. New demos coming soon! 🎥

<details>
<summary><b>View Demo Videos</b></summary>

### Basic Web Search Demo
https://github.com/user-attachments/assets/59f6ebba-801a-49ab-8278-1b2120912e33

### Advanced Scraping Demo
https://github.com/user-attachments/assets/61ab0bee-fdfa-4d50-b0de-5fab96b4b91d

[📺 More tutorials on YouTube →](https://github.com/brightdata-com/brightdata-mcp/blob/main/examples/README.md)

</details>

---

## 🔧 Available Tools

### ⚡ Rapid Mode Tools (Default - Free)

| Tool | Description | Use Case |
|------|-------------|----------|
| 🔍 `search_engine` | Web search with AI-optimized results | Research, fact-checking, current events |
| 📄 `scrape_as_markdown` | Convert any webpage to clean markdown | Content extraction, documentation |

### 💎 Pro Mode Tools (60+ Tools)

<details>
<summary><b>Click to see all Pro tools</b></summary>

| Category | Tools | Description |
|----------|-------|-------------|
| **Browser Control** | `scraping_browser.*` | Full browser automation |
| **Web Data APIs** | `web_data_*` | Structured data extraction |
| **E-commerce** | Product scrapers | Amazon, eBay, Walmart data |
| **Social Media** | Social scrapers | Twitter, LinkedIn, Instagram |
| **Maps & Local** | Location tools | Google Maps, business data |

[📚 View complete tool documentation →](https://github.com/brightdata-com/brightdata-mcp/blob/main/assets/Tools.md)

</details>

---

## 🎮 Try It Now!

### 🧪 Online Playground
Try the Web MCP without any setup:

<div align="center">
  <a href="https://brightdata.com/ai/playground-chat">
    <img src="https://img.shields.io/badge/Try_on-Playground-00C7B7?style=for-the-badge&logo=data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMjQiIGhlaWdodD0iMjQiIHZpZXdCb3g9IjAgMCAyNCAyNCIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIj4KPHBhdGggZD0iTTEyIDJMMyA3VjE3TDEyIDIyTDIxIDE3VjdMMTIgMloiIHN0cm9rZT0id2hpdGUiIHN0cm9rZS13aWR0aD0iMiIvPgo8L3N2Zz4=" alt="Playground"/>
  </a>
</div>

---

## 🔧 Configuration

### Basic Setup
```json
{
  "mcpServers": {
    "Bright Data": {
      "command": "npx",
      "args": ["@brightdata/mcp"],
      "env": {
        "API_TOKEN": "your-token-here"
      }
    }
  }
}
```

### Advanced Configuration
```json
{
  "mcpServers": {
    "Bright Data": {
      "command": "npx",
      "args": ["@brightdata/mcp"],
      "env": {
        "API_TOKEN": "your-token-here",
        "PRO_MODE": "true",              // Enable all 60+ tools
        "RATE_LIMIT": "100/1h",          // Custom rate limiting
        "WEB_UNLOCKER_ZONE": "custom",   // Custom unlocker zone
        "BROWSER_ZONE": "custom_browser" // Custom browser zone
      }
    }
  }
}
```

---

## 📚 Documentation

<div align="center">
  <table>
    <tr>
      <td align="center">
        <a href="https://docs.brightdata.com/mcp-server/overview">
          <img src="https://img.shields.io/badge/📖-API_Docs-blue?style=for-the-badge" alt="API Docs"/>
        </a>
      </td>
      <td align="center">
        <a href="https://github.com/brightdata-com/brightdata-mcp/blob/main/examples">
          <img src="https://img.shields.io/badge/💡-Examples-green?style=for-the-badge" alt="Examples"/>
        </a>
      </td>
      <td align="center">
        <a href="https://github.com/brightdata-com/brightdata-mcp/blob/main/CHANGELOG.md">
          <img src="https://img.shields.io/badge/📝-Changelog-orange?style=for-the-badge" alt="Changelog"/>
        </a>
      </td>
      <td align="center">
        <a href="https://brightdata.com/blog/ai/web-scraping-with-mcp">
          <img src="https://img.shields.io/badge/📚-Tutorial-purple?style=for-the-badge" alt="Tutorial"/>
        </a>
      </td>
    </tr>
  </table>
</div>

---

## 🚨 Common Issues & Solutions

<details>
<summary><b>🔧 Troubleshooting Guide</b></summary>

### ❌ "spawn npx ENOENT" Error
**Solution:** Install Node.js or use the full path to node:
```json
"command": "/usr/local/bin/node"  // macOS/Linux
"command": "C:\\Program Files\\nodejs\\node.exe"  // Windows
```

### ⏱️ Timeouts on Complex Sites
**Solution:** Increase timeout in your client settings to 180s

### 🔑 Authentication Issues
**Solution:** Ensure your API token is valid and has proper permissions

### 📡 Remote Server Connection
**Solution:** Check your internet connection and firewall settings

[More troubleshooting →](https://github.com/brightdata-com/brightdata-mcp#troubleshooting)

</details>

---

## 🤝 Contributing

We love contributions! Here's how you can help:

- 🐛 [Report bugs](https://github.com/brightdata-com/brightdata-mcp/issues)
- 💡 [Suggest features](https://github.com/brightdata-com/brightdata-mcp/issues)
- 🔧 [Submit PRs](https://github.com/brightdata-com/brightdata-mcp/pulls)
- ⭐ Star this repo!

Please follow [Bright Data's coding standards](https://brightdata.com/dna/js_code).

---

## 📞 Support

<div align="center">
  <table>
    <tr>
      <td align="center">
        <a href="https://github.com/brightdata-com/brightdata-mcp/issues">
          <strong>🐛 GitHub Issues</strong><br/>
          <sub>Report bugs & features</sub>
        </a>
      </td>
      <td align="center">
        <a href="https://docs.brightdata.com/mcp-server/overview">
          <strong>📚 Documentation</strong><br/>
          <sub>Complete guides</sub>
        </a>
      </td>
      <td align="center">
        <a href="mailto:[email protected]">
          <strong>✉️ Email</strong><br/>
          <sub>[email protected]</sub>
        </a>
      </td>
    </tr>
  </table>
</div>

---

## 📜 License

MIT © [Bright Data Ltd.](https://brightdata.com)

---

<div align="center">
  <p>
    <strong>Built with ❤️ by</strong><br/>
    <a href="https://brightdata.com">
      <img src="https://idsai.net.technion.ac.il/files/2022/01/Logo-600.png" alt="Bright Data" height="30"/>
    </a>
  </p>
  <p>
    <sub>The world's #1 web data platform</sub>
  </p>
  
  <br/>
  
  <p>
    <a href="https://github.com/brightdata-com/brightdata-mcp">⭐ Star us on GitHub</a> • 
    <a href="https://brightdata.com/blog">Read our Blog</a>
  </p>
</div>

```

--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------

```dockerfile
FROM node:22.12-alpine AS builder


COPY . /app
WORKDIR /app


RUN --mount=type=cache,target=/root/.npm npm install

FROM node:22-alpine AS release

WORKDIR /app


COPY --from=builder /app/server.js /app/
COPY --from=builder /app/browser_tools.js /app/
COPY --from=builder /app/browser_session.js /app/
COPY --from=builder /app/package.json /app/
COPY --from=builder /app/package-lock.json /app/


ENV NODE_ENV=production


RUN npm ci --ignore-scripts --omit-dev


ENTRYPOINT ["node", "server.js"]

```

--------------------------------------------------------------------------------
/.github/workflows/release.yml:
--------------------------------------------------------------------------------

```yaml
name: Release
on:
  push:
    tags: v*

jobs:
  release:
    name: Release
    runs-on: ubuntu-latest
    permissions:
      contents: read
      id-token: write
    steps:
      - uses: actions/checkout@v5
      - uses: actions/setup-node@v5
        with:
          node-version: 22
          cache: "npm"
          registry-url: 'https://registry.npmjs.org'
          scope: '@brightdata'
      - run: npm ci
      - run: npm audit signatures
      - run: npm publish
        env:
          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}

```

--------------------------------------------------------------------------------
/smithery.yaml:
--------------------------------------------------------------------------------

```yaml
startCommand:
  type: stdio
  configSchema:
    type: object
    properties:
      apiToken:
        type: string
        description: "Bright Data API key, available in https://brightdata.com/cp/setting/users"
      webUnlockerZone:
        type: string
        default: 'mcp_unlocker'
        description: "Optional: The Web Unlocker zone name (defaults to 'mcp_unlocker')"
      browserZone:
        type: string
        default: 'mcp_browser'
        description: "Optional: Zone name for the Browser API (enables browser control tools, deafults to 'mcp_browser')"
  commandFunction: |-
    config => ({ 
      command: 'node', 
      args: ['server.js'], 
      env: { 
        API_TOKEN: config.apiToken,
        WEB_UNLOCKER_ZONE: config.webUnlockerZone,
        BROWSER_ZONE: config.browserZone
      } 
    })

```

--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------

```json
{
    "name": "@brightdata/mcp",
    "version": "2.6.0",
    "description": "An MCP interface into the Bright Data toolset",
    "type": "module",
    "main": "./server.js",
    "bin": {
        "@brightdata/mcp": "./server.js"
    },
    "scripts": {
        "start": "node server.js"
    },
    "keywords": [
        "mcp",
        "brightdata"
    ],
    "author": "Bright Data",
    "repository": {
        "type": "git",
        "url": "https://github.com/brightdata/brightdata-mcp.git"
    },
    "bugs": {
        "url": "https://github.com/brightdata/brightdata-mcp/issues"
    },
    "license": "MIT",
    "dependencies": {
        "axios": "^1.11.0",
        "fastmcp": "^3.1.1",
        "playwright": "^1.51.1",
        "zod": "^3.24.2"
    },
    "publishConfig": {
        "access": "public"
    },
    "files": [
        "server.js",
        "browser_tools.js",
        "browser_session.js",
        "aria_snapshot_filter.js"
    ],
    "mcpName": "io.github.brightdata/brightdata-mcp"
}

```

--------------------------------------------------------------------------------
/.github/workflows/publish-mcp.yml:
--------------------------------------------------------------------------------

```yaml
name: Publish to MCP Registry

on:
  push:
    tags: ["v*"] 
  workflow_dispatch: 

jobs:
  publish:
    runs-on: ubuntu-latest
    permissions:
      id-token: write  
      contents: read

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Setup Node.js 
        uses: actions/setup-node@v4
        with:
          node-version: "22"

      - name: Sync version in server.json with package.json
        run: |
          VERSION=$(node -p "require('./package.json').version")
          echo "Syncing version to: $VERSION"
          jq --arg v "$VERSION" '.version = $v | .packages[0].version = $v' server.json > tmp.json && mv tmp.json server.json
          echo "Updated server.json:"
          cat server.json

      - name: Install MCP Publisher
        run: |
          curl -L "https://github.com/modelcontextprotocol/registry/releases/latest/download/mcp-publisher_$(uname -s | tr '[:upper:]' '[:lower:]')_$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/').tar.gz" | tar xz mcp-publisher
          chmod +x mcp-publisher

      - name: Login to MCP Registry
        run: ./mcp-publisher login github-oidc

      - name: Publish to MCP Registry
        run: ./mcp-publisher publish

```

--------------------------------------------------------------------------------
/server.json:
--------------------------------------------------------------------------------

```json
{
  "$schema": "https://static.modelcontextprotocol.io/schemas/2025-10-17/server.schema.json",
  "name": "io.github.brightdata/brightdata-mcp",
  "description": "Bright Data's Web MCP server enabling AI agents to search, extract & navigate the web",
  "repository": {
    "url": "https://github.com/brightdata/brightdata-mcp",
    "source": "github"
  },
  "version": "2.5.0",
  "packages": [
    {
      "registryType": "npm",
      "registryBaseUrl": "https://registry.npmjs.org",
      "identifier": "@brightdata/mcp",
      "version": "2.5.0",
      "transport": {
        "type": "stdio"
      },
      "environmentVariables": [
        {
          "name": "API_TOKEN",
          "description": "Your API key for Bright Data",
          "isRequired": true,
          "isSecret": true,
          "format": "string"
        },
        {
          "name": "WEB_UNLOCKER_ZONE",
          "description": "Your unlocker zone name",
          "isRequired": false,
          "isSecret": false,
          "format": "string"
        },
        {
          "name": "BROWSER_ZONE",
          "description": "Your browser zone name",
          "isRequired": false,
          "isSecret": false,
          "format": "string"
        },
        {
          "name": "PRO_MODE",
          "description": "To enable PRO_MODE - set to true",
          "isRequired": false,
          "isSecret": false,
          "format": "boolean"
        }
      ]
    }
  ]
}

```

--------------------------------------------------------------------------------
/aria_snapshot_filter.js:
--------------------------------------------------------------------------------

```javascript
// LICENSE_CODE ZON
'use strict'; /*jslint node:true es9:true*/

export class Aria_snapshot_filter {
    static INTERACTIVE_ROLES = new Set([
        'button', 'link', 'textbox', 'searchbox', 'combobox', 'checkbox',
        'radio', 'switch', 'slider', 'tab', 'menuitem', 'option',
    ]);
    static parse_playwright_snapshot(snapshot_text){
        const lines = snapshot_text.split('\n');
        const elements = [];
        for (const line of lines)
        {
            const trimmed = line.trim();
            if (!trimmed || !trimmed.startsWith('-'))
                continue;
            const ref_match = trimmed.match(/\[ref=([^\]]+)\]/);
            if (!ref_match)
                continue;
            const ref = ref_match[1];
            const role_match = trimmed.match(/^-\s+([a-zA-Z]+)/);
            if (!role_match)
                continue;
            const role = role_match[1];
            if (!this.INTERACTIVE_ROLES.has(role))
                continue;
            const name_match = trimmed.match(/"([^"]*)"/);
            const name = name_match ? name_match[1] : '';
            let url = null;
            const next_line_index = lines.indexOf(line)+1;
            if (next_line_index<lines.length)
            {
                const next_line = lines[next_line_index];
                const url_match = next_line.match(/\/url:\s*(.+)/);
                if (url_match)
                    url = url_match[1].trim().replace(/^["']|["']$/g, '');
            }
            elements.push({ref, role, name, url});
        }
        return elements;
    }

    static format_compact(elements){
        const lines = [];
        for (const el of elements)
        {
            const parts = [`[${el.ref}]`, el.role];
            if (el.name && el.name.length>0)
            {
                const name = el.name.length>60 ?
                    el.name.substring(0, 57)+'...' : el.name;
                parts.push(`"${name}"`);
            }
            if (el.url && el.url.length>0 && !el.url.startsWith('#'))
            {
                let url = el.url;
                if (url.length>50)
                    url = url.substring(0, 47)+'...';
                parts.push(`→ ${url}`);
            }
            lines.push(parts.join(' '));
        }
        return lines.join('\n');
    }

    static filter_snapshot(snapshot_text){
        try {
            const elements = this.parse_playwright_snapshot(snapshot_text);
            if (elements.length===0)
                return 'No interactive elements found';
            return this.format_compact(elements);
        } catch(e){
            return `Error filtering snapshot: ${e.message}\n${e.stack}`;
        }
    }
}

```

--------------------------------------------------------------------------------
/CHANGELOG.md:
--------------------------------------------------------------------------------

```markdown
# Changelog

All notable changes to this project will be documented in this file.

## [2.6.0] - 2025-10-27

### Added
- Client name logging and header passthrough for improved observability (PR #75)
- ARIA ref-based browser automation for more reliable element interactions (PR #65)
- ARIA snapshot filtering for better element targeting
- Network request tracking in browser sessions
- MCP Registry support (PR #71)
- `scraping_browser_snapshot` tool to capture ARIA snapshots
- `scraping_browser_click_ref`, `scraping_browser_type_ref`, `scraping_browser_wait_for_ref` tools using ref-based selectors
- `scraping_browser_network_requests` tool to track HTTP requests

### Changed
- Enhanced search engine tool to return JSON with only relevant fields (PR #57)
- Added `fixed_values` parameter to reduce token usage (PR #60)
- Browser tools now use ARIA refs instead of CSS selectors for better reliability

### Fixed
- Stop polling on HTTP 400 errors in web data tools (PR #64)

### Deprecated
- Legacy selector-based tools (`scraping_browser_click`, `scraping_browser_type`, `scraping_browser_wait_for`) replaced by ref-based equivalents
- `scraping_browser_links` tool deprecated in favor of snapshot-based approach


## [2.0.0] - 2025-05-26

### Changed
- Updated browser authentication to use API_TOKEN instead of previous authentication method
- BROWSER_ZONE is now an optional parameter, the deafult zone is `mcp_browser`
- Removed duplicate web_data_ tools

## [1.9.2] - 2025-05-23

### Fixed
- Fixed GitHub references and repository settings

## [1.9.1] - 2025-05-21

### Fixed
- Fixed spelling errors and improved coding conventions
- Converted files back to Unix line endings for consistency

## [1.9.0] - 2025-05-21

### Added
- Added 23 new web data tools for enhanced data collection capabilities
- Added progress reporting functionality for better user feedback
- Added default parameter handling for improved tool usability

### Changed
- Improved coding conventions and file formatting
- Enhanced web data API endpoints integration

## [1.8.3] - 2025-05-21

### Added
- Added Bright Data MCP with Claude demo video to README.md

### Changed
- Updated documentation with video demonstrations

## [1.8.2] - 2025-05-13

### Changed
- Bumped FastMCP version for improved performance
- Updated README.md with additional documentation

## [1.8.1] - 2025-05-05

### Added
- Added 12 new WSAPI endpoints for enhanced functionality
- Changed to polling mechanism for better reliability

### Changed
- Applied dos2unix formatting for consistency
- Updated Docker configuration
- Updated smithery.yaml configuration

## [1.8.0] - 2025-05-03

### Added
- Added domain-based browser sessions to avoid navigation limit issues
- Added automatic creation of required unlocker zone when not present

### Fixed
- Fixed browser context maintenance across tool calls with current domain tracking
- Minor lint fixes

## [1.0.0] - 2025-04-29

### Added
- Initial release of Bright Data MCP server
- Browser automation capabilities with Bright Data integration
- Core web scraping and data collection tools
- Smithery.yaml configuration for deployment in Smithery.ai
- MIT License
- Demo materials and documentation

### Documentation
- Created comprehensive README.md
- Added demo.md with usage examples
- Created examples/README.md for sample implementations
- Added Tools.md documentation for available tools

---

## Release Notes

### Version 1.9.x Series
The 1.9.x series focuses on expanding web data collection capabilities and improving authentication mechanisms. Key highlights include the addition of 23 new web data tools.

### Version 1.8.x Series  
The 1.8.x series introduced significant improvements to browser session management, WSAPI endpoints, and overall system reliability. Notable features include domain-based sessions and automatic zone creation.

### Version 1.0.0
Initial stable release providing core MCP server functionality for Bright Data integration with comprehensive browser automation and web scraping capabilities.


```

--------------------------------------------------------------------------------
/browser_session.js:
--------------------------------------------------------------------------------

```javascript
'use strict'; /*jslint node:true es9:true*/
import * as playwright from 'playwright';
import {Aria_snapshot_filter} from './aria_snapshot_filter.js';

export class Browser_session {
    constructor({cdp_endpoint}){
        this.cdp_endpoint = cdp_endpoint;
        this._domainSessions = new Map();
        this._currentDomain = 'default';
    }

    _getDomain(url){
        try {
            const urlObj = new URL(url);
            return urlObj.hostname;
        } catch(e){
            console.error(`Error extracting domain from ${url}:`, e);
            return 'default';
        }
    }

    async _getDomainSession(domain, {log}={}){
        if (!this._domainSessions.has(domain)) 
        {
            this._domainSessions.set(domain, {
                browser: null,
                page: null,
                browserClosed: true,
                requests: new Map()
            });
        }
        return this._domainSessions.get(domain);
    }

    async get_browser({log, domain='default'}={}){
        try {
            const session = await this._getDomainSession(domain, {log});
            if (session.browser)
            {
                try { await session.browser.contexts(); }
                catch(e){
                    log?.(`Browser connection lost for domain ${domain} (${e.message}), `
                        +`reconnecting...`);
                    session.browser = null;
                    session.page = null;
                    session.browserClosed = true;
                }
            }
            if (!session.browser)
            {
                log?.(`Connecting to Bright Data Scraping Browser for domain ${domain}.`);
                session.browser = await playwright.chromium.connectOverCDP(
                    this.cdp_endpoint);
                session.browserClosed = false;
                session.browser.on('disconnected', ()=>{
                    log?.(`Browser disconnected for domain ${domain}`);
                    session.browser = null;
                    session.page = null;
                    session.browserClosed = true;
                });
                log?.(`Connected to Bright Data Scraping Browser for domain ${domain}`);
            }
            return session.browser;
        } catch(e){
            console.error(`Error connecting to browser for domain ${domain}:`, e);
            const session = this._domainSessions.get(domain);
            if (session) 
            {
                session.browser = null;
                session.page = null;
                session.browserClosed = true;
            }
            throw e;
        }
    }

    async get_page({url=null}={}){
        if (url) 
        {
            this._currentDomain = this._getDomain(url);
        }
        const domain = this._currentDomain;
        try {
            const session = await this._getDomainSession(domain);
            if (session.browserClosed || !session.page)
            {
                const browser = await this.get_browser({domain});
                const existingContexts = browser.contexts();
                if (existingContexts.length === 0)
                {
                    const context = await browser.newContext();
                    session.page = await context.newPage();
                }
                else
                {
                    const existingPages = existingContexts[0]?.pages();
                    if (existingPages && existingPages.length > 0)
                        session.page = existingPages[0];
                    else
                        session.page = await existingContexts[0].newPage();
                }
                session.page.on('request', request=>
                    session.requests.set(request, null));
                session.page.on('response', response=>
                    session.requests.set(response.request(), response));
                session.browserClosed = false;
                session.page.once('close', ()=>{
                    session.page = null;
                });
            }
            return session.page;
        } catch(e){
            console.error(`Error getting page for domain ${domain}:`, e);
            const session = this._domainSessions.get(domain);
            if (session) 
            {
                session.browser = null;
                session.page = null;
                session.browserClosed = true;
            }
            throw e;
        }
    }

    async capture_snapshot({filtered=true}={}){
        const page = await this.get_page();
        try {
            const full_snapshot = await page._snapshotForAI();
            if (!filtered)
            {
                return {
                    url: page.url(),
                    title: await page.title(),
                    aria_snapshot: full_snapshot,
                };
            }
            const filtered_snapshot = Aria_snapshot_filter.filter_snapshot(
                full_snapshot);
            return {
                url: page.url(),
                title: await page.title(),
                aria_snapshot: filtered_snapshot,
            };
        } catch(e){
            throw new Error(`Error capturing ARIA snapshot: ${e.message}`);
        }
    }

    async ref_locator({element, ref}){
        const page = await this.get_page();
        try {
            const snapshot = await page._snapshotForAI();
            if (!snapshot.includes(`[ref=${ref}]`))
                throw new Error('Ref '+ref+' not found in the current page '
                    +'snapshot. Try capturing new snapshot.');
            return page.locator(`aria-ref=${ref}`).describe(element);
        } catch(e){
            throw new Error(`Error creating ref locator for ${element} with ref ${ref}: ${e.message}`);
        }
    }

    async get_requests(){
        const domain = this._currentDomain;
        const session = await this._getDomainSession(domain);
        return session.requests;
    }

    async clear_requests(){
        const domain = this._currentDomain;
        const session = await this._getDomainSession(domain);
        session.requests.clear();
    }

    async close(domain=null){
        if (domain){
            const session = this._domainSessions.get(domain);
            if (session && session.browser) 
            {
                try { await session.browser.close(); }
                catch(e){ console.error(`Error closing browser for domain ${domain}:`, e); }
                session.browser = null;
                session.page = null;
                session.browserClosed = true;
                session.requests.clear();
                this._domainSessions.delete(domain);
            }
        }
        else {
            for (const [domain, session] of this._domainSessions.entries()) {
                if (session.browser) 
                {
                    try { await session.browser.close(); }
                    catch(e){ console.error(`Error closing browser for domain ${domain}:`, e); }
                    session.browser = null;
                    session.page = null;
                    session.browserClosed = true;
                    session.requests.clear();
                }
            }
            this._domainSessions.clear();
        }
        if (!domain) 
        {
            this._currentDomain = 'default';
        }
    }
}


```

--------------------------------------------------------------------------------
/assets/Tools.md:
--------------------------------------------------------------------------------

```markdown
|Feature|Description|
|---|---|
|search_engine|Scrape search results from Google, Bing, or Yandex. Returns SERP results in JSON for Google and Markdown for Bing/Yandex; supports pagination with the cursor parameter.|
|scrape_as_markdown|Scrape a single webpage with advanced extraction and return Markdown. Uses Bright Data's unlocker to handle bot protection and CAPTCHA.|
|search_engine_batch|Run up to 10 search queries in parallel. Returns JSON for Google results and Markdown for Bing/Yandex.|
|scrape_batch|Scrape up to 10 webpages in one request and return an array of URL/content pairs in Markdown format.|
|scrape_as_html|Scrape a single webpage with advanced extraction and return the HTML response body. Handles sites protected by bot detection or CAPTCHA.|
|extract|Scrape a webpage as Markdown and convert it to structured JSON using AI sampling, with an optional custom extraction prompt.|
|session_stats|Report how many times each tool has been called during the current MCP session.|
|web_data_amazon_product|Quickly read structured Amazon product data. Requires a valid product URL containing /dp/. Often faster and more reliable than scraping.|
|web_data_amazon_product_reviews|Quickly read structured Amazon product review data. Requires a valid product URL containing /dp/. Often faster and more reliable than scraping.|
|web_data_amazon_product_search|Retrieve structured Amazon search results. Requires a search keyword and Amazon domain URL; limited to the first page of results.|
|web_data_walmart_product|Quickly read structured Walmart product data. Requires a product URL containing /ip/. Often faster and more reliable than scraping.|
|web_data_walmart_seller|Quickly read structured Walmart seller data. Requires a valid Walmart seller URL. Often faster and more reliable than scraping.|
|web_data_ebay_product|Quickly read structured eBay product data. Requires a valid eBay product URL. Often faster and more reliable than scraping.|
|web_data_homedepot_products|Quickly read structured Home Depot product data. Requires a valid homedepot.com product URL. Often faster and more reliable than scraping.|
|web_data_zara_products|Quickly read structured Zara product data. Requires a valid Zara product URL. Often faster and more reliable than scraping.|
|web_data_etsy_products|Quickly read structured Etsy product data. Requires a valid Etsy product URL. Often faster and more reliable than scraping.|
|web_data_bestbuy_products|Quickly read structured Best Buy product data. Requires a valid Best Buy product URL. Often faster and more reliable than scraping.|
|web_data_linkedin_person_profile|Quickly read structured LinkedIn people profile data. Requires a valid LinkedIn profile URL. Often faster and more reliable than scraping.|
|web_data_linkedin_company_profile|Quickly read structured LinkedIn company profile data. Requires a valid LinkedIn company URL. Often faster and more reliable than scraping.|
|web_data_linkedin_job_listings|Quickly read structured LinkedIn job listings data. Requires a valid LinkedIn jobs URL or search URL. Often faster and more reliable than scraping.|
|web_data_linkedin_posts|Quickly read structured LinkedIn posts data. Requires a valid LinkedIn post URL. Often faster and more reliable than scraping.|
|web_data_linkedin_people_search|Quickly read structured LinkedIn people search data. Requires a LinkedIn people search URL. Often faster and more reliable than scraping.|
|web_data_crunchbase_company|Quickly read structured Crunchbase company data. Requires a valid Crunchbase company URL. Often faster and more reliable than scraping.|
|web_data_zoominfo_company_profile|Quickly read structured ZoomInfo company profile data. Requires a valid ZoomInfo company URL. Often faster and more reliable than scraping.|
|web_data_instagram_profiles|Quickly read structured Instagram profile data. Requires a valid Instagram profile URL. Often faster and more reliable than scraping.|
|web_data_instagram_posts|Quickly read structured Instagram post data. Requires a valid Instagram post URL. Often faster and more reliable than scraping.|
|web_data_instagram_reels|Quickly read structured Instagram reel data. Requires a valid Instagram reel URL. Often faster and more reliable than scraping.|
|web_data_instagram_comments|Quickly read structured Instagram comments data. Requires a valid Instagram URL. Often faster and more reliable than scraping.|
|web_data_facebook_posts|Quickly read structured Facebook post data. Requires a valid Facebook post URL. Often faster and more reliable than scraping.|
|web_data_facebook_marketplace_listings|Quickly read structured Facebook Marketplace listing data. Requires a valid Marketplace listing URL. Often faster and more reliable than scraping.|
|web_data_facebook_company_reviews|Quickly read structured Facebook company reviews data. Requires a valid Facebook company URL and review count. Often faster and more reliable than scraping.|
|web_data_facebook_events|Quickly read structured Facebook events data. Requires a valid Facebook event URL. Often faster and more reliable than scraping.|
|web_data_tiktok_profiles|Quickly read structured TikTok profile data. Requires a valid TikTok profile URL. Often faster and more reliable than scraping.|
|web_data_tiktok_posts|Quickly read structured TikTok post data. Requires a valid TikTok post URL. Often faster and more reliable than scraping.|
|web_data_tiktok_shop|Quickly read structured TikTok Shop product data. Requires a valid TikTok Shop product URL. Often faster and more reliable than scraping.|
|web_data_tiktok_comments|Quickly read structured TikTok comments data. Requires a valid TikTok video URL. Often faster and more reliable than scraping.|
|web_data_google_maps_reviews|Quickly read structured Google Maps reviews data. Requires a valid Google Maps URL and optional days_limit (default 3). Often faster and more reliable than scraping.|
|web_data_google_shopping|Quickly read structured Google Shopping product data. Requires a valid Google Shopping product URL. Often faster and more reliable than scraping.|
|web_data_google_play_store|Quickly read structured Google Play Store app data. Requires a valid Play Store app URL. Often faster and more reliable than scraping.|
|web_data_apple_app_store|Quickly read structured Apple App Store app data. Requires a valid App Store app URL. Often faster and more reliable than scraping.|
|web_data_reuter_news|Quickly read structured Reuters news data. Requires a valid Reuters news article URL. Often faster and more reliable than scraping.|
|web_data_github_repository_file|Quickly read structured GitHub repository file data. Requires a valid GitHub file URL. Often faster and more reliable than scraping.|
|web_data_yahoo_finance_business|Quickly read structured Yahoo Finance company profile data. Requires a valid Yahoo Finance business URL. Often faster and more reliable than scraping.|
|web_data_x_posts|Quickly read structured X (Twitter) post data. Requires a valid X post URL. Often faster and more reliable than scraping.|
|web_data_zillow_properties_listing|Quickly read structured Zillow property listing data. Requires a valid Zillow listing URL. Often faster and more reliable than scraping.|
|web_data_booking_hotel_listings|Quickly read structured Booking.com hotel listing data. Requires a valid Booking.com listing URL. Often faster and more reliable than scraping.|
|web_data_youtube_profiles|Quickly read structured YouTube channel profile data. Requires a valid YouTube channel URL. Often faster and more reliable than scraping.|
|web_data_youtube_comments|Quickly read structured YouTube comments data. Requires a valid YouTube video URL and optional num_of_comments (default 10). Often faster and more reliable than scraping.|
|web_data_reddit_posts|Quickly read structured Reddit post data. Requires a valid Reddit post URL. Often faster and more reliable than scraping.|
|web_data_youtube_videos|Quickly read structured YouTube video metadata. Requires a valid YouTube video URL. Often faster and more reliable than scraping.|
|scraping_browser_navigate|Open or reuse a scraping-browser session and navigate to the provided URL, resetting tracked network requests.|
|scraping_browser_go_back|Navigate the active scraping-browser session back to the previous page and report the new URL and title.|
|scraping_browser_go_forward|Navigate the active scraping-browser session forward to the next page and report the new URL and title.|
|scraping_browser_snapshot|Capture an ARIA snapshot of the current page listing interactive elements and their refs for later ref-based actions.|
|scraping_browser_click_ref|Click an element using its ref from the latest ARIA snapshot; requires a ref and human-readable element description.|
|scraping_browser_type_ref|Fill an element identified by ref from the ARIA snapshot, optionally pressing Enter to submit after typing.|
|scraping_browser_screenshot|Capture a screenshot of the current page; supports optional full_page mode for full-length images.|
|scraping_browser_network_requests|List the network requests recorded since page load with HTTP method, URL, and response status for debugging.|
|scraping_browser_wait_for_ref|Wait until an element identified by ARIA ref becomes visible, with an optional timeout in milliseconds.|
|scraping_browser_get_text|Return the text content of the current page's body element.|
|scraping_browser_get_html|Return the HTML content of the current page; avoid the full_page option unless head or script tags are required.|
|scraping_browser_scroll|Scroll to the bottom of the current page in the scraping-browser session.|
|scraping_browser_scroll_to_ref|Scroll the page until the element referenced in the ARIA snapshot is in view.|

```

--------------------------------------------------------------------------------
/browser_tools.js:
--------------------------------------------------------------------------------

```javascript
'use strict'; /*jslint node:true es9:true*/
import {UserError, imageContent as image_content} from 'fastmcp';
import {z} from 'zod';
import axios from 'axios';
import {Browser_session} from './browser_session.js';
let browser_zone = process.env.BROWSER_ZONE || 'mcp_browser';

let open_session;
const require_browser = async()=>{
    if (!open_session)
    {
        open_session = new Browser_session({
            cdp_endpoint: await calculate_cdp_endpoint(),
        });
    }
    return open_session;
};

const calculate_cdp_endpoint = async()=>{
    try {
        const status_response = await axios({
            url: 'https://api.brightdata.com/status',
            method: 'GET',
            headers: {authorization: `Bearer ${process.env.API_TOKEN}`},
        });
        const customer = status_response.data.customer;
        const password_response = await axios({
            url: `https://api.brightdata.com/zone/passwords?zone=${browser_zone}`,
            method: 'GET',
            headers: {authorization: `Bearer ${process.env.API_TOKEN}`},
        });
        const password = password_response.data.passwords[0];

        return `wss://brd-customer-${customer}-zone-${browser_zone}:`
            +`${password}@brd.superproxy.io:9222`;
    } catch(e){
        if (e.response?.status===422)
            throw new Error(`Browser zone '${browser_zone}' does not exist`);
        throw new Error(`Error retrieving browser credentials: ${e.message}`);
    }
};

let scraping_browser_navigate = {
    name: 'scraping_browser_navigate',
    description: 'Navigate a scraping browser session to a new URL',
    parameters: z.object({
        url: z.string().describe('The URL to navigate to'),
    }),
    execute: async({url})=>{
        const browser_session = await require_browser();
        const page = await browser_session.get_page({url});
        await browser_session.clear_requests();
        try {
            await page.goto(url, {
                timeout: 120000,
                waitUntil: 'domcontentloaded',
            });
            return [
                `Successfully navigated to ${url}`,
                `Title: ${await page.title()}`,
                `URL: ${page.url()}`,
            ].join('\n');
        } catch(e){
            throw new UserError(`Error navigating to ${url}: ${e}`);
        }
    },
};

let scraping_browser_go_back = {
    name: 'scraping_browser_go_back',
    description: 'Go back to the previous page',
    parameters: z.object({}),
    execute: async()=>{
        const page = await (await require_browser()).get_page();
        try {
            await page.goBack();
            return [
                'Successfully navigated back',
                `Title: ${await page.title()}`,
                `URL: ${page.url()}`,
            ].join('\n');
        } catch(e){
            throw new UserError(`Error navigating back: ${e}`);
        }
    },
};

const scraping_browser_go_forward = {
    name: 'scraping_browser_go_forward',
    description: 'Go forward to the next page',
    parameters: z.object({}),
    execute: async()=>{
        const page = await (await require_browser()).get_page();
        try {
            await page.goForward();
            return [
                'Successfully navigated forward',
                `Title: ${await page.title()}`,
                `URL: ${page.url()}`,
            ].join('\n');
        } catch(e){
            throw new UserError(`Error navigating forward: ${e}`);
        }
    },
};

let scraping_browser_snapshot = {
    name: 'scraping_browser_snapshot',
    description: [
        'Capture an ARIA snapshot of the current page showing all interactive '
        +'elements with their refs.',
        'This provides accurate element references that can be used with '
        +'ref-based tools.',
        'Use this before interacting with elements to get proper refs instead '
        +'of guessing selectors.'
    ].join('\n'),
    parameters: z.object({}),
    execute: async()=>{
        const browser_session = await require_browser();
        try {
            const snapshot = await browser_session.capture_snapshot();
            return [
                `Page: ${snapshot.url}`,
                `Title: ${snapshot.title}`,
                '',
                'Interactive Elements:',
                snapshot.aria_snapshot
            ].join('\n');
        } catch(e){
            throw new UserError(`Error capturing snapshot: ${e}`);
        }
    },
};

let scraping_browser_click_ref = {
    name: 'scraping_browser_click_ref',
    description: [
        'Click on an element using its ref from the ARIA snapshot.',
        'Use scraping_browser_snapshot first to get the correct ref values.',
        'This is more reliable than CSS selectors.'
    ].join('\n'),
    parameters: z.object({
        ref: z.string().describe('The ref attribute from the ARIA snapshot (e.g., "23")'),
        element: z.string().describe('Description of the element being clicked for context'),
    }),
    execute: async({ref, element})=>{
        const browser_session = await require_browser();
        try {
            const locator = await browser_session.ref_locator({element, ref});
            await locator.click({timeout: 5000});
            return `Successfully clicked element: ${element} (ref=${ref})`;
        } catch(e){
            throw new UserError(`Error clicking element ${element} with ref ${ref}: ${e}`);
        }
    },
};

let scraping_browser_type_ref = {
    name: 'scraping_browser_type_ref',
    description: [
        'Type text into an element using its ref from the ARIA snapshot.',
        'Use scraping_browser_snapshot first to get the correct ref values.',
        'This is more reliable than CSS selectors.'
    ].join('\n'),
    parameters: z.object({
        ref: z.string().describe('The ref attribute from the ARIA snapshot (e.g., "23")'),
        element: z.string().describe('Description of the element being typed into for context'),
        text: z.string().describe('Text to type'),
        submit: z.boolean().optional()
            .describe('Whether to submit the form after typing (press Enter)'),
    }),
    execute: async({ref, element, text, submit})=>{
        const browser_session = await require_browser();
        try {
            const locator = await browser_session.ref_locator({element, ref});
            await locator.fill(text);
            if (submit)
                await locator.press('Enter');
            const suffix = submit ? ' and submitted the form' : '';
            return 'Successfully typed "'+text+'" into element: '+element
                +' (ref='+ref+')'+suffix;
        } catch(e){
            throw new UserError(`Error typing into element ${element} with ref ${ref}: ${e}`);
        }
    },
};

let scraping_browser_screenshot = {
    name: 'scraping_browser_screenshot',
    description: 'Take a screenshot of the current page',
    parameters: z.object({
        full_page: z.boolean().optional().describe([
            'Whether to screenshot the full page (default: false)',
            'You should avoid fullscreen if it\'s not important, since the '
            +'images can be quite large',
        ].join('\n')),
    }),
    execute: async({full_page=false})=>{
        const page = await (await require_browser()).get_page();
        try {
            const buffer = await page.screenshot({fullPage: full_page});
            return image_content({buffer});
        } catch(e){
            throw new UserError(`Error taking screenshot: ${e}`);
        }
    },
};

let scraping_browser_get_html = {
    name: 'scraping_browser_get_html',
    description: 'Get the HTML content of the current page. Avoid using this '
    +'tool and if used, use full_page option unless it is important to see '
    +'things like script tags since this can be large',
    parameters: z.object({
        full_page: z.boolean().optional().describe([
            'Whether to get the full page HTML including head and script tags',
            'Avoid this if you only need the extra HTML, since it can be '
            +'quite large',
        ].join('\n')),
    }),
    execute: async({full_page=false})=>{
        const page = await (await require_browser()).get_page();
        try {
            if (!full_page)
                return await page.$eval('body', body=>body.innerHTML);
            const html = await page.content();
            if (!full_page && html)
                return html.split('<body>')[1].split('</body>')[0];
            return html;
        } catch(e){
            throw new UserError(`Error getting HTML content: ${e}`);
        }
    },
};

let scraping_browser_get_text = {
    name: 'scraping_browser_get_text',
    description: 'Get the text content of the current page',
    parameters: z.object({}),
    execute: async()=>{
        const page = await (await require_browser()).get_page();
        try { return await page.$eval('body', body=>body.innerText); }
        catch(e){ throw new UserError(`Error getting text content: ${e}`); }
    },
};

let scraping_browser_scroll = {
    name: 'scraping_browser_scroll',
    description: 'Scroll to the bottom of the current page',
    parameters: z.object({}),
    execute: async()=>{
        const page = await (await require_browser()).get_page();
        try {
            await page.evaluate(()=>{
                window.scrollTo(0, document.body.scrollHeight);
            });
            return 'Successfully scrolled to the bottom of the page';
        } catch(e){
            throw new UserError(`Error scrolling page: ${e}`);
        }
    },
};

let scraping_browser_scroll_to_ref = {
    name: 'scraping_browser_scroll_to_ref',
    description: [
        'Scroll to a specific element using its ref from the ARIA snapshot.',
        'Use scraping_browser_snapshot first to get the correct ref values.',
        'This is more reliable than CSS selectors.'
    ].join('\n'),
    parameters: z.object({
        ref: z.string().describe('The ref attribute from the ARIA snapshot (e.g., "23")'),
        element: z.string().describe('Description of the element to scroll to'),
    }),
    execute: async({ref, element})=>{
        const browser_session = await require_browser();
        try {
            const locator = await browser_session.ref_locator({element, ref});
            await locator.scrollIntoViewIfNeeded();
            return `Successfully scrolled to element: ${element} (ref=${ref})`;
        } catch(e){
            throw new UserError(`Error scrolling to element ${element} with `
                +`ref ${ref}: ${e}`);
        }
    },
};

let scraping_browser_network_requests = {
    name: 'scraping_browser_network_requests',
    description: [
        'Get all network requests made since loading the current page.',
        'Shows HTTP method, URL, status code and status text for each request.',
        'Useful for debugging API calls, tracking data fetching, and '
        +'understanding page behavior.'
    ].join('\n'),
    parameters: z.object({}),
    execute: async()=>{
        const browser_session = await require_browser();
        try {
            const requests = await browser_session.get_requests();
            if (requests.size==0) 
                return 'No network requests recorded for the current page.';

            const results = [];
            requests.forEach((response, request)=>{
                const result = [];
                result.push(`[${request.method().toUpperCase()}] ${request.url()}`);
                if (response)
                    result.push(`=> [${response.status()}] ${response.statusText()}`);

                results.push(result.join(' '));
            });
            
            return [
                `Network Requests (${results.length} total):`,
                '',
                ...results
            ].join('\n');
        } catch(e){
            throw new UserError(`Error getting network requests: ${e}`);
        }
    },
};

let scraping_browser_wait_for_ref = {
    name: 'scraping_browser_wait_for_ref',
    description: [
        'Wait for an element to be visible using its ref from the ARIA snapshot.',
        'Use scraping_browser_snapshot first to get the correct ref values.',
        'This is more reliable than CSS selectors.'
    ].join('\n'),
    parameters: z.object({
        ref: z.string().describe('The ref attribute from the ARIA snapshot (e.g., "23")'),
        element: z.string().describe('Description of the element being waited for'),
        timeout: z.number().optional()
            .describe('Maximum time to wait in milliseconds (default: 30000)'),
    }),
    execute: async({ref, element, timeout})=>{
        const browser_session = await require_browser();
        try {
            const locator = await browser_session.ref_locator({element, ref});
            await locator.waitFor({timeout: timeout || 30000});
            return `Successfully waited for element: ${element} (ref=${ref})`;
        } catch(e){
            throw new UserError(`Error waiting for element ${element} with ref ${ref}: ${e}`);
        }
    },
};

export const tools = [
    scraping_browser_navigate,
    scraping_browser_go_back,
    scraping_browser_go_forward,
    scraping_browser_snapshot,
    scraping_browser_click_ref,
    scraping_browser_type_ref,
    scraping_browser_screenshot,
    scraping_browser_network_requests,
    scraping_browser_wait_for_ref,
    scraping_browser_get_text,
    scraping_browser_get_html,
    scraping_browser_scroll,
    scraping_browser_scroll_to_ref,
];

```

--------------------------------------------------------------------------------
/server.js:
--------------------------------------------------------------------------------

```javascript
#!/usr/bin/env node
'use strict'; /*jslint node:true es9:true*/
import {FastMCP} from 'fastmcp';
import {z} from 'zod';
import axios from 'axios';
import {tools as browser_tools} from './browser_tools.js';
import {createRequire} from 'node:module';
const require = createRequire(import.meta.url);
const package_json = require('./package.json');
const api_token = process.env.API_TOKEN;
const unlocker_zone = process.env.WEB_UNLOCKER_ZONE || 'mcp_unlocker';
const browser_zone = process.env.BROWSER_ZONE || 'mcp_browser';
const pro_mode = process.env.PRO_MODE === 'true';
const pro_mode_tools = ['search_engine', 'scrape_as_markdown', 
    'search_engine_batch', 'scrape_batch'];
function parse_rate_limit(rate_limit_str) {
    if (!rate_limit_str) 
        return null;
    
    const match = rate_limit_str.match(/^(\d+)\/(\d+)([mhs])$/);
    if (!match) 
        throw new Error('Invalid RATE_LIMIT format. Use: 100/1h or 50/30m');
    
    const [, limit, time, unit] = match;
    const multiplier = unit==='h' ? 3600 : unit==='m' ? 60 : 1;
    
    return {
        limit: parseInt(limit),
        window: parseInt(time) * multiplier * 1000, 
        display: rate_limit_str
    };
}

const rate_limit_config = parse_rate_limit(process.env.RATE_LIMIT);

if (!api_token)
    throw new Error('Cannot run MCP server without API_TOKEN env');

const api_headers = (clientName=null)=>({
    'user-agent': `${package_json.name}/${package_json.version}`,
    authorization: `Bearer ${api_token}`,
    ...(clientName ? {'x-mcp-client-name': clientName} : {}),
});

function check_rate_limit(){
    if (!rate_limit_config) 
        return true;
    
    const now = Date.now();
    const window_start = now - rate_limit_config.window;
    
    debug_stats.call_timestamps = debug_stats.call_timestamps.filter(timestamp=>timestamp>window_start);
    
    if (debug_stats.call_timestamps.length>=rate_limit_config.limit)
        throw new Error(`Rate limit exceeded: ${rate_limit_config.display}`);
    
    debug_stats.call_timestamps.push(now);
    return true;
}

async function ensure_required_zones(){
    try {
        console.error('Checking for required zones...');
        let response = await axios({
            url: 'https://api.brightdata.com/zone/get_active_zones',
            method: 'GET',
            headers: api_headers(),
        });
        let zones = response.data || [];
        let has_unlocker_zone = zones.some(zone=>zone.name==unlocker_zone);
        let has_browser_zone = zones.some(zone=>zone.name==browser_zone);
        
        if (!has_unlocker_zone)
        {
            console.error(`Required zone "${unlocker_zone}" not found, `
                +`creating it...`);
            await axios({
                url: 'https://api.brightdata.com/zone',
                method: 'POST',
                headers: {
                    ...api_headers(),
                    'Content-Type': 'application/json',
                },
                data: {
                    zone: {name: unlocker_zone, type: 'unblocker'},
                    plan: {type: 'unblocker'},
                },
            });
            console.error(`Zone "${unlocker_zone}" created successfully`);
        }
        else
            console.error(`Required zone "${unlocker_zone}" already exists`);
            
        if (!has_browser_zone)
        {
            console.error(`Required zone "${browser_zone}" not found, `
                +`creating it...`);
            await axios({
                url: 'https://api.brightdata.com/zone',
                method: 'POST',
                headers: {
                    ...api_headers(),
                    'Content-Type': 'application/json',
                },
                data: {
                    zone: {name: browser_zone, type: 'browser_api'},
                    plan: {type: 'browser_api'},
                },
            });
            console.error(`Zone "${browser_zone}" created successfully`);
        }
        else
            console.error(`Required zone "${browser_zone}" already exists`);
    } catch(e){
        console.error('Error checking/creating zones:',
            e.response?.data||e.message);
    }
}

await ensure_required_zones();

let server = new FastMCP({
    name: 'Bright Data',
    version: package_json.version,
});
let debug_stats = {tool_calls: {}, session_calls: 0, call_timestamps: []};

const addTool = (tool) => {
    if (!pro_mode && !pro_mode_tools.includes(tool.name)) 
        return;
    server.addTool(tool);
};

addTool({
    name: 'search_engine',
    description: 'Scrape search results from Google, Bing or Yandex. Returns '
        +'SERP results in JSON or Markdown (URL, title, description), Ideal for'
        +'gathering current information, news, and detailed search results.',
    parameters: z.object({
        query: z.string(),
        engine: z.enum(['google', 'bing', 'yandex'])
            .optional()
            .default('google'),
        cursor: z.string()
            .optional()
            .describe('Pagination cursor for next page'),
    }),
    execute: tool_fn('search_engine', async ({query, engine, cursor}, ctx)=>{
        const is_google = engine=='google';
        const url = search_url(engine, query, cursor);
        let response = await axios({
            url: 'https://api.brightdata.com/request',
            method: 'POST',
            data: {
                url: url,
                zone: unlocker_zone,
                format: 'raw',
                data_format: is_google ? 'parsed' : 'markdown',
            },
            headers: api_headers(ctx.clientName),
            responseType: 'text',
        });
        if (!is_google)
            return response.data;
        try {
            const searchData = JSON.parse(response.data);
            return JSON.stringify({
                organic: searchData.organic || [],
                images: searchData.images
                    ? searchData.images.map(img=>img.link) : [],
                current_page: searchData.pagination.current_page || {},
                related: searchData.related || [],
                ai_overview: searchData.ai_overview || null,
            });
        } catch(e){
            return JSON.stringify({
                organic: [],
                images: [],
                pagination: {},
                related: [],
            });
        }
    }),
});

addTool({
    name: 'scrape_as_markdown',
    description: 'Scrape a single webpage URL with advanced options for '
    +'content extraction and get back the results in MarkDown language. '
    +'This tool can unlock any webpage even if it uses bot detection or '
    +'CAPTCHA.',
    parameters: z.object({url: z.string().url()}),
    execute: tool_fn('scrape_as_markdown', async({url}, ctx)=>{
        let response = await axios({
            url: 'https://api.brightdata.com/request',
            method: 'POST',
            data: {
                url,
                zone: unlocker_zone,
                format: 'raw',
                data_format: 'markdown',
            },
            headers: api_headers(ctx.clientName),
            responseType: 'text',
        });
        return response.data;
    }),
});

addTool({
    name: 'search_engine_batch',
    description: 'Run multiple search queries simultaneously. Returns '
    +'JSON for Google, Markdown for Bing/Yandex.',
    parameters: z.object({
        queries: z.array(z.object({
            query: z.string(),
            engine: z.enum(['google', 'bing', 'yandex'])
                .optional()
                .default('google'),
            cursor: z.string()
                .optional(),
        })).min(1).max(10),
    }),
    execute: tool_fn('search_engine_batch', async ({queries}, ctx)=>{
        const search_promises = queries.map(({query, engine, cursor})=>{
            const is_google = (engine || 'google') === 'google';
            const url = is_google
                ? `${search_url(engine || 'google', query, cursor)}&brd_json=1`
                : search_url(engine || 'google', query, cursor);

            return axios({
                url: 'https://api.brightdata.com/request',
                method: 'POST',
                data: {
                    url,
                    zone: unlocker_zone,
                    format: 'raw',
                    data_format: is_google ? undefined : 'markdown',
                },
                headers: api_headers(ctx.clientName),
                responseType: 'text',
            }).then(response => {
                if (is_google) {
                    const search_data = JSON.parse(response.data);
                    return {
                        query,
                        engine: engine || 'google',
                        result: {
                            organic: search_data.organic || [],
                            images: search_data.images ? search_data.images.map(img => img.link) : [],
                            current_page: search_data.pagination?.current_page || {},
                            related: search_data.related || [],
                            ai_overview: search_data.ai_overview || null
                        }
                    };
                }
                return {
                    query,
                    engine: engine || 'google',
                    result: response.data
                };
            });
        });

        const results = await Promise.all(search_promises);
        return JSON.stringify(results, null, 2);
    }),
});

addTool({
   name: 'scrape_batch',
   description: 'Scrape multiple webpages URLs with advanced options for '
        +'content extraction and get back the results in MarkDown language. '
        +'This tool can unlock any webpage even if it uses bot detection or '
        +'CAPTCHA.',
   parameters: z.object({
       urls: z.array(z.string().url()).min(1).max(10).describe('Array of URLs to scrape (max 10)')
   }),
   execute: tool_fn('scrape_batch', async ({urls}, ctx)=>{
       const scrapePromises = urls.map(url =>
           axios({
               url: 'https://api.brightdata.com/request',
               method: 'POST',
               data: {
                   url,
                   zone: unlocker_zone,
                   format: 'raw',
                   data_format: 'markdown',
               },
               headers: api_headers(ctx.clientName),
               responseType: 'text',
           }).then(response => ({
               url,
               content: response.data
           }))
       );

       const results = await Promise.all(scrapePromises);
       return JSON.stringify(results, null, 2);
   }),
});

addTool({
    name: 'scrape_as_html',
    description: 'Scrape a single webpage URL with advanced options for '
    +'content extraction and get back the results in HTML. '
    +'This tool can unlock any webpage even if it uses bot detection or '
    +'CAPTCHA.',
    parameters: z.object({url: z.string().url()}),
    execute: tool_fn('scrape_as_html', async({url}, ctx)=>{
        let response = await axios({
            url: 'https://api.brightdata.com/request',
            method: 'POST',
            data: {
                url,
                zone: unlocker_zone,
                format: 'raw',
            },
            headers: api_headers(ctx.clientName),
            responseType: 'text',
        });
        return response.data;
    }),
});

addTool({
    name: 'extract',
    description: 'Scrape a webpage and extract structured data as JSON. '
        + 'First scrapes the page as markdown, then uses AI sampling to convert '
        + 'it to structured JSON format. This tool can unlock any webpage even '
        + 'if it uses bot detection or CAPTCHA.',
    parameters: z.object({
        url: z.string().url(),
        extraction_prompt: z.string().optional().describe(
            'Custom prompt to guide the extraction process. If not provided, '
            + 'will extract general structured data from the page.'
        ),
    }),
    execute: tool_fn('extract', async ({ url, extraction_prompt }, ctx) => {
        let scrape_response = await axios({
            url: 'https://api.brightdata.com/request',
            method: 'POST',
            data: {
                url,
                zone: unlocker_zone,
                format: 'raw',
                data_format: 'markdown',
            },
            headers: api_headers(ctx.clientName),
            responseType: 'text',
        });

        let markdown_content = scrape_response.data;

        let system_prompt = 'You are a data extraction specialist. You MUST respond with ONLY valid JSON, no other text or formatting. '
            + 'Extract the requested information from the markdown content and return it as a properly formatted JSON object. '
            + 'Do not include any explanations, markdown formatting, or text outside the JSON response.';

        let user_prompt = extraction_prompt ||
            'Extract the requested information from this markdown content and return ONLY a JSON object:';

        let session = server.sessions[0]; // Get the first active session
        if (!session) throw new Error('No active session available for sampling');

        let sampling_response = await session.requestSampling({
            messages: [
                {
                    role: "user",
                    content: {
                        type: "text",
                        text: `${user_prompt}\n\nMarkdown content:\n${markdown_content}\n\nRemember: Respond with ONLY valid JSON, no other text.`,
                    },
                },
            ],
            systemPrompt: system_prompt,
            includeContext: "thisServer",
        });

        return sampling_response.content.text;
    }),
});

addTool({
    name: 'session_stats',
    description: 'Tell the user about the tool usage during this session',
    parameters: z.object({}),
    execute: tool_fn('session_stats', async()=>{
        let used_tools = Object.entries(debug_stats.tool_calls);
        let lines = ['Tool calls this session:'];
        for (let [name, calls] of used_tools)
            lines.push(`- ${name} tool: called ${calls} times`);
        return lines.join('\n');
    }),
});

const datasets = [{
    id: 'amazon_product',
    dataset_id: 'gd_l7q7dkf244hwjntr0',
    description: [
        'Quickly read structured amazon product data.',
        'Requires a valid product URL with /dp/ in it.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'amazon_product_reviews',
    dataset_id: 'gd_le8e811kzy4ggddlq',
    description: [
        'Quickly read structured amazon product review data.',
        'Requires a valid product URL with /dp/ in it.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'amazon_product_search',
    dataset_id: 'gd_lwdb4vjm1ehb499uxs',
    description: [
        'Quickly read structured amazon product search data.',
        'Requires a valid search keyword and amazon domain URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['keyword', 'url'],
    fixed_values: {pages_to_search: '1'}, 
}, {
    id: 'walmart_product',
    dataset_id: 'gd_l95fol7l1ru6rlo116',
    description: [
        'Quickly read structured walmart product data.',
        'Requires a valid product URL with /ip/ in it.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'walmart_seller',
    dataset_id: 'gd_m7ke48w81ocyu4hhz0',
    description: [
        'Quickly read structured walmart seller data.',
        'Requires a valid walmart seller URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'ebay_product',
    dataset_id: 'gd_ltr9mjt81n0zzdk1fb',
    description: [
        'Quickly read structured ebay product data.',
        'Requires a valid ebay product URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'homedepot_products',
    dataset_id: 'gd_lmusivh019i7g97q2n',
    description: [
        'Quickly read structured homedepot product data.',
        'Requires a valid homedepot product URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'zara_products',
    dataset_id: 'gd_lct4vafw1tgx27d4o0',
    description: [
        'Quickly read structured zara product data.',
        'Requires a valid zara product URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'etsy_products',
    dataset_id: 'gd_ltppk0jdv1jqz25mz',
    description: [
        'Quickly read structured etsy product data.',
        'Requires a valid etsy product URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'bestbuy_products',
    dataset_id: 'gd_ltre1jqe1jfr7cccf',
    description: [
        'Quickly read structured bestbuy product data.',
        'Requires a valid bestbuy product URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'linkedin_person_profile',
    dataset_id: 'gd_l1viktl72bvl7bjuj0',
    description: [
        'Quickly read structured linkedin people profile data.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'linkedin_company_profile',
    dataset_id: 'gd_l1vikfnt1wgvvqz95w',
    description: [
        'Quickly read structured linkedin company profile data',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'linkedin_job_listings',
    dataset_id: 'gd_lpfll7v5hcqtkxl6l',
    description: [
        'Quickly read structured linkedin job listings data',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'linkedin_posts',
    dataset_id: 'gd_lyy3tktm25m4avu764',
    description: [
        'Quickly read structured linkedin posts data',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'linkedin_people_search',
    dataset_id: 'gd_m8d03he47z8nwb5xc',
    description: [
        'Quickly read structured linkedin people search data',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url', 'first_name', 'last_name'],
}, {
    id: 'crunchbase_company',
    dataset_id: 'gd_l1vijqt9jfj7olije',
    description: [
        'Quickly read structured crunchbase company data',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
},
{
    id: 'zoominfo_company_profile',
    dataset_id: 'gd_m0ci4a4ivx3j5l6nx',
    description: [
        'Quickly read structured ZoomInfo company profile data.',
        'Requires a valid ZoomInfo company URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
},
{
    id: 'instagram_profiles',
    dataset_id: 'gd_l1vikfch901nx3by4',
    description: [
        'Quickly read structured Instagram profile data.',
        'Requires a valid Instagram URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
},
{
    id: 'instagram_posts',
    dataset_id: 'gd_lk5ns7kz21pck8jpis',
    description: [
        'Quickly read structured Instagram post data.',
        'Requires a valid Instagram URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
},
{
    id: 'instagram_reels',
    dataset_id: 'gd_lyclm20il4r5helnj',
    description: [
        'Quickly read structured Instagram reel data.',
        'Requires a valid Instagram URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
},
{
    id: 'instagram_comments',
    dataset_id: 'gd_ltppn085pokosxh13',
    description: [
        'Quickly read structured Instagram comments data.',
        'Requires a valid Instagram URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
},
{
    id: 'facebook_posts',
    dataset_id: 'gd_lyclm1571iy3mv57zw',
    description: [
        'Quickly read structured Facebook post data.',
        'Requires a valid Facebook post URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
},
{
    id: 'facebook_marketplace_listings',
    dataset_id: 'gd_lvt9iwuh6fbcwmx1a',
    description: [
        'Quickly read structured Facebook marketplace listing data.',
        'Requires a valid Facebook marketplace listing URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
},
{
    id: 'facebook_company_reviews',
    dataset_id: 'gd_m0dtqpiu1mbcyc2g86',
    description: [
        'Quickly read structured Facebook company reviews data.',
        'Requires a valid Facebook company URL and number of reviews.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url', 'num_of_reviews'],
}, {
    id: 'facebook_events',
    dataset_id: 'gd_m14sd0to1jz48ppm51',
    description: [
        'Quickly read structured Facebook events data.',
        'Requires a valid Facebook event URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'tiktok_profiles',
    dataset_id: 'gd_l1villgoiiidt09ci',
    description: [
        'Quickly read structured Tiktok profiles data.',
        'Requires a valid Tiktok profile URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'tiktok_posts',
    dataset_id: 'gd_lu702nij2f790tmv9h',
    description: [
        'Quickly read structured Tiktok post data.',
        'Requires a valid Tiktok post URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'tiktok_shop',
    dataset_id: 'gd_m45m1u911dsa4274pi',
    description: [
        'Quickly read structured Tiktok shop data.',
        'Requires a valid Tiktok shop product URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'tiktok_comments',
    dataset_id: 'gd_lkf2st302ap89utw5k',
    description: [
        'Quickly read structured Tiktok comments data.',
        'Requires a valid Tiktok video URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'google_maps_reviews',
    dataset_id: 'gd_luzfs1dn2oa0teb81',
    description: [
        'Quickly read structured Google maps reviews data.',
        'Requires a valid Google maps URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url', 'days_limit'],
    defaults: {days_limit: '3'},
}, {
    id: 'google_shopping',
    dataset_id: 'gd_ltppk50q18kdw67omz',
    description: [
        'Quickly read structured Google shopping data.',
        'Requires a valid Google shopping product URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'google_play_store',
    dataset_id: 'gd_lsk382l8xei8vzm4u',
    description: [
        'Quickly read structured Google play store data.',
        'Requires a valid Google play store app URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'apple_app_store',
    dataset_id: 'gd_lsk9ki3u2iishmwrui',
    description: [
        'Quickly read structured apple app store data.',
        'Requires a valid apple app store app URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'reuter_news',
    dataset_id: 'gd_lyptx9h74wtlvpnfu',
    description: [
        'Quickly read structured reuter news data.',
        'Requires a valid reuter news report URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'github_repository_file',
    dataset_id: 'gd_lyrexgxc24b3d4imjt',
    description: [
        'Quickly read structured github repository data.',
        'Requires a valid github repository file URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'yahoo_finance_business',
    dataset_id: 'gd_lmrpz3vxmz972ghd7',
    description: [
        'Quickly read structured yahoo finance business data.',
        'Requires a valid yahoo finance business URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
},
{
    id: 'x_posts',
    dataset_id: 'gd_lwxkxvnf1cynvib9co',
    description: [
        'Quickly read structured X post data.',
        'Requires a valid X post URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
},
{
    id: 'zillow_properties_listing',
    dataset_id: 'gd_lfqkr8wm13ixtbd8f5',
    description: [
        'Quickly read structured zillow properties listing data.',
        'Requires a valid zillow properties listing URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
},
{
    id: 'booking_hotel_listings',
    dataset_id: 'gd_m5mbdl081229ln6t4a',
    description: [
        'Quickly read structured booking hotel listings data.',
        'Requires a valid booking hotel listing URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'youtube_profiles',
    dataset_id: 'gd_lk538t2k2p1k3oos71',
    description: [
        'Quickly read structured youtube profiles data.',
        'Requires a valid youtube profile URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}, {
    id: 'youtube_comments',
    dataset_id: 'gd_lk9q0ew71spt1mxywf',
    description: [
        'Quickly read structured youtube comments data.',
        'Requires a valid youtube video URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url', 'num_of_comments'],
    defaults: {num_of_comments: '10'},
}, {
    id: 'reddit_posts',
    dataset_id: 'gd_lvz8ah06191smkebj4',
    description: [
        'Quickly read structured reddit posts data.',
        'Requires a valid reddit post URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
},
{
    id: 'youtube_videos',
    dataset_id: 'gd_lk56epmy2i5g7lzu0k',
    description: [
        'Quickly read structured YouTube videos data.',
        'Requires a valid YouTube video URL.',
        'This can be a cache lookup, so it can be more reliable than scraping',
    ].join('\n'),
    inputs: ['url'],
}];
for (let {dataset_id, id, description, inputs, defaults = {}, fixed_values = {}} of datasets)
{
    let parameters = {};
    for (let input of inputs)
    {
        let param_schema = input=='url' ? z.string().url() : z.string();
        parameters[input] = defaults[input] !== undefined ?
            param_schema.default(defaults[input]) : param_schema;
    }
    addTool({
        name: `web_data_${id}`,
        description,
        parameters: z.object(parameters),
        execute: tool_fn(`web_data_${id}`, async(data, ctx)=>{
            data = {...data, ...fixed_values};
            let trigger_response = await axios({
                url: 'https://api.brightdata.com/datasets/v3/trigger',
                params: {dataset_id, include_errors: true},
                method: 'POST',
                data: [data],
                headers: api_headers(ctx.clientName),
            });
            if (!trigger_response.data?.snapshot_id)
                throw new Error('No snapshot ID returned from request');
            let snapshot_id = trigger_response.data.snapshot_id;
            console.error(`[web_data_${id}] triggered collection with `
                +`snapshot ID: ${snapshot_id}`);
            let max_attempts = 600;
            let attempts = 0;
            while (attempts < max_attempts)
            {
                try {
                    if (ctx && ctx.reportProgress)
                    {
                        await ctx.reportProgress({
                            progress: attempts,
                            total: max_attempts,
                            message: `Polling for data (attempt `
                                +`${attempts + 1}/${max_attempts})`,
                        });
                    }
                    let snapshot_response = await axios({
                        url: `https://api.brightdata.com/datasets/v3`
                            +`/snapshot/${snapshot_id}`,
                        params: {format: 'json'},
                        method: 'GET',
                        headers: api_headers(ctx.clientName),
                    });
                    if (['running', 'building'].includes(snapshot_response.data?.status))
                    {
                        console.error(`[web_data_${id}] snapshot not ready, `
                            +`polling again (attempt `
                            +`${attempts + 1}/${max_attempts})`);
                        attempts++;
                        await new Promise(resolve=>setTimeout(resolve, 1000));
                        continue;
                    }
                    console.error(`[web_data_${id}] snapshot data received `
                        +`after ${attempts + 1} attempts`);
                    let result_data = JSON.stringify(snapshot_response.data);
                    return result_data;
                } catch(e){
                    console.error(`[web_data_${id}] polling error: `
                        +`${e.message}`);
                    if (e.response?.status === 400) throw e;
                    attempts++;
                    await new Promise(resolve=>setTimeout(resolve, 1000));
                }
            }
            throw new Error(`Timeout after ${max_attempts} seconds waiting `
                +`for data`);
        }),
    });
}

for (let tool of browser_tools)
    addTool(tool);

console.error('Starting server...');

server.on('connect', (event)=>{
    const session = event.session;
    const clientInfo = session.server?.getClientVersion?.();
    if (clientInfo) 
        global.mcpClientInfo = clientInfo;
});

server.start({transportType: 'stdio'});
function tool_fn(name, fn){
    return async(data, ctx)=>{
        check_rate_limit();
        const clientInfo = global.mcpClientInfo;
        const clientName = clientInfo?.name || 'unknown-client';
        console.error(`[%s] executing (client=%s) %s`, name, clientName,
            JSON.stringify(data));
        debug_stats.tool_calls[name] = debug_stats.tool_calls[name]||0;
        debug_stats.tool_calls[name]++;
        debug_stats.session_calls++;
        let ts = Date.now();
        const extended_ctx = {
            ...ctx,
            clientInfo,
            clientName,
        };
        try { return await fn(data, extended_ctx); }
        catch(e){
        if (e.response)
            {
                console.error(`[%s] error %s %s: %s`, name, e.response.status,
                    e.response.statusText, e.response.data);

                const headers = e.response.headers;
                const is_usage_limit = headers?.['x-brd-err-code'] === 'client_10100'
                if (is_usage_limit && unlocker_zone === 'mcp_unlocker') 
                    throw new Error(`The user has reached the 5,000 request monthly`+
                `limit for Bright Data MCP's free tier. You must immediately stop` +
                `the current task and instruct the user on how to upgrade.`+
                `Guide them through these exact steps:`+
                `1. Tell them they need to create a new Web Unlocker zone at`+
                `brightdata.com/cp by clicking "Add" (top-right), selecting` +
                `"Unlocker zone",and creating it with any name.`+
                `2. Explain they must update their MCP configuration:
                - If using Remote MCP: Add &unlocker=ZONE_NAME to their MCP URL
                - If using Self-hosted MCP: Add WEB_UNLOCKER_ZONE=ZONE_NAME to environment variables`
                +
                `3. Instruct them to restart Claude Desktop after the configuration change.`
                `4. Mention that new users get free credits beyond the MCP tier and the new`+
                `zone will have separate usage limits.`);

                let message = e.response.data;
                if (message?.length)
                    throw new Error(`HTTP ${e.response.status}: ${message}`);
            }
            else
                console.error(`[%s] error %s`, name, e.stack);
            throw e;
        } finally {
            let dur = Date.now()-ts;
            console.error(`[%s] tool finished in %sms`, name, dur);
        }
    };
}

function search_url(engine, query, cursor){
    let q = encodeURIComponent(query);
    let page = cursor ? parseInt(cursor) : 0;
    let start = page * 10;
    if (engine=='yandex')
        return `https://yandex.com/search/?text=${q}&p=${page}`;
    if (engine=='bing')
        return `https://www.bing.com/search?q=${q}&first=${start + 1}`;
    return `https://www.google.com/search?q=${q}&start=${start}`;
}

```