#
tokens: 44158/50000 28/28 files
lines: off (toggle) GitHub
raw markdown copy
# Directory Structure

```
├── .cursor
│   └── rules
│       ├── implementation-plan.mdc
│       └── mcp-development-protocol.mdc
├── .gitignore
├── .venv
│   ├── Include
│   │   └── site
│   │       └── python3.12
│   │           └── greenlet
│   │               └── greenlet.h
│   ├── pyvenv.cfg
│   └── Scripts
│       ├── activate
│       ├── activate.bat
│       ├── Activate.ps1
│       ├── cchardetect
│       ├── crawl4ai-doctor.exe
│       ├── crawl4ai-download-models.exe
│       ├── crawl4ai-migrate.exe
│       ├── crawl4ai-setup.exe
│       ├── crwl.exe
│       ├── deactivate.bat
│       ├── distro.exe
│       ├── docs-scraper.exe
│       ├── dotenv.exe
│       ├── f2py.exe
│       ├── httpx.exe
│       ├── huggingface-cli.exe
│       ├── jsonschema.exe
│       ├── litellm.exe
│       ├── markdown-it.exe
│       ├── mcp.exe
│       ├── nltk.exe
│       ├── normalizer.exe
│       ├── numpy-config.exe
│       ├── openai.exe
│       ├── pip.exe
│       ├── pip3.12.exe
│       ├── pip3.exe
│       ├── playwright.exe
│       ├── py.test.exe
│       ├── pygmentize.exe
│       ├── pytest.exe
│       ├── python.exe
│       ├── pythonw.exe
│       ├── tqdm.exe
│       ├── typer.exe
│       └── uvicorn.exe
├── input_files
│   └── .gitkeep
├── LICENSE
├── pyproject.toml
├── README.md
├── requirements.txt
├── scraped_docs
│   └── .gitkeep
├── src
│   └── docs_scraper
│       ├── __init__.py
│       ├── cli.py
│       ├── crawlers
│       │   ├── __init__.py
│       │   ├── menu_crawler.py
│       │   ├── multi_url_crawler.py
│       │   ├── single_url_crawler.py
│       │   └── sitemap_crawler.py
│       ├── server.py
│       └── utils
│           ├── __init__.py
│           ├── html_parser.py
│           └── request_handler.py
└── tests
    ├── conftest.py
    ├── test_crawlers
    │   ├── test_menu_crawler.py
    │   ├── test_multi_url_crawler.py
    │   ├── test_single_url_crawler.py
    │   └── test_sitemap_crawler.py
    └── test_utils
        ├── test_html_parser.py
        └── test_request_handler.py
```

# Files

--------------------------------------------------------------------------------
/input_files/.gitkeep:
--------------------------------------------------------------------------------

```

```

--------------------------------------------------------------------------------
/scraped_docs/.gitkeep:
--------------------------------------------------------------------------------

```
 
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# Virtual Environment
venv/
ENV/
.env

# IDE
.idea/
.vscode/
*.swp
*.swo
.DS_Store

# Scraped Docs - ignore contents but keep directory
scraped_docs/*
!scraped_docs/.gitkeep

# Input Files - ignore contents but keep directory
input_files/*
!input_files/.gitkeep 
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
# Crawl4AI Documentation Scraper

Keep your dependency documentation lean, current, and AI-ready. This toolkit helps you extract clean, focused documentation from any framework or library website, perfect for both human readers and LLM consumption.

## Why This Tool?

In today's fast-paced development environment, you need:
- 📚 Quick access to dependency documentation without the bloat
- 🤖 Documentation in a format that's ready for RAG systems and LLMs
- 🎯 Focused content without navigation elements, ads, or irrelevant sections
- ⚡ Fast, efficient way to keep documentation up-to-date
- 🧹 Clean Markdown output for easy integration with documentation tools

Traditional web scraping often gives you everything - including navigation menus, footers, ads, and other noise. This toolkit is specifically designed to extract only what matters: the actual documentation content.

### Key Benefits

1. **Clean Documentation Output**
   - Markdown format for content-focused documentation
   - JSON format for structured menu data
   - Perfect for documentation sites, wikis, and knowledge bases
   - Ideal format for LLM training and RAG systems

2. **Smart Content Extraction**
   - Automatically identifies main content areas
   - Strips away navigation, ads, and irrelevant sections
   - Preserves code blocks and technical formatting
   - Maintains proper Markdown structure

3. **Flexible Crawling Strategies**
   - Single page for quick reference docs
   - Multi-page for comprehensive library documentation
   - Sitemap-based for complete framework coverage
   - Menu-based for structured documentation hierarchies

4. **LLM and RAG Ready**
   - Clean Markdown text suitable for embeddings
   - Preserved code blocks for technical accuracy
   - Structured menu data in JSON format
   - Consistent formatting for reliable processing

A comprehensive Python toolkit for scraping documentation websites using different crawling strategies. Built using the Crawl4AI library for efficient web crawling.

[![Powered by Crawl4AI](https://img.shields.io/badge/Powered%20by-Crawl4AI-blue?style=flat-square)](https://github.com/unclecode/crawl4ai)

## Features

### Core Features
- 🚀 Multiple crawling strategies
- 📑 Automatic nested menu expansion
- 🔄 Handles dynamic content and lazy-loaded elements
- 🎯 Configurable selectors
- 📝 Clean Markdown output for documentation
- 📊 JSON output for menu structure
- 🎨 Colorful terminal feedback
- 🔍 Smart URL processing
- ⚡ Asynchronous execution

### Available Crawlers
1. **Single URL Crawler** (`single_url_crawler.py`)
   - Extracts content from a single documentation page
   - Outputs clean Markdown format
   - Perfect for targeted content extraction
   - Configurable content selectors

2. **Multi URL Crawler** (`multi_url_crawler.py`)
   - Processes multiple URLs in parallel
   - Generates individual Markdown files per page
   - Efficient batch processing
   - Shared browser session for better performance

3. **Sitemap Crawler** (`sitemap_crawler.py`)
   - Automatically discovers and crawls sitemap.xml
   - Creates Markdown files for each page
   - Supports recursive sitemap parsing
   - Handles gzipped sitemaps

4. **Menu Crawler** (`menu_crawler.py`)
   - Extracts all menu links from documentation
   - Outputs structured JSON format
   - Handles nested and dynamic menus
   - Smart menu expansion

## Requirements

- Python 3.7+
- Virtual Environment (recommended)

## Installation

1. Clone the repository:
```bash
git clone https://github.com/felores/crawl4ai_docs_scraper.git
cd crawl4ai_docs_scraper
```

2. Create and activate a virtual environment:
```bash
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
```

3. Install dependencies:
```bash
pip install -r requirements.txt
```

## Usage

### 1. Single URL Crawler

```bash
python single_url_crawler.py https://docs.example.com/page
```

Arguments:
- URL: Target documentation URL (required, first argument)

Note: Use quotes only if your URL contains special characters or spaces.

Output format (Markdown):
```markdown
# Page Title

## Section 1
Content with preserved formatting, including:
- Lists
- Links
- Tables

### Code Examples
```python
def example():
    return "Code blocks are preserved"
```

### 2. Multi URL Crawler

```bash
# Using a text file with URLs
python multi_url_crawler.py urls.txt

# Using JSON output from menu crawler
python multi_url_crawler.py menu_links.json

# Using custom output prefix
python multi_url_crawler.py menu_links.json --output-prefix custom_name
```

Arguments:
- URLs file: Path to file containing URLs (required, first argument)
  - Can be .txt with one URL per line
  - Or .json from menu crawler output
- `--output-prefix`: Custom prefix for output markdown file (optional)

Note: Use quotes only if your file path contains spaces.

Output filename format:
- Without `--output-prefix`: `domain_path_docs_content_timestamp.md` (e.g., `cloudflare_agents_docs_content_20240323_223656.md`)
- With `--output-prefix`: `custom_prefix_docs_content_timestamp.md` (e.g., `custom_name_docs_content_20240323_223656.md`)

The crawler accepts two types of input files:
1. Text file with one URL per line:
```text
https://docs.example.com/page1
https://docs.example.com/page2
https://docs.example.com/page3
```

2. JSON file (compatible with menu crawler output):
```json
{
    "menu_links": [
        "https://docs.example.com/page1",
        "https://docs.example.com/page2"
    ]
}
```

### 3. Sitemap Crawler

```bash
python sitemap_crawler.py https://docs.example.com/sitemap.xml
```

Options:
- `--max-depth`: Maximum sitemap recursion depth (optional)
- `--patterns`: URL patterns to include (optional)

### 4. Menu Crawler

```bash
python menu_crawler.py https://docs.example.com
```

Options:
- `--selectors`: Custom menu selectors (optional)

The menu crawler now saves its output to the `input_files` directory, making it ready for immediate use with the multi-url crawler. The output JSON has this format:
```json
{
    "start_url": "https://docs.example.com/",
    "total_links_found": 42,
    "menu_links": [
        "https://docs.example.com/page1",
        "https://docs.example.com/page2"
    ]
}
```

After running the menu crawler, you'll get a command to run the multi-url crawler with the generated file.

## Directory Structure

```
crawl4ai_docs_scraper/
├── input_files/           # Input files for URL processing
│   ├── urls.txt          # Text file with URLs
│   └── menu_links.json   # JSON output from menu crawler
├── scraped_docs/         # Output directory for markdown files
│   └── docs_timestamp.md # Generated documentation
├── multi_url_crawler.py
├── menu_crawler.py
└── requirements.txt
```

## Error Handling

All crawlers include comprehensive error handling with colored terminal output:
- 🟢 Green: Success messages
- 🔵 Cyan: Processing status
- 🟡 Yellow: Warnings
- 🔴 Red: Error messages

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Attribution

This project uses [Crawl4AI](https://github.com/unclecode/crawl4ai) for web data extraction.

## Acknowledgments

- Built with [Crawl4AI](https://github.com/unclecode/crawl4ai)
- Uses [termcolor](https://pypi.org/project/termcolor/) for colorful terminal output
```

--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------

```
crawl4ai
aiohttp
termcolor
playwright
```

--------------------------------------------------------------------------------
/src/docs_scraper/utils/__init__.py:
--------------------------------------------------------------------------------

```python
"""
Utility modules for web crawling and HTML parsing.
"""
from .request_handler import RequestHandler
from .html_parser import HTMLParser

__all__ = [
    'RequestHandler',
    'HTMLParser'
] 
```

--------------------------------------------------------------------------------
/src/docs_scraper/__init__.py:
--------------------------------------------------------------------------------

```python
"""
Documentation scraper MCP server package.
"""
# Import subpackages but not modules to avoid circular imports
from . import crawlers
from . import utils

# Expose important items at package level
__all__ = ['crawlers', 'utils'] 
```

--------------------------------------------------------------------------------
/src/docs_scraper/crawlers/__init__.py:
--------------------------------------------------------------------------------

```python
"""
Web crawler implementations for documentation scraping.
"""
from .single_url_crawler import SingleURLCrawler
from .multi_url_crawler import MultiURLCrawler
from .sitemap_crawler import SitemapCrawler
from .menu_crawler import MenuCrawler

__all__ = [
    'SingleURLCrawler',
    'MultiURLCrawler',
    'SitemapCrawler',
    'MenuCrawler'
] 
```

--------------------------------------------------------------------------------
/.venv/Scripts/deactivate.bat:
--------------------------------------------------------------------------------

```
@echo off

if defined _OLD_VIRTUAL_PROMPT (
    set "PROMPT=%_OLD_VIRTUAL_PROMPT%"
)
set _OLD_VIRTUAL_PROMPT=

if defined _OLD_VIRTUAL_PYTHONHOME (
    set "PYTHONHOME=%_OLD_VIRTUAL_PYTHONHOME%"
    set _OLD_VIRTUAL_PYTHONHOME=
)

if defined _OLD_VIRTUAL_PATH (
    set "PATH=%_OLD_VIRTUAL_PATH%"
)

set _OLD_VIRTUAL_PATH=

set VIRTUAL_ENV=
set VIRTUAL_ENV_PROMPT=

:END

```

--------------------------------------------------------------------------------
/src/docs_scraper/cli.py:
--------------------------------------------------------------------------------

```python
"""
Command line interface for the docs_scraper package.
"""
import logging

def main():
    """Entry point for the package when run from the command line."""
    from docs_scraper.server import main as server_main
    
    # Configure logging
    logging.basicConfig(
        level=logging.INFO,
        format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
    )
    
    # Run the server
    server_main()

if __name__ == "__main__":
    main() 
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"

[project]
name = "docs_scraper"
version = "0.1.0"
authors = [
    { name = "Your Name", email = "[email protected]" }
]
description = "A documentation scraping tool"
requires-python = ">=3.7"
dependencies = [
    "beautifulsoup4",
    "requests",
    "aiohttp",
    "lxml",
    "termcolor",
    "crawl4ai"
]
classifiers = [
    "Programming Language :: Python :: 3",
    "Operating System :: OS Independent",
]

[project.optional-dependencies]
test = [
    "pytest",
    "pytest-asyncio",
    "aioresponses"
]

[project.scripts]
docs-scraper = "docs_scraper.cli:main"

[tool.setuptools.packages.find]
where = ["src"]
include = ["docs_scraper*"]
namespaces = false

[tool.hatch.build]
packages = ["src/docs_scraper"] 
```

--------------------------------------------------------------------------------
/.venv/Scripts/activate.bat:
--------------------------------------------------------------------------------

```
@echo off

rem This file is UTF-8 encoded, so we need to update the current code page while executing it
for /f "tokens=2 delims=:." %%a in ('"%SystemRoot%\System32\chcp.com"') do (
    set _OLD_CODEPAGE=%%a
)
if defined _OLD_CODEPAGE (
    "%SystemRoot%\System32\chcp.com" 65001 > nul
)

set "VIRTUAL_ENV=D:\AI-DEV\mcp\docs_scraper_mcp\.venv"

if not defined PROMPT set PROMPT=$P$G

if defined _OLD_VIRTUAL_PROMPT set PROMPT=%_OLD_VIRTUAL_PROMPT%
if defined _OLD_VIRTUAL_PYTHONHOME set PYTHONHOME=%_OLD_VIRTUAL_PYTHONHOME%

set _OLD_VIRTUAL_PROMPT=%PROMPT%
set PROMPT=(.venv) %PROMPT%

if defined PYTHONHOME set _OLD_VIRTUAL_PYTHONHOME=%PYTHONHOME%
set PYTHONHOME=

if defined _OLD_VIRTUAL_PATH set PATH=%_OLD_VIRTUAL_PATH%
if not defined _OLD_VIRTUAL_PATH set _OLD_VIRTUAL_PATH=%PATH%

set "PATH=%VIRTUAL_ENV%\Scripts;%PATH%"
set "VIRTUAL_ENV_PROMPT=(.venv) "

:END
if defined _OLD_CODEPAGE (
    "%SystemRoot%\System32\chcp.com" %_OLD_CODEPAGE% > nul
    set _OLD_CODEPAGE=
)

```

--------------------------------------------------------------------------------
/tests/conftest.py:
--------------------------------------------------------------------------------

```python
"""
Test configuration and fixtures for the docs_scraper package.
"""
import os
import pytest
import aiohttp
from typing import AsyncGenerator, Dict, Any
from aioresponses import aioresponses
from bs4 import BeautifulSoup

@pytest.fixture
def mock_aiohttp() -> aioresponses:
    """Fixture for mocking aiohttp requests."""
    with aioresponses() as m:
        yield m

@pytest.fixture
def sample_html() -> str:
    """Sample HTML content for testing."""
    return """
    <!DOCTYPE html>
    <html>
    <head>
        <title>Test Page</title>
        <meta name="description" content="Test description">
    </head>
    <body>
        <nav class="menu">
            <ul>
                <li><a href="/page1">Page 1</a></li>
                <li>
                    <a href="/section1">Section 1</a>
                    <ul>
                        <li><a href="/section1/page1">Section 1.1</a></li>
                        <li><a href="/section1/page2">Section 1.2</a></li>
                    </ul>
                </li>
            </ul>
        </nav>
        <main>
            <h1>Welcome</h1>
            <p>Test content</p>
            <a href="/test1">Link 1</a>
            <a href="/test2">Link 2</a>
        </main>
    </body>
    </html>
    """

@pytest.fixture
def sample_sitemap() -> str:
    """Sample sitemap.xml content for testing."""
    return """<?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
        <url>
            <loc>https://example.com/</loc>
            <lastmod>2024-03-24</lastmod>
        </url>
        <url>
            <loc>https://example.com/page1</loc>
            <lastmod>2024-03-24</lastmod>
        </url>
        <url>
            <loc>https://example.com/page2</loc>
            <lastmod>2024-03-24</lastmod>
        </url>
    </urlset>
    """

@pytest.fixture
def mock_website(mock_aiohttp, sample_html, sample_sitemap) -> None:
    """Set up a mock website with various pages and a sitemap."""
    base_url = "https://example.com"
    pages = {
        "/": sample_html,
        "/page1": sample_html.replace("Test Page", "Page 1"),
        "/page2": sample_html.replace("Test Page", "Page 2"),
        "/section1": sample_html.replace("Test Page", "Section 1"),
        "/section1/page1": sample_html.replace("Test Page", "Section 1.1"),
        "/section1/page2": sample_html.replace("Test Page", "Section 1.2"),
        "/robots.txt": "User-agent: *\nAllow: /",
        "/sitemap.xml": sample_sitemap
    }
    
    for path, content in pages.items():
        mock_aiohttp.get(f"{base_url}{path}", status=200, body=content)

@pytest.fixture
async def aiohttp_session() -> AsyncGenerator[aiohttp.ClientSession, None]:
    """Create an aiohttp ClientSession for testing."""
    async with aiohttp.ClientSession() as session:
        yield session

@pytest.fixture
def test_urls() -> Dict[str, Any]:
    """Test URLs and related data for testing."""
    base_url = "https://example.com"
    return {
        "base_url": base_url,
        "valid_urls": [
            f"{base_url}/",
            f"{base_url}/page1",
            f"{base_url}/page2"
        ],
        "invalid_urls": [
            "not_a_url",
            "ftp://example.com",
            "https://nonexistent.example.com"
        ],
        "menu_selector": "nav.menu",
        "sitemap_url": f"{base_url}/sitemap.xml"
    } 
```

--------------------------------------------------------------------------------
/tests/test_crawlers/test_single_url_crawler.py:
--------------------------------------------------------------------------------

```python
"""
Tests for the SingleURLCrawler class.
"""
import pytest
from docs_scraper.crawlers import SingleURLCrawler
from docs_scraper.utils import RequestHandler, HTMLParser

@pytest.mark.asyncio
async def test_single_url_crawler_successful_crawl(mock_website, test_urls, aiohttp_session):
    """Test successful crawling of a single URL."""
    url = test_urls["valid_urls"][0]
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = SingleURLCrawler(request_handler=request_handler, html_parser=html_parser)
    
    result = await crawler.crawl(url)
    
    assert result["success"] is True
    assert result["url"] == url
    assert "content" in result
    assert "title" in result["metadata"]
    assert "description" in result["metadata"]
    assert len(result["links"]) > 0
    assert result["status_code"] == 200
    assert result["error"] is None

@pytest.mark.asyncio
async def test_single_url_crawler_invalid_url(mock_website, test_urls, aiohttp_session):
    """Test crawling with an invalid URL."""
    url = test_urls["invalid_urls"][0]
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = SingleURLCrawler(request_handler=request_handler, html_parser=html_parser)
    
    result = await crawler.crawl(url)
    
    assert result["success"] is False
    assert result["url"] == url
    assert result["content"] is None
    assert result["metadata"] == {}
    assert result["links"] == []
    assert result["error"] is not None

@pytest.mark.asyncio
async def test_single_url_crawler_nonexistent_url(mock_website, test_urls, aiohttp_session):
    """Test crawling a URL that doesn't exist."""
    url = test_urls["invalid_urls"][2]
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = SingleURLCrawler(request_handler=request_handler, html_parser=html_parser)
    
    result = await crawler.crawl(url)
    
    assert result["success"] is False
    assert result["url"] == url
    assert result["content"] is None
    assert result["metadata"] == {}
    assert result["links"] == []
    assert result["error"] is not None

@pytest.mark.asyncio
async def test_single_url_crawler_metadata_extraction(mock_website, test_urls, aiohttp_session):
    """Test extraction of metadata from a crawled page."""
    url = test_urls["valid_urls"][0]
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = SingleURLCrawler(request_handler=request_handler, html_parser=html_parser)
    
    result = await crawler.crawl(url)
    
    assert result["success"] is True
    assert result["metadata"]["title"] == "Test Page"
    assert result["metadata"]["description"] == "Test description"

@pytest.mark.asyncio
async def test_single_url_crawler_link_extraction(mock_website, test_urls, aiohttp_session):
    """Test extraction of links from a crawled page."""
    url = test_urls["valid_urls"][0]
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = SingleURLCrawler(request_handler=request_handler, html_parser=html_parser)
    
    result = await crawler.crawl(url)
    
    assert result["success"] is True
    assert len(result["links"]) >= 6  # Number of links in sample HTML
    assert "/page1" in result["links"]
    assert "/section1" in result["links"]
    assert "/test1" in result["links"]
    assert "/test2" in result["links"]

@pytest.mark.asyncio
async def test_single_url_crawler_rate_limiting(mock_website, test_urls, aiohttp_session):
    """Test rate limiting functionality."""
    url = test_urls["valid_urls"][0]
    request_handler = RequestHandler(session=aiohttp_session, rate_limit=1)  # 1 request per second
    html_parser = HTMLParser()
    crawler = SingleURLCrawler(request_handler=request_handler, html_parser=html_parser)
    
    import time
    start_time = time.time()
    
    # Make multiple requests
    for _ in range(3):
        result = await crawler.crawl(url)
        assert result["success"] is True
    
    end_time = time.time()
    elapsed_time = end_time - start_time
    
    # Should take at least 2 seconds due to rate limiting
    assert elapsed_time >= 2.0 
```

--------------------------------------------------------------------------------
/tests/test_crawlers/test_multi_url_crawler.py:
--------------------------------------------------------------------------------

```python
"""
Tests for the MultiURLCrawler class.
"""
import pytest
from docs_scraper.crawlers import MultiURLCrawler
from docs_scraper.utils import RequestHandler, HTMLParser

@pytest.mark.asyncio
async def test_multi_url_crawler_successful_crawl(mock_website, test_urls, aiohttp_session):
    """Test successful crawling of multiple URLs."""
    urls = test_urls["valid_urls"]
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = MultiURLCrawler(request_handler=request_handler, html_parser=html_parser)
    
    results = await crawler.crawl(urls)
    
    assert len(results) == len(urls)
    for result, url in zip(results, urls):
        assert result["success"] is True
        assert result["url"] == url
        assert "content" in result
        assert "title" in result["metadata"]
        assert "description" in result["metadata"]
        assert len(result["links"]) > 0
        assert result["status_code"] == 200
        assert result["error"] is None

@pytest.mark.asyncio
async def test_multi_url_crawler_mixed_urls(mock_website, test_urls, aiohttp_session):
    """Test crawling a mix of valid and invalid URLs."""
    urls = test_urls["valid_urls"][:1] + test_urls["invalid_urls"][:1]
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = MultiURLCrawler(request_handler=request_handler, html_parser=html_parser)
    
    results = await crawler.crawl(urls)
    
    assert len(results) == len(urls)
    # Valid URL
    assert results[0]["success"] is True
    assert results[0]["url"] == urls[0]
    assert "content" in results[0]
    # Invalid URL
    assert results[1]["success"] is False
    assert results[1]["url"] == urls[1]
    assert results[1]["content"] is None

@pytest.mark.asyncio
async def test_multi_url_crawler_concurrent_limit(mock_website, test_urls, aiohttp_session):
    """Test concurrent request limiting."""
    urls = test_urls["valid_urls"] * 2  # Duplicate URLs to have more requests
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = MultiURLCrawler(
        request_handler=request_handler,
        html_parser=html_parser,
        concurrent_limit=2
    )
    
    import time
    start_time = time.time()
    
    results = await crawler.crawl(urls)
    
    end_time = time.time()
    elapsed_time = end_time - start_time
    
    assert len(results) == len(urls)
    # With concurrent_limit=2, processing 6 URLs should take at least 3 time units
    assert elapsed_time >= (len(urls) / 2) * 0.1  # Assuming each request takes ~0.1s

@pytest.mark.asyncio
async def test_multi_url_crawler_empty_urls(mock_website, aiohttp_session):
    """Test crawling with empty URL list."""
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = MultiURLCrawler(request_handler=request_handler, html_parser=html_parser)
    
    results = await crawler.crawl([])
    
    assert len(results) == 0

@pytest.mark.asyncio
async def test_multi_url_crawler_duplicate_urls(mock_website, test_urls, aiohttp_session):
    """Test crawling with duplicate URLs."""
    url = test_urls["valid_urls"][0]
    urls = [url, url, url]  # Same URL multiple times
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = MultiURLCrawler(request_handler=request_handler, html_parser=html_parser)
    
    results = await crawler.crawl(urls)
    
    assert len(results) == len(urls)
    for result in results:
        assert result["success"] is True
        assert result["url"] == url
        assert result["metadata"]["title"] == "Test Page"

@pytest.mark.asyncio
async def test_multi_url_crawler_rate_limiting(mock_website, test_urls, aiohttp_session):
    """Test rate limiting with multiple URLs."""
    urls = test_urls["valid_urls"]
    request_handler = RequestHandler(session=aiohttp_session, rate_limit=1)  # 1 request per second
    html_parser = HTMLParser()
    crawler = MultiURLCrawler(request_handler=request_handler, html_parser=html_parser)
    
    import time
    start_time = time.time()
    
    results = await crawler.crawl(urls)
    
    end_time = time.time()
    elapsed_time = end_time - start_time
    
    assert len(results) == len(urls)
    # Should take at least (len(urls) - 1) seconds due to rate limiting
    assert elapsed_time >= len(urls) - 1 
```

--------------------------------------------------------------------------------
/.venv/Include/site/python3.12/greenlet/greenlet.h:
--------------------------------------------------------------------------------

```
/* -*- indent-tabs-mode: nil; tab-width: 4; -*- */

/* Greenlet object interface */

#ifndef Py_GREENLETOBJECT_H
#define Py_GREENLETOBJECT_H


#include <Python.h>

#ifdef __cplusplus
extern "C" {
#endif

/* This is deprecated and undocumented. It does not change. */
#define GREENLET_VERSION "1.0.0"

#ifndef GREENLET_MODULE
#define implementation_ptr_t void*
#endif

typedef struct _greenlet {
    PyObject_HEAD
    PyObject* weakreflist;
    PyObject* dict;
    implementation_ptr_t pimpl;
} PyGreenlet;

#define PyGreenlet_Check(op) (op && PyObject_TypeCheck(op, &PyGreenlet_Type))


/* C API functions */

/* Total number of symbols that are exported */
#define PyGreenlet_API_pointers 12

#define PyGreenlet_Type_NUM 0
#define PyExc_GreenletError_NUM 1
#define PyExc_GreenletExit_NUM 2

#define PyGreenlet_New_NUM 3
#define PyGreenlet_GetCurrent_NUM 4
#define PyGreenlet_Throw_NUM 5
#define PyGreenlet_Switch_NUM 6
#define PyGreenlet_SetParent_NUM 7

#define PyGreenlet_MAIN_NUM 8
#define PyGreenlet_STARTED_NUM 9
#define PyGreenlet_ACTIVE_NUM 10
#define PyGreenlet_GET_PARENT_NUM 11

#ifndef GREENLET_MODULE
/* This section is used by modules that uses the greenlet C API */
static void** _PyGreenlet_API = NULL;

#    define PyGreenlet_Type \
        (*(PyTypeObject*)_PyGreenlet_API[PyGreenlet_Type_NUM])

#    define PyExc_GreenletError \
        ((PyObject*)_PyGreenlet_API[PyExc_GreenletError_NUM])

#    define PyExc_GreenletExit \
        ((PyObject*)_PyGreenlet_API[PyExc_GreenletExit_NUM])

/*
 * PyGreenlet_New(PyObject *args)
 *
 * greenlet.greenlet(run, parent=None)
 */
#    define PyGreenlet_New                                        \
        (*(PyGreenlet * (*)(PyObject * run, PyGreenlet * parent)) \
             _PyGreenlet_API[PyGreenlet_New_NUM])

/*
 * PyGreenlet_GetCurrent(void)
 *
 * greenlet.getcurrent()
 */
#    define PyGreenlet_GetCurrent \
        (*(PyGreenlet * (*)(void)) _PyGreenlet_API[PyGreenlet_GetCurrent_NUM])

/*
 * PyGreenlet_Throw(
 *         PyGreenlet *greenlet,
 *         PyObject *typ,
 *         PyObject *val,
 *         PyObject *tb)
 *
 * g.throw(...)
 */
#    define PyGreenlet_Throw                 \
        (*(PyObject * (*)(PyGreenlet * self, \
                          PyObject * typ,    \
                          PyObject * val,    \
                          PyObject * tb))    \
             _PyGreenlet_API[PyGreenlet_Throw_NUM])

/*
 * PyGreenlet_Switch(PyGreenlet *greenlet, PyObject *args)
 *
 * g.switch(*args, **kwargs)
 */
#    define PyGreenlet_Switch                                              \
        (*(PyObject *                                                      \
           (*)(PyGreenlet * greenlet, PyObject * args, PyObject * kwargs)) \
             _PyGreenlet_API[PyGreenlet_Switch_NUM])

/*
 * PyGreenlet_SetParent(PyObject *greenlet, PyObject *new_parent)
 *
 * g.parent = new_parent
 */
#    define PyGreenlet_SetParent                                 \
        (*(int (*)(PyGreenlet * greenlet, PyGreenlet * nparent)) \
             _PyGreenlet_API[PyGreenlet_SetParent_NUM])

/*
 * PyGreenlet_GetParent(PyObject* greenlet)
 *
 * return greenlet.parent;
 *
 * This could return NULL even if there is no exception active.
 * If it does not return NULL, you are responsible for decrementing the
 * reference count.
 */
#     define PyGreenlet_GetParent                                    \
    (*(PyGreenlet* (*)(PyGreenlet*))                                 \
     _PyGreenlet_API[PyGreenlet_GET_PARENT_NUM])

/*
 * deprecated, undocumented alias.
 */
#     define PyGreenlet_GET_PARENT PyGreenlet_GetParent

#     define PyGreenlet_MAIN                                         \
    (*(int (*)(PyGreenlet*))                                         \
     _PyGreenlet_API[PyGreenlet_MAIN_NUM])

#     define PyGreenlet_STARTED                                      \
    (*(int (*)(PyGreenlet*))                                         \
     _PyGreenlet_API[PyGreenlet_STARTED_NUM])

#     define PyGreenlet_ACTIVE                                       \
    (*(int (*)(PyGreenlet*))                                         \
     _PyGreenlet_API[PyGreenlet_ACTIVE_NUM])




/* Macro that imports greenlet and initializes C API */
/* NOTE: This has actually moved to ``greenlet._greenlet._C_API``, but we
   keep the older definition to be sure older code that might have a copy of
   the header still works. */
#    define PyGreenlet_Import()                                               \
        {                                                                     \
            _PyGreenlet_API = (void**)PyCapsule_Import("greenlet._C_API", 0); \
        }

#endif /* GREENLET_MODULE */

#ifdef __cplusplus
}
#endif
#endif /* !Py_GREENLETOBJECT_H */

```

--------------------------------------------------------------------------------
/src/docs_scraper/utils/html_parser.py:
--------------------------------------------------------------------------------

```python
"""
HTML parser module for extracting content and links from HTML documents.
"""
from typing import List, Dict, Any, Optional
from bs4 import BeautifulSoup
from urllib.parse import urljoin, urlparse

class HTMLParser:
    def __init__(self, base_url: str):
        """
        Initialize the HTML parser.
        
        Args:
            base_url: Base URL for resolving relative links
        """
        self.base_url = base_url

    def parse_content(self, html: str) -> Dict[str, Any]:
        """
        Parse HTML content and extract useful information.
        
        Args:
            html: Raw HTML content
            
        Returns:
            Dict containing:
                - title: Page title
                - description: Meta description
                - text_content: Main text content
                - links: List of links found
                - headers: List of headers found
        """
        soup = BeautifulSoup(html, 'lxml')
        
        # Extract title
        title = soup.title.string if soup.title else None
        
        # Extract meta description
        meta_desc = None
        meta_tag = soup.find('meta', attrs={'name': 'description'})
        if meta_tag:
            meta_desc = meta_tag.get('content')
        
        # Extract main content (remove script, style, etc.)
        for tag in soup(['script', 'style', 'nav', 'footer', 'header']):
            tag.decompose()
        
        # Get text content
        text_content = ' '.join(soup.stripped_strings)
        
        # Extract headers
        headers = []
        for tag in soup.find_all(['h1', 'h2', 'h3', 'h4', 'h5', 'h6']):
            headers.append({
                'level': int(tag.name[1]),
                'text': tag.get_text(strip=True)
            })
        
        # Extract links
        links = self._extract_links(soup)
        
        return {
            'title': title,
            'description': meta_desc,
            'text_content': text_content,
            'links': links,
            'headers': headers
        }

    def parse_menu(self, html: str, menu_selector: str) -> List[Dict[str, Any]]:
        """
        Parse navigation menu from HTML using a CSS selector.
        
        Args:
            html: Raw HTML content
            menu_selector: CSS selector for the menu element
            
        Returns:
            List of menu items with their structure
        """
        soup = BeautifulSoup(html, 'lxml')
        menu = soup.select_one(menu_selector)
        
        if not menu:
            return []
            
        return self._extract_menu_items(menu)

    def _extract_links(self, soup: BeautifulSoup) -> List[Dict[str, str]]:
        """Extract and normalize all links from the document."""
        links = []
        for a in soup.find_all('a', href=True):
            href = a['href']
            text = a.get_text(strip=True)
            
            # Skip empty or javascript links
            if not href or href.startswith(('javascript:', '#')):
                continue
                
            # Resolve relative URLs
            absolute_url = urljoin(self.base_url, href)
            
            # Only include links to the same domain
            if urlparse(absolute_url).netloc == urlparse(self.base_url).netloc:
                links.append({
                    'url': absolute_url,
                    'text': text
                })
                
        return links

    def _extract_menu_items(self, element: BeautifulSoup) -> List[Dict[str, Any]]:
        """Recursively extract menu structure."""
        items = []
        
        for item in element.find_all(['li', 'a'], recursive=False):
            if item.name == 'a':
                # Single link item
                href = item.get('href')
                if href and not href.startswith(('javascript:', '#')):
                    items.append({
                        'type': 'link',
                        'url': urljoin(self.base_url, href),
                        'text': item.get_text(strip=True)
                    })
            else:
                # Potentially nested menu item
                link = item.find('a')
                if link and link.get('href'):
                    menu_item = {
                        'type': 'menu',
                        'text': link.get_text(strip=True),
                        'url': urljoin(self.base_url, link['href']),
                        'children': []
                    }
                    
                    # Look for nested lists
                    nested = item.find(['ul', 'ol'])
                    if nested:
                        menu_item['children'] = self._extract_menu_items(nested)
                        
                    items.append(menu_item)
                    
        return items 
```

--------------------------------------------------------------------------------
/tests/test_utils/test_request_handler.py:
--------------------------------------------------------------------------------

```python
"""
Tests for the RequestHandler class.
"""
import asyncio
import pytest
import aiohttp
import time
from docs_scraper.utils import RequestHandler

@pytest.mark.asyncio
async def test_request_handler_successful_get(mock_website, test_urls, aiohttp_session):
    """Test successful GET request."""
    url = test_urls["valid_urls"][0]
    handler = RequestHandler(session=aiohttp_session)
    
    response = await handler.get(url)
    
    assert response.status == 200
    assert "<!DOCTYPE html>" in await response.text()

@pytest.mark.asyncio
async def test_request_handler_invalid_url(mock_website, test_urls, aiohttp_session):
    """Test handling of invalid URL."""
    url = test_urls["invalid_urls"][0]
    handler = RequestHandler(session=aiohttp_session)
    
    with pytest.raises(aiohttp.ClientError):
        await handler.get(url)

@pytest.mark.asyncio
async def test_request_handler_nonexistent_url(mock_website, test_urls, aiohttp_session):
    """Test handling of nonexistent URL."""
    url = test_urls["invalid_urls"][2]
    handler = RequestHandler(session=aiohttp_session)
    
    with pytest.raises(aiohttp.ClientError):
        await handler.get(url)

@pytest.mark.asyncio
async def test_request_handler_rate_limiting(mock_website, test_urls, aiohttp_session):
    """Test rate limiting functionality."""
    url = test_urls["valid_urls"][0]
    rate_limit = 2  # 2 requests per second
    handler = RequestHandler(session=aiohttp_session, rate_limit=rate_limit)
    
    start_time = time.time()
    
    # Make multiple requests
    for _ in range(3):
        response = await handler.get(url)
        assert response.status == 200
    
    end_time = time.time()
    elapsed_time = end_time - start_time
    
    # Should take at least 1 second due to rate limiting
    assert elapsed_time >= 1.0

@pytest.mark.asyncio
async def test_request_handler_custom_headers(mock_website, test_urls, aiohttp_session):
    """Test custom headers in requests."""
    url = test_urls["valid_urls"][0]
    custom_headers = {
        "User-Agent": "Custom Bot 1.0",
        "Accept-Language": "en-US,en;q=0.9"
    }
    handler = RequestHandler(session=aiohttp_session, headers=custom_headers)
    
    response = await handler.get(url)
    
    assert response.status == 200
    # Headers should be merged with default headers
    assert handler.headers["User-Agent"] == "Custom Bot 1.0"
    assert handler.headers["Accept-Language"] == "en-US,en;q=0.9"

@pytest.mark.asyncio
async def test_request_handler_timeout(mock_website, test_urls, aiohttp_session):
    """Test request timeout handling."""
    url = test_urls["valid_urls"][0]
    handler = RequestHandler(session=aiohttp_session, timeout=0.001)  # Very short timeout
    
    # Mock a delayed response
    mock_website.get(url, status=200, body="Delayed response", delay=0.1)
    
    with pytest.raises(aiohttp.ClientTimeout):
        await handler.get(url)

@pytest.mark.asyncio
async def test_request_handler_retry(mock_website, test_urls, aiohttp_session):
    """Test request retry functionality."""
    url = test_urls["valid_urls"][0]
    handler = RequestHandler(session=aiohttp_session, max_retries=3)
    
    # Mock temporary failures followed by success
    mock_website.get(url, status=500)  # First attempt fails
    mock_website.get(url, status=500)  # Second attempt fails
    mock_website.get(url, status=200, body="Success")  # Third attempt succeeds
    
    response = await handler.get(url)
    
    assert response.status == 200
    assert await response.text() == "Success"

@pytest.mark.asyncio
async def test_request_handler_max_retries_exceeded(mock_website, test_urls, aiohttp_session):
    """Test behavior when max retries are exceeded."""
    url = test_urls["valid_urls"][0]
    handler = RequestHandler(session=aiohttp_session, max_retries=2)
    
    # Mock consistent failures
    mock_website.get(url, status=500)
    mock_website.get(url, status=500)
    mock_website.get(url, status=500)
    
    with pytest.raises(aiohttp.ClientError):
        await handler.get(url)

@pytest.mark.asyncio
async def test_request_handler_session_management(mock_website, test_urls):
    """Test session management."""
    url = test_urls["valid_urls"][0]
    
    # Test with context manager
    async with aiohttp.ClientSession() as session:
        handler = RequestHandler(session=session)
        response = await handler.get(url)
        assert response.status == 200
    
    # Test with closed session
    with pytest.raises(aiohttp.ClientError):
        await handler.get(url)

@pytest.mark.asyncio
async def test_request_handler_concurrent_requests(mock_website, test_urls, aiohttp_session):
    """Test handling of concurrent requests."""
    urls = test_urls["valid_urls"]
    handler = RequestHandler(session=aiohttp_session)
    
    # Make concurrent requests
    tasks = [handler.get(url) for url in urls]
    responses = await asyncio.gather(*tasks)
    
    assert all(response.status == 200 for response in responses) 
```

--------------------------------------------------------------------------------
/tests/test_crawlers/test_menu_crawler.py:
--------------------------------------------------------------------------------

```python
"""
Tests for the MenuCrawler class.
"""
import pytest
from docs_scraper.crawlers import MenuCrawler
from docs_scraper.utils import RequestHandler, HTMLParser

@pytest.mark.asyncio
async def test_menu_crawler_successful_crawl(mock_website, test_urls, aiohttp_session):
    """Test successful crawling of menu links."""
    url = test_urls["valid_urls"][0]
    menu_selector = test_urls["menu_selector"]
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = MenuCrawler(request_handler=request_handler, html_parser=html_parser)
    
    results = await crawler.crawl(url, menu_selector)
    
    assert len(results) >= 4  # Number of menu links in sample HTML
    for result in results:
        assert result["success"] is True
        assert result["url"].startswith("https://example.com")
        assert "content" in result
        assert "title" in result["metadata"]
        assert "description" in result["metadata"]
        assert len(result["links"]) > 0
        assert result["status_code"] == 200
        assert result["error"] is None

@pytest.mark.asyncio
async def test_menu_crawler_invalid_url(mock_website, test_urls, aiohttp_session):
    """Test crawling with an invalid URL."""
    url = test_urls["invalid_urls"][0]
    menu_selector = test_urls["menu_selector"]
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = MenuCrawler(request_handler=request_handler, html_parser=html_parser)
    
    results = await crawler.crawl(url, menu_selector)
    
    assert len(results) == 1
    assert results[0]["success"] is False
    assert results[0]["url"] == url
    assert results[0]["error"] is not None

@pytest.mark.asyncio
async def test_menu_crawler_invalid_selector(mock_website, test_urls, aiohttp_session):
    """Test crawling with an invalid CSS selector."""
    url = test_urls["valid_urls"][0]
    invalid_selector = "#nonexistent-menu"
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = MenuCrawler(request_handler=request_handler, html_parser=html_parser)
    
    results = await crawler.crawl(url, invalid_selector)
    
    assert len(results) == 1
    assert results[0]["success"] is False
    assert results[0]["url"] == url
    assert "No menu links found" in results[0]["error"]

@pytest.mark.asyncio
async def test_menu_crawler_nested_menu(mock_website, test_urls, aiohttp_session):
    """Test crawling nested menu structure."""
    url = test_urls["valid_urls"][0]
    menu_selector = test_urls["menu_selector"]
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = MenuCrawler(
        request_handler=request_handler,
        html_parser=html_parser,
        max_depth=2  # Crawl up to 2 levels deep
    )
    
    results = await crawler.crawl(url, menu_selector)
    
    # Check if nested menu items were crawled
    urls = {result["url"] for result in results}
    assert "https://example.com/section1" in urls
    assert "https://example.com/section1/page1" in urls
    assert "https://example.com/section1/page2" in urls

@pytest.mark.asyncio
async def test_menu_crawler_concurrent_limit(mock_website, test_urls, aiohttp_session):
    """Test concurrent request limiting for menu crawling."""
    url = test_urls["valid_urls"][0]
    menu_selector = test_urls["menu_selector"]
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = MenuCrawler(
        request_handler=request_handler,
        html_parser=html_parser,
        concurrent_limit=1  # Process one URL at a time
    )
    
    import time
    start_time = time.time()
    
    results = await crawler.crawl(url, menu_selector)
    
    end_time = time.time()
    elapsed_time = end_time - start_time
    
    assert len(results) >= 4
    # With concurrent_limit=1, processing should take at least 0.4 seconds
    assert elapsed_time >= 0.4

@pytest.mark.asyncio
async def test_menu_crawler_rate_limiting(mock_website, test_urls, aiohttp_session):
    """Test rate limiting for menu crawling."""
    url = test_urls["valid_urls"][0]
    menu_selector = test_urls["menu_selector"]
    request_handler = RequestHandler(session=aiohttp_session, rate_limit=1)  # 1 request per second
    html_parser = HTMLParser()
    crawler = MenuCrawler(request_handler=request_handler, html_parser=html_parser)
    
    import time
    start_time = time.time()
    
    results = await crawler.crawl(url, menu_selector)
    
    end_time = time.time()
    elapsed_time = end_time - start_time
    
    assert len(results) >= 4
    # Should take at least 3 seconds due to rate limiting
    assert elapsed_time >= 3.0

@pytest.mark.asyncio
async def test_menu_crawler_max_depth(mock_website, test_urls, aiohttp_session):
    """Test max depth limitation for menu crawling."""
    url = test_urls["valid_urls"][0]
    menu_selector = test_urls["menu_selector"]
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser()
    crawler = MenuCrawler(
        request_handler=request_handler,
        html_parser=html_parser,
        max_depth=1  # Only crawl top-level menu items
    )
    
    results = await crawler.crawl(url, menu_selector)
    
    # Should only include top-level menu items
    urls = {result["url"] for result in results}
    assert "https://example.com/section1" in urls
    assert "https://example.com/page1" in urls
    assert "https://example.com/section1/page1" not in urls  # Nested item should not be included 
```

--------------------------------------------------------------------------------
/src/docs_scraper/utils/request_handler.py:
--------------------------------------------------------------------------------

```python
"""
Request handler module for managing HTTP requests with rate limiting and error handling.
"""
import asyncio
import logging
from typing import Optional, Dict, Any
import aiohttp
from urllib.robotparser import RobotFileParser
from urllib.parse import urljoin

logger = logging.getLogger(__name__)

class RequestHandler:
    def __init__(
        self,
        rate_limit: float = 1.0,
        concurrent_limit: int = 5,
        user_agent: str = "DocsScraperBot/1.0",
        timeout: int = 30,
        session: Optional[aiohttp.ClientSession] = None
    ):
        """
        Initialize the request handler.
        
        Args:
            rate_limit: Minimum time between requests to the same domain (in seconds)
            concurrent_limit: Maximum number of concurrent requests
            user_agent: User agent string to use for requests
            timeout: Request timeout in seconds
            session: Optional aiohttp.ClientSession to use. If not provided, one will be created.
        """
        self.rate_limit = rate_limit
        self.concurrent_limit = concurrent_limit
        self.user_agent = user_agent
        self.timeout = timeout
        self._provided_session = session
        
        self._domain_locks: Dict[str, asyncio.Lock] = {}
        self._domain_last_request: Dict[str, float] = {}
        self._semaphore = asyncio.Semaphore(concurrent_limit)
        self._session: Optional[aiohttp.ClientSession] = None
        self._robot_parsers: Dict[str, RobotFileParser] = {}

    async def __aenter__(self):
        """Set up the aiohttp session."""
        if self._provided_session:
            self._session = self._provided_session
        else:
            self._session = aiohttp.ClientSession(
                headers={"User-Agent": self.user_agent},
                timeout=aiohttp.ClientTimeout(total=self.timeout)
            )
        return self

    async def __aexit__(self, exc_type, exc_val, exc_tb):
        """Clean up the aiohttp session."""
        if self._session and not self._provided_session:
            await self._session.close()

    async def _check_robots_txt(self, url: str) -> bool:
        """
        Check if the URL is allowed by robots.txt.
        
        Args:
            url: URL to check
            
        Returns:
            bool: True if allowed, False if disallowed
        """
        from urllib.parse import urlparse
        parsed = urlparse(url)
        domain = f"{parsed.scheme}://{parsed.netloc}"
        
        if domain not in self._robot_parsers:
            parser = RobotFileParser()
            parser.set_url(urljoin(domain, "/robots.txt"))
            try:
                async with self._session.get(parser.url) as response:
                    content = await response.text()
                    parser.parse(content.splitlines())
            except Exception as e:
                logger.warning(f"Failed to fetch robots.txt for {domain}: {e}")
                return True
            self._robot_parsers[domain] = parser
            
        return self._robot_parsers[domain].can_fetch(self.user_agent, url)

    async def get(self, url: str, **kwargs) -> Dict[str, Any]:
        """
        Make a GET request with rate limiting and error handling.
        
        Args:
            url: URL to request
            **kwargs: Additional arguments to pass to aiohttp.ClientSession.get()
            
        Returns:
            Dict containing:
                - success: bool indicating if request was successful
                - status: HTTP status code if available
                - content: Response content if successful
                - error: Error message if unsuccessful
        """
        from urllib.parse import urlparse
        parsed = urlparse(url)
        domain = parsed.netloc

        # Get or create domain lock
        if domain not in self._domain_locks:
            self._domain_locks[domain] = asyncio.Lock()

        # Check robots.txt
        if not await self._check_robots_txt(url):
            return {
                "success": False,
                "status": None,
                "error": "URL disallowed by robots.txt",
                "content": None
            }

        try:
            async with self._semaphore:  # Limit concurrent requests
                async with self._domain_locks[domain]:  # Lock per domain
                    # Rate limiting
                    if domain in self._domain_last_request:
                        elapsed = asyncio.get_event_loop().time() - self._domain_last_request[domain]
                        if elapsed < self.rate_limit:
                            await asyncio.sleep(self.rate_limit - elapsed)
                    
                    self._domain_last_request[domain] = asyncio.get_event_loop().time()
                    
                    # Make request
                    async with self._session.get(url, **kwargs) as response:
                        content = await response.text()
                        return {
                            "success": response.status < 400,
                            "status": response.status,
                            "content": content,
                            "error": None if response.status < 400 else f"HTTP {response.status}"
                        }

        except asyncio.TimeoutError:
            return {
                "success": False,
                "status": None,
                "error": "Request timed out",
                "content": None
            }
        except Exception as e:
            return {
                "success": False,
                "status": None,
                "error": str(e),
                "content": None
            } 
```

--------------------------------------------------------------------------------
/tests/test_crawlers/test_sitemap_crawler.py:
--------------------------------------------------------------------------------

```python
"""
Tests for the SitemapCrawler class.
"""
import pytest
from docs_scraper.crawlers import SitemapCrawler
from docs_scraper.utils import RequestHandler, HTMLParser

@pytest.mark.asyncio
async def test_sitemap_crawler_successful_crawl(mock_website, test_urls, aiohttp_session):
    """Test successful crawling of a sitemap."""
    sitemap_url = test_urls["sitemap_url"]
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser(base_url=test_urls["base_url"])
    crawler = SitemapCrawler(request_handler=request_handler, html_parser=html_parser)
    
    results = await crawler.crawl(sitemap_url)
    
    assert len(results) == 3  # Number of URLs in sample sitemap
    for result in results:
        assert result["success"] is True
        assert result["url"].startswith("https://example.com")
        assert "content" in result
        assert "title" in result["metadata"]
        assert "description" in result["metadata"]
        assert len(result["links"]) > 0
        assert result["status_code"] == 200
        assert result["error"] is None

@pytest.mark.asyncio
async def test_sitemap_crawler_invalid_sitemap_url(mock_website, aiohttp_session):
    """Test crawling with an invalid sitemap URL."""
    sitemap_url = "https://nonexistent.example.com/sitemap.xml"
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser(base_url="https://nonexistent.example.com")
    crawler = SitemapCrawler(request_handler=request_handler, html_parser=html_parser)
    
    results = await crawler.crawl(sitemap_url)
    
    assert len(results) == 1
    assert results[0]["success"] is False
    assert results[0]["url"] == sitemap_url
    assert results[0]["error"] is not None

@pytest.mark.asyncio
async def test_sitemap_crawler_invalid_xml(mock_website, aiohttp_session):
    """Test crawling with invalid XML content."""
    sitemap_url = "https://example.com/invalid-sitemap.xml"
    mock_website.get(sitemap_url, status=200, body="<invalid>xml</invalid>")
    
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser(base_url="https://example.com")
    crawler = SitemapCrawler(request_handler=request_handler, html_parser=html_parser)
    
    results = await crawler.crawl(sitemap_url)
    
    assert len(results) == 1
    assert results[0]["success"] is False
    assert results[0]["url"] == sitemap_url
    assert "Invalid sitemap format" in results[0]["error"]

@pytest.mark.asyncio
async def test_sitemap_crawler_concurrent_limit(mock_website, test_urls, aiohttp_session):
    """Test concurrent request limiting for sitemap crawling."""
    sitemap_url = test_urls["sitemap_url"]
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser(base_url=test_urls["base_url"])
    crawler = SitemapCrawler(request_handler=request_handler, html_parser=html_parser)
    
    import time
    start_time = time.time()
    
    results = await crawler.crawl(sitemap_url)
    
    end_time = time.time()
    elapsed_time = end_time - start_time
    
    assert len(results) == 3
    # With concurrent_limit=1, processing should take at least 0.3 seconds
    assert elapsed_time >= 0.3

@pytest.mark.asyncio
async def test_sitemap_crawler_rate_limiting(mock_website, test_urls, aiohttp_session):
    """Test rate limiting for sitemap crawling."""
    sitemap_url = test_urls["sitemap_url"]
    request_handler = RequestHandler(session=aiohttp_session, rate_limit=1)  # 1 request per second
    html_parser = HTMLParser(base_url=test_urls["base_url"])
    crawler = SitemapCrawler(request_handler=request_handler, html_parser=html_parser)
    
    import time
    start_time = time.time()
    
    results = await crawler.crawl(sitemap_url)
    
    end_time = time.time()
    elapsed_time = end_time - start_time
    
    assert len(results) == 3
    # Should take at least 3 seconds due to rate limiting (1 for sitemap + 2 for pages)
    assert elapsed_time >= 2.0

@pytest.mark.asyncio
async def test_sitemap_crawler_nested_sitemaps(mock_website, test_urls, aiohttp_session):
    """Test crawling nested sitemaps."""
    # Create a sitemap index
    sitemap_index = """<?xml version="1.0" encoding="UTF-8"?>
    <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
        <sitemap>
            <loc>https://example.com/sitemap1.xml</loc>
        </sitemap>
        <sitemap>
            <loc>https://example.com/sitemap2.xml</loc>
        </sitemap>
    </sitemapindex>
    """
    
    # Create sub-sitemaps
    sitemap1 = """<?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
        <url>
            <loc>https://example.com/page1</loc>
        </url>
    </urlset>
    """
    
    sitemap2 = """<?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
        <url>
            <loc>https://example.com/page2</loc>
        </url>
    </urlset>
    """
    
    mock_website.get("https://example.com/sitemap-index.xml", status=200, body=sitemap_index)
    mock_website.get("https://example.com/sitemap1.xml", status=200, body=sitemap1)
    mock_website.get("https://example.com/sitemap2.xml", status=200, body=sitemap2)
    
    request_handler = RequestHandler(session=aiohttp_session)
    html_parser = HTMLParser(base_url="https://example.com")
    crawler = SitemapCrawler(request_handler=request_handler, html_parser=html_parser)
    
    results = await crawler.crawl("https://example.com/sitemap-index.xml")
    
    assert len(results) == 2  # Two pages from two sub-sitemaps
    urls = {result["url"] for result in results}
    assert "https://example.com/page1" in urls
    assert "https://example.com/page2" in urls 
```

--------------------------------------------------------------------------------
/tests/test_utils/test_html_parser.py:
--------------------------------------------------------------------------------

```python
"""
Tests for the HTMLParser class.
"""
import pytest
from bs4 import BeautifulSoup
from docs_scraper.utils import HTMLParser

@pytest.fixture
def html_parser():
    """Fixture for HTMLParser instance."""
    return HTMLParser()

@pytest.fixture
def sample_html():
    """Sample HTML content for testing."""
    return """
    <!DOCTYPE html>
    <html>
    <head>
        <title>Test Page</title>
        <meta name="description" content="Test description">
        <meta name="keywords" content="test, keywords">
        <meta property="og:title" content="OG Title">
        <meta property="og:description" content="OG Description">
    </head>
    <body>
        <nav class="menu">
            <ul>
                <li><a href="/page1">Page 1</a></li>
                <li>
                    <a href="/section1">Section 1</a>
                    <ul>
                        <li><a href="/section1/page1">Section 1.1</a></li>
                        <li><a href="/section1/page2">Section 1.2</a></li>
                    </ul>
                </li>
            </ul>
        </nav>
        <main>
            <h1>Welcome</h1>
            <p>Test content with a <a href="/test1">link</a> and another <a href="/test2">link</a>.</p>
            <div class="content">
                <p>More content</p>
                <a href="mailto:[email protected]">Email</a>
                <a href="tel:+1234567890">Phone</a>
                <a href="javascript:void(0)">JavaScript</a>
                <a href="#section">Hash</a>
                <a href="ftp://example.com">FTP</a>
            </div>
        </main>
    </body>
    </html>
    """

def test_parse_html(html_parser, sample_html):
    """Test HTML parsing."""
    soup = html_parser.parse_html(sample_html)
    assert isinstance(soup, BeautifulSoup)
    assert soup.title.string == "Test Page"

def test_extract_metadata(html_parser, sample_html):
    """Test metadata extraction."""
    soup = html_parser.parse_html(sample_html)
    metadata = html_parser.extract_metadata(soup)
    
    assert metadata["title"] == "Test Page"
    assert metadata["description"] == "Test description"
    assert metadata["keywords"] == "test, keywords"
    assert metadata["og:title"] == "OG Title"
    assert metadata["og:description"] == "OG Description"

def test_extract_links(html_parser, sample_html):
    """Test link extraction."""
    soup = html_parser.parse_html(sample_html)
    links = html_parser.extract_links(soup)
    
    # Should only include valid HTTP(S) links
    assert "/page1" in links
    assert "/section1" in links
    assert "/section1/page1" in links
    assert "/section1/page2" in links
    assert "/test1" in links
    assert "/test2" in links
    
    # Should not include invalid or special links
    assert "mailto:[email protected]" not in links
    assert "tel:+1234567890" not in links
    assert "javascript:void(0)" not in links
    assert "#section" not in links
    assert "ftp://example.com" not in links

def test_extract_menu_links(html_parser, sample_html):
    """Test menu link extraction."""
    soup = html_parser.parse_html(sample_html)
    menu_links = html_parser.extract_menu_links(soup, "nav.menu")
    
    assert len(menu_links) == 4
    assert "/page1" in menu_links
    assert "/section1" in menu_links
    assert "/section1/page1" in menu_links
    assert "/section1/page2" in menu_links

def test_extract_menu_links_invalid_selector(html_parser, sample_html):
    """Test menu link extraction with invalid selector."""
    soup = html_parser.parse_html(sample_html)
    menu_links = html_parser.extract_menu_links(soup, "#nonexistent")
    
    assert len(menu_links) == 0

def test_extract_text_content(html_parser, sample_html):
    """Test text content extraction."""
    soup = html_parser.parse_html(sample_html)
    content = html_parser.extract_text_content(soup)
    
    assert "Welcome" in content
    assert "Test content" in content
    assert "More content" in content
    # Should not include navigation text
    assert "Section 1.1" not in content

def test_clean_html(html_parser):
    """Test HTML cleaning."""
    dirty_html = """
    <html>
    <body>
        <script>alert('test');</script>
        <style>body { color: red; }</style>
        <p>Test content</p>
        <!-- Comment -->
        <iframe src="test.html"></iframe>
    </body>
    </html>
    """
    
    clean_html = html_parser.clean_html(dirty_html)
    soup = html_parser.parse_html(clean_html)
    
    assert len(soup.find_all("script")) == 0
    assert len(soup.find_all("style")) == 0
    assert len(soup.find_all("iframe")) == 0
    assert "Test content" in soup.get_text()

def test_normalize_url(html_parser):
    """Test URL normalization."""
    base_url = "https://example.com/docs"
    test_cases = [
        ("/test", "https://example.com/test"),
        ("test", "https://example.com/docs/test"),
        ("../test", "https://example.com/test"),
        ("https://other.com/test", "https://other.com/test"),
        ("//other.com/test", "https://other.com/test"),
    ]
    
    for input_url, expected_url in test_cases:
        assert html_parser.normalize_url(input_url, base_url) == expected_url

def test_is_valid_link(html_parser):
    """Test link validation."""
    valid_links = [
        "https://example.com",
        "http://example.com",
        "/absolute/path",
        "relative/path",
        "../parent/path",
        "./current/path"
    ]
    
    invalid_links = [
        "mailto:[email protected]",
        "tel:+1234567890",
        "javascript:void(0)",
        "#hash",
        "ftp://example.com",
        ""
    ]
    
    for link in valid_links:
        assert html_parser.is_valid_link(link) is True
    
    for link in invalid_links:
        assert html_parser.is_valid_link(link) is False

def test_extract_structured_data(html_parser):
    """Test structured data extraction."""
    html = """
    <html>
    <head>
        <script type="application/ld+json">
        {
            "@context": "https://schema.org",
            "@type": "Article",
            "headline": "Test Article",
            "author": {
                "@type": "Person",
                "name": "John Doe"
            }
        }
        </script>
    </head>
    <body>
        <p>Test content</p>
    </body>
    </html>
    """
    
    soup = html_parser.parse_html(html)
    structured_data = html_parser.extract_structured_data(soup)
    
    assert len(structured_data) == 1
    assert structured_data[0]["@type"] == "Article"
    assert structured_data[0]["headline"] == "Test Article"
    assert structured_data[0]["author"]["name"] == "John Doe" 
```

--------------------------------------------------------------------------------
/src/docs_scraper/crawlers/single_url_crawler.py:
--------------------------------------------------------------------------------

```python
import os
import sys
import asyncio
import re
import argparse
from datetime import datetime
from termcolor import colored
from crawl4ai import *
from ..utils import RequestHandler, HTMLParser
from typing import Dict, Any, Optional

class SingleURLCrawler:
    """A crawler that processes a single URL."""
    
    def __init__(self, request_handler: RequestHandler, html_parser: HTMLParser):
        """
        Initialize the crawler.
        
        Args:
            request_handler: Handler for making HTTP requests
            html_parser: Parser for processing HTML content
        """
        self.request_handler = request_handler
        self.html_parser = html_parser
    
    async def crawl(self, url: str) -> Dict[str, Any]:
        """
        Crawl a single URL and extract its content.
        
        Args:
            url: The URL to crawl
            
        Returns:
            Dict containing:
                - success: Whether the crawl was successful
                - url: The URL that was crawled
                - content: The extracted content (if successful)
                - metadata: Additional metadata about the page
                - links: Links found on the page
                - status_code: HTTP status code
                - error: Error message (if unsuccessful)
        """
        try:
            response = await self.request_handler.get(url)
            if not response["success"]:
                return {
                    "success": False,
                    "url": url,
                    "content": None,
                    "metadata": {},
                    "links": [],
                    "status_code": response.get("status"),
                    "error": response.get("error", "Unknown error")
                }
            
            html_content = response["content"]
            parsed_content = self.html_parser.parse_content(html_content)
            
            return {
                "success": True,
                "url": url,
                "content": parsed_content["text_content"],
                "metadata": {
                    "title": parsed_content["title"],
                    "description": parsed_content["description"]
                },
                "links": parsed_content["links"],
                "status_code": response["status"],
                "error": None
            }
            
        except Exception as e:
            return {
                "success": False,
                "url": url,
                "content": None,
                "metadata": {},
                "links": [],
                "status_code": None,
                "error": str(e)
            }

def get_filename_prefix(url: str) -> str:
    """
    Generate a filename prefix from a URL including path components.
    Examples:
    - https://docs.literalai.com/page -> literalai_docs_page
    - https://literalai.com/docs/page -> literalai_docs_page
    - https://api.example.com/path/to/page -> example_api_path_to_page
    
    Args:
        url (str): The URL to process
        
    Returns:
        str: Generated filename prefix
    """
    # Remove protocol and split URL parts
    clean_url = url.split('://')[1]
    url_parts = clean_url.split('/')
    
    # Get domain parts
    domain_parts = url_parts[0].split('.')
    
    # Extract main domain name (ignoring TLD)
    main_domain = domain_parts[-2]
    
    # Start building the prefix with domain
    prefix_parts = [main_domain]
    
    # Add subdomain if exists
    if len(domain_parts) > 2:
        subdomain = domain_parts[0]
        if subdomain != main_domain:
            prefix_parts.append(subdomain)
    
    # Add all path segments
    if len(url_parts) > 1:
        path_segments = [segment for segment in url_parts[1:] if segment]
        for segment in path_segments:
            # Clean up segment (remove special characters, convert to lowercase)
            clean_segment = re.sub(r'[^a-zA-Z0-9]', '', segment.lower())
            if clean_segment and clean_segment != main_domain:
                prefix_parts.append(clean_segment)
    
    # Join all parts with underscore
    return '_'.join(prefix_parts)

def process_markdown_content(content: str, url: str) -> str:
    """Process markdown content to start from first H1 and add URL as H2"""
    # Find the first H1 tag
    h1_match = re.search(r'^# .+$', content, re.MULTILINE)
    if not h1_match:
        # If no H1 found, return original content with URL as H1
        return f"# No Title Found\n\n## Source\n{url}\n\n{content}"
        
    # Get the content starting from the first H1
    content_from_h1 = content[h1_match.start():]
    
    # Remove "Was this page helpful?" section and everything after it
    helpful_patterns = [
        r'^#+\s*Was this page helpful\?.*$',  # Matches any heading level with this text
        r'^Was this page helpful\?.*$',       # Matches the text without heading
        r'^#+\s*Was this helpful\?.*$',       # Matches any heading level with shorter text
        r'^Was this helpful\?.*$'             # Matches shorter text without heading
    ]
    
    for pattern in helpful_patterns:
        parts = re.split(pattern, content_from_h1, flags=re.MULTILINE | re.IGNORECASE)
        if len(parts) > 1:
            content_from_h1 = parts[0].strip()
            break
    
    # Insert URL as H2 after the H1
    lines = content_from_h1.split('\n')
    h1_line = lines[0]
    rest_of_content = '\n'.join(lines[1:]).strip()
    
    return f"{h1_line}\n\n## Source\n{url}\n\n{rest_of_content}"

def save_markdown_content(content: str, url: str) -> str:
    """Save markdown content to a file"""
    try:
        # Generate filename prefix from URL
        filename_prefix = get_filename_prefix(url)
        
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        filename = f"{filename_prefix}_{timestamp}.md"
        filepath = os.path.join("scraped_docs", filename)
        
        # Create scraped_docs directory if it doesn't exist
        os.makedirs("scraped_docs", exist_ok=True)
        
        processed_content = process_markdown_content(content, url)
        
        with open(filepath, "w", encoding="utf-8") as f:
            f.write(processed_content)
        
        print(colored(f"\n✓ Markdown content saved to: {filepath}", "green"))
        return filepath
    except Exception as e:
        print(colored(f"\n✗ Error saving markdown content: {str(e)}", "red"))
        return None

async def main():
    # Set up argument parser
    parser = argparse.ArgumentParser(description='Crawl a single URL and generate markdown documentation')
    parser.add_argument('url', type=str, help='Target documentation URL to crawl')
    args = parser.parse_args()

    try:
        print(colored("\n=== Starting Single URL Crawl ===", "cyan"))
        print(colored(f"\nCrawling URL: {args.url}", "yellow"))
        
        browser_config = BrowserConfig(headless=True, verbose=True)
        async with AsyncWebCrawler(config=browser_config) as crawler:
            crawler_config = CrawlerRunConfig(
                cache_mode=CacheMode.BYPASS,
                markdown_generator=DefaultMarkdownGenerator(
                    content_filter=PruningContentFilter(threshold=0.48, threshold_type="fixed", min_word_threshold=0)
                )
            )
            
            result = await crawler.arun(
                url=args.url,
                config=crawler_config
            )
            
            if result.success:
                print(colored("\n✓ Successfully crawled URL", "green"))
                print(colored(f"Content length: {len(result.markdown.raw_markdown)} characters", "cyan"))
                save_markdown_content(result.markdown.raw_markdown, args.url)
            else:
                print(colored(f"\n✗ Failed to crawl URL: {result.error_message}", "red"))
                
    except Exception as e:
        print(colored(f"\n✗ Error during crawl: {str(e)}", "red"))
        sys.exit(1)

if __name__ == "__main__":
    asyncio.run(main())
```

--------------------------------------------------------------------------------
/src/docs_scraper/server.py:
--------------------------------------------------------------------------------

```python
"""
MCP server implementation for web crawling and documentation scraping.
"""
import asyncio
import logging
from typing import List, Dict, Any, Optional
from pydantic import BaseModel, Field, HttpUrl
from mcp.server.fastmcp import FastMCP

# Import the crawlers with relative imports
# This helps prevent circular import issues
from .crawlers.single_url_crawler import SingleURLCrawler
from .crawlers.multi_url_crawler import MultiURLCrawler
from .crawlers.sitemap_crawler import SitemapCrawler
from .crawlers.menu_crawler import MenuCrawler

# Import utility classes
from .utils import RequestHandler, HTMLParser

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

# Create MCP server
mcp = FastMCP(
    name="DocsScraperMCP",
    version="0.1.0"
)

# Input validation models
class SingleUrlInput(BaseModel):
    url: HttpUrl = Field(..., description="Target URL to crawl")
    depth: int = Field(0, ge=0, description="How many levels deep to follow links")
    exclusion_patterns: Optional[List[str]] = Field(None, description="List of regex patterns for URLs to exclude")
    rate_limit: float = Field(1.0, gt=0, description="Minimum time between requests (seconds)")

class MultiUrlInput(BaseModel):
    urls: List[HttpUrl] = Field(..., min_items=1, description="List of URLs to crawl")
    concurrent_limit: int = Field(5, gt=0, description="Maximum number of concurrent requests")
    exclusion_patterns: Optional[List[str]] = Field(None, description="List of regex patterns for URLs to exclude")
    rate_limit: float = Field(1.0, gt=0, description="Minimum time between requests to the same domain (seconds)")

class SitemapInput(BaseModel):
    base_url: HttpUrl = Field(..., description="Base URL of the website")
    sitemap_url: Optional[HttpUrl] = Field(None, description="Optional explicit sitemap URL")
    concurrent_limit: int = Field(5, gt=0, description="Maximum number of concurrent requests")
    exclusion_patterns: Optional[List[str]] = Field(None, description="List of regex patterns for URLs to exclude")
    rate_limit: float = Field(1.0, gt=0, description="Minimum time between requests (seconds)")

class MenuInput(BaseModel):
    base_url: HttpUrl = Field(..., description="Base URL of the website")
    menu_selector: str = Field(..., min_length=1, description="CSS selector for the navigation menu element")
    concurrent_limit: int = Field(5, gt=0, description="Maximum number of concurrent requests")
    exclusion_patterns: Optional[List[str]] = Field(None, description="List of regex patterns for URLs to exclude")
    rate_limit: float = Field(1.0, gt=0, description="Minimum time between requests (seconds)")

@mcp.tool()
async def single_url_crawler(
    url: str,
    depth: int = 0,
    exclusion_patterns: Optional[List[str]] = None,
    rate_limit: float = 1.0
) -> Dict[str, Any]:
    """
    Crawl a single URL and optionally follow links up to a specified depth.
    
    Args:
        url: Target URL to crawl
        depth: How many levels deep to follow links (0 means only the target URL)
        exclusion_patterns: List of regex patterns for URLs to exclude
        rate_limit: Minimum time between requests (seconds)
        
    Returns:
        Dict containing crawled content and statistics
    """
    try:
        # Validate input
        input_data = SingleUrlInput(
            url=url,
            depth=depth,
            exclusion_patterns=exclusion_patterns,
            rate_limit=rate_limit
        )
        
        # Create required utility instances
        request_handler = RequestHandler(rate_limit=input_data.rate_limit)
        html_parser = HTMLParser(base_url=str(input_data.url))
        
        # Create the crawler with the proper parameters
        crawler = SingleURLCrawler(request_handler=request_handler, html_parser=html_parser)
        
        # Use request_handler as a context manager to ensure proper session initialization
        async with request_handler:
            # Call the crawl method with the URL
            return await crawler.crawl(str(input_data.url))
        
    except Exception as e:
        logger.error(f"Single URL crawler failed: {str(e)}")
        return {
            "success": False,
            "error": str(e),
            "content": None,
            "stats": {
                "urls_crawled": 0,
                "urls_failed": 1,
                "max_depth_reached": 0
            }
        }

@mcp.tool()
async def multi_url_crawler(
    urls: List[str],
    concurrent_limit: int = 5,
    exclusion_patterns: Optional[List[str]] = None,
    rate_limit: float = 1.0
) -> Dict[str, Any]:
    """
    Crawl multiple URLs in parallel with rate limiting.
    
    Args:
        urls: List of URLs to crawl
        concurrent_limit: Maximum number of concurrent requests
        exclusion_patterns: List of regex patterns for URLs to exclude
        rate_limit: Minimum time between requests to the same domain (seconds)
        
    Returns:
        Dict containing results for each URL and overall statistics
    """
    try:
        # Validate input
        input_data = MultiUrlInput(
            urls=urls,
            concurrent_limit=concurrent_limit,
            exclusion_patterns=exclusion_patterns,
            rate_limit=rate_limit
        )
        
        # Create the crawler with the proper parameters
        crawler = MultiURLCrawler(verbose=True)
        
        # Call the crawl method with the URLs
        url_list = [str(url) for url in input_data.urls]
        results = await crawler.crawl(url_list)
        
        # Return a standardized response format
        return {
            "success": True,
            "results": results,
            "stats": {
                "urls_crawled": len(results),
                "urls_succeeded": sum(1 for r in results if r["success"]),
                "urls_failed": sum(1 for r in results if not r["success"])
            }
        }
        
    except Exception as e:
        logger.error(f"Multi URL crawler failed: {str(e)}")
        return {
            "success": False,
            "error": str(e),
            "content": None,
            "stats": {
                "urls_crawled": 0,
                "urls_failed": len(urls),
                "concurrent_requests_max": 0
            }
        }

@mcp.tool()
async def sitemap_crawler(
    base_url: str,
    sitemap_url: Optional[str] = None,
    concurrent_limit: int = 5,
    exclusion_patterns: Optional[List[str]] = None,
    rate_limit: float = 1.0
) -> Dict[str, Any]:
    """
    Crawl a website using its sitemap.xml.
    
    Args:
        base_url: Base URL of the website
        sitemap_url: Optional explicit sitemap URL (if different from base_url/sitemap.xml)
        concurrent_limit: Maximum number of concurrent requests
        exclusion_patterns: List of regex patterns for URLs to exclude
        rate_limit: Minimum time between requests (seconds)
        
    Returns:
        Dict containing crawled pages and statistics
    """
    try:
        # Validate input
        input_data = SitemapInput(
            base_url=base_url,
            sitemap_url=sitemap_url,
            concurrent_limit=concurrent_limit,
            exclusion_patterns=exclusion_patterns,
            rate_limit=rate_limit
        )
        
        # Create required utility instances
        request_handler = RequestHandler(
            rate_limit=input_data.rate_limit,
            concurrent_limit=input_data.concurrent_limit
        )
        html_parser = HTMLParser(base_url=str(input_data.base_url))
        
        # Create the crawler with the proper parameters
        crawler = SitemapCrawler(
            request_handler=request_handler,
            html_parser=html_parser,
            verbose=True
        )
        
        # Determine the sitemap URL to use
        sitemap_url_to_use = str(input_data.sitemap_url) if input_data.sitemap_url else f"{str(input_data.base_url).rstrip('/')}/sitemap.xml"
        
        # Call the crawl method with the sitemap URL
        results = await crawler.crawl(sitemap_url_to_use)
        
        return {
            "success": True,
            "content": results,
            "stats": {
                "urls_crawled": len(results),
                "urls_succeeded": sum(1 for r in results if r["success"]),
                "urls_failed": sum(1 for r in results if not r["success"]),
                "sitemap_found": len(results) > 0
            }
        }
        
    except Exception as e:
        logger.error(f"Sitemap crawler failed: {str(e)}")
        return {
            "success": False,
            "error": str(e),
            "content": None,
            "stats": {
                "urls_crawled": 0,
                "urls_failed": 1,
                "sitemap_found": False
            }
        }

@mcp.tool()
async def menu_crawler(
    base_url: str,
    menu_selector: str,
    concurrent_limit: int = 5,
    exclusion_patterns: Optional[List[str]] = None,
    rate_limit: float = 1.0
) -> Dict[str, Any]:
    """
    Crawl a website by following its navigation menu structure.
    
    Args:
        base_url: Base URL of the website
        menu_selector: CSS selector for the navigation menu element
        concurrent_limit: Maximum number of concurrent requests
        exclusion_patterns: List of regex patterns for URLs to exclude
        rate_limit: Minimum time between requests (seconds)
        
    Returns:
        Dict containing menu structure and crawled content
    """
    try:
        # Validate input
        input_data = MenuInput(
            base_url=base_url,
            menu_selector=menu_selector,
            concurrent_limit=concurrent_limit,
            exclusion_patterns=exclusion_patterns,
            rate_limit=rate_limit
        )
        
        # Create the crawler with the proper parameters
        crawler = MenuCrawler(start_url=str(input_data.base_url))
        
        # Call the crawl method
        results = await crawler.crawl()
        
        return {
            "success": True,
            "content": results,
            "stats": {
                "urls_crawled": len(results.get("menu_links", [])),
                "urls_failed": 0,
                "menu_items_found": len(results.get("menu_structure", {}).get("items", []))
            }
        }
        
    except Exception as e:
        logger.error(f"Menu crawler failed: {str(e)}")
        return {
            "success": False,
            "error": str(e),
            "content": None,
            "stats": {
                "urls_crawled": 0,
                "urls_failed": 1,
                "menu_items_found": 0
            }
        }

def main():
    """Main entry point for the MCP server."""
    try:
        logger.info("Starting DocsScraperMCP server...")
        mcp.run()  # Using run() method instead of start()
    except Exception as e:
        logger.error(f"Server failed: {str(e)}")
        raise
    finally:
        logger.info("DocsScraperMCP server stopped.")

if __name__ == "__main__":
    main() 
```

--------------------------------------------------------------------------------
/src/docs_scraper/crawlers/multi_url_crawler.py:
--------------------------------------------------------------------------------

```python
import os
import sys
import asyncio
import re
import json
import argparse
from typing import List, Optional
from datetime import datetime
from termcolor import colored
from crawl4ai import AsyncWebCrawler, BrowserConfig, CrawlerRunConfig, CacheMode
from crawl4ai.markdown_generation_strategy import DefaultMarkdownGenerator
from crawl4ai.content_filter_strategy import PruningContentFilter
from urllib.parse import urlparse

def load_urls_from_file(file_path: str) -> List[str]:
    """Load URLs from either a text file or JSON file"""
    try:
        # Create input_files directory if it doesn't exist
        input_dir = "input_files"
        os.makedirs(input_dir, exist_ok=True)
        
        # Check if file exists in current directory or input_files directory
        if os.path.exists(file_path):
            actual_path = file_path
        elif os.path.exists(os.path.join(input_dir, file_path)):
            actual_path = os.path.join(input_dir, file_path)
        else:
            print(colored(f"Error: File {file_path} not found", "red"))
            print(colored(f"Please place your URL files in either:", "yellow"))
            print(colored(f"1. The root directory ({os.getcwd()})", "yellow"))
            print(colored(f"2. The input_files directory ({os.path.join(os.getcwd(), input_dir)})", "yellow"))
            sys.exit(1)
            
        file_ext = os.path.splitext(actual_path)[1].lower()
        
        if file_ext == '.json':
            print(colored(f"Loading URLs from JSON file: {actual_path}", "cyan"))
            with open(actual_path, 'r', encoding='utf-8') as f:
                try:
                    data = json.load(f)
                    # Handle menu crawler output format
                    if isinstance(data, dict) and 'menu_links' in data:
                        urls = data['menu_links']
                    elif isinstance(data, dict) and 'urls' in data:
                        urls = data['urls']
                    elif isinstance(data, list):
                        urls = data
                    else:
                        print(colored("Error: Invalid JSON format. Expected 'menu_links' or 'urls' key, or list of URLs", "red"))
                        sys.exit(1)
                    print(colored(f"Successfully loaded {len(urls)} URLs from JSON file", "green"))
                    return urls
                except json.JSONDecodeError as e:
                    print(colored(f"Error: Invalid JSON file - {str(e)}", "red"))
                    sys.exit(1)
        else:
            print(colored(f"Loading URLs from text file: {actual_path}", "cyan"))
            with open(actual_path, 'r', encoding='utf-8') as f:
                urls = [line.strip() for line in f if line.strip()]
                print(colored(f"Successfully loaded {len(urls)} URLs from text file", "green"))
                return urls
                
    except Exception as e:
        print(colored(f"Error loading URLs from file: {str(e)}", "red"))
        sys.exit(1)

class MultiURLCrawler:
    def __init__(self, verbose: bool = True):
        self.browser_config = BrowserConfig(
            headless=True,
            verbose=True,
            viewport_width=800,
            viewport_height=600
        )
        
        self.crawler_config = CrawlerRunConfig(
            cache_mode=CacheMode.BYPASS,
            markdown_generator=DefaultMarkdownGenerator(
                content_filter=PruningContentFilter(
                    threshold=0.48,
                    threshold_type="fixed",
                    min_word_threshold=0
                )
            ),
        )
        
        self.verbose = verbose
        
    def process_markdown_content(self, content: str, url: str) -> str:
        """Process markdown content to start from first H1 and add URL as H2"""
        # Find the first H1 tag
        h1_match = re.search(r'^# .+$', content, re.MULTILINE)
        if not h1_match:
            # If no H1 found, return original content with URL as H1
            return f"# No Title Found\n\n## Source\n{url}\n\n{content}"
            
        # Get the content starting from the first H1
        content_from_h1 = content[h1_match.start():]
        
        # Remove "Was this page helpful?" section and everything after it
        helpful_patterns = [
            r'^#+\s*Was this page helpful\?.*$',  # Matches any heading level with this text
            r'^Was this page helpful\?.*$',       # Matches the text without heading
            r'^#+\s*Was this helpful\?.*$',       # Matches any heading level with shorter text
            r'^Was this helpful\?.*$'             # Matches shorter text without heading
        ]
        
        for pattern in helpful_patterns:
            parts = re.split(pattern, content_from_h1, flags=re.MULTILINE | re.IGNORECASE)
            if len(parts) > 1:
                content_from_h1 = parts[0].strip()
                break
        
        # Insert URL as H2 after the H1
        lines = content_from_h1.split('\n')
        h1_line = lines[0]
        rest_of_content = '\n'.join(lines[1:])
        
        return f"{h1_line}\n\n## Source\n{url}\n\n{rest_of_content}"
        
    def get_filename_prefix(self, url: str) -> str:
        """
        Generate a filename prefix from a URL including path components.
        Examples:
        - https://docs.literalai.com/page -> literalai_docs_page
        - https://literalai.com/docs/page -> literalai_docs_page
        - https://api.example.com/path/to/page -> example_api_path_to_page
        """
        try:
            # Parse the URL
            parsed = urlparse(url)
            
            # Split hostname and reverse it (e.g., 'docs.example.com' -> ['com', 'example', 'docs'])
            hostname_parts = parsed.hostname.split('.')
            hostname_parts.reverse()
            
            # Remove common TLDs and 'www'
            hostname_parts = [p for p in hostname_parts if p not in ('com', 'org', 'net', 'www')]
            
            # Get path components, removing empty strings
            path_parts = [p for p in parsed.path.split('/') if p]
            
            # Combine hostname and path parts
            all_parts = hostname_parts + path_parts
            
            # Clean up parts: lowercase, remove special chars, limit length
            cleaned_parts = []
            for part in all_parts:
                # Convert to lowercase and remove special characters
                cleaned = re.sub(r'[^a-zA-Z0-9]+', '_', part.lower())
                # Remove leading/trailing underscores
                cleaned = cleaned.strip('_')
                # Only add non-empty parts
                if cleaned:
                    cleaned_parts.append(cleaned)
            
            # Join parts with underscores
            return '_'.join(cleaned_parts)
        
        except Exception as e:
            print(colored(f"Error generating filename prefix: {str(e)}", "red"))
            return "default"

    def save_markdown_content(self, results: List[dict], filename_prefix: str = None):
        """Save all markdown content to a single file"""
        try:
            # Use the first successful URL to generate the filename prefix if none provided
            if not filename_prefix and results:
                # Find first successful result
                first_url = next((result["url"] for result in results if result["success"]), None)
                if first_url:
                    filename_prefix = self.get_filename_prefix(first_url)
                else:
                    filename_prefix = "docs"  # Fallback if no successful results
            
            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
            filename = f"{filename_prefix}_{timestamp}.md"
            filepath = os.path.join("scraped_docs", filename)
            
            # Create scraped_docs directory if it doesn't exist
            os.makedirs("scraped_docs", exist_ok=True)
            
            with open(filepath, "w", encoding="utf-8") as f:
                for result in results:
                    if result["success"]:
                        processed_content = self.process_markdown_content(
                            result["markdown_content"],
                            result["url"]
                        )
                        f.write(processed_content)
                        f.write("\n\n---\n\n")
            
            if self.verbose:
                print(colored(f"\nMarkdown content saved to: {filepath}", "green"))
            return filepath
            
        except Exception as e:
            print(colored(f"\nError saving markdown content: {str(e)}", "red"))
            return None

    async def crawl(self, urls: List[str]) -> List[dict]:
        """
        Crawl multiple URLs sequentially using session reuse for optimal performance
        """
        if self.verbose:
            print("\n=== Starting Crawl ===")
            total_urls = len(urls)
            print(f"Total URLs to crawl: {total_urls}")

        results = []
        async with AsyncWebCrawler(config=self.browser_config) as crawler:
            session_id = "crawl_session"  # Reuse the same session for all URLs
            for idx, url in enumerate(urls, 1):
                try:
                    if self.verbose:
                        progress = (idx / total_urls) * 100
                        print(f"\nProgress: {idx}/{total_urls} ({progress:.1f}%)")
                        print(f"Crawling: {url}")
                    
                    result = await crawler.arun(
                        url=url,
                        config=self.crawler_config,
                        session_id=session_id,
                    )
                    
                    results.append({
                        "url": url,
                        "success": result.success,
                        "content_length": len(result.markdown.raw_markdown) if result.success else 0,
                        "markdown_content": result.markdown.raw_markdown if result.success else "",
                        "error": result.error_message if not result.success else None
                    })
                    
                    if self.verbose and result.success:
                        print(f"✓ Successfully crawled URL {idx}/{total_urls}")
                        print(f"Content length: {len(result.markdown.raw_markdown)} characters")
                except Exception as e:
                    results.append({
                        "url": url,
                        "success": False,
                        "content_length": 0,
                        "markdown_content": "",
                        "error": str(e)
                    })
                    if self.verbose:
                        print(f"✗ Error crawling URL {idx}/{total_urls}: {str(e)}")

        if self.verbose:
            successful = sum(1 for r in results if r["success"])
            print(f"\n=== Crawl Complete ===")
            print(f"Successfully crawled: {successful}/{total_urls} URLs")

        return results

async def main():
    parser = argparse.ArgumentParser(description='Crawl multiple URLs and generate markdown documentation')
    parser.add_argument('urls_file', type=str, help='Path to file containing URLs (either .txt or .json)')
    parser.add_argument('--output-prefix', type=str, help='Prefix for output markdown file (optional)')
    args = parser.parse_args()

    try:
        # Load URLs from file
        urls = load_urls_from_file(args.urls_file)
        
        if not urls:
            print(colored("Error: No URLs found in the input file", "red"))
            sys.exit(1)
            
        print(colored(f"Found {len(urls)} URLs to crawl", "green"))
        
        # Initialize and run crawler
        crawler = MultiURLCrawler(verbose=True)
        results = await crawler.crawl(urls)
        
        # Save results to markdown file - only pass output_prefix if explicitly set
        crawler.save_markdown_content(results, args.output_prefix if args.output_prefix else None)
        
    except Exception as e:
        print(colored(f"Error during crawling: {str(e)}", "red"))
        sys.exit(1)

if __name__ == "__main__":
    asyncio.run(main()) 
```

--------------------------------------------------------------------------------
/src/docs_scraper/crawlers/sitemap_crawler.py:
--------------------------------------------------------------------------------

```python
import os
import sys
import asyncio
import re
import xml.etree.ElementTree as ET
import argparse
from typing import List, Optional, Dict
from datetime import datetime
from termcolor import colored
from ..utils import RequestHandler, HTMLParser

class SitemapCrawler:
    def __init__(self, request_handler: Optional[RequestHandler] = None, html_parser: Optional[HTMLParser] = None, verbose: bool = True):
        """
        Initialize the sitemap crawler.
        
        Args:
            request_handler: Optional RequestHandler instance. If not provided, one will be created.
            html_parser: Optional HTMLParser instance. If not provided, one will be created.
            verbose: Whether to print progress messages
        """
        self.verbose = verbose
        self.request_handler = request_handler or RequestHandler(
            rate_limit=1.0,
            concurrent_limit=5,
            user_agent="DocsScraperBot/1.0",
            timeout=30
        )
        self._html_parser = html_parser

    async def fetch_sitemap(self, sitemap_url: str) -> List[str]:
        """
        Fetch and parse an XML sitemap to extract URLs.
        
        Args:
            sitemap_url (str): The URL of the XML sitemap
            
        Returns:
            List[str]: List of URLs found in the sitemap
        """
        if self.verbose:
            print(f"\nFetching sitemap from: {sitemap_url}")
            
        async with self.request_handler as handler:
            try:
                response = await handler.get(sitemap_url)
                if not response["success"]:
                    raise Exception(f"Failed to fetch sitemap: {response['error']}")
                
                content = response["content"]
                
                # Parse XML content
                root = ET.fromstring(content)
                
                # Handle both standard sitemaps and sitemap indexes
                urls = []
                
                # Remove XML namespace for easier parsing
                namespace = root.tag.split('}')[0] + '}' if '}' in root.tag else ''
                
                if root.tag == f"{namespace}sitemapindex":
                    # This is a sitemap index file
                    if self.verbose:
                        print("Found sitemap index, processing nested sitemaps...")
                    
                    for sitemap in root.findall(f".//{namespace}sitemap"):
                        loc = sitemap.find(f"{namespace}loc")
                        if loc is not None and loc.text:
                            nested_urls = await self.fetch_sitemap(loc.text)
                            urls.extend(nested_urls)
                else:
                    # This is a standard sitemap
                    for url in root.findall(f".//{namespace}url"):
                        loc = url.find(f"{namespace}loc")
                        if loc is not None and loc.text:
                            urls.append(loc.text)
                
                if self.verbose:
                    print(f"Found {len(urls)} URLs in sitemap")
                return urls
                
            except Exception as e:
                print(f"Error fetching sitemap: {str(e)}")
                return []

    def process_markdown_content(self, content: str, url: str) -> str:
        """Process markdown content to start from first H1 and add URL as H2"""
        # Find the first H1 tag
        h1_match = re.search(r'^# .+$', content, re.MULTILINE)
        if not h1_match:
            # If no H1 found, return original content with URL as H1
            return f"# No Title Found\n\n## Source\n{url}\n\n{content}"
            
        # Get the content starting from the first H1
        content_from_h1 = content[h1_match.start():]
        
        # Remove "Was this page helpful?" section and everything after it
        helpful_patterns = [
            r'^#+\s*Was this page helpful\?.*$',  # Matches any heading level with this text
            r'^Was this page helpful\?.*$',       # Matches the text without heading
            r'^#+\s*Was this helpful\?.*$',       # Matches any heading level with shorter text
            r'^Was this helpful\?.*$'             # Matches shorter text without heading
        ]
        
        for pattern in helpful_patterns:
            parts = re.split(pattern, content_from_h1, flags=re.MULTILINE | re.IGNORECASE)
            if len(parts) > 1:
                content_from_h1 = parts[0].strip()
                break
        
        # Insert URL as H2 after the H1
        lines = content_from_h1.split('\n')
        h1_line = lines[0]
        rest_of_content = '\n'.join(lines[1:]).strip()
        
        return f"{h1_line}\n\n## Source\n{url}\n\n{rest_of_content}"

    def save_markdown_content(self, results: List[dict], filename_prefix: str = "vercel_ai_docs"):
        """Save all markdown content to a single file"""
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        filename = f"{filename_prefix}_{timestamp}.md"
        filepath = os.path.join("scraped_docs", filename)
        
        # Create scraped_docs directory if it doesn't exist
        os.makedirs("scraped_docs", exist_ok=True)
        
        with open(filepath, "w", encoding="utf-8") as f:
            for result in results:
                if result["success"]:
                    processed_content = self.process_markdown_content(
                        result["content"],
                        result["url"]
                    )
                    f.write(processed_content)
                    f.write("\n\n---\n\n")
        
        if self.verbose:
            print(f"\nMarkdown content saved to: {filepath}")
        return filepath

    async def crawl(self, sitemap_url: str) -> List[dict]:
        """
        Crawl a sitemap URL and all URLs it contains.
        
        Args:
            sitemap_url: URL of the sitemap to crawl
            
        Returns:
            List of dictionaries containing crawl results
        """
        if self.verbose:
            print("\n=== Starting Crawl ===")
        
        # First fetch all URLs from the sitemap
        urls = await self.fetch_sitemap(sitemap_url)
        
        if self.verbose:
            print(f"Total URLs to crawl: {len(urls)}")

        results = []
        async with self.request_handler as handler:
            for idx, url in enumerate(urls, 1):
                try:
                    if self.verbose:
                        progress = (idx / len(urls)) * 100
                        print(f"\nProgress: {idx}/{len(urls)} ({progress:.1f}%)")
                        print(f"Crawling: {url}")
                    
                    response = await handler.get(url)
                    html_parser = self._html_parser or HTMLParser(url)
                    
                    if response["success"]:
                        parsed_content = html_parser.parse_content(response["content"])
                        results.append({
                            "url": url,
                            "success": True,
                            "content": parsed_content["text_content"],
                            "metadata": {
                                "title": parsed_content["title"],
                                "description": parsed_content["description"]
                            },
                            "links": parsed_content["links"],
                            "status_code": response["status"],
                            "error": None
                        })
                        
                        if self.verbose:
                            print(f"✓ Successfully crawled URL {idx}/{len(urls)}")
                            print(f"Content length: {len(parsed_content['text_content'])} characters")
                    else:
                        results.append({
                            "url": url,
                            "success": False,
                            "content": "",
                            "metadata": {"title": None, "description": None},
                            "links": [],
                            "status_code": response.get("status"),
                            "error": response["error"]
                        })
                        if self.verbose:
                            print(f"✗ Error crawling URL {idx}/{len(urls)}: {response['error']}")
                            
                except Exception as e:
                    results.append({
                        "url": url,
                        "success": False,
                        "content": "",
                        "metadata": {"title": None, "description": None},
                        "links": [],
                        "status_code": None,
                        "error": str(e)
                    })
                    if self.verbose:
                        print(f"✗ Error crawling URL {idx}/{len(urls)}: {str(e)}")

        if self.verbose:
            successful = sum(1 for r in results if r["success"])
            print(f"\n=== Crawl Complete ===")
            print(f"Successfully crawled: {successful}/{len(urls)} URLs")

        return results

    def get_filename_prefix(self, url: str) -> str:
        """
        Generate a filename prefix from a sitemap URL.
        Examples:
        - https://docs.literalai.com/sitemap.xml -> literalai_documentation
        - https://literalai.com/docs/sitemap.xml -> literalai_docs
        - https://api.example.com/sitemap.xml -> example_api
        
        Args:
            url (str): The sitemap URL
            
        Returns:
            str: Generated filename prefix
        """
        # Remove protocol and split URL parts
        clean_url = url.split('://')[1]
        url_parts = clean_url.split('/')
        
        # Get domain parts
        domain_parts = url_parts[0].split('.')
        
        # Extract main domain name (ignoring TLD)
        main_domain = domain_parts[-2]
        
        # Determine the qualifier (subdomain or path segment)
        qualifier = None
        
        # First check subdomain
        if len(domain_parts) > 2:
            qualifier = domain_parts[0]
        # Then check path
        elif len(url_parts) > 2:
            # Get the first meaningful path segment
            for segment in url_parts[1:]:
                if segment and segment != 'sitemap.xml':
                    qualifier = segment
                    break
        
        # Build the prefix
        if qualifier:
            # Clean up qualifier (remove special characters, convert to lowercase)
            qualifier = re.sub(r'[^a-zA-Z0-9]', '', qualifier.lower())
            # Don't duplicate parts if they're the same
            if qualifier != main_domain:
                return f"{main_domain}_{qualifier}"
        
        return main_domain

async def main():
    # Set up argument parser
    parser = argparse.ArgumentParser(description='Crawl a sitemap and generate markdown documentation')
    parser.add_argument('sitemap_url', type=str, help='URL of the sitemap (e.g., https://docs.example.com/sitemap.xml)')
    parser.add_argument('--max-depth', type=int, default=10, help='Maximum sitemap recursion depth')
    parser.add_argument('--patterns', type=str, nargs='+', help='URL patterns to include (e.g., "/docs/*" "/guide/*")')
    args = parser.parse_args()

    try:
        print(colored(f"\nFetching sitemap: {args.sitemap_url}", "cyan"))
        
        # Initialize crawler
        crawler = SitemapCrawler(verbose=True)
        
        # Fetch URLs from sitemap
        urls = await crawler.fetch_sitemap(args.sitemap_url)
        
        if not urls:
            print(colored("No URLs found in sitemap", "red"))
            sys.exit(1)
            
        # Filter URLs by pattern if specified
        if args.patterns:
            print(colored("\nFiltering URLs by patterns:", "cyan"))
            for pattern in args.patterns:
                print(colored(f"  {pattern}", "yellow"))
            
            filtered_urls = []
            for url in urls:
                if any(pattern.replace('*', '') in url for pattern in args.patterns):
                    filtered_urls.append(url)
            
            print(colored(f"\nFound {len(filtered_urls)} URLs matching patterns", "green"))
            urls = filtered_urls
        
        # Crawl the URLs
        results = await crawler.crawl(args.sitemap_url)
        
        # Save results to markdown file with dynamic name
        filename_prefix = crawler.get_filename_prefix(args.sitemap_url)
        crawler.save_markdown_content(results, filename_prefix)
        
    except Exception as e:
        print(colored(f"Error during crawling: {str(e)}", "red"))
        sys.exit(1)

if __name__ == "__main__":
    asyncio.run(main()) 
```

--------------------------------------------------------------------------------
/src/docs_scraper/crawlers/menu_crawler.py:
--------------------------------------------------------------------------------

```python
#!/usr/bin/env python3

import asyncio
from typing import List, Set
from termcolor import colored
from crawl4ai import AsyncWebCrawler, BrowserConfig, CrawlerRunConfig, CacheMode
from crawl4ai.extraction_strategy import JsonCssExtractionStrategy
from urllib.parse import urljoin, urlparse
import json
import os
import sys
import argparse
from datetime import datetime
import re

# Constants
BASE_URL = "https://developers.cloudflare.com/agents/"
INPUT_DIR = "input_files"  # Changed from OUTPUT_DIR
MENU_SELECTORS = [
    # Traditional documentation selectors
    "nav a",                                  # General navigation links
    "[role='navigation'] a",                  # Role-based navigation
    ".sidebar a",                             # Common sidebar class
    "[class*='nav'] a",                       # Classes containing 'nav'
    "[class*='menu'] a",                      # Classes containing 'menu'
    "aside a",                                # Side navigation
    ".toc a",                                 # Table of contents
    
    # Modern framework selectors (Mintlify, Docusaurus, etc)
    "[class*='sidebar'] [role='navigation'] [class*='group'] a",  # Navigation groups
    "[class*='sidebar'] [role='navigation'] [class*='item'] a",   # Navigation items
    "[class*='sidebar'] [role='navigation'] [class*='link'] a",   # Direct links
    "[class*='sidebar'] [role='navigation'] div[class*='text']",  # Text items
    "[class*='sidebar'] [role='navigation'] [class*='nav-item']", # Nav items
    
    # Additional common patterns
    "[class*='docs-'] a",                     # Documentation-specific links
    "[class*='navigation'] a",                # Navigation containers
    "[class*='toc'] a",                       # Table of contents variations
    ".docNavigation a",                       # Documentation navigation
    "[class*='menu-item'] a",                 # Menu items
    
    # Client-side rendered navigation
    "[class*='sidebar'] a[href]",             # Any link in sidebar
    "[class*='sidebar'] [role='link']",       # ARIA role links
    "[class*='sidebar'] [role='menuitem']",   # Menu items
    "[class*='sidebar'] [role='treeitem']",   # Tree navigation items
    "[class*='sidebar'] [onclick]",           # Elements with click handlers
    "[class*='sidebar'] [class*='link']",     # Elements with link classes
    "a[href^='/']",                           # Root-relative links
    "a[href^='./']",                          # Relative links
    "a[href^='../']"                          # Parent-relative links
]

# JavaScript to expand nested menus
EXPAND_MENUS_JS = """
(async () => {
    // Wait for client-side rendering to complete
    await new Promise(r => setTimeout(r, 2000));
    
    // Function to expand all menu items
    async function expandAllMenus() {
        // Combined selectors for expandable menu items
        const expandableSelectors = [
            // Previous selectors...
            // Additional selectors for client-side rendered menus
            '[class*="sidebar"] button',
            '[class*="sidebar"] [role="button"]',
            '[class*="sidebar"] [aria-controls]',
            '[class*="sidebar"] [aria-expanded]',
            '[class*="sidebar"] [data-state]',
            '[class*="sidebar"] [class*="expand"]',
            '[class*="sidebar"] [class*="toggle"]',
            '[class*="sidebar"] [class*="collapse"]'
        ];
        
        let expanded = 0;
        let lastExpanded = -1;
        let attempts = 0;
        const maxAttempts = 10;  // Increased attempts for client-side rendering
        
        while (expanded !== lastExpanded && attempts < maxAttempts) {
            lastExpanded = expanded;
            attempts++;
            
            for (const selector of expandableSelectors) {
                const elements = document.querySelectorAll(selector);
                for (const el of elements) {
                    try {
                        // Click the element
                        el.click();
                        
                        // Try multiple expansion methods
                        el.setAttribute('aria-expanded', 'true');
                        el.setAttribute('data-state', 'open');
                        el.classList.add('expanded', 'show', 'active');
                        el.classList.remove('collapsed', 'closed');
                        
                        // Handle parent groups - multiple patterns
                        ['[class*="group"]', '[class*="parent"]', '[class*="submenu"]'].forEach(parentSelector => {
                            let parent = el.closest(parentSelector);
                            if (parent) {
                                parent.setAttribute('data-state', 'open');
                                parent.setAttribute('aria-expanded', 'true');
                                parent.classList.add('expanded', 'show', 'active');
                            }
                        });
                        
                        expanded++;
                        await new Promise(r => setTimeout(r, 200));  // Increased delay between clicks
                    } catch (e) {
                        continue;
                    }
                }
            }
            
            // Wait longer between attempts for client-side rendering
            await new Promise(r => setTimeout(r, 500));
        }
        
        // After expansion, try to convert text items to links if needed
        const textSelectors = [
            '[class*="sidebar"] [role="navigation"] [class*="text"]',
            '[class*="menu-item"]',
            '[class*="nav-item"]',
            '[class*="sidebar"] [role="menuitem"]',
            '[class*="sidebar"] [role="treeitem"]'
        ];
        
        textSelectors.forEach(selector => {
            const textItems = document.querySelectorAll(selector);
            textItems.forEach(item => {
                if (!item.querySelector('a') && item.textContent && item.textContent.trim()) {
                    const text = item.textContent.trim();
                    // Only create link if it doesn't already exist
                    if (!Array.from(item.children).some(child => child.tagName === 'A')) {
                        const link = document.createElement('a');
                        link.href = '#' + text.toLowerCase().replace(/[^a-z0-9]+/g, '-');
                        link.textContent = text;
                        item.appendChild(link);
                    }
                }
            });
        });
        
        return expanded;
    }
    
    const expandedCount = await expandAllMenus();
    // Final wait to ensure all client-side updates are complete
    await new Promise(r => setTimeout(r, 1000));
    return expandedCount;
})();
"""

def get_filename_prefix(url: str) -> str:
    """
    Generate a filename prefix from a URL including path components.
    Examples:
    - https://docs.literalai.com/page -> literalai_docs_page
    - https://literalai.com/docs/page -> literalai_docs_page
    - https://api.example.com/path/to/page -> example_api_path_to_page
    
    Args:
        url (str): The URL to process
        
    Returns:
        str: A filename-safe string derived from the URL
    """
    try:
        # Parse the URL
        parsed = urlparse(url)
        
        # Split hostname and reverse it (e.g., 'docs.example.com' -> ['com', 'example', 'docs'])
        hostname_parts = parsed.hostname.split('.')
        hostname_parts.reverse()
        
        # Remove common TLDs and 'www'
        hostname_parts = [p for p in hostname_parts if p not in ('com', 'org', 'net', 'www')]
        
        # Get path components, removing empty strings
        path_parts = [p for p in parsed.path.split('/') if p]
        
        # Combine hostname and path parts
        all_parts = hostname_parts + path_parts
        
        # Clean up parts: lowercase, remove special chars, limit length
        cleaned_parts = []
        for part in all_parts:
            # Convert to lowercase and remove special characters
            cleaned = re.sub(r'[^a-zA-Z0-9]+', '_', part.lower())
            # Remove leading/trailing underscores
            cleaned = cleaned.strip('_')
            # Only add non-empty parts
            if cleaned:
                cleaned_parts.append(cleaned)
        
        # Join parts with underscores
        return '_'.join(cleaned_parts)
    
    except Exception as e:
        print(colored(f"Error generating filename prefix: {str(e)}", "red"))
        return "default"

class MenuCrawler:
    def __init__(self, start_url: str):
        self.start_url = start_url
        
        # Configure browser settings
        self.browser_config = BrowserConfig(
            headless=True,
            viewport_width=1920,
            viewport_height=1080,
            java_script_enabled=True  # Ensure JavaScript is enabled
        )
        
        # Create extraction strategy for menu links
        extraction_schema = {
            "name": "MenuLinks",
            "baseSelector": ", ".join(MENU_SELECTORS),
            "fields": [
                {
                    "name": "href",
                    "type": "attribute",
                    "attribute": "href"
                },
                {
                    "name": "text",
                    "type": "text"
                },
                {
                    "name": "onclick",
                    "type": "attribute",
                    "attribute": "onclick"
                },
                {
                    "name": "role",
                    "type": "attribute",
                    "attribute": "role"
                }
            ]
        }
        extraction_strategy = JsonCssExtractionStrategy(extraction_schema)
        
        # Configure crawler settings with proper wait conditions
        self.crawler_config = CrawlerRunConfig(
            extraction_strategy=extraction_strategy,
            cache_mode=CacheMode.BYPASS,  # Don't use cache for fresh results
            verbose=True,  # Enable detailed logging
            wait_for_images=True,  # Ensure lazy-loaded content is captured
            js_code=[
                # Initial wait for client-side rendering
                "await new Promise(r => setTimeout(r, 2000));",
                EXPAND_MENUS_JS
            ],  # Add JavaScript to expand nested menus
            wait_for="""js:() => {
                // Wait for sidebar and its content to be present
                const sidebar = document.querySelector('[class*="sidebar"]');
                if (!sidebar) return false;
                
                // Check if we have navigation items
                const hasNavItems = sidebar.querySelectorAll('a').length > 0;
                if (hasNavItems) return true;
                
                // If no nav items yet, check for loading indicators
                const isLoading = document.querySelector('[class*="loading"]') !== null;
                return !isLoading;  // Return true if not loading anymore
            }""",
            session_id="menu_crawler",  # Use a session to maintain state
            js_only=False  # We want full page load first
        )
        
        # Create output directory if it doesn't exist
        if not os.path.exists(INPUT_DIR):
            os.makedirs(INPUT_DIR)
            print(colored(f"Created output directory: {INPUT_DIR}", "green"))

    async def extract_all_menu_links(self) -> List[str]:
        """Extract all menu links from the main page, including nested menus."""
        try:
            print(colored(f"Crawling main page: {self.start_url}", "cyan"))
            print(colored("Expanding all nested menus...", "yellow"))
            
            async with AsyncWebCrawler(config=self.browser_config) as crawler:
                # Get page content using crawl4ai
                result = await crawler.arun(
                    url=self.start_url,
                    config=self.crawler_config
                )

                if not result or not result.success:
                    print(colored(f"Failed to get page data", "red"))
                    if result and result.error_message:
                        print(colored(f"Error: {result.error_message}", "red"))
                    return []

                links = set()
                
                # Parse the base domain from start_url
                base_domain = urlparse(self.start_url).netloc
                
                # Add the base URL first (without trailing slash for consistency)
                base_url = self.start_url.rstrip('/')
                links.add(base_url)
                print(colored(f"Added base URL: {base_url}", "green"))
                
                # Extract links from the result
                if hasattr(result, 'extracted_content') and result.extracted_content:
                    try:
                        menu_links = json.loads(result.extracted_content)
                        for link in menu_links:
                            href = link.get('href', '')
                            text = link.get('text', '').strip()
                            
                            # Skip empty hrefs
                            if not href:
                                continue
                                
                            # Convert relative URLs to absolute
                            absolute_url = urljoin(self.start_url, href)
                            parsed_url = urlparse(absolute_url)
                            
                            # Accept internal links (same domain) that aren't anchors
                            if (parsed_url.netloc == base_domain and 
                                not href.startswith('#') and 
                                '#' not in absolute_url):
                                
                                # Remove any trailing slashes for consistency
                                absolute_url = absolute_url.rstrip('/')
                                
                                links.add(absolute_url)
                                print(colored(f"Found link: {text} -> {absolute_url}", "green"))
                            else:
                                print(colored(f"Skipping external or anchor link: {text} -> {href}", "yellow"))
                                
                    except json.JSONDecodeError as e:
                        print(colored(f"Error parsing extracted content: {str(e)}", "red"))
                
                print(colored(f"\nFound {len(links)} unique menu links", "green"))
                return sorted(list(links))

        except Exception as e:
            print(colored(f"Error extracting menu links: {str(e)}", "red"))
            return []

    def save_results(self, results: dict) -> str:
        """Save crawling results to a JSON file in the input_files directory."""
        try:
            # Create input_files directory if it doesn't exist
            os.makedirs(INPUT_DIR, exist_ok=True)
            
            # Generate filename using the same pattern
            filename_prefix = get_filename_prefix(self.start_url)
            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
            filename = f"{filename_prefix}_menu_links_{timestamp}.json"
            filepath = os.path.join(INPUT_DIR, filename)
            
            with open(filepath, "w", encoding="utf-8") as f:
                json.dump(results, f, indent=2)
            
            print(colored(f"\n✓ Menu links saved to: {filepath}", "green"))
            print(colored("\nTo crawl these URLs with multi_url_crawler.py, run:", "cyan"))
            print(colored(f"python multi_url_crawler.py --urls {filename}", "yellow"))
            return filepath
            
        except Exception as e:
            print(colored(f"\n✗ Error saving menu links: {str(e)}", "red"))
            return None

    async def crawl(self):
        """Main crawling method."""
        try:
            # Extract all menu links from the main page
            menu_links = await self.extract_all_menu_links()

            # Save results
            results = {
                "start_url": self.start_url,
                "total_links_found": len(menu_links),
                "menu_links": menu_links
            }

            self.save_results(results)

            print(colored(f"\nCrawling completed!", "green"))
            print(colored(f"Total unique menu links found: {len(menu_links)}", "green"))

        except Exception as e:
            print(colored(f"Error during crawling: {str(e)}", "red"))

async def main():
    # Set up argument parser
    parser = argparse.ArgumentParser(description='Extract menu links from a documentation website')
    parser.add_argument('url', type=str, help='Documentation site URL to crawl')
    parser.add_argument('--selectors', type=str, nargs='+', help='Custom menu selectors (optional)')
    args = parser.parse_args()

    try:
        # Update menu selectors if custom ones provided
        if args.selectors:
            print(colored("Using custom menu selectors:", "cyan"))
            for selector in args.selectors:
                print(colored(f"  {selector}", "yellow"))
            global MENU_SELECTORS
            MENU_SELECTORS = args.selectors

        crawler = MenuCrawler(args.url)
        await crawler.crawl()
    except Exception as e:
        print(colored(f"Error in main: {str(e)}", "red"))
        sys.exit(1)

if __name__ == "__main__":
    print(colored("Starting documentation menu crawler...", "cyan"))
    asyncio.run(main()) 
```

--------------------------------------------------------------------------------
/.venv/Scripts/Activate.ps1:
--------------------------------------------------------------------------------

```
<#
.Synopsis
Activate a Python virtual environment for the current PowerShell session.

.Description
Pushes the python executable for a virtual environment to the front of the
$Env:PATH environment variable and sets the prompt to signify that you are
in a Python virtual environment. Makes use of the command line switches as
well as the `pyvenv.cfg` file values present in the virtual environment.

.Parameter VenvDir
Path to the directory that contains the virtual environment to activate. The
default value for this is the parent of the directory that the Activate.ps1
script is located within.

.Parameter Prompt
The prompt prefix to display when this virtual environment is activated. By
default, this prompt is the name of the virtual environment folder (VenvDir)
surrounded by parentheses and followed by a single space (ie. '(.venv) ').

.Example
Activate.ps1
Activates the Python virtual environment that contains the Activate.ps1 script.

.Example
Activate.ps1 -Verbose
Activates the Python virtual environment that contains the Activate.ps1 script,
and shows extra information about the activation as it executes.

.Example
Activate.ps1 -VenvDir C:\Users\MyUser\Common\.venv
Activates the Python virtual environment located in the specified location.

.Example
Activate.ps1 -Prompt "MyPython"
Activates the Python virtual environment that contains the Activate.ps1 script,
and prefixes the current prompt with the specified string (surrounded in
parentheses) while the virtual environment is active.

.Notes
On Windows, it may be required to enable this Activate.ps1 script by setting the
execution policy for the user. You can do this by issuing the following PowerShell
command:

PS C:\> Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

For more information on Execution Policies: 
https://go.microsoft.com/fwlink/?LinkID=135170

#>
Param(
    [Parameter(Mandatory = $false)]
    [String]
    $VenvDir,
    [Parameter(Mandatory = $false)]
    [String]
    $Prompt
)

<# Function declarations --------------------------------------------------- #>

<#
.Synopsis
Remove all shell session elements added by the Activate script, including the
addition of the virtual environment's Python executable from the beginning of
the PATH variable.

.Parameter NonDestructive
If present, do not remove this function from the global namespace for the
session.

#>
function global:deactivate ([switch]$NonDestructive) {
    # Revert to original values

    # The prior prompt:
    if (Test-Path -Path Function:_OLD_VIRTUAL_PROMPT) {
        Copy-Item -Path Function:_OLD_VIRTUAL_PROMPT -Destination Function:prompt
        Remove-Item -Path Function:_OLD_VIRTUAL_PROMPT
    }

    # The prior PYTHONHOME:
    if (Test-Path -Path Env:_OLD_VIRTUAL_PYTHONHOME) {
        Copy-Item -Path Env:_OLD_VIRTUAL_PYTHONHOME -Destination Env:PYTHONHOME
        Remove-Item -Path Env:_OLD_VIRTUAL_PYTHONHOME
    }

    # The prior PATH:
    if (Test-Path -Path Env:_OLD_VIRTUAL_PATH) {
        Copy-Item -Path Env:_OLD_VIRTUAL_PATH -Destination Env:PATH
        Remove-Item -Path Env:_OLD_VIRTUAL_PATH
    }

    # Just remove the VIRTUAL_ENV altogether:
    if (Test-Path -Path Env:VIRTUAL_ENV) {
        Remove-Item -Path env:VIRTUAL_ENV
    }

    # Just remove VIRTUAL_ENV_PROMPT altogether.
    if (Test-Path -Path Env:VIRTUAL_ENV_PROMPT) {
        Remove-Item -Path env:VIRTUAL_ENV_PROMPT
    }

    # Just remove the _PYTHON_VENV_PROMPT_PREFIX altogether:
    if (Get-Variable -Name "_PYTHON_VENV_PROMPT_PREFIX" -ErrorAction SilentlyContinue) {
        Remove-Variable -Name _PYTHON_VENV_PROMPT_PREFIX -Scope Global -Force
    }

    # Leave deactivate function in the global namespace if requested:
    if (-not $NonDestructive) {
        Remove-Item -Path function:deactivate
    }
}

<#
.Description
Get-PyVenvConfig parses the values from the pyvenv.cfg file located in the
given folder, and returns them in a map.

For each line in the pyvenv.cfg file, if that line can be parsed into exactly
two strings separated by `=` (with any amount of whitespace surrounding the =)
then it is considered a `key = value` line. The left hand string is the key,
the right hand is the value.

If the value starts with a `'` or a `"` then the first and last character is
stripped from the value before being captured.

.Parameter ConfigDir
Path to the directory that contains the `pyvenv.cfg` file.
#>
function Get-PyVenvConfig(
    [String]
    $ConfigDir
) {
    Write-Verbose "Given ConfigDir=$ConfigDir, obtain values in pyvenv.cfg"

    # Ensure the file exists, and issue a warning if it doesn't (but still allow the function to continue).
    $pyvenvConfigPath = Join-Path -Resolve -Path $ConfigDir -ChildPath 'pyvenv.cfg' -ErrorAction Continue

    # An empty map will be returned if no config file is found.
    $pyvenvConfig = @{ }

    if ($pyvenvConfigPath) {

        Write-Verbose "File exists, parse `key = value` lines"
        $pyvenvConfigContent = Get-Content -Path $pyvenvConfigPath

        $pyvenvConfigContent | ForEach-Object {
            $keyval = $PSItem -split "\s*=\s*", 2
            if ($keyval[0] -and $keyval[1]) {
                $val = $keyval[1]

                # Remove extraneous quotations around a string value.
                if ("'""".Contains($val.Substring(0, 1))) {
                    $val = $val.Substring(1, $val.Length - 2)
                }

                $pyvenvConfig[$keyval[0]] = $val
                Write-Verbose "Adding Key: '$($keyval[0])'='$val'"
            }
        }
    }
    return $pyvenvConfig
}


<# Begin Activate script --------------------------------------------------- #>

# Determine the containing directory of this script
$VenvExecPath = Split-Path -Parent $MyInvocation.MyCommand.Definition
$VenvExecDir = Get-Item -Path $VenvExecPath

Write-Verbose "Activation script is located in path: '$VenvExecPath'"
Write-Verbose "VenvExecDir Fullname: '$($VenvExecDir.FullName)"
Write-Verbose "VenvExecDir Name: '$($VenvExecDir.Name)"

# Set values required in priority: CmdLine, ConfigFile, Default
# First, get the location of the virtual environment, it might not be
# VenvExecDir if specified on the command line.
if ($VenvDir) {
    Write-Verbose "VenvDir given as parameter, using '$VenvDir' to determine values"
}
else {
    Write-Verbose "VenvDir not given as a parameter, using parent directory name as VenvDir."
    $VenvDir = $VenvExecDir.Parent.FullName.TrimEnd("\\/")
    Write-Verbose "VenvDir=$VenvDir"
}

# Next, read the `pyvenv.cfg` file to determine any required value such
# as `prompt`.
$pyvenvCfg = Get-PyVenvConfig -ConfigDir $VenvDir

# Next, set the prompt from the command line, or the config file, or
# just use the name of the virtual environment folder.
if ($Prompt) {
    Write-Verbose "Prompt specified as argument, using '$Prompt'"
}
else {
    Write-Verbose "Prompt not specified as argument to script, checking pyvenv.cfg value"
    if ($pyvenvCfg -and $pyvenvCfg['prompt']) {
        Write-Verbose "  Setting based on value in pyvenv.cfg='$($pyvenvCfg['prompt'])'"
        $Prompt = $pyvenvCfg['prompt'];
    }
    else {
        Write-Verbose "  Setting prompt based on parent's directory's name. (Is the directory name passed to venv module when creating the virtual environment)"
        Write-Verbose "  Got leaf-name of $VenvDir='$(Split-Path -Path $venvDir -Leaf)'"
        $Prompt = Split-Path -Path $venvDir -Leaf
    }
}

Write-Verbose "Prompt = '$Prompt'"
Write-Verbose "VenvDir='$VenvDir'"

# Deactivate any currently active virtual environment, but leave the
# deactivate function in place.
deactivate -nondestructive

# Now set the environment variable VIRTUAL_ENV, used by many tools to determine
# that there is an activated venv.
$env:VIRTUAL_ENV = $VenvDir

if (-not $Env:VIRTUAL_ENV_DISABLE_PROMPT) {

    Write-Verbose "Setting prompt to '$Prompt'"

    # Set the prompt to include the env name
    # Make sure _OLD_VIRTUAL_PROMPT is global
    function global:_OLD_VIRTUAL_PROMPT { "" }
    Copy-Item -Path function:prompt -Destination function:_OLD_VIRTUAL_PROMPT
    New-Variable -Name _PYTHON_VENV_PROMPT_PREFIX -Description "Python virtual environment prompt prefix" -Scope Global -Option ReadOnly -Visibility Public -Value $Prompt

    function global:prompt {
        Write-Host -NoNewline -ForegroundColor Green "($_PYTHON_VENV_PROMPT_PREFIX) "
        _OLD_VIRTUAL_PROMPT
    }
    $env:VIRTUAL_ENV_PROMPT = $Prompt
}

# Clear PYTHONHOME
if (Test-Path -Path Env:PYTHONHOME) {
    Copy-Item -Path Env:PYTHONHOME -Destination Env:_OLD_VIRTUAL_PYTHONHOME
    Remove-Item -Path Env:PYTHONHOME
}

# Add the venv to the PATH
Copy-Item -Path Env:PATH -Destination Env:_OLD_VIRTUAL_PATH
$Env:PATH = "$VenvExecDir$([System.IO.Path]::PathSeparator)$Env:PATH"

# SIG # Begin signature block
# MII3ewYJKoZIhvcNAQcCoII3bDCCN2gCAQExDzANBglghkgBZQMEAgEFADB5Bgor
# BgEEAYI3AgEEoGswaTA0BgorBgEEAYI3AgEeMCYCAwEAAAQQH8w7YFlLCE63JNLG
# KX7zUQIBAAIBAAIBAAIBAAIBADAxMA0GCWCGSAFlAwQCAQUABCBnL745ElCYk8vk
# dBtMuQhLeWJ3ZGfzKW4DHCYzAn+QB6CCG9IwggXMMIIDtKADAgECAhBUmNLR1FsZ
# lUgTecgRwIeZMA0GCSqGSIb3DQEBDAUAMHcxCzAJBgNVBAYTAlVTMR4wHAYDVQQK
# ExVNaWNyb3NvZnQgQ29ycG9yYXRpb24xSDBGBgNVBAMTP01pY3Jvc29mdCBJZGVu
# dGl0eSBWZXJpZmljYXRpb24gUm9vdCBDZXJ0aWZpY2F0ZSBBdXRob3JpdHkgMjAy
# MDAeFw0yMDA0MTYxODM2MTZaFw00NTA0MTYxODQ0NDBaMHcxCzAJBgNVBAYTAlVT
# MR4wHAYDVQQKExVNaWNyb3NvZnQgQ29ycG9yYXRpb24xSDBGBgNVBAMTP01pY3Jv
# c29mdCBJZGVudGl0eSBWZXJpZmljYXRpb24gUm9vdCBDZXJ0aWZpY2F0ZSBBdXRo
# b3JpdHkgMjAyMDCCAiIwDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBALORKgeD
# Bmf9np3gx8C3pOZCBH8Ppttf+9Va10Wg+3cL8IDzpm1aTXlT2KCGhFdFIMeiVPvH
# or+Kx24186IVxC9O40qFlkkN/76Z2BT2vCcH7kKbK/ULkgbk/WkTZaiRcvKYhOuD
# PQ7k13ESSCHLDe32R0m3m/nJxxe2hE//uKya13NnSYXjhr03QNAlhtTetcJtYmrV
# qXi8LW9J+eVsFBT9FMfTZRY33stuvF4pjf1imxUs1gXmuYkyM6Nix9fWUmcIxC70
# ViueC4fM7Ke0pqrrBc0ZV6U6CwQnHJFnni1iLS8evtrAIMsEGcoz+4m+mOJyoHI1
# vnnhnINv5G0Xb5DzPQCGdTiO0OBJmrvb0/gwytVXiGhNctO/bX9x2P29Da6SZEi3
# W295JrXNm5UhhNHvDzI9e1eM80UHTHzgXhgONXaLbZ7LNnSrBfjgc10yVpRnlyUK
# xjU9lJfnwUSLgP3B+PR0GeUw9gb7IVc+BhyLaxWGJ0l7gpPKWeh1R+g/OPTHU3mg
# trTiXFHvvV84wRPmeAyVWi7FQFkozA8kwOy6CXcjmTimthzax7ogttc32H83rwjj
# O3HbbnMbfZlysOSGM1l0tRYAe1BtxoYT2v3EOYI9JACaYNq6lMAFUSw0rFCZE4e7
# swWAsk0wAly4JoNdtGNz764jlU9gKL431VulAgMBAAGjVDBSMA4GA1UdDwEB/wQE
# AwIBhjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBTIftJqhSobyhmYBAcnz1AQ
# T2ioojAQBgkrBgEEAYI3FQEEAwIBADANBgkqhkiG9w0BAQwFAAOCAgEAr2rd5hnn
# LZRDGU7L6VCVZKUDkQKL4jaAOxWiUsIWGbZqWl10QzD0m/9gdAmxIR6QFm3FJI9c
# Zohj9E/MffISTEAQiwGf2qnIrvKVG8+dBetJPnSgaFvlVixlHIJ+U9pW2UYXeZJF
# xBA2CFIpF8svpvJ+1Gkkih6PsHMNzBxKq7Kq7aeRYwFkIqgyuH4yKLNncy2RtNwx
# AQv3Rwqm8ddK7VZgxCwIo3tAsLx0J1KH1r6I3TeKiW5niB31yV2g/rarOoDXGpc8
# FzYiQR6sTdWD5jw4vU8w6VSp07YEwzJ2YbuwGMUrGLPAgNW3lbBeUU0i/OxYqujY
# lLSlLu2S3ucYfCFX3VVj979tzR/SpncocMfiWzpbCNJbTsgAlrPhgzavhgplXHT2
# 6ux6anSg8Evu75SjrFDyh+3XOjCDyft9V77l4/hByuVkrrOj7FjshZrM77nq81YY
# uVxzmq/FdxeDWds3GhhyVKVB0rYjdaNDmuV3fJZ5t0GNv+zcgKCf0Xd1WF81E+Al
# GmcLfc4l+gcK5GEh2NQc5QfGNpn0ltDGFf5Ozdeui53bFv0ExpK91IjmqaOqu/dk
# ODtfzAzQNb50GQOmxapMomE2gj4d8yu8l13bS3g7LfU772Aj6PXsCyM2la+YZr9T
# 03u4aUoqlmZpxJTG9F9urJh4iIAGXKKy7aIwggb+MIIE5qADAgECAhMzAAE6wJsA
# snAfNTDUAAAAATrAMA0GCSqGSIb3DQEBDAUAMFoxCzAJBgNVBAYTAlVTMR4wHAYD
# VQQKExVNaWNyb3NvZnQgQ29ycG9yYXRpb24xKzApBgNVBAMTIk1pY3Jvc29mdCBJ
# RCBWZXJpZmllZCBDUyBFT0MgQ0EgMDIwHhcNMjQxMjAzMDk0MTAzWhcNMjQxMjA2
# MDk0MTAzWjB8MQswCQYDVQQGEwJVUzEPMA0GA1UECBMGT3JlZ29uMRIwEAYDVQQH
# EwlCZWF2ZXJ0b24xIzAhBgNVBAoTGlB5dGhvbiBTb2Z0d2FyZSBGb3VuZGF0aW9u
# MSMwIQYDVQQDExpQeXRob24gU29mdHdhcmUgRm91bmRhdGlvbjCCAaIwDQYJKoZI
# hvcNAQEBBQADggGPADCCAYoCggGBAOb6O8bNcUcKFE6vSWXiMf3/EALRFfo4QQDq
# m/XTiKG7GeHjRzVGTGhooTagMmKc4m52JNEPsQxCwXYCcTz9bRt/I+CYfHcqIAKL
# Ga46L/7a7T6qFF0tcAX15uMED8xwApWO8GP4wYxGS/a0GD3OGpl2qsZmw0lKFm+f
# rD8ip4F4z6KrnmsVlC0qMdCndPH+2NAstO7k+ArRhqDvsu5Jt39O2M7LuaX6loEp
# 43CWmkr7i+cLlkNbRx5hBrN/8MJQDhnprS2a6y+3PGLNDDUvN9Yb9UMZFwu5xcjM
# AP8VqdQYmgdz7PZDlIncayelUgxG1Zdx6CWqngNOKpc4lkTKzxf3Q8m+awDXCw7L
# 18jWlFL1Jki2khRmU4HhPzKf2WcgQCHtUPekmvnNUeNFT3TZsxXyhXuDTmB+BOV7
# TpUO594HbMIhdktkdZ2mA/mcKPN4NnYJy55pUmwQIOuqPY1dlR+nv2ko466xxd8x
# pUO3R/pAse5WXaNx6fy9jj0ChNuNmQIDAQABo4ICGTCCAhUwDAYDVR0TAQH/BAIw
# ADAOBgNVHQ8BAf8EBAMCB4AwPAYDVR0lBDUwMwYKKwYBBAGCN2EBAAYIKwYBBQUH
# AwMGGysGAQQBgjdhgqKNuwqmkohkgZH0oEWCk/3hbzAdBgNVHQ4EFgQUJrRvpIJJ
# B/SCh4T20kb8yCbcj7MwHwYDVR0jBBgwFoAUZZ9RzoVofy+KRYiq3acxux4NAF4w
# ZwYDVR0fBGAwXjBcoFqgWIZWaHR0cDovL3d3dy5taWNyb3NvZnQuY29tL3BraW9w
# cy9jcmwvTWljcm9zb2Z0JTIwSUQlMjBWZXJpZmllZCUyMENTJTIwRU9DJTIwQ0El
# MjAwMi5jcmwwgaUGCCsGAQUFBwEBBIGYMIGVMGQGCCsGAQUFBzAChlhodHRwOi8v
# d3d3Lm1pY3Jvc29mdC5jb20vcGtpb3BzL2NlcnRzL01pY3Jvc29mdCUyMElEJTIw
# VmVyaWZpZWQlMjBDUyUyMEVPQyUyMENBJTIwMDIuY3J0MC0GCCsGAQUFBzABhiFo
# dHRwOi8vb25lb2NzcC5taWNyb3NvZnQuY29tL29jc3AwZgYDVR0gBF8wXTBRBgwr
# BgEEAYI3TIN9AQEwQTA/BggrBgEFBQcCARYzaHR0cDovL3d3dy5taWNyb3NvZnQu
# Y29tL3BraW9wcy9Eb2NzL1JlcG9zaXRvcnkuaHRtMAgGBmeBDAEEATANBgkqhkiG
# 9w0BAQwFAAOCAgEAvsdWH8gSPoZt5yblw1MghytJmjs6zeME3/iAYXPzcwwXfB08
# RLMesfF0svS4P1GEdP+CcIIqFl7/48ECI4eregPZjypMWhqPQQWuT8+gMDiWRtYj
# KGEhu+faY+U8Bqv/OrRRS6MHNAoJAGH3t/oxeLXTaeW1URtswm+gvEx+K9KFpHP9
# j1mF0wh4wuVvUJAGYc4KpcorLk7P9vQTLCDmFZi08HMcRYQURHEpskWiSS4czpfv
# ImpRPD5RbIqkQnK7M5wX8X+cI/0hCUb0TjIuxBo92FU726ivQ2vqpzuuRo2Uw1wR
# hzerJI2Bilj66tBzPNn02EZt978Ju/f6+N7b0tFD0kSz1DwTYAR3eK9CwRxIXCGf
# 2rrHz/T9ATSlF71xw2g0R5HBPaJKaySj7PXIU5Nq9sEVCuUSJmNmiZRdBkF76LKH
# mg3cCM5QbgpKfnc46Hi6I7D9q8F1XF0IJwRgP/fwDquWnkTHOoWC9nA0+eLedlAh
# oKYQGEKzAdNLdXdWyUNTjnUKhEXLQgPyDPP9ZLHK+jjQwc6ptwPoX3uuSHtrNmnu
# EIFibcHCp6JVIIu5B94xKS99hGXhaxbuaOMyCvTEVmoLMxAWpF1lVyUUTgj9Lr4W
# /qdqUE+7a1QTDLLswxY8djZCJejn0pDLoKoNfjaeIaRKXXrRs2BND+qKDaowggda
# MIIFQqADAgECAhMzAAAABft6XDITYd9dAAAAAAAFMA0GCSqGSIb3DQEBDAUAMGMx
# CzAJBgNVBAYTAlVTMR4wHAYDVQQKExVNaWNyb3NvZnQgQ29ycG9yYXRpb24xNDAy
# BgNVBAMTK01pY3Jvc29mdCBJRCBWZXJpZmllZCBDb2RlIFNpZ25pbmcgUENBIDIw
# MjEwHhcNMjEwNDEzMTczMTUzWhcNMjYwNDEzMTczMTUzWjBaMQswCQYDVQQGEwJV
# UzEeMBwGA1UEChMVTWljcm9zb2Z0IENvcnBvcmF0aW9uMSswKQYDVQQDEyJNaWNy
# b3NvZnQgSUQgVmVyaWZpZWQgQ1MgRU9DIENBIDAyMIICIjANBgkqhkiG9w0BAQEF
# AAOCAg8AMIICCgKCAgEA0hqZfD8ykKTA6CDbWvshmBpDoBf7Lv132RVuSqVwQO3a
# ALLkuRnnTIoRmMGo0fIMQrtwR6UHB06xdqOkAfqB6exubXTHu44+duHUCdE4ngjE
# LBQyluMuSOnHaEdveIbt31OhMEX/4nQkph4+Ah0eR4H2sTRrVKmKrlOoQlhia73Q
# g2dHoitcX1uT1vW3Knpt9Mt76H7ZHbLNspMZLkWBabKMl6BdaWZXYpPGdS+qY80g
# DaNCvFq0d10UMu7xHesIqXpTDT3Q3AeOxSylSTc/74P3og9j3OuemEFauFzL55t1
# MvpadEhQmD8uFMxFv/iZOjwvcdY1zhanVLLyplz13/NzSoU3QjhPdqAGhRIwh/YD
# zo3jCdVJgWQRrW83P3qWFFkxNiME2iO4IuYgj7RwseGwv7I9cxOyaHihKMdT9Neo
# SjpSNzVnKKGcYMtOdMtKFqoV7Cim2m84GmIYZTBorR/Po9iwlasTYKFpGZqdWKyY
# nJO2FV8oMmWkIK1iagLLgEt6ZaR0rk/1jUYssyTiRqWr84Qs3XL/V5KUBEtUEQfQ
# /4RtnI09uFFUIGJZV9mD/xOUksWodGrCQSem6Hy261xMJAHqTqMuDKgwi8xk/mfl
# r7yhXPL73SOULmu1Aqu4I7Gpe6QwNW2TtQBxM3vtSTmdPW6rK5y0gED51RjsyK0C
# AwEAAaOCAg4wggIKMA4GA1UdDwEB/wQEAwIBhjAQBgkrBgEEAYI3FQEEAwIBADAd
# BgNVHQ4EFgQUZZ9RzoVofy+KRYiq3acxux4NAF4wVAYDVR0gBE0wSzBJBgRVHSAA
# MEEwPwYIKwYBBQUHAgEWM2h0dHA6Ly93d3cubWljcm9zb2Z0LmNvbS9wa2lvcHMv
# RG9jcy9SZXBvc2l0b3J5Lmh0bTAZBgkrBgEEAYI3FAIEDB4KAFMAdQBiAEMAQTAS
# BgNVHRMBAf8ECDAGAQH/AgEAMB8GA1UdIwQYMBaAFNlBKbAPD2Ns72nX9c0pnqRI
# ajDmMHAGA1UdHwRpMGcwZaBjoGGGX2h0dHA6Ly93d3cubWljcm9zb2Z0LmNvbS9w
# a2lvcHMvY3JsL01pY3Jvc29mdCUyMElEJTIwVmVyaWZpZWQlMjBDb2RlJTIwU2ln
# bmluZyUyMFBDQSUyMDIwMjEuY3JsMIGuBggrBgEFBQcBAQSBoTCBnjBtBggrBgEF
# BQcwAoZhaHR0cDovL3d3dy5taWNyb3NvZnQuY29tL3BraW9wcy9jZXJ0cy9NaWNy
# b3NvZnQlMjBJRCUyMFZlcmlmaWVkJTIwQ29kZSUyMFNpZ25pbmclMjBQQ0ElMjAy
# MDIxLmNydDAtBggrBgEFBQcwAYYhaHR0cDovL29uZW9jc3AubWljcm9zb2Z0LmNv
# bS9vY3NwMA0GCSqGSIb3DQEBDAUAA4ICAQBFSWDUd08X4g5HzvVfrB1SiV8pk6XP
# HT9jPkCmvU/uvBzmZRAjYk2gKYR3pXoStRJaJ/lhjC5Dq/2R7P1YRZHCDYyK0zvS
# RMdE6YQtgGjmsdhzD0nCS6hVVcgfmNQscPJ1WHxbvG5EQgYQ0ZED1FN0MOPQzWe1
# zbH5Va0dSxtnodBVRjnyDYEm7sNEcvJHTG3eXzAyd00E5KDCsEl4z5O0mvXqwaH2
# PS0200E6P4WqLwgs/NmUu5+Aa8Lw/2En2VkIW7Pkir4Un1jG6+tj/ehuqgFyUPPC
# h6kbnvk48bisi/zPjAVkj7qErr7fSYICCzJ4s4YUNVVHgdoFn2xbW7ZfBT3QA9zf
# hq9u4ExXbrVD5rxXSTFEUg2gzQq9JHxsdHyMfcCKLFQOXODSzcYeLpCd+r6GcoDB
# ToyPdKccjC6mAq6+/hiMDnpvKUIHpyYEzWUeattyKXtMf+QrJeQ+ny5jBL+xqdOO
# PEz3dg7qn8/oprUrUbGLBv9fWm18fWXdAv1PCtLL/acMLtHoyeSVMKQYqDHb3Qm0
# uQ+NQ0YE4kUxSQa+W/cCzYAI32uN0nb9M4Mr1pj4bJZidNkM4JyYqezohILxYkgH
# bboJQISrQWrm5RYdyhKBpptJ9JJn0Z63LjdnzlOUxjlsAbQir2Wmz/OJE703BbHm
# QZRwzPx1vu7S5zCCB54wggWGoAMCAQICEzMAAAAHh6M0o3uljhwAAAAAAAcwDQYJ
# KoZIhvcNAQEMBQAwdzELMAkGA1UEBhMCVVMxHjAcBgNVBAoTFU1pY3Jvc29mdCBD
# b3Jwb3JhdGlvbjFIMEYGA1UEAxM/TWljcm9zb2Z0IElkZW50aXR5IFZlcmlmaWNh
# dGlvbiBSb290IENlcnRpZmljYXRlIEF1dGhvcml0eSAyMDIwMB4XDTIxMDQwMTIw
# MDUyMFoXDTM2MDQwMTIwMTUyMFowYzELMAkGA1UEBhMCVVMxHjAcBgNVBAoTFU1p
# Y3Jvc29mdCBDb3Jwb3JhdGlvbjE0MDIGA1UEAxMrTWljcm9zb2Z0IElEIFZlcmlm
# aWVkIENvZGUgU2lnbmluZyBQQ0EgMjAyMTCCAiIwDQYJKoZIhvcNAQEBBQADggIP
# ADCCAgoCggIBALLwwK8ZiCji3VR6TElsaQhVCbRS/3pK+MHrJSj3Zxd3KU3rlfL3
# qrZilYKJNqztA9OQacr1AwoNcHbKBLbsQAhBnIB34zxf52bDpIO3NJlfIaTE/xrw
# eLoQ71lzCHkD7A4As1Bs076Iu+mA6cQzsYYH/Cbl1icwQ6C65rU4V9NQhNUwgrx9
# rGQ//h890Q8JdjLLw0nV+ayQ2Fbkd242o9kH82RZsH3HEyqjAB5a8+Ae2nPIPc8s
# ZU6ZE7iRrRZywRmrKDp5+TcmJX9MRff241UaOBs4NmHOyke8oU1TYrkxh+YeHgfW
# o5tTgkoSMoayqoDpHOLJs+qG8Tvh8SnifW2Jj3+ii11TS8/FGngEaNAWrbyfNrC6
# 9oKpRQXY9bGH6jn9NEJv9weFxhTwyvx9OJLXmRGbAUXN1U9nf4lXezky6Uh/cgjk
# Vd6CGUAf0K+Jw+GE/5VpIVbcNr9rNE50Sbmy/4RTCEGvOq3GhjITbCa4crCzTTHg
# YYjHs1NbOc6brH+eKpWLtr+bGecy9CrwQyx7S/BfYJ+ozst7+yZtG2wR461uckFu
# 0t+gCwLdN0A6cFtSRtR8bvxVFyWwTtgMMFRuBa3vmUOTnfKLsLefRaQcVTgRnzeL
# zdpt32cdYKp+dhr2ogc+qM6K4CBI5/j4VFyC4QFeUP2YAidLtvpXRRo3AgMBAAGj
# ggI1MIICMTAOBgNVHQ8BAf8EBAMCAYYwEAYJKwYBBAGCNxUBBAMCAQAwHQYDVR0O
# BBYEFNlBKbAPD2Ns72nX9c0pnqRIajDmMFQGA1UdIARNMEswSQYEVR0gADBBMD8G
# CCsGAQUFBwIBFjNodHRwOi8vd3d3Lm1pY3Jvc29mdC5jb20vcGtpb3BzL0RvY3Mv
# UmVwb3NpdG9yeS5odG0wGQYJKwYBBAGCNxQCBAweCgBTAHUAYgBDAEEwDwYDVR0T
# AQH/BAUwAwEB/zAfBgNVHSMEGDAWgBTIftJqhSobyhmYBAcnz1AQT2ioojCBhAYD
# VR0fBH0wezB5oHegdYZzaHR0cDovL3d3dy5taWNyb3NvZnQuY29tL3BraW9wcy9j
# cmwvTWljcm9zb2Z0JTIwSWRlbnRpdHklMjBWZXJpZmljYXRpb24lMjBSb290JTIw
# Q2VydGlmaWNhdGUlMjBBdXRob3JpdHklMjAyMDIwLmNybDCBwwYIKwYBBQUHAQEE
# gbYwgbMwgYEGCCsGAQUFBzAChnVodHRwOi8vd3d3Lm1pY3Jvc29mdC5jb20vcGtp
# b3BzL2NlcnRzL01pY3Jvc29mdCUyMElkZW50aXR5JTIwVmVyaWZpY2F0aW9uJTIw
# Um9vdCUyMENlcnRpZmljYXRlJTIwQXV0aG9yaXR5JTIwMjAyMC5jcnQwLQYIKwYB
# BQUHMAGGIWh0dHA6Ly9vbmVvY3NwLm1pY3Jvc29mdC5jb20vb2NzcDANBgkqhkiG
# 9w0BAQwFAAOCAgEAfyUqnv7Uq+rdZgrbVyNMul5skONbhls5fccPlmIbzi+OwVdP
# Q4H55v7VOInnmezQEeW4LqK0wja+fBznANbXLB0KrdMCbHQpbLvG6UA/Xv2pfpVI
# E1CRFfNF4XKO8XYEa3oW8oVH+KZHgIQRIwAbyFKQ9iyj4aOWeAzwk+f9E5StNp5T
# 8FG7/VEURIVWArbAzPt9ThVN3w1fAZkF7+YU9kbq1bCR2YD+MtunSQ1Rft6XG7b4
# e0ejRA7mB2IoX5hNh3UEauY0byxNRG+fT2MCEhQl9g2i2fs6VOG19CNep7SquKaB
# jhWmirYyANb0RJSLWjinMLXNOAga10n8i9jqeprzSMU5ODmrMCJE12xS/NWShg/t
# uLjAsKP6SzYZ+1Ry358ZTFcx0FS/mx2vSoU8s8HRvy+rnXqyUJ9HBqS0DErVLjQw
# K8VtsBdekBmdTbQVoCgPCqr+PDPB3xajYnzevs7eidBsM71PINK2BoE2UfMwxCCX
# 3mccFgx6UsQeRSdVVVNSyALQe6PT12418xon2iDGE81OGCreLzDcMAZnrUAx4XQL
# Uz6ZTl65yPUiOh3k7Yww94lDf+8oG2oZmDh5O1Qe38E+M3vhKwmzIeoB1dVLlz4i
# 3IpaDcR+iuGjH2TdaC1ZOmBXiCRKJLj4DT2uhJ04ji+tHD6n58vhavFIrmcxghr/
# MIIa+wIBATBxMFoxCzAJBgNVBAYTAlVTMR4wHAYDVQQKExVNaWNyb3NvZnQgQ29y
# cG9yYXRpb24xKzApBgNVBAMTIk1pY3Jvc29mdCBJRCBWZXJpZmllZCBDUyBFT0Mg
# Q0EgMDICEzMAATrAmwCycB81MNQAAAABOsAwDQYJYIZIAWUDBAIBBQCggcgwGQYJ
# KoZIhvcNAQkDMQwGCisGAQQBgjcCAQQwHAYKKwYBBAGCNwIBCzEOMAwGCisGAQQB
# gjcCARUwLwYJKoZIhvcNAQkEMSIEIGcBno/ti9PCrR9sXrajsTvlHQvGxbk63JiI
# URJByQuGMFwGCisGAQQBgjcCAQwxTjBMoEaARABCAHUAaQBsAHQAOgAgAFIAZQBs
# AGUAYQBzAGUAXwB2ADMALgAxADIALgA4AF8AMgAwADIANAAxADIAMAAzAC4AMAAx
# oQKAADANBgkqhkiG9w0BAQEFAASCAYA12ir0FCUD2IvWiy7MqsqAciOsLhTQmvF0
# /jSGXaCIrGUzlySVKbuQ47XFYcT5xIoz7ChRS/OvrKCJ0eTjFHn7osTLM4BrKZFi
# G0CwUrFom77qPC4XQQY238IGPMiJZCza+PbrXun7tGNqJSH4uCybQMnB1XH3W9qy
# o3Mn89gnba36QesJ8wl5no+HvIS0LnylhzvDcqdO3yI/EC22XJ/f/XENtDIRI+nC
# nwFuo22Ez2ElzjpFtn9kLyA0/Z8Q6SEcboSbpu6daFBgFe6Ztfrs2ga8C4BRgale
# NJJjL9EqnuHQmMC6TctFqw6bMdQi1OhJsKOaPrp3jWY1np+1jI7X+jcrJ0ZYrTpN
# iH5tIgbYarDF6D2HA8Tka5vfKtG7Q2KBOWH9PqJIAjwHCr0XaIQa1NbkAYFmsVMH
# 4MJluljUeMx3Gh9wfCx1R9485sNAFp+Triawf20YtgbTXPw+e0gzjE8vGDBht6i7
# +8Yrqbq00uH3Y+JtjWAis+oi8ZrMPDGhghgUMIIYEAYKKwYBBAGCNwMDATGCGAAw
# ghf8BgkqhkiG9w0BBwKgghftMIIX6QIBAzEPMA0GCWCGSAFlAwQCAQUAMIIBYgYL
# KoZIhvcNAQkQAQSgggFRBIIBTTCCAUkCAQEGCisGAQQBhFkKAwEwMTANBglghkgB
# ZQMEAgEFAAQgZFXp7g3d1kespGffVXL6HjYsxYd3bAR3mRUetFkYstkCBmdEaWXk
# sRgTMjAyNDEyMDMyMDE0MzIuODM2WjAEgAIB9KCB4aSB3jCB2zELMAkGA1UEBhMC
# VVMxEzARBgNVBAgTCldhc2hpbmd0b24xEDAOBgNVBAcTB1JlZG1vbmQxHjAcBgNV
# BAoTFU1pY3Jvc29mdCBDb3Jwb3JhdGlvbjElMCMGA1UECxMcTWljcm9zb2Z0IEFt
# ZXJpY2EgT3BlcmF0aW9uczEnMCUGA1UECxMeblNoaWVsZCBUU1MgRVNOOjc4MDAt
# MDVFMC1EOTQ3MTUwMwYDVQQDEyxNaWNyb3NvZnQgUHVibGljIFJTQSBUaW1lIFN0
# YW1waW5nIEF1dGhvcml0eaCCDyEwggeCMIIFaqADAgECAhMzAAAABeXPD/9mLsmH
# AAAAAAAFMA0GCSqGSIb3DQEBDAUAMHcxCzAJBgNVBAYTAlVTMR4wHAYDVQQKExVN
# aWNyb3NvZnQgQ29ycG9yYXRpb24xSDBGBgNVBAMTP01pY3Jvc29mdCBJZGVudGl0
# eSBWZXJpZmljYXRpb24gUm9vdCBDZXJ0aWZpY2F0ZSBBdXRob3JpdHkgMjAyMDAe
# Fw0yMDExMTkyMDMyMzFaFw0zNTExMTkyMDQyMzFaMGExCzAJBgNVBAYTAlVTMR4w
# HAYDVQQKExVNaWNyb3NvZnQgQ29ycG9yYXRpb24xMjAwBgNVBAMTKU1pY3Jvc29m
# dCBQdWJsaWMgUlNBIFRpbWVzdGFtcGluZyBDQSAyMDIwMIICIjANBgkqhkiG9w0B
# AQEFAAOCAg8AMIICCgKCAgEAnnznUmP94MWfBX1jtQYioxwe1+eXM9ETBb1lRkd3
# kcFdcG9/sqtDlwxKoVIcaqDb+omFio5DHC4RBcbyQHjXCwMk/l3TOYtgoBjxnG/e
# ViS4sOx8y4gSq8Zg49REAf5huXhIkQRKe3Qxs8Sgp02KHAznEa/Ssah8nWo5hJM1
# xznkRsFPu6rfDHeZeG1Wa1wISvlkpOQooTULFm809Z0ZYlQ8Lp7i5F9YciFlyAKw
# n6yjN/kR4fkquUWfGmMopNq/B8U/pdoZkZZQbxNlqJOiBGgCWpx69uKqKhTPVi3g
# VErnc/qi+dR8A2MiAz0kN0nh7SqINGbmw5OIRC0EsZ31WF3Uxp3GgZwetEKxLms7
# 3KG/Z+MkeuaVDQQheangOEMGJ4pQZH55ngI0Tdy1bi69INBV5Kn2HVJo9XxRYR/J
# PGAaM6xGl57Ei95HUw9NV/uC3yFjrhc087qLJQawSC3xzY/EXzsT4I7sDbxOmM2r
# l4uKK6eEpurRduOQ2hTkmG1hSuWYBunFGNv21Kt4N20AKmbeuSnGnsBCd2cjRKG7
# 9+TX+sTehawOoxfeOO/jR7wo3liwkGdzPJYHgnJ54UxbckF914AqHOiEV7xTnD1a
# 69w/UTxwjEugpIPMIIE67SFZ2PMo27xjlLAHWW3l1CEAFjLNHd3EQ79PUr8FUXet
# Xr0CAwEAAaOCAhswggIXMA4GA1UdDwEB/wQEAwIBhjAQBgkrBgEEAYI3FQEEAwIB
# ADAdBgNVHQ4EFgQUa2koOjUvSGNAz3vYr0npPtk92yEwVAYDVR0gBE0wSzBJBgRV
# HSAAMEEwPwYIKwYBBQUHAgEWM2h0dHA6Ly93d3cubWljcm9zb2Z0LmNvbS9wa2lv
# cHMvRG9jcy9SZXBvc2l0b3J5Lmh0bTATBgNVHSUEDDAKBggrBgEFBQcDCDAZBgkr
# BgEEAYI3FAIEDB4KAFMAdQBiAEMAQTAPBgNVHRMBAf8EBTADAQH/MB8GA1UdIwQY
# MBaAFMh+0mqFKhvKGZgEByfPUBBPaKiiMIGEBgNVHR8EfTB7MHmgd6B1hnNodHRw
# Oi8vd3d3Lm1pY3Jvc29mdC5jb20vcGtpb3BzL2NybC9NaWNyb3NvZnQlMjBJZGVu
# dGl0eSUyMFZlcmlmaWNhdGlvbiUyMFJvb3QlMjBDZXJ0aWZpY2F0ZSUyMEF1dGhv
# cml0eSUyMDIwMjAuY3JsMIGUBggrBgEFBQcBAQSBhzCBhDCBgQYIKwYBBQUHMAKG
# dWh0dHA6Ly93d3cubWljcm9zb2Z0LmNvbS9wa2lvcHMvY2VydHMvTWljcm9zb2Z0
# JTIwSWRlbnRpdHklMjBWZXJpZmljYXRpb24lMjBSb290JTIwQ2VydGlmaWNhdGUl
# MjBBdXRob3JpdHklMjAyMDIwLmNydDANBgkqhkiG9w0BAQwFAAOCAgEAX4h2x35t
# tVoVdedMeGj6TuHYRJklFaW4sTQ5r+k77iB79cSLNe+GzRjv4pVjJviceW6AF6yc
# WoEYR0LYhaa0ozJLU5Yi+LCmcrdovkl53DNt4EXs87KDogYb9eGEndSpZ5ZM74LN
# vVzY0/nPISHz0Xva71QjD4h+8z2XMOZzY7YQ0Psw+etyNZ1CesufU211rLslLKsO
# 8F2aBs2cIo1k+aHOhrw9xw6JCWONNboZ497mwYW5EfN0W3zL5s3ad4Xtm7yFM7Uj
# rhc0aqy3xL7D5FR2J7x9cLWMq7eb0oYioXhqV2tgFqbKHeDick+P8tHYIFovIP7Y
# G4ZkJWag1H91KlELGWi3SLv10o4KGag42pswjybTi4toQcC/irAodDW8HNtX+cbz
# 0sMptFJK+KObAnDFHEsukxD+7jFfEV9Hh/+CSxKRsmnuiovCWIOb+H7DRon9Tlxy
# diFhvu88o0w35JkNbJxTk4MhF/KgaXn0GxdH8elEa2Imq45gaa8D+mTm8LWVydt4
# ytxYP/bqjN49D9NZ81coE6aQWm88TwIf4R4YZbOpMKN0CyejaPNN41LGXHeCUMYm
# Bx3PkP8ADHD1J2Cr/6tjuOOCztfp+o9Nc+ZoIAkpUcA/X2gSMkgHAPUvIdtoSAHE
# UKiBhI6JQivRepyvWcl+JYbYbBh7pmgAXVswggeXMIIFf6ADAgECAhMzAAAAO4pp
# Wb4UBWRxAAAAAAA7MA0GCSqGSIb3DQEBDAUAMGExCzAJBgNVBAYTAlVTMR4wHAYD
# VQQKExVNaWNyb3NvZnQgQ29ycG9yYXRpb24xMjAwBgNVBAMTKU1pY3Jvc29mdCBQ
# dWJsaWMgUlNBIFRpbWVzdGFtcGluZyBDQSAyMDIwMB4XDTI0MDIxNTIwMzYxMloX
# DTI1MDIxNTIwMzYxMlowgdsxCzAJBgNVBAYTAlVTMRMwEQYDVQQIEwpXYXNoaW5n
# dG9uMRAwDgYDVQQHEwdSZWRtb25kMR4wHAYDVQQKExVNaWNyb3NvZnQgQ29ycG9y
# YXRpb24xJTAjBgNVBAsTHE1pY3Jvc29mdCBBbWVyaWNhIE9wZXJhdGlvbnMxJzAl
# BgNVBAsTHm5TaGllbGQgVFNTIEVTTjo3ODAwLTA1RTAtRDk0NzE1MDMGA1UEAxMs
# TWljcm9zb2Z0IFB1YmxpYyBSU0EgVGltZSBTdGFtcGluZyBBdXRob3JpdHkwggIi
# MA0GCSqGSIb3DQEBAQUAA4ICDwAwggIKAoICAQCoN2tZV70ADgeArSKowvN7sD1W
# j9d2dKDzNsSpQZSD3kwUftP9qC4o/eDvvzx/AzPtJpkW5JpDqYKGIk3NSyyWFlY1
# 2loL6mhkRO8K3lLLgZ9wAr68z+1W0NLs0Bd48QUtLfckAiToekndsqKFP28jZOKB
# U43nW2SiLEL1Wo2JUHFW5Crw16Bkms3b8U9etQKcErNDgTbUnxFbc73Dr47el6pp
# sy6ZMFK7aWzryjKZZfJwS1EmgT2CTQ4XY9qj2Fd9y3gSWNlP+XrGyCiPQ3oQ5cdr
# 9Ms59najNa0WxHbR7B8DPIxXRDxCmdQxHw3HL9N8SC017cvwA4hEuBMfix2gC7xi
# DyM+pTkl28BZ1ANnBznEMZs9rbHtKQpyz2bsNO0RYRP+xrIZtWduvwCWEB6k2H5U
# HSYErMUTm2T4VOQeGsjPRFco+t/5spFqPBsUr/774i4Z+fAfD91D1DFgiK5CVZgg
# k1StKFVDfQSKU5YRXI/TaM4bVocAW3S9rVgpQXCcWI/WJEBxYZn6SJ5dE45VlCwy
# C7HEZvCOrtM02rELlCcXbGdICL3FltPh9A2ZsDw0HA6/7NXF3mhyZ37yQ3sprS/M
# glb5ddY3/KL7nyCfehVuQDjFD2S/h7FCkM1tFFOJnHrn+UHaBsWS/LjyKdBLSK26
# D/C6RPbM6m5MqeJQIwIDAQABo4IByzCCAccwHQYDVR0OBBYEFBmDkSnO3Ykx3QWs
# 933wkNmnHPEVMB8GA1UdIwQYMBaAFGtpKDo1L0hjQM972K9J6T7ZPdshMGwGA1Ud
# HwRlMGMwYaBfoF2GW2h0dHA6Ly93d3cubWljcm9zb2Z0LmNvbS9wa2lvcHMvY3Js
# L01pY3Jvc29mdCUyMFB1YmxpYyUyMFJTQSUyMFRpbWVzdGFtcGluZyUyMENBJTIw
# MjAyMC5jcmwweQYIKwYBBQUHAQEEbTBrMGkGCCsGAQUFBzAChl1odHRwOi8vd3d3
# Lm1pY3Jvc29mdC5jb20vcGtpb3BzL2NlcnRzL01pY3Jvc29mdCUyMFB1YmxpYyUy
# MFJTQSUyMFRpbWVzdGFtcGluZyUyMENBJTIwMjAyMC5jcnQwDAYDVR0TAQH/BAIw
# ADAWBgNVHSUBAf8EDDAKBggrBgEFBQcDCDAOBgNVHQ8BAf8EBAMCB4AwZgYDVR0g
# BF8wXTBRBgwrBgEEAYI3TIN9AQEwQTA/BggrBgEFBQcCARYzaHR0cDovL3d3dy5t
# aWNyb3NvZnQuY29tL3BraW9wcy9Eb2NzL1JlcG9zaXRvcnkuaHRtMAgGBmeBDAEE
# AjANBgkqhkiG9w0BAQwFAAOCAgEAKrAu6dFJYu6BKhLMdAxnEMzeKzJOMunyOeCM
# X9VC/meVFudURy3RKZMYUq0YFqQ0BsmfufswGszwfSnqaq116/fomiYokxBDQU/r
# 2u8sXod6NfSaD8/xx/pAFSU28YFYJh46+wdlR30wgf+8uJJMlpZ90fGiZ2crTw0K
# ZJWWSg53MlXTalBP7ZepnoVp9NmcRD9CDw+3IdkjzH1yCnfjbWp0HfBJdv7WJVlc
# nRM45MYqUX1x+5LCeeDnBw2pTj3cDKPNNtNhb8BHRcTJSH84tjVRTtpCtc1XZE5u
# +u0g1tCzLSm7AmR+SZjoClyzinuQuqk/8kx6YRow7Y8wBiZjP5LfriRreaDGpm97
# efzhkwVKcsZsKnw007GhPRQWz52fSgMsRzg6rWx6MRBv3c+kBcefgLVVEI3gguge
# j9NwDXUnmH+DC6ir5NTQ3ZVLhwA2Fjbn+rctcXeozP5g/CS9Qx4C8RpkvyZGvBEB
# DyNFdU9r2HyMvFP/NaUCI0xC7oLde5FONeRFI01itSXk1N7R80JUW7jqRKvy7Ueq
# g6T6PwWfAd/R+vh7oQXhLH98dPJMODz3cdCtw5MeAnfcfUDEE8b6mzJK5iLJbnKY
# IQ+o9T/AcS0A1yCiClaBZBTociaFT5JStvCe7CDzvUWVBY375ezQ+l6M3tTzy63G
# pBDohSMxggdGMIIHQgIBATB4MGExCzAJBgNVBAYTAlVTMR4wHAYDVQQKExVNaWNy
# b3NvZnQgQ29ycG9yYXRpb24xMjAwBgNVBAMTKU1pY3Jvc29mdCBQdWJsaWMgUlNB
# IFRpbWVzdGFtcGluZyBDQSAyMDIwAhMzAAAAO4ppWb4UBWRxAAAAAAA7MA0GCWCG
# SAFlAwQCAQUAoIIEnzARBgsqhkiG9w0BCRACDzECBQAwGgYJKoZIhvcNAQkDMQ0G
# CyqGSIb3DQEJEAEEMBwGCSqGSIb3DQEJBTEPFw0yNDEyMDMyMDE0MzJaMC8GCSqG
# SIb3DQEJBDEiBCBNv25RRNMkG3WBRCPMirWBBK5bDR5n2jrt3NxatbZRRjCBuQYL
# KoZIhvcNAQkQAi8xgakwgaYwgaMwgaAEIJPbJzLEniYkzwpcwDrQSswJJ/yvXnr9
# 1KPiO2/Blq7cMHwwZaRjMGExCzAJBgNVBAYTAlVTMR4wHAYDVQQKExVNaWNyb3Nv
# ZnQgQ29ycG9yYXRpb24xMjAwBgNVBAMTKU1pY3Jvc29mdCBQdWJsaWMgUlNBIFRp
# bWVzdGFtcGluZyBDQSAyMDIwAhMzAAAAO4ppWb4UBWRxAAAAAAA7MIIDYQYLKoZI
# hvcNAQkQAhIxggNQMIIDTKGCA0gwggNEMIICLAIBATCCAQmhgeGkgd4wgdsxCzAJ
# BgNVBAYTAlVTMRMwEQYDVQQIEwpXYXNoaW5ndG9uMRAwDgYDVQQHEwdSZWRtb25k
# MR4wHAYDVQQKExVNaWNyb3NvZnQgQ29ycG9yYXRpb24xJTAjBgNVBAsTHE1pY3Jv
# c29mdCBBbWVyaWNhIE9wZXJhdGlvbnMxJzAlBgNVBAsTHm5TaGllbGQgVFNTIEVT
# Tjo3ODAwLTA1RTAtRDk0NzE1MDMGA1UEAxMsTWljcm9zb2Z0IFB1YmxpYyBSU0Eg
# VGltZSBTdGFtcGluZyBBdXRob3JpdHmiIwoBATAHBgUrDgMCGgMVAClO3IQMdSPc
# u/l/uN44JqTQiG/3oGcwZaRjMGExCzAJBgNVBAYTAlVTMR4wHAYDVQQKExVNaWNy
# b3NvZnQgQ29ycG9yYXRpb24xMjAwBgNVBAMTKU1pY3Jvc29mdCBQdWJsaWMgUlNB
# IFRpbWVzdGFtcGluZyBDQSAyMDIwMA0GCSqGSIb3DQEBCwUAAgUA6vlz6jAiGA8y
# MDI0MTIwMzEyMTEyMloYDzIwMjQxMjA0MTIxMTIyWjB3MD0GCisGAQQBhFkKBAEx
# LzAtMAoCBQDq+XPqAgEAMAoCAQACAiVbAgH/MAcCAQACAhKYMAoCBQDq+sVqAgEA
# MDYGCisGAQQBhFkKBAIxKDAmMAwGCisGAQQBhFkKAwKgCjAIAgEAAgMHoSChCjAI
# AgEAAgMBhqAwDQYJKoZIhvcNAQELBQADggEBALcfsg+u49h09/2fVfz+1Oa0FPvl
# IbwTVUeWlo8UR47kMKA9YMmJ6keO8V0AZFgm2gpw7makh+I1YuB94yqAdmD927Y+
# W/mDiAT8+42WeSa73vAM0vg0ec6fwS03d+R312Cz0qgZ6MfzqmHOQqSpmxX2hMrF
# fGwmZPaAgUwBovI0Hlv5L5Y4hUOF1nP0OOdkJYpqUJNv7Zf/i7cwJ2cUNHdrakld
# W/2arqSCAC9sduzylh2SYCkvrOgAy/AkAHlazB3iaX1KptwZj8yaK6IfXE9nlMy5
# lMpyDoqyU2nBQaqxcJ2tFAnTqT5LNrDan1XtSTXokbRfZ52wUeNv02NIy7cwDQYJ
# KoZIhvcNAQEBBQAEggIAKSKc+LD6QvnR4RvbNgzgxW9+v7RnkF3ymYnzljaCs++W
# zZVRZxTGF1+ploDbpD9MvJiLgOqO5B5ROh2hfs3GtPBsC8vWek4glRhzja9HCTZT
# WBjsqT7zCj/cemZXBr91bImFKawdxy1PL+YeWYQieQ6f+nMRZzL9HT4qqXTuwrHt
# GbG7yUJvau7AT2jynHPZhQTO8vkHHyMbs2CXxO5D/SyDtqpszYJyCAl+V/jgaZZl
# KGN4Vyfh4fLCaTOGSpoAp1qQBICuHcT+3ZdZpUFbr9hyqBCgAGshqNGwEcjrPGc/
# Sisb0AsTjGAJL9bprQ+94InpaRNdKFj2PwNZ3A02+Re8rWAdcH2vQ/NW7fDSD0aE
# MO26CnK7J+XG1HwVjsqivX/4XS5R1MVIMswcbSUiaouTAN33h5m/vsEMFFG46Qv2
# wJYHr/5QzzqrRsMkuDMqECUBrfJXrIQhwiG8paMMv//91XHniJj8vfB8a1EWQLA3
# +xqOcI9WGdLKzMSmalAOu3LE9PaVdeESNA8pp7EzmbgkzgkJ/uL/+c8ZQvxpXx6K
# DA4vpw2utiUATCPB1J8R+TXkLiSoIZUNNPzF3OndLm91Teziwgd9X2sWrfaBJh5r
# 70oKaRn7atZnxen4l5eUcas0WvsgM0Pb55UfhlPZeX1YC1bX2jpoYUblRMc6bCY=
# SIG # End signature block

```