chriscarlon/os-mcp # codebase.md

# Directory Structure

```
├── .dockerignore
├── .env.local
├── .gitignore
├── .python-version
├── dockerfile
├── LICENSE
├── makefile
├── pyproject.toml
├── README.md
└── src
    ├── __init__.py
    ├── api_service
    │   ├── __init__.py
    │   ├── os_api.py
    │   └── protocols.py
    ├── config_docs
    │   ├── ogcapi-features-1.yaml
    │   ├── ogcapi-features-2.yaml
    │   └── ogcapi-features-3.yaml
    ├── http_client_test.py
    ├── mcp_service
    │   ├── __init__.py
    │   ├── guardrails.py
    │   ├── os_service.py
    │   ├── prompts.py
    │   ├── protocols.py
    │   ├── resources.py
    │   └── routing_service.py
    ├── middleware
    │   ├── __init__.py
    │   ├── http_middleware.py
    │   └── stdio_middleware.py
    ├── models.py
    ├── prompt_templates
    │   ├── __init__.py
    │   └── prompt_templates.py
    ├── server.py
    ├── stdio_client_test.py
    ├── utils
    │   ├── __init__.py
    │   └── logging_config.py
    └── workflow_generator
        ├── __init__.py
        └── workflow_planner.py
```

# Files

--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------

```
1 | 3.11
2 | 
```

--------------------------------------------------------------------------------
/.env.local:
--------------------------------------------------------------------------------

```
1 | OS_API_KEY=xxxxxxx
2 | STDIO_KEY=xxxxxxx
3 | BEARER_TOKENS=xxxxxxx
4 | 
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
 1 | # Python-generated files
 2 | __pycache__/
 3 | *.py[oc]
 4 | build/
 5 | dist/
 6 | wheels/
 7 | *.egg-info
 8 | 
 9 | # Virtual environments
10 | .venv
11 | .env
12 | .DS_Store
13 | .ruff_cache
14 | uv.lock
15 | pyrightconfig.json
16 | *.log
17 | .pytest_cache/
18 | .mypy_cache/
```

--------------------------------------------------------------------------------
/.dockerignore:
--------------------------------------------------------------------------------

```
 1 | # Python artifacts
 2 | __pycache__/
 3 | *.py[cod]
 4 | *$py.class
 5 | *.so
 6 | .Python
 7 | build/
 8 | develop-eggs/
 9 | dist/
10 | downloads/
11 | eggs/
12 | .eggs/
13 | lib/
14 | lib64/
15 | parts/
16 | sdist/
17 | var/
18 | wheels/
19 | *.egg-info/
20 | .installed.cfg
21 | *.egg
22 | 
23 | # Virtual environments
24 | .venv/
25 | venv/
26 | ENV/
27 | env/
28 | .env
29 | 
30 | # Testing and type checking
31 | .pytest_cache/
32 | .mypy_cache/
33 | .coverage
34 | htmlcov/
35 | .tox/
36 | .hypothesis/
37 | *.cover
38 | .coverage.*
39 | 
40 | # Development tools
41 | .ruff_cache/
42 | pyrightconfig.json
43 | .vscode/
44 | .idea/
45 | *.swp
46 | *.swo
47 | *~
48 | 
49 | # OS files
50 | .DS_Store
51 | .DS_Store?
52 | ._*
53 | .Spotlight-V100
54 | .Trashes
55 | ehthumbs.db
56 | Thumbs.db
57 | 
58 | # Git
59 | .git/
60 | .gitignore
61 | .gitattributes
62 | 
63 | # Documentation
64 | *.md
65 | docs/
66 | LICENSE
67 | 
68 | # Build files
69 | Makefile
70 | makefile
71 | dockerfile
72 | Dockerfile
73 | 
74 | # Logs
75 | *.log
76 | logs/
77 | 
78 | # Package manager
79 | uv.lock
80 | poetry.lock
81 | Pipfile.lock
82 | 
83 | # Test files
84 | *_test.py
85 | test_*.py
86 | tests/
```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
 1 | # Ordnance Survey MCP Server
 2 | 
 3 | A MCP server for accessing UK geospatial data through Ordnance Survey APIs.
 4 | 
 5 | ## What it does
 6 | 
 7 | Provides LLM access to the Ordnance Survey's Data Hub APIs. 
 8 | 
 9 | Ask simple questions such as find me all cinemas in Leeds City Centre or use the prompts templates for more complex, speciifc use cases - relating to street works, planning, etc.
10 | 
11 | This MCP server enforces a 2 step workflow plan to ensure that the user gets the best results possible.
12 | 
13 | ## Quick Start
14 | 
15 | ### 1. Get an OS API Key
16 | 
17 | Register at [OS Data Hub](https://osdatahub.os.uk/) to get your free API key and set up a project.
18 | 
19 | ### 2. Run with Docker and add to your Claude Desktop config (easiest)
20 | 
21 | ```bash
22 | Clone the repository:
23 | git clone https://github.com/your-username/os-mcp-server.git
24 | cd os-mcp-server
25 | ```
26 | 
27 | Then build the Docker image:
28 | 
29 | ```bash
30 | docker build -t os-mcp-server .
31 | ```
32 | 
33 | Add the following to your Claude Desktop config:
34 | 
35 | ```json
36 | {
37 |   "mcpServers": {
38 |     "os-mcp-server": {
39 |       "command": "docker",
40 |       "args": [
41 |         "run",
42 |         "--rm",
43 |         "-i",
44 |         "-e",
45 |         "OS_API_KEY=your_api_key_here",
46 |         "-e",
47 |         "STDIO_KEY=any_value",
48 |         "os-mcp-server"
49 |       ]
50 |     }
51 |   }
52 | }
53 | ```
54 | 
55 | Open Claude Desktop and you should now see all available tools, resources, and prompts.
56 | 
57 | ## Requirements
58 | 
59 | - Python 3.11+
60 | - OS API Key from [OS Data Hub](https://osdatahub.os.uk/)
61 | - Set a STDIO_KEY env var which can be any value for now whilst auth is improved.
62 | 
63 | ## License
64 | 
65 | MIT License. This project does not have the endorsement of Ordnance Survey. This is a personal project and not affiliated with Ordnance Survey. This is not a commercial product. It is actively being worked on so expect breaking changes and bugs.
66 | 
```

--------------------------------------------------------------------------------
/src/__init__.py:
--------------------------------------------------------------------------------

```python
1 | 
```

--------------------------------------------------------------------------------
/src/api_service/__init__.py:
--------------------------------------------------------------------------------

```python
1 | 
```

--------------------------------------------------------------------------------
/src/mcp_service/__init__.py:
--------------------------------------------------------------------------------

```python
1 | 
```

--------------------------------------------------------------------------------
/src/middleware/__init__.py:
--------------------------------------------------------------------------------

```python
1 | 
```

--------------------------------------------------------------------------------
/src/prompt_templates/__init__.py:
--------------------------------------------------------------------------------

```python
1 | 
```

--------------------------------------------------------------------------------
/src/utils/__init__.py:
--------------------------------------------------------------------------------

```python
1 | 
```

--------------------------------------------------------------------------------
/src/workflow_generator/__init__.py:
--------------------------------------------------------------------------------

```python
1 | 
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
 1 | [project]
 2 | name = "os-mcp"
 3 | version = "0.1.5"
 4 | description = "A Python MCP server that provides access to Ordnance Survey's DataHub APIs."
 5 | readme = "README.md"
 6 | requires-python = ">=3.11"
 7 | dependencies = [
 8 |     "aiohttp>=3.11.18",
 9 |     "anthropic>=0.51.0",
10 |     "fastapi>=0.116.1",
11 |     "mcp>=1.12.0",
12 |     "starlette>=0.47.1",
13 |     "uvicorn>=0.35.0",
14 | ]
15 | 
```

--------------------------------------------------------------------------------
/dockerfile:
--------------------------------------------------------------------------------

```dockerfile
 1 | FROM python:3.11-slim
 2 | 
 3 | WORKDIR /app
 4 | 
 5 | ENV PYTHONPATH=/app/src \
 6 |     PYTHONUNBUFFERED=1 \
 7 |     PIP_NO_CACHE_DIR=1 \
 8 |     PIP_DISABLE_PIP_VERSION_CHECK=1
 9 | 
10 | 
11 | RUN apt-get update && apt-get install -y \
12 |     gcc \
13 |     && rm -rf /var/lib/apt/lists/* \
14 |     && apt-get clean
15 | 
16 | COPY pyproject.toml ./
17 | RUN pip install --no-cache-dir .
18 | 
19 | COPY src/ ./src/
20 | 
21 | EXPOSE 8000
22 | 
23 | CMD ["python", "src/server.py", "--transport", "stdio"]
24 | 
```

--------------------------------------------------------------------------------
/src/workflow_generator/workflow_planner.py:
--------------------------------------------------------------------------------

```python
 1 | from typing import Dict, Any, Optional, List
 2 | from models import OpenAPISpecification
 3 | from utils.logging_config import get_logger
 4 | 
 5 | logger = get_logger(__name__)
 6 | 
 7 | 
 8 | class WorkflowPlanner:
 9 |     """Context provider for LLM workflow planning"""
10 | 
11 |     def __init__(
12 |         self,
13 |         openapi_spec: Optional[OpenAPISpecification],
14 |         basic_collections_info: Optional[Dict[str, Any]] = None,
15 |     ):
16 |         self.spec = openapi_spec
17 |         self.basic_collections_info = basic_collections_info or {}
18 |         self.detailed_collections_cache = {}
19 | 
20 |     def get_basic_context(self) -> Dict[str, Any]:
21 |         """Get basic context for LLM to plan its workflow - no detailed queryables"""
22 |         return {
23 |             "available_collections": self.basic_collections_info,
24 |             "openapi_spec": self.spec,
25 |         }
26 | 
27 |     def get_detailed_context(self, collection_ids: List[str]) -> Dict[str, Any]:
28 |         """Get detailed context for specific collections mentioned in the plan"""
29 |         detailed_collections = {
30 |             coll_id: self.detailed_collections_cache.get(coll_id)
31 |             for coll_id in collection_ids
32 |             if coll_id in self.detailed_collections_cache
33 |         }
34 | 
35 |         return {
36 |             "available_collections": detailed_collections,
37 |             "openapi_spec": self.spec,
38 |         }
39 | 
```

--------------------------------------------------------------------------------
/src/api_service/protocols.py:
--------------------------------------------------------------------------------

```python
 1 | from typing import Protocol, Dict, List, Any, Optional, runtime_checkable
 2 | from models import (
 3 |     OpenAPISpecification,
 4 |     CollectionsCache,
 5 |     CollectionQueryables,
 6 | )
 7 | 
 8 | 
 9 | @runtime_checkable
10 | class APIClient(Protocol):
11 |     """Protocol for API clients"""
12 | 
13 |     async def initialise(self):
14 |         """Initialise the aiohttp session if not already created"""
15 |         ...
16 | 
17 |     async def close(self):
18 |         """Close the aiohttp session"""
19 |         ...
20 | 
21 |     async def get_api_key(self) -> str:
22 |         """Get the API key"""
23 |         ...
24 | 
25 |     async def make_request(
26 |         self,
27 |         endpoint: str,
28 |         params: Optional[Dict[str, Any]] = None,
29 |         path_params: Optional[List[str]] = None,
30 |     ) -> Dict[str, Any]:
31 |         """Make a request to an API endpoint"""
32 |         ...
33 | 
34 |     async def make_request_no_auth(
35 |         self,
36 |         url: str,
37 |         params: Optional[Dict[str, Any]] = None,
38 |         max_retries: int = 2,
39 |     ) -> str:
40 |         """Make a request without authentication"""
41 |         ...
42 | 
43 |     async def cache_openapi_spec(self) -> OpenAPISpecification:
44 |         """Cache the OpenAPI spec"""
45 |         ...
46 | 
47 |     async def cache_collections(self) -> CollectionsCache:
48 |         """Cache the collections data"""
49 |         ...
50 | 
51 |     async def fetch_collections_queryables(
52 |         self, collection_ids: List[str]
53 |     ) -> Dict[str, CollectionQueryables]:
54 |         """Fetch detailed queryables for specific collections only"""
55 |         ...
56 | 
```

--------------------------------------------------------------------------------
/src/utils/logging_config.py:
--------------------------------------------------------------------------------

```python
 1 | import logging
 2 | import sys
 3 | import os
 4 | import re
 5 | 
 6 | 
 7 | class APIKeySanitisingFilter(logging.Filter):
 8 |     """Filter to sanitise API keys from log messages"""
 9 | 
10 |     def __init__(self):
11 |         super().__init__()
12 |         self.patterns = [
13 |             r"[?&]key=[^&\s]*",
14 |             r"[?&]api_key=[^&\s]*",
15 |             r"[?&]apikey=[^&\s]*",
16 |             r"[?&]token=[^&\s]*",
17 |         ]
18 | 
19 |     def filter(self, record: logging.LogRecord) -> bool:
20 |         """Sanitise the log record message"""
21 |         if hasattr(record, "msg") and isinstance(record.msg, str):
22 |             record.msg = self._sanitise_text(record.msg)
23 | 
24 |         if hasattr(record, "args") and record.args:
25 |             sanitised_args = []
26 |             for arg in record.args:
27 |                 if isinstance(arg, str):
28 |                     sanitised_args.append(self._sanitise_text(arg))
29 |                 else:
30 |                     sanitised_args.append(arg)
31 |             record.args = tuple(sanitised_args)
32 | 
33 |         return True
34 | 
35 |     def _sanitise_text(self, text: str) -> str:
36 |         """Remove API keys from text"""
37 |         sanitised = text
38 |         for pattern in self.patterns:
39 |             sanitised = re.sub(pattern, "", sanitised, flags=re.IGNORECASE)
40 | 
41 |         sanitised = re.sub(r"[?&]$", "", sanitised)
42 |         sanitised = re.sub(r"&{2,}", "&", sanitised)
43 |         sanitised = re.sub(r"\?&", "?", sanitised)
44 | 
45 |         return sanitised
46 | 
47 | 
48 | def configure_logging(debug: bool = False) -> logging.Logger:
49 |     """
50 |     Configure logging for the entire application.
51 | 
52 |     Args:
53 |         debug: Whether to enable debug logging or not
54 |     """
55 |     log_level = (
56 |         logging.DEBUG if (debug or os.environ.get("DEBUG") == "1") else logging.INFO
57 |     )
58 | 
59 |     formatter = logging.Formatter(
60 |         "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
61 |     )
62 | 
63 |     root_logger = logging.getLogger()
64 |     root_logger.setLevel(log_level)
65 | 
66 |     for handler in root_logger.handlers[:]:
67 |         root_logger.removeHandler(handler)
68 | 
69 |     console_handler = logging.StreamHandler(sys.stderr)
70 |     console_handler.setFormatter(formatter)
71 |     console_handler.setLevel(log_level)
72 | 
73 |     api_key_filter = APIKeySanitisingFilter()
74 |     console_handler.addFilter(api_key_filter)
75 | 
76 |     root_logger.addHandler(console_handler)
77 | 
78 |     logging.getLogger("uvicorn").propagate = False
79 | 
80 |     return root_logger
81 | 
82 | 
83 | def get_logger(name: str) -> logging.Logger:
84 |     """
85 |     Get a logger for a module.
86 | 
87 |     Args:
88 |         name: Module name (usually __name__)
89 | 
90 |     Returns:
91 |         Logger instance
92 |     """
93 |     return logging.getLogger(name)
94 | 
```

--------------------------------------------------------------------------------
/src/prompt_templates/prompt_templates.py:
--------------------------------------------------------------------------------

```python
 1 | PROMPT_TEMPLATES = {
 2 |     "usrn_breakdown": (
 3 |         "Break down USRN {usrn} into its component road links for routing analysis. "
 4 |         "Step 1: GET /collections/trn-ntwk-street-1/items?filter=usrn='{usrn}' to get street details. "
 5 |         "Step 2: Use Street geometry bbox to query Road Links: GET /collections/trn-ntwk-roadlink-4/items?bbox=[street_bbox] "
 6 |         "Step 3: Filter Road Links by Street reference using properties.street_ref matching street feature ID. "
 7 |         "Step 4: For each Road Link: GET /collections/trn-ntwk-roadnode-1/items?filter=roadlink_ref='roadlink_id' "
 8 |         "Step 5: Set crs=EPSG:27700 for British National Grid coordinates. "
 9 |         "Return: Complete breakdown of USRN into constituent Road Links with node connections."
10 |     ),
11 |     "restriction_matching_analysis": (
12 |         "Perform comprehensive traffic restriction matching for road network in bbox {bbox} with SPECIFIC STREET IDENTIFICATION. "
13 |         "Step 1: Build routing network: get_routing_data(bbox='{bbox}', limit={limit}, build_network=True) "
14 |         "Step 2: Extract restriction data from build_status.restrictions array. "
15 |         "Step 3: For each restriction, match to road links using restrictionnetworkreference: "
16 |         "  - networkreferenceid = Road Link UUID from trn-ntwk-roadlink-4 "
17 |         "  - roadlinkdirection = 'In Direction' (with geometry) or 'In Opposite Direction' (against geometry) "
18 |         "  - roadlinksequence = order for multi-link restrictions (turns) "
19 |         "Step 4: IMMEDIATELY lookup street names: Use get_bulk_features(collection_id='trn-ntwk-roadlink-4', identifiers=[road_link_uuids]) to resolve UUIDs to actual street names (name1_text, roadclassification, roadclassificationnumber) "
20 |         "Step 5: Analyze restriction types by actual street names: "
21 |         "  - One Way: Apply directional constraint to specific named road "
22 |         "  - Turn Restriction: Block movement between named streets (from street A to street B) "
23 |         "  - Vehicle Restrictions: Apply dimension/weight limits to specific named roads "
24 |         "  - Access Restrictions: Apply vehicle type constraints to specific named streets "
25 |         "Step 6: Check exemptions array (e.g., 'Pedal Cycles' exempt from one-way) for each named street "
26 |         "Step 7: Group results by street names and road classifications (A Roads, B Roads, Local Roads) "
27 |         "Return: Complete restriction mapping showing ACTUAL STREET NAMES with their specific restrictions, directions, exemptions, and road classifications. Present results as 'Street Name (Road Class)' rather than UUIDs."
28 |     ),
29 | }
30 | 
```

--------------------------------------------------------------------------------
/src/mcp_service/resources.py:
--------------------------------------------------------------------------------

```python
 1 | """MCP Resources for OS NGD documentation"""
 2 | 
 3 | import json
 4 | import time
 5 | from models import NGDAPIEndpoint
 6 | from utils.logging_config import get_logger
 7 | 
 8 | logger = get_logger(__name__)
 9 | 
10 | 
11 | # TODO: Do this for
12 | class OSDocumentationResources:
13 |     """Handles registration of OS NGD documentation resources"""
14 | 
15 |     def __init__(self, mcp_service, api_client):
16 |         self.mcp = mcp_service
17 |         self.api_client = api_client
18 | 
19 |     def register_all(self) -> None:
20 |         """Register all documentation resources"""
21 |         self._register_transport_network_resources()
22 |         # Future: self._register_land_resources()
23 |         # Future: self._register_building_resources()
24 | 
25 |     def _register_transport_network_resources(self) -> None:
26 |         """Register transport network documentation resources"""
27 | 
28 |         @self.mcp.resource("os-docs://street")
29 |         async def street_docs() -> str:
30 |             return await self._fetch_doc_resource(
31 |                 "street", NGDAPIEndpoint.MARKDOWN_STREET.value
32 |             )
33 | 
34 |         @self.mcp.resource("os-docs://road")
35 |         async def road_docs() -> str:
36 |             return await self._fetch_doc_resource(
37 |                 "road", NGDAPIEndpoint.MARKDOWN_ROAD.value
38 |             )
39 | 
40 |         @self.mcp.resource("os-docs://tram-on-road")
41 |         async def tram_on_road_docs() -> str:
42 |             return await self._fetch_doc_resource(
43 |                 "tram-on-road", NGDAPIEndpoint.TRAM_ON_ROAD.value
44 |             )
45 | 
46 |         @self.mcp.resource("os-docs://road-node")
47 |         async def road_node_docs() -> str:
48 |             return await self._fetch_doc_resource(
49 |                 "road-node", NGDAPIEndpoint.ROAD_NODE.value
50 |             )
51 | 
52 |         @self.mcp.resource("os-docs://road-link")
53 |         async def road_link_docs() -> str:
54 |             return await self._fetch_doc_resource(
55 |                 "road-link", NGDAPIEndpoint.ROAD_LINK.value
56 |             )
57 | 
58 |         @self.mcp.resource("os-docs://road-junction")
59 |         async def road_junction_docs() -> str:
60 |             return await self._fetch_doc_resource(
61 |                 "road-junction", NGDAPIEndpoint.ROAD_JUNCTION.value
62 |             )
63 | 
64 |     async def _fetch_doc_resource(self, feature_type: str, url: str) -> str:
65 |         """Generic method to fetch documentation resources"""
66 |         try:
67 |             content = await self.api_client.make_request_no_auth(url)
68 | 
69 |             return json.dumps(
70 |                 {
71 |                     "feature_type": feature_type,
72 |                     "content": content,
73 |                     "content_type": "markdown",
74 |                     "source_url": url,
75 |                     "timestamp": time.time(),
76 |                 }
77 |             )
78 | 
79 |         except Exception as e:
80 |             logger.error(f"Error fetching {feature_type} documentation: {e}")
81 |             return json.dumps({"error": str(e), "feature_type": feature_type})
82 | 
```

--------------------------------------------------------------------------------
/src/mcp_service/prompts.py:
--------------------------------------------------------------------------------

```python
 1 | from typing import List
 2 | from mcp.types import PromptMessage, TextContent
 3 | from prompt_templates.prompt_templates import PROMPT_TEMPLATES
 4 | from utils.logging_config import get_logger
 5 | 
 6 | logger = get_logger(__name__)
 7 | 
 8 | 
 9 | class OSWorkflowPrompts:
10 |     """Handles registration of OS NGD workflow prompts"""
11 | 
12 |     def __init__(self, mcp_service):
13 |         self.mcp = mcp_service
14 | 
15 |     def register_all(self) -> None:
16 |         """Register all workflow prompts"""
17 |         self._register_analysis_prompts()
18 |         self._register_general_prompts()
19 | 
20 |     def _register_analysis_prompts(self) -> None:
21 |         """Register analysis workflow prompts"""
22 | 
23 |         @self.mcp.prompt()
24 |         def usrn_breakdown_analysis(usrn: str) -> List[PromptMessage]:
25 |             """Generate a step-by-step USRN breakdown workflow"""
26 |             template = PROMPT_TEMPLATES["usrn_breakdown"].format(usrn=usrn)
27 | 
28 |             return [
29 |                 PromptMessage(
30 |                     role="user",
31 |                     content=TextContent(
32 |                         type="text",
33 |                         text=f"As an expert in OS NGD API workflows and transport network analysis, {template}",
34 |                     ),
35 |                 )
36 |             ]
37 | 
38 |     def _register_general_prompts(self) -> None:
39 |         """Register general OS NGD guidance prompts"""
40 | 
41 |         @self.mcp.prompt()
42 |         def collection_query_guidance(
43 |             collection_id: str, query_type: str = "features"
44 |         ) -> List[PromptMessage]:
45 |             """Generate guidance for querying OS NGD collections"""
46 |             return [
47 |                 PromptMessage(
48 |                     role="user",
49 |                     content=TextContent(
50 |                         type="text",
51 |                         text=f"As an OS NGD API expert, guide me through querying the '{collection_id}' collection for {query_type}. "
52 |                         f"Include: 1) Available filters, 2) Best practices for bbox queries, "
53 |                         f"3) CRS considerations, 4) Example queries with proper syntax.",
54 |                     ),
55 |                 )
56 |             ]
57 | 
58 |         @self.mcp.prompt()
59 |         def workflow_planning(
60 |             user_request: str, data_theme: str = "transport"
61 |         ) -> List[PromptMessage]:
62 |             """Generate a workflow plan for complex OS NGD queries"""
63 |             return [
64 |                 PromptMessage(
65 |                     role="user",
66 |                     content=TextContent(
67 |                         type="text",
68 |                         text=f"As a geospatial workflow planner, create a detailed workflow plan for: '{user_request}'. "
69 |                         f"Focus on {data_theme} theme data. Include: "
70 |                         f"1) Collection selection rationale, "
71 |                         f"2) Query sequence with dependencies, "
72 |                         f"3) Filter strategies, "
73 |                         f"4) Error handling considerations.",
74 |                     ),
75 |                 )
76 |             ]
77 | 
```

--------------------------------------------------------------------------------
/src/config_docs/ogcapi-features-2.yaml:
--------------------------------------------------------------------------------

```yaml
 1 | openapi: 3.0.3
 2 | info:
 3 |   title: "Building Blocks specified in OGC API - Features - Part 2: Coordinate Reference Systems by Reference"
 4 |   description: |-
 5 |     Common components used in the
 6 |     [OGC standard "OGC API - Features - Part 2: Coordinate Reference Systems by Reference"](http://docs.opengeospatial.org/is/18-058/18-058.html).
 7 | 
 8 |     OGC API - Features - Part 2: Coordinate Reference Systems by Reference 1.0 is an OGC Standard.
 9 |     Copyright (c) 2020 Open Geospatial Consortium.
10 |     To obtain additional rights of use, visit http://www.opengeospatial.org/legal/ .
11 | 
12 |     This document is also available on
13 |     [OGC](http://schemas.opengis.net/ogcapi/features/part2/1.0/openapi/ogcapi-features-2.yaml).
14 |   version: '1.0.0'
15 |   contact:
16 |     name: Clemens Portele
17 |     email: [email protected]
18 |   license:
19 |     name: OGC License
20 |     url: 'http://www.opengeospatial.org/legal/'
21 | components:
22 |   parameters:
23 |     bbox-crs:
24 |       name: bbox-crs
25 |       in: query
26 |       required: false
27 |       schema:
28 |         type: string
29 |         format: uri
30 |       style: form
31 |       explode: false
32 |     crs:
33 |       name: crs
34 |       in: query
35 |       required: false
36 |       schema:
37 |         type: string
38 |         format: uri
39 |       style: form
40 |       explode: false
41 |   schemas:
42 |     collection:
43 |       allOf:
44 |       - $ref: http://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/collection.yaml
45 |       - $ref: '#/components/schemas/collectionExtensionCrs'
46 |     collectionExtensionCrs:
47 |       type: object
48 |       properties:
49 |         storageCrs:
50 |           description: the CRS identifier, from the list of supported CRS identifiers, that may be used to retrieve features from a collection without the need to apply a CRS transformation
51 |           type: string
52 |           format: uri
53 |         storageCrsCoordinateEpoch:
54 |           description: point in time at which coordinates in the spatial feature collection are referenced to the dynamic coordinate reference system in `storageCrs`, that may be used to retrieve features from a collection without the need to apply a change of coordinate epoch. It is expressed as a decimal year in the Gregorian calendar
55 |           type: number
56 |           example: '2017-03-25 in the Gregorian calendar is epoch 2017.23'
57 |     collections:
58 |       allOf:
59 |       - $ref: http://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/collections.yaml
60 |       - $ref: '#/components/schemas/collectionsExtensionCrs'
61 |     collectionsExtensionCrs:
62 |       type: object
63 |       properties:
64 |         crs:
65 |           description: a global list of CRS identifiers that are supported by spatial feature collections offered by the service
66 |           type: array
67 |           items:
68 |             type: string
69 |             format: uri
70 |   headers:
71 |     Content-Crs:
72 |       description: a URI, in angular brackets, identifying the coordinate reference system used in the content / payload
73 |       schema:
74 |         type: string
75 |       example: '<http://www.opengis.net/def/crs/EPSG/0/3395>'
76 | 
```

--------------------------------------------------------------------------------
/src/mcp_service/protocols.py:
--------------------------------------------------------------------------------

```python
  1 | from typing import Protocol, Optional, Callable, List, runtime_checkable, Any
  2 | 
  3 | 
  4 | @runtime_checkable
  5 | class MCPService(Protocol):
  6 |     """Protocol for MCP services"""
  7 | 
  8 |     def tool(self) -> Callable[..., Any]:
  9 |         """Register a function as an MCP tool"""
 10 |         ...
 11 | 
 12 |     def resource(
 13 |         self,
 14 |         uri: str,
 15 |         *,
 16 |         name: str | None = None,
 17 |         title: str | None = None,
 18 |         description: str | None = None,
 19 |         mime_type: str | None = None,
 20 |     ) -> Callable[[Any], Any]:
 21 |         """Register a function as an MCP resource"""
 22 |         ...
 23 | 
 24 |     def run(self) -> None:
 25 |         """Run the MCP service"""
 26 |         ...
 27 | 
 28 | 
 29 | @runtime_checkable
 30 | class FeatureService(Protocol):
 31 |     """Protocol for OS NGD feature services"""
 32 | 
 33 |     def hello_world(self, name: str) -> str:
 34 |         """Test connection to the service"""
 35 |         ...
 36 | 
 37 |     def check_api_key(self) -> str:
 38 |         """Check if API key is available"""
 39 |         ...
 40 | 
 41 |     async def list_collections(self) -> str:
 42 |         """List all available feature collections"""
 43 |         ...
 44 | 
 45 |     async def get_single_collection(self, collection_id: str) -> str:
 46 |         """Get detailed information about a specific collection"""
 47 |         ...
 48 | 
 49 |     async def get_single_collection_queryables(self, collection_id: str) -> str:
 50 |         """Get queryable properties for a collection"""
 51 |         ...
 52 | 
 53 |     # TODO: Need to make sure the full list of parameters is supported
 54 |     # TODO: Supporting cql-text is clunky and need to figure out how to support this better
 55 |     async def search_features(
 56 |         self,
 57 |         collection_id: str,
 58 |         bbox: Optional[str] = None,
 59 |         crs: Optional[str] = None,
 60 |         limit: int = 10,
 61 |         offset: int = 0,
 62 |         filter: Optional[str] = None,
 63 |         filter_lang: Optional[str] = "cql-text",
 64 |         query_attr: Optional[str] = None,
 65 |         query_attr_value: Optional[str] = None,
 66 |     ) -> str:
 67 |         """Search for features in a collection with full CQL filter support"""
 68 |         ...
 69 | 
 70 |     async def get_feature(
 71 |         self, collection_id: str, feature_id: str, crs: Optional[str] = None
 72 |     ) -> str:
 73 |         """Get a specific feature by ID"""
 74 |         ...
 75 | 
 76 |     async def get_linked_identifiers(
 77 |         self, identifier_type: str, identifier: str, feature_type: Optional[str] = None
 78 |     ) -> str:
 79 |         """Get linked identifiers for a specified identifier"""
 80 |         ...
 81 | 
 82 |     async def get_bulk_features(
 83 |         self,
 84 |         collection_id: str,
 85 |         identifiers: List[str],
 86 |         query_by_attr: Optional[str] = None,
 87 |     ) -> str:
 88 |         """Get multiple features in a single call"""
 89 |         ...
 90 | 
 91 |     async def get_bulk_linked_features(
 92 |         self,
 93 |         identifier_type: str,
 94 |         identifiers: List[str],
 95 |         feature_type: Optional[str] = None,
 96 |     ) -> str:
 97 |         """Get linked features for multiple identifiers"""
 98 |         ...
 99 | 
100 |     async def fetch_detailed_collections(self, collection_ids: List[str]) -> str:
101 |         """Get detailed information about specific collections for workflow planning"""
102 |         ...
103 | 
```

--------------------------------------------------------------------------------
/src/middleware/stdio_middleware.py:
--------------------------------------------------------------------------------

```python
 1 | import json
 2 | import time
 3 | import asyncio
 4 | from collections import deque
 5 | from typing import Callable, TypeVar, Any, Union, cast
 6 | from functools import wraps
 7 | from utils.logging_config import get_logger
 8 | 
 9 | logger = get_logger(__name__)
10 | F = TypeVar("F", bound=Callable[..., Any])
11 | 
12 | 
13 | class StdioRateLimiter:
14 |     """STDIO-specific rate limiting"""
15 | 
16 |     def __init__(self, requests_per_minute: int = 10, window_seconds: int = 60):
17 |         self.requests_per_minute = requests_per_minute
18 |         self.window_seconds = window_seconds
19 |         self.request_timestamps = deque()
20 | 
21 |     def check_rate_limit(self) -> bool:
22 |         """Check rate limit for STDIO client"""
23 |         current_time = time.time()
24 | 
25 |         while (
26 |             self.request_timestamps
27 |             and current_time - self.request_timestamps[0] >= self.window_seconds
28 |         ):
29 |             self.request_timestamps.popleft()
30 | 
31 |         if len(self.request_timestamps) >= self.requests_per_minute:
32 |             logger.warning("STDIO rate limit exceeded")
33 |             return False
34 | 
35 |         self.request_timestamps.append(current_time)
36 |         return True
37 | 
38 | 
39 | class StdioMiddleware:
40 |     """STDIO authentication and rate limiting"""
41 | 
42 |     def __init__(self, requests_per_minute: int = 20):
43 |         self.authenticated = False
44 |         self.client_id = "anonymous"
45 |         self.rate_limiter = StdioRateLimiter(requests_per_minute=requests_per_minute)
46 | 
47 |     def require_auth_and_rate_limit(self, func: F) -> F:
48 |         """Decorator for auth and rate limiting"""
49 | 
50 |         @wraps(func)
51 |         async def async_wrapper(*args: Any, **kwargs: Any) -> Union[str, Any]:
52 |             if not self.authenticated:
53 |                 logger.error(
54 |                     json.dumps({"error": "Authentication required", "code": 401})
55 |                 )
56 |                 return json.dumps({"error": "Authentication required", "code": 401})
57 | 
58 |             if not self.rate_limiter.check_rate_limit():
59 |                 logger.error(json.dumps({"error": "Rate limited", "code": 429}))
60 |                 return json.dumps({"error": "Rate limited", "code": 429})
61 | 
62 |             return await func(*args, **kwargs)
63 | 
64 |         @wraps(func)
65 |         def sync_wrapper(*args: Any, **kwargs: Any) -> Union[str, Any]:
66 |             if not self.authenticated:
67 |                 logger.error(
68 |                     json.dumps({"error": "Authentication required", "code": 401})
69 |                 )
70 |                 return json.dumps({"error": "Authentication required", "code": 401})
71 | 
72 |             if not self.rate_limiter.check_rate_limit():
73 |                 logger.error(json.dumps({"error": "Rate limited", "code": 429}))
74 |                 return json.dumps({"error": "Rate limited", "code": 429})
75 | 
76 |             return func(*args, **kwargs)
77 | 
78 |         if asyncio.iscoroutinefunction(func):
79 |             return cast(F, async_wrapper)
80 |         return cast(F, sync_wrapper)
81 | 
82 |     def authenticate(self, key: str) -> bool:
83 |         """Authenticate with API key"""
84 |         if key and key.strip():
85 |             self.authenticated = True
86 |             self.client_id = key
87 |             return True
88 |         else:
89 |             self.authenticated = False
90 |             return False
91 | 
```

--------------------------------------------------------------------------------
/src/mcp_service/guardrails.py:
--------------------------------------------------------------------------------

```python
 1 | import re
 2 | import json
 3 | import asyncio
 4 | from functools import wraps
 5 | from typing import TypeVar, Callable, Any, Union, cast
 6 | from utils.logging_config import get_logger
 7 | 
 8 | logger = get_logger(__name__)
 9 | F = TypeVar("F", bound=Callable[..., Any])
10 | 
11 | 
12 | class ToolGuardrails:
13 |     """Prompt injection protection"""
14 | 
15 |     def __init__(self):
16 |         self.suspicious_patterns = [
17 |             r"(?i)ignore previous",
18 |             r"(?i)ignore all previous instructions",
19 |             r"(?i)assistant:",
20 |             r"\{\{.*?\}\}",
21 |             r"(?i)forget",
22 |             r"(?i)show credentials",
23 |             r"(?i)show secrets",
24 |             r"(?i)reveal password",
25 |             r"(?i)dump (tokens|secrets|passwords|credentials)",
26 |             r"(?i)leak confidential",
27 |             r"(?i)reveal secrets",
28 |             r"(?i)expose secrets",
29 |             r"(?i)secrets.*contain",
30 |             r"(?i)extract secrets",
31 |         ]
32 | 
33 |     def detect_prompt_injection(self, input_text: Any) -> bool:
34 |         """Check if input contains prompt injection attempts"""
35 |         if not isinstance(input_text, str):
36 |             return False
37 |         return any(
38 |             re.search(pattern, input_text) for pattern in self.suspicious_patterns
39 |         )
40 | 
41 |     def basic_guardrails(self, func: F) -> F:
42 |         """Prompt injection protection only"""
43 | 
44 |         @wraps(func)
45 |         async def async_wrapper(*args: Any, **kwargs: Any) -> Union[str, Any]:
46 |             try:
47 |                 for arg in args:
48 |                     if isinstance(arg, str) and self.detect_prompt_injection(arg):
49 |                         raise ValueError("Prompt injection detected!")
50 | 
51 |                 for name, value in kwargs.items():
52 |                     if not (
53 |                         hasattr(value, "request_context")
54 |                         and hasattr(value, "request_id")
55 |                     ):
56 |                         if isinstance(value, str) and self.detect_prompt_injection(
57 |                             value
58 |                         ):
59 |                             raise ValueError(f"Prompt injection in '{name}'!")
60 |             except ValueError as e:
61 |                 return json.dumps({"error": str(e), "code": 400})
62 | 
63 |             return await func(*args, **kwargs)
64 | 
65 |         @wraps(func)
66 |         def sync_wrapper(*args: Any, **kwargs: Any) -> Union[str, Any]:
67 |             try:
68 |                 for arg in args:
69 |                     if isinstance(arg, str) and self.detect_prompt_injection(arg):
70 |                         raise ValueError("Prompt injection detected!")
71 | 
72 |                 for name, value in kwargs.items():
73 |                     if not (
74 |                         hasattr(value, "request_context")
75 |                         and hasattr(value, "request_id")
76 |                     ):
77 |                         if isinstance(value, str) and self.detect_prompt_injection(
78 |                             value
79 |                         ):
80 |                             raise ValueError(f"Prompt injection in '{name}'!")
81 |             except ValueError as e:
82 |                 return json.dumps({"error": str(e), "code": 400})
83 | 
84 |             return func(*args, **kwargs)
85 | 
86 |         if asyncio.iscoroutinefunction(func):
87 |             return cast(F, async_wrapper)
88 |         return cast(F, sync_wrapper)
89 | 
```

--------------------------------------------------------------------------------
/src/models.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Types for the OS NGD API MCP Server
  3 | """
  4 | 
  5 | from enum import Enum
  6 | from pydantic import BaseModel
  7 | from typing import Any, List, Dict
  8 | 
  9 | 
 10 | class NGDAPIEndpoint(Enum):
 11 |     """
 12 |     Enum for the OS API Endpoints following OGC API Features standard
 13 |     """
 14 | 
 15 |     # NGD Features Endpoints
 16 |     NGD_FEATURES_BASE_PATH = "https://api.os.uk/features/ngd/ofa/v1/{}"
 17 |     COLLECTIONS = NGD_FEATURES_BASE_PATH.format("collections")
 18 |     COLLECTION_INFO = NGD_FEATURES_BASE_PATH.format("collections/{}")
 19 |     COLLECTION_SCHEMA = NGD_FEATURES_BASE_PATH.format("collections/{}/schema")
 20 |     COLLECTION_FEATURES = NGD_FEATURES_BASE_PATH.format("collections/{}/items")
 21 |     COLLECTION_FEATURE_BY_ID = NGD_FEATURES_BASE_PATH.format("collections/{}/items/{}")
 22 |     COLLECTION_QUERYABLES = NGD_FEATURES_BASE_PATH.format("collections/{}/queryables")
 23 | 
 24 |     # OpenAPI Specification Endpoint
 25 |     OPENAPI_SPEC = NGD_FEATURES_BASE_PATH.format("api")
 26 | 
 27 |     # Linked Identifiers Endpoints
 28 |     LINKED_IDENTIFIERS_BASE_PATH = "https://api.os.uk/search/links/v1/{}"
 29 |     LINKED_IDENTIFIERS = LINKED_IDENTIFIERS_BASE_PATH.format("identifierTypes/{}/{}")
 30 | 
 31 |     # Markdown Resources
 32 |     MARKDOWN_BASE_PATH = "https://docs.os.uk/osngd/data-structure/{}"
 33 |     MARKDOWN_STREET = MARKDOWN_BASE_PATH.format("transport/transport-network/street.md")
 34 |     MARKDOWN_ROAD = MARKDOWN_BASE_PATH.format("transport/transport-network/road.md")
 35 |     TRAM_ON_ROAD = MARKDOWN_BASE_PATH.format(
 36 |         "transport/transport-network/tram-on-road.md"
 37 |     )
 38 |     ROAD_NODE = MARKDOWN_BASE_PATH.format("transport/transport-network/road-node.md")
 39 |     ROAD_LINK = MARKDOWN_BASE_PATH.format("transport/transport-network/road-link.md")
 40 |     ROAD_JUNCTION = MARKDOWN_BASE_PATH.format(
 41 |         "transport/transport-network/road-junction.md"
 42 |     )
 43 | 
 44 |     # Places API Endpoints
 45 |     # TODO: Add these back in when I get access to the Places API from OS
 46 |     # PLACES_BASE_PATH = "https://api.os.uk/search/places/v1/{}"
 47 |     # PLACES_UPRN = PLACES_BASE_PATH.format("uprn")
 48 |     # POST_CODE = PLACES_BASE_PATH.format("postcode")
 49 | 
 50 | 
 51 | class OpenAPISpecification(BaseModel):
 52 |     """Parsed OpenAPI specification optimized for LLM context"""
 53 | 
 54 |     title: str
 55 |     version: str
 56 |     base_url: str
 57 |     endpoints: Dict[str, str]
 58 |     collection_ids: List[str]
 59 |     supported_crs: Dict[str, Any]
 60 |     crs_guide: Dict[str, str]
 61 | 
 62 | 
 63 | class WorkflowStep(BaseModel):
 64 |     """A single step in a workflow plan"""
 65 | 
 66 |     step_number: int
 67 |     description: str
 68 |     api_endpoint: str
 69 |     parameters: Dict[str, Any]
 70 |     dependencies: List[int] = []
 71 | 
 72 | 
 73 | class WorkflowPlan(BaseModel):
 74 |     """Generated workflow plan for user requests"""
 75 | 
 76 |     user_request: str
 77 |     steps: List[WorkflowStep]
 78 |     reasoning: str
 79 |     estimated_complexity: str = "simple"
 80 | 
 81 | 
 82 | class Collection(BaseModel):
 83 |     """Represents a feature collection from the OS NGD API"""
 84 | 
 85 |     id: str
 86 |     title: str
 87 |     description: str = ""
 88 |     links: List[Dict[str, Any]] = []
 89 |     extent: Dict[str, Any] = {}
 90 |     itemType: str = "feature"
 91 | 
 92 | 
 93 | class CollectionsCache(BaseModel):
 94 |     """Cached collections data with filtering applied"""
 95 | 
 96 |     collections: List[Collection]
 97 |     raw_response: Dict[str, Any]
 98 | 
 99 | 
100 | class CollectionQueryables(BaseModel):
101 |     """Queryables information for a collection"""
102 | 
103 |     id: str
104 |     title: str
105 |     description: str
106 |     all_queryables: Dict[str, Any]
107 |     enum_queryables: Dict[str, Any]
108 |     has_enum_filters: bool
109 |     total_queryables: int
110 |     enum_count: int
111 | 
112 | 
113 | class WorkflowContextCache(BaseModel):
114 |     """Cached workflow context data"""
115 | 
116 |     collections_info: Dict[str, CollectionQueryables]
117 |     openapi_spec: OpenAPISpecification
118 |     cached_at: float
119 | 
```

--------------------------------------------------------------------------------
/src/server.py:
--------------------------------------------------------------------------------

```python
  1 | import argparse
  2 | import os
  3 | import uvicorn
  4 | from typing import Any
  5 | from utils.logging_config import configure_logging
  6 | 
  7 | from api_service.os_api import OSAPIClient
  8 | from mcp_service.os_service import OSDataHubService
  9 | from mcp.server.fastmcp import FastMCP
 10 | from middleware.stdio_middleware import StdioMiddleware
 11 | from middleware.http_middleware import HTTPMiddleware
 12 | from starlette.middleware import Middleware
 13 | from starlette.middleware.cors import CORSMiddleware
 14 | from starlette.routing import Route
 15 | from starlette.responses import JSONResponse
 16 | 
 17 | logger = configure_logging()
 18 | 
 19 | 
 20 | def main():
 21 |     """Main entry point"""
 22 |     parser = argparse.ArgumentParser(description="OS DataHub API MCP Server")
 23 |     parser.add_argument(
 24 |         "--transport",
 25 |         choices=["stdio", "streamable-http"],
 26 |         default="stdio",
 27 |         help="Transport protocol to use (stdio or streamable-http)",
 28 |     )
 29 |     parser.add_argument(
 30 |         "--host", default="0.0.0.0", help="Host to bind to (default: 0.0.0.0)"
 31 |     )
 32 |     parser.add_argument(
 33 |         "--port", type=int, default=8000, help="Port to bind to (default: 8000)"
 34 |     )
 35 |     parser.add_argument("--debug", action="store_true", help="Enable debug mode")
 36 |     args = parser.parse_args()
 37 | 
 38 |     configure_logging(debug=args.debug)
 39 | 
 40 |     logger.info(
 41 |         f"OS DataHub API MCP Server starting with {args.transport} transport..."
 42 |     )
 43 | 
 44 |     api_client = OSAPIClient()
 45 | 
 46 |     match args.transport:
 47 |         case "stdio":
 48 |             logger.info("Starting with stdio transport")
 49 | 
 50 |             mcp = FastMCP(
 51 |                 "os-ngd-api",
 52 |                 debug=args.debug,
 53 |                 log_level="DEBUG" if args.debug else "INFO",
 54 |             )
 55 | 
 56 |             stdio_auth = StdioMiddleware()
 57 | 
 58 |             service = OSDataHubService(api_client, mcp, stdio_middleware=stdio_auth)
 59 | 
 60 |             stdio_api_key = os.environ.get("STDIO_KEY")
 61 |             if not stdio_api_key or not stdio_auth.authenticate(stdio_api_key):
 62 |                 logger.error("Authentication failed")
 63 |                 return
 64 | 
 65 |             service.run()
 66 | 
 67 |         case "streamable-http":
 68 |             logger.info(f"Starting Streamable HTTP server on {args.host}:{args.port}")
 69 | 
 70 |             mcp = FastMCP(
 71 |                 "os-ngd-api",
 72 |                 host=args.host,
 73 |                 port=args.port,
 74 |                 debug=args.debug,
 75 |                 json_response=True,
 76 |                 stateless_http=False,
 77 |                 log_level="DEBUG" if args.debug else "INFO",
 78 |             )
 79 | 
 80 |             OSDataHubService(api_client, mcp)
 81 | 
 82 |             async def auth_discovery(_: Any) -> JSONResponse:
 83 |                 """Return authentication methods."""
 84 |                 return JSONResponse(
 85 |                     content={"authMethods": [{"type": "http", "scheme": "bearer"}]}
 86 |                 )
 87 | 
 88 |             app = mcp.streamable_http_app()
 89 | 
 90 |             app.routes.append(
 91 |                 Route(
 92 |                     "/.well-known/mcp-auth",
 93 |                     endpoint=auth_discovery,
 94 |                     methods=["GET"],
 95 |                 )
 96 |             )
 97 | 
 98 |             app.user_middleware.extend(
 99 |                 [
100 |                     Middleware(
101 |                         CORSMiddleware,
102 |                         allow_origins=["*"],
103 |                         allow_credentials=True,
104 |                         allow_methods=["GET", "POST", "OPTIONS"],
105 |                         allow_headers=["*"],
106 |                         expose_headers=["*"],
107 |                     ),
108 |                     Middleware(HTTPMiddleware),
109 |                 ]
110 |             )
111 | 
112 |             uvicorn.run(
113 |                 app,
114 |                 host=args.host,
115 |                 port=args.port,
116 |                 log_level="debug" if args.debug else "info",
117 |             )
118 | 
119 |         case _:
120 |             logger.error(f"Unknown transport: {args.transport}")
121 |             return
122 | 
123 | 
124 | if __name__ == "__main__":
125 |     main()
126 | 
```

--------------------------------------------------------------------------------
/src/config_docs/ogcapi-features-3.yaml:
--------------------------------------------------------------------------------

```yaml
  1 | openapi: 3.0.3
  2 | info:
  3 |   title: "Building Blocks specified in OGC API - Features - Part 3: Filtering"
  4 |   description: |-
  5 |     Common components used in the
  6 |     [OGC standard "OGC API - Features - Part 3: Filtering"](https://docs.ogc.org/is/19-079r2/19-079r2.html).
  7 | 
  8 |     OGC API - Features - Part 3: Filtering 1.0 is an OGC Standard.
  9 |     Copyright (c) 2024 Open Geospatial Consortium.
 10 |     To obtain additional rights of use, visit https://www.ogc.org/legal/ .
 11 | 
 12 |     This document is also available in the
 13 |     [OGC Schema Repository](https://schemas.opengis.net/ogcapi/features/part3/1.0/openapi/ogcapi-features-3.yaml).
 14 |   version: '1.0.0'
 15 |   contact:
 16 |     name: Clemens Portele
 17 |     email: [email protected]
 18 |   license:
 19 |     name: OGC License
 20 |     url: 'https://www.ogc.org/legal/'
 21 | paths:
 22 |   /collections/{collectionId}/queryables:
 23 |     get:
 24 |       summary: Get the list of supported queryables for a collection
 25 |       description: |-
 26 |         This operation returns the list of supported queryables of a collection. Queryables are 
 27 |         the properties that may be used to construct a filter expression on items in the collection.
 28 |         The response is a JSON Schema of a object where each property is a queryable.
 29 |       operationId: getQueryables
 30 |       parameters:
 31 |         - $ref: 'https://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/parameters/collectionId.yaml'
 32 |       responses:
 33 |         '200':
 34 |           description: The queryable properties of the collection.
 35 |           content:
 36 |             application/schema+json:
 37 |               schema:
 38 |                 type: object
 39 |   /functions:
 40 |     get:
 41 |       summary: Get the list of supported functions
 42 |       description: |-
 43 |         This operation returns the list of custom functions supported in CQL2 expressions.
 44 |       operationId: getFunctions
 45 |       responses:
 46 |         '200':
 47 |           description: The list of custom functions supported in CQL2 expressions.
 48 |           content:
 49 |             application/json:
 50 |               schema:
 51 |                 $ref: '#/components/schemas/functions'
 52 | components:
 53 |   parameters:
 54 |     filter:
 55 |       name: filter
 56 |       in: query
 57 |       required: false
 58 |       schema:
 59 |         type: string
 60 |       style: form
 61 |       explode: false
 62 |     filter-lang:
 63 |       name: filter-lang
 64 |       in: query
 65 |       required: false
 66 |       schema:
 67 |         type: string
 68 |         enum:
 69 |           - 'cql2-text'
 70 |           - 'cql2-json'
 71 |         default: 'cql2-text'
 72 |       style: form
 73 |     filter-crs:
 74 |       name: filter-crs
 75 |       in: query
 76 |       required: false
 77 |       schema:
 78 |         type: string
 79 |         format: uri-reference
 80 |       style: form
 81 |       explode: false
 82 |   schemas:
 83 |     functions:
 84 |       type: object
 85 |       required:
 86 |       - functions
 87 |       properties:
 88 |         functions:
 89 |           type: array
 90 |           items:
 91 |             type: object
 92 |             required:
 93 |             - name
 94 |             - returns
 95 |             properties:
 96 |               name:
 97 |                 type: string
 98 |               description:
 99 |                 type: string
100 |               metadataUrl:
101 |                 type: string
102 |                 format: uri-reference
103 |               arguments:
104 |                 type: array
105 |                 items:
106 |                   type: object
107 |                   required:
108 |                   - type
109 |                   properties:
110 |                     title:
111 |                       type: string
112 |                     description:
113 |                       type: string
114 |                     type:
115 |                       type: array
116 |                       items:
117 |                         type: string
118 |                         enum:
119 |                         - string
120 |                         - number
121 |                         - integer
122 |                         - datetime
123 |                         - geometry
124 |                         - boolean
125 |               returns:
126 |                 type: array
127 |                 items:
128 |                   type: string
129 |                   enum:
130 |                   - string
131 |                   - number
132 |                   - integer
133 |                   - datetime
134 |                   - geometry
135 |                   - boolean
136 | 
```

--------------------------------------------------------------------------------
/src/middleware/http_middleware.py:
--------------------------------------------------------------------------------

```python
  1 | import os
  2 | import time
  3 | from collections import defaultdict, deque
  4 | from typing import List, Callable, Awaitable
  5 | from starlette.middleware.base import BaseHTTPMiddleware
  6 | from starlette.requests import Request
  7 | from starlette.responses import Response, JSONResponse
  8 | from utils.logging_config import get_logger
  9 | 
 10 | logger = get_logger(__name__)
 11 | 
 12 | 
 13 | class RateLimiter:
 14 |     """HTTP-layer rate limiting"""
 15 | 
 16 |     def __init__(self, requests_per_minute: int = 10, window_seconds: int = 60):
 17 |         self.requests_per_minute = requests_per_minute
 18 |         self.window_seconds = window_seconds
 19 |         self.request_timestamps = defaultdict(lambda: deque())
 20 | 
 21 |     def check_rate_limit(self, client_id: str) -> bool:
 22 |         """Check if client has exceeded rate limit"""
 23 |         current_time = time.time()
 24 |         timestamps = self.request_timestamps[client_id]
 25 | 
 26 |         while timestamps and current_time - timestamps[0] >= self.window_seconds:
 27 |             timestamps.popleft()
 28 | 
 29 |         if len(timestamps) >= self.requests_per_minute:
 30 |             logger.warning(f"HTTP rate limit exceeded for client {client_id}")
 31 |             return False
 32 | 
 33 |         timestamps.append(current_time)
 34 |         return True
 35 | 
 36 | 
 37 | def get_valid_bearer_tokens() -> List[str]:
 38 |     """Get valid bearer tokens from environment variable."""
 39 |     try:
 40 |         tokens = os.environ.get("BEARER_TOKENS", "").split(",")
 41 |         valid_tokens = [t.strip() for t in tokens if t.strip()]
 42 | 
 43 |         if not valid_tokens:
 44 |             logger.warning(
 45 |                 "No BEARER_TOKENS configured, all authentication will be rejected"
 46 |             )
 47 |             return []
 48 | 
 49 |         return valid_tokens
 50 |     except Exception as e:
 51 |         logger.error(f"Error getting valid tokens: {e}")
 52 |         return []
 53 | 
 54 | 
 55 | async def verify_bearer_token(token: str) -> bool:
 56 |     """Verify bearer token is valid."""
 57 |     try:
 58 |         valid_tokens = get_valid_bearer_tokens()
 59 |         if not valid_tokens or not token:
 60 |             return False
 61 |         return token in valid_tokens
 62 |     except Exception as e:
 63 |         logger.error(f"Error validating token: {e}")
 64 |         return False
 65 | 
 66 | 
 67 | class HTTPMiddleware(BaseHTTPMiddleware):
 68 |     def __init__(self, app, requests_per_minute: int = 10):
 69 |         super().__init__(app)
 70 |         self.rate_limiter = RateLimiter(requests_per_minute=requests_per_minute)
 71 | 
 72 |     async def dispatch(
 73 |         self, request: Request, call_next: Callable[[Request], Awaitable[Response]]
 74 |     ) -> Response:
 75 |         if request.url.path == "/.well-known/mcp-auth" or request.method == "OPTIONS":
 76 |             return await call_next(request)
 77 | 
 78 |         session_id = request.headers.get("mcp-session-id")
 79 |         if not session_id:
 80 |             client_ip = request.client.host if request.client else "unknown"
 81 |             session_id = f"ip-{client_ip}"
 82 | 
 83 |         if not self.rate_limiter.check_rate_limit(session_id):
 84 |             return JSONResponse(
 85 |                 status_code=429,
 86 |                 content={"detail": "Too many requests. Please try again later."},
 87 |                 headers={"Retry-After": "60"},
 88 |             )
 89 | 
 90 |         origin = request.headers.get("origin", "")
 91 |         if origin and not self._is_valid_origin(origin, request):
 92 |             client_ip = request.client.host if request.client else "unknown"
 93 |             logger.warning(
 94 |                 f"Blocked request with suspicious origin from {client_ip}, Origin: {origin}"
 95 |             )
 96 |             return JSONResponse(status_code=403, content={"detail": "Invalid origin"})
 97 | 
 98 |         user_agent = request.headers.get("user-agent", "")
 99 |         if self._is_browser_plugin(user_agent, request):
100 |             client_ip = request.client.host if request.client else "unknown"
101 |             logger.warning(
102 |                 f"Blocked browser plugin access from {client_ip}, User-Agent: {user_agent}"
103 |             )
104 |             return JSONResponse(
105 |                 status_code=403,
106 |                 content={"detail": "Browser plugin access is not allowed"},
107 |             )
108 | 
109 |         auth_header = request.headers.get("Authorization")
110 |         if auth_header and auth_header.startswith("Bearer "):
111 |             token = auth_header.replace("Bearer ", "")
112 |             if await verify_bearer_token(token):
113 |                 request.state.token = token
114 |                 return await call_next(request)
115 |             else:
116 |                 logger.warning(
117 |                     f"Invalid bearer token attempt from {request.client.host if request.client else 'unknown'}"
118 |                 )
119 |         else:
120 |             logger.warning(
121 |                 f"Missing or invalid Authorization header from {request.client.host if request.client else 'unknown'}"
122 |             )
123 | 
124 |         return JSONResponse(
125 |             status_code=401,
126 |             content={"detail": "Authentication required"},
127 |             headers={"WWW-Authenticate": "Bearer"},
128 |         )
129 | 
130 |     def _is_valid_origin(self, origin: str, request: Request) -> bool:
131 |         """Validate Origin header to prevent DNS rebinding attacks."""
132 |         valid_local_origins = ["http://localhost:", "http://127.0.0.1:"]
133 |         valid_domains = os.environ.get("ALLOWED_ORIGINS", "").split(",")
134 |         valid_domains = [d.strip() for d in valid_domains if d.strip()]
135 | 
136 |         for valid_origin in valid_local_origins:
137 |             if origin.startswith(valid_origin):
138 |                 return True
139 | 
140 |         for domain in valid_domains:
141 |             if (
142 |                 origin == domain
143 |                 or origin.startswith(f"https://{domain}")
144 |                 or origin.startswith(f"http://{domain}")
145 |             ):
146 |                 return True
147 | 
148 |         return False
149 | 
150 |     def _is_browser_plugin(self, user_agent: str, request: Request) -> bool:
151 |         """Check if request is from a browser plugin."""
152 |         plugin_patterns = [
153 |             "Chrome-Extension",
154 |             "Mozilla/5.0 (compatible; Extension)",
155 |             "Browser-Extension",
156 |         ]
157 | 
158 |         for pattern in plugin_patterns:
159 |             if pattern.lower() in user_agent.lower():
160 |                 return True
161 | 
162 |         origin = request.headers.get("origin", "")
163 |         if origin and (
164 |             origin.startswith("chrome-extension://")
165 |             or origin.startswith("moz-extension://")
166 |             or origin.startswith("safari-extension://")
167 |         ):
168 |             return True
169 | 
170 |         return False
171 | 
```

--------------------------------------------------------------------------------
/src/stdio_client_test.py:
--------------------------------------------------------------------------------

```python
  1 | import asyncio
  2 | import subprocess
  3 | import sys
  4 | import os
  5 | import time
  6 | from mcp import ClientSession
  7 | from mcp.client.stdio import stdio_client
  8 | from mcp.client.stdio import StdioServerParameters
  9 | from mcp.types import TextContent
 10 | 
 11 | 
 12 | def extract_text_from_result(result) -> str:
 13 |     """Safely extract text from MCP tool result"""
 14 |     if not result.content:
 15 |         return "No content"
 16 | 
 17 |     for item in result.content:
 18 |         if isinstance(item, TextContent):
 19 |             return item.text
 20 | 
 21 |     content_types = [type(item).__name__ for item in result.content]
 22 |     return f"Non-text content: {', '.join(content_types)}"
 23 | 
 24 | 
 25 | async def test_stdio_rate_limiting():
 26 |     """Test STDIO rate limiting - should block after 1 request per minute"""
 27 |     env = os.environ.copy()
 28 |     env["STDIO_KEY"] = "test-stdio-key"
 29 |     env["OS_API_KEY"] = os.environ.get("OS_API_KEY", "dummy-key-for-testing")
 30 |     env["PYTHONPATH"] = "src"
 31 | 
 32 |     print("Starting STDIO server subprocess...")
 33 | 
 34 |     server_process = subprocess.Popen(
 35 |         [sys.executable, "src/server.py", "--transport", "stdio", "--debug"],
 36 |         stdin=subprocess.PIPE,
 37 |         stdout=subprocess.PIPE,
 38 |         stderr=subprocess.PIPE,
 39 |         env=env,
 40 |         text=True,
 41 |         bufsize=0,
 42 |     )
 43 | 
 44 |     try:
 45 |         print("Connecting to STDIO server...")
 46 | 
 47 |         server_params = StdioServerParameters(
 48 |             command=sys.executable,
 49 |             args=["src/server.py", "--transport", "stdio", "--debug"],
 50 |             env=env,
 51 |         )
 52 |         async with stdio_client(server_params) as (read_stream, write_stream):
 53 |             async with ClientSession(read_stream, write_stream) as session:
 54 |                 print("Initializing session...")
 55 |                 await session.initialize()
 56 | 
 57 |                 print("Listing available tools...")
 58 |                 tools = await session.list_tools()
 59 |                 print(f"   Found {len(tools.tools)} tools")
 60 | 
 61 |                 print("\nRAPID FIRE TEST (should hit rate limit)")
 62 |                 print("-" * 50)
 63 | 
 64 |                 results = []
 65 |                 for i in range(3):
 66 |                     try:
 67 |                         start_time = time.time()
 68 |                         print(f"Request #{i + 1}: ", end="", flush=True)
 69 | 
 70 |                         result = await session.call_tool(
 71 |                             "hello_world", {"name": f"TestUser{i + 1}"}
 72 |                         )
 73 |                         elapsed = time.time() - start_time
 74 | 
 75 |                         result_text = extract_text_from_result(result)
 76 | 
 77 |                         if "rate limit" in result_text.lower() or "429" in result_text:
 78 |                             print(f"BLOCKED ({elapsed:.1f}s) - {result_text}")
 79 |                             results.append(("BLOCKED", elapsed, result_text))
 80 |                         else:
 81 |                             print(f"SUCCESS ({elapsed:.1f}s) - {result_text}")
 82 |                             results.append(("SUCCESS", elapsed, result_text))
 83 | 
 84 |                     except Exception as e:
 85 |                         elapsed = time.time() - start_time
 86 |                         print(f"ERROR ({elapsed:.1f}s) - {e}")
 87 |                         results.append(("ERROR", elapsed, str(e)))
 88 | 
 89 |                     if i < 2:
 90 |                         await asyncio.sleep(0.1)
 91 | 
 92 |                 print("\nRESULTS ANALYSIS")
 93 |                 print("-" * 50)
 94 | 
 95 |                 success_count = sum(1 for r in results if r[0] == "SUCCESS")
 96 |                 blocked_count = sum(1 for r in results if r[0] == "BLOCKED")
 97 |                 error_count = sum(1 for r in results if r[0] == "ERROR")
 98 | 
 99 |                 print(f"Successful requests: {success_count}")
100 |                 print(f"Rate limited requests: {blocked_count}")
101 |                 print(f"Error requests: {error_count}")
102 | 
103 |                 print("\nVISUAL TIMELINE:")
104 |                 timeline = ""
105 |                 for i, (status, elapsed, _) in enumerate(results):
106 |                     if status == "SUCCESS":
107 |                         timeline += "OK"
108 |                     elif status == "BLOCKED":
109 |                         timeline += "XX"
110 |                     else:
111 |                         timeline += "ER"
112 | 
113 |                     if i < len(results) - 1:
114 |                         timeline += "--"
115 | 
116 |                 print(f"   {timeline}")
117 |                 print("   Request: 1   2   3")
118 | 
119 |                 print("\nTEST VERDICT")
120 |                 print("-" * 50)
121 | 
122 |                 if success_count == 1 and blocked_count >= 1:
123 |                     print("PASS: Rate limiting works correctly!")
124 |                     print("   First request succeeded")
125 |                     print("   Subsequent requests blocked")
126 |                 elif success_count > 1:
127 |                     print("FAIL: Rate limiting too permissive!")
128 |                     print(f"   {success_count} requests succeeded (expected 1)")
129 |                 else:
130 |                     print("FAIL: No requests succeeded!")
131 |                     print("   Check authentication or server issues")
132 | 
133 |                 print("\nWAITING FOR RATE LIMIT RESET...")
134 |                 print("   (Testing if limit resets after window)")
135 |                 print("   Waiting 10 seconds...")
136 | 
137 |                 for countdown in range(10, 0, -1):
138 |                     print(f"   {countdown}s remaining...", end="\r", flush=True)
139 |                     await asyncio.sleep(1)
140 | 
141 |                 print("\nTesting after rate limit window...")
142 |                 try:
143 |                     result = await session.call_tool(
144 |                         "hello_world", {"name": "AfterWait"}
145 |                     )
146 |                     result_text = extract_text_from_result(result)
147 | 
148 |                     if "rate limit" not in result_text.lower():
149 |                         print("SUCCESS: Rate limit properly reset!")
150 |                     else:
151 |                         print("Rate limit still active (might need longer wait)")
152 | 
153 |                 except Exception as e:
154 |                     print(f"Error after wait: {e}")
155 | 
156 |     except Exception as e:
157 |         print(f"Test failed with error: {e}")
158 | 
159 |     finally:
160 |         print("\nCleaning up server process...")
161 |         server_process.terminate()
162 |         try:
163 |             await asyncio.wait_for(
164 |                 asyncio.create_task(asyncio.to_thread(server_process.wait)), timeout=5.0
165 |             )
166 |             print("Server terminated gracefully")
167 |         except asyncio.TimeoutError:
168 |             print("Force killing server...")
169 |             server_process.kill()
170 |             server_process.wait()
171 | 
172 | 
173 | if __name__ == "__main__":
174 |     print("STDIO Rate Limit Test Suite")
175 |     print("Testing if STDIO middleware properly blocks > 1 request/minute")
176 |     print()
177 | 
178 |     if not os.environ.get("OS_API_KEY"):
179 |         print("Warning: OS_API_KEY not set, using dummy key for testing")
180 |         print("   (This is OK for rate limit testing)")
181 |         print()
182 | 
183 |     asyncio.run(test_stdio_rate_limiting())
184 | 
```

--------------------------------------------------------------------------------
/src/http_client_test.py:
--------------------------------------------------------------------------------

```python
  1 | import asyncio
  2 | import logging
  3 | from mcp.client.streamable_http import streamablehttp_client
  4 | from mcp import ClientSession
  5 | from mcp.types import TextContent
  6 | 
  7 | logging.basicConfig(level=logging.WARNING)
  8 | logger = logging.getLogger(__name__)
  9 | 
 10 | 
 11 | def extract_text_from_result(result) -> str:
 12 |     """Safely extract text from MCP tool result"""
 13 |     if not result.content:
 14 |         return "No content"
 15 | 
 16 |     for item in result.content:
 17 |         if isinstance(item, TextContent):
 18 |             return item.text
 19 | 
 20 |     content_types = [type(item).__name__ for item in result.content]
 21 |     return f"Non-text content: {', '.join(content_types)}"
 22 | 
 23 | 
 24 | async def test_usrn_search(session_name: str, usrn_values: list):
 25 |     """Test USRN searches with different values"""
 26 |     headers = {"Authorization": "Bearer dev-token"}
 27 |     results = []
 28 |     session_id = None
 29 | 
 30 |     print(f"\n{session_name} - Testing USRN searches")
 31 |     print("-" * 40)
 32 | 
 33 |     try:
 34 |         async with streamablehttp_client(
 35 |             "http://127.0.0.1:8000/mcp", headers=headers
 36 |         ) as (
 37 |             read_stream,
 38 |             write_stream,
 39 |             get_session_id,
 40 |         ):
 41 |             async with ClientSession(read_stream, write_stream) as session:
 42 |                 await session.initialize()
 43 |                 session_id = get_session_id()
 44 |                 print(f"{session_name} Session ID: {session_id}")
 45 | 
 46 |                 for i, usrn in enumerate(usrn_values):
 47 |                     try:
 48 |                         result = await session.call_tool(
 49 |                             "search_features",
 50 |                             {
 51 |                                 "collection_id": "trn-ntwk-street-1",
 52 |                                 "query_attr": "usrn",
 53 |                                 "query_attr_value": str(usrn),
 54 |                                 "limit": 5,
 55 |                             },
 56 |                         )
 57 |                         result_text = extract_text_from_result(result)
 58 |                         print(f"  USRN {usrn}: SUCCESS - Found data")
 59 |                         print(result_text)
 60 |                         results.append(("SUCCESS", usrn, result_text[:100] + "..."))
 61 |                         await asyncio.sleep(0.1)
 62 | 
 63 |                     except Exception as e:
 64 |                         if "429" in str(e) or "Too Many Requests" in str(e):
 65 |                             print(f"  USRN {usrn}: BLOCKED - Rate limited")
 66 |                             results.append(("BLOCKED", usrn, str(e)))
 67 |                         else:
 68 |                             print(f"  USRN {usrn}: ERROR - {e}")
 69 |                             results.append(("ERROR", usrn, str(e)))
 70 | 
 71 |     except Exception as e:
 72 |         print(f"{session_name}: Connection error - {str(e)[:100]}")
 73 |         if len(results) == 0:
 74 |             results = [("ERROR", "connection", str(e))] * len(usrn_values)
 75 | 
 76 |     return results, session_id
 77 | 
 78 | 
 79 | async def test_usrn_calls():
 80 |     """Test calling USRN searches with different values"""
 81 |     print("USRN Search Test - Two Different USRNs")
 82 |     print("=" * 50)
 83 | 
 84 |     # Different USRN values to test
 85 |     usrn_values_1 = ["24501091", "24502114"]
 86 |     usrn_values_2 = ["24502114", "24501091"]
 87 | 
 88 |     results_a, session_id_a = await test_usrn_search("SESSION-A", usrn_values_1)
 89 |     await asyncio.sleep(0.5)
 90 |     results_b, session_id_b = await test_usrn_search("SESSION-B", usrn_values_2)
 91 | 
 92 |     # Analyze results
 93 |     print("\nRESULTS SUMMARY")
 94 |     print("=" * 30)
 95 | 
 96 |     success_a = len([r for r in results_a if r[0] == "SUCCESS"])
 97 |     blocked_a = len([r for r in results_a if r[0] == "BLOCKED"])
 98 |     error_a = len([r for r in results_a if r[0] == "ERROR"])
 99 | 
100 |     success_b = len([r for r in results_b if r[0] == "SUCCESS"])
101 |     blocked_b = len([r for r in results_b if r[0] == "BLOCKED"])
102 |     error_b = len([r for r in results_b if r[0] == "ERROR"])
103 | 
104 |     print(f"SESSION-A: {success_a} success, {blocked_a} blocked, {error_a} errors")
105 |     print(f"SESSION-B: {success_b} success, {blocked_b} blocked, {error_b} errors")
106 |     print(f"Total: {success_a + success_b} success, {blocked_a + blocked_b} blocked")
107 | 
108 |     if session_id_a and session_id_b:
109 |         print(f"Different session IDs: {session_id_a != session_id_b}")
110 |     else:
111 |         print("Could not compare session IDs")
112 | 
113 |     # Show detailed results
114 |     print("\nDETAILED RESULTS:")
115 |     print("SESSION-A:")
116 |     for status, usrn, details in results_a:
117 |         print(f"  USRN {usrn}: {status}")
118 | 
119 |     print("SESSION-B:")
120 |     for status, usrn, details in results_b:
121 |         print(f"  USRN {usrn}: {status}")
122 | 
123 | 
124 | async def test_session_safe(session_name: str, requests: int = 3):
125 |     """Test a single session with safer error handling"""
126 |     headers = {"Authorization": "Bearer dev-token"}
127 |     results = []
128 |     session_id = None
129 | 
130 |     print(f"\n{session_name} - Testing {requests} requests")
131 |     print("-" * 40)
132 | 
133 |     try:
134 |         async with streamablehttp_client(
135 |             "http://127.0.0.1:8000/mcp", headers=headers
136 |         ) as (
137 |             read_stream,
138 |             write_stream,
139 |             get_session_id,
140 |         ):
141 |             async with ClientSession(read_stream, write_stream) as session:
142 |                 await session.initialize()
143 |                 session_id = get_session_id()
144 |                 print(f"{session_name} Session ID: {session_id}")
145 | 
146 |                 for i in range(requests):
147 |                     try:
148 |                         result = await session.call_tool(
149 |                             "hello_world", {"name": f"{session_name}-User{i + 1}"}
150 |                         )
151 |                         result_text = extract_text_from_result(result)
152 |                         print(f"  Request {i + 1}: SUCCESS - {result_text}")
153 |                         results.append("SUCCESS")
154 |                         await asyncio.sleep(0.1)
155 | 
156 |                     except Exception as e:
157 |                         if "429" in str(e) or "Too Many Requests" in str(e):
158 |                             print(f"  Request {i + 1}: BLOCKED - Rate limited")
159 |                             results.append("BLOCKED")
160 |                         else:
161 |                             print(f"  Request {i + 1}: ERROR - {e}")
162 |                             results.append("ERROR")
163 | 
164 |                         for j in range(i + 1, requests):
165 |                             print(f"  Request {j + 1}: BLOCKED - Session rate limited")
166 |                             results.append("BLOCKED")
167 |                         break
168 | 
169 |     except Exception as e:
170 |         print(f"{session_name}: Connection error - {str(e)[:100]}")
171 |         if len(results) == 0:
172 |             results = ["ERROR"] * requests
173 |         elif len(results) < requests:
174 |             remaining = requests - len(results)
175 |             results.extend(["BLOCKED"] * remaining)
176 | 
177 |     return results, session_id
178 | 
179 | 
180 | async def test_two_sessions():
181 |     """Test rate limiting across two sessions"""
182 |     print("HTTP Rate Limit Test - Two Sessions")
183 |     print("=" * 50)
184 | 
185 |     results_a, session_id_a = await test_session_safe("SESSION-A", 20)
186 |     await asyncio.sleep(0.5)
187 |     results_b, session_id_b = await test_session_safe("SESSION-B", 20)
188 | 
189 |     # Analyze results
190 |     print("\nRESULTS SUMMARY")
191 |     print("=" * 30)
192 | 
193 |     success_a = results_a.count("SUCCESS")
194 |     blocked_a = results_a.count("BLOCKED")
195 |     error_a = results_a.count("ERROR")
196 | 
197 |     success_b = results_b.count("SUCCESS")
198 |     blocked_b = results_b.count("BLOCKED")
199 |     error_b = results_b.count("ERROR")
200 | 
201 |     print(f"SESSION-A: {success_a} success, {blocked_a} blocked, {error_a} errors")
202 |     print(f"SESSION-B: {success_b} success, {blocked_b} blocked, {error_b} errors")
203 |     print(f"Total: {success_a + success_b} success, {blocked_a + blocked_b} blocked")
204 | 
205 |     if session_id_a and session_id_b:
206 |         print(f"Different session IDs: {session_id_a != session_id_b}")
207 |     else:
208 |         print("Could not compare session IDs")
209 | 
210 |     total_success = success_a + success_b
211 |     total_blocked = blocked_a + blocked_b
212 | 
213 |     print("\nRATE LIMITING ASSESSMENT:")
214 |     if total_success >= 2 and total_blocked >= 2:
215 |         print("✅ PASS: Rate limiting is working")
216 |         print(f"   - {total_success} requests succeeded")
217 |         print(f"   - {total_blocked} requests were rate limited")
218 |         print("   - Each session got limited after ~2 requests")
219 |     elif total_success == 0:
220 |         print("❌ FAIL: No requests succeeded (check server/auth)")
221 |     else:
222 |         print("⚠️  UNCLEAR: Unexpected pattern")
223 |         print("   Check server logs for actual behavior")
224 | 
225 | 
226 | if __name__ == "__main__":
227 |     # Run the USRN test by default
228 |     asyncio.run(test_usrn_calls())
229 | 
230 |     # Uncomment the line below to run the original test instead
231 |     # asyncio.run(test_two_sessions())
232 | 
```

--------------------------------------------------------------------------------
/src/mcp_service/routing_service.py:
--------------------------------------------------------------------------------

```python
  1 | from typing import Dict, List, Set, Optional, Any
  2 | from dataclasses import dataclass
  3 | from utils.logging_config import get_logger
  4 | 
  5 | logger = get_logger(__name__)
  6 | 
  7 | 
  8 | @dataclass
  9 | class RouteNode:
 10 |     """Represents a routing node (intersection)"""
 11 | 
 12 |     id: int
 13 |     node_identifier: str
 14 |     connected_edges: Set[int]
 15 | 
 16 | 
 17 | @dataclass
 18 | class RouteEdge:
 19 |     """Represents a routing edge (road segment)"""
 20 | 
 21 |     id: int
 22 |     road_id: str
 23 |     road_name: Optional[str]
 24 |     source_node_id: int
 25 |     target_node_id: int
 26 |     cost: float
 27 |     reverse_cost: float
 28 |     geometry: Optional[Dict[str, Any]]
 29 | 
 30 | 
 31 | class InMemoryRoutingNetwork:
 32 |     """In-memory routing network built from OS NGD data"""
 33 | 
 34 |     def __init__(self):
 35 |         self.nodes: Dict[int, RouteNode] = {}
 36 |         self.edges: Dict[int, RouteEdge] = {}
 37 |         self.node_lookup: Dict[str, int] = {}
 38 |         self.is_built = False
 39 | 
 40 |     def add_node(self, node_identifier: str) -> int:
 41 |         """Add a node and return its internal ID"""
 42 |         if node_identifier in self.node_lookup:
 43 |             return self.node_lookup[node_identifier]
 44 | 
 45 |         node_id = len(self.nodes) + 1
 46 |         self.nodes[node_id] = RouteNode(
 47 |             id=node_id,
 48 |             node_identifier=node_identifier,
 49 |             connected_edges=set(),
 50 |         )
 51 |         self.node_lookup[node_identifier] = node_id
 52 |         return node_id
 53 | 
 54 |     def add_edge(self, road_data: Dict[str, Any]) -> None:
 55 |         """Add a road link as an edge"""
 56 |         properties = road_data.get("properties", {})
 57 | 
 58 |         start_node = properties.get("startnode", "")
 59 |         end_node = properties.get("endnode", "")
 60 | 
 61 |         if not start_node or not end_node:
 62 |             logger.warning(f"Road link {properties.get('id')} missing node data")
 63 |             return
 64 | 
 65 |         source_id = self.add_node(start_node)
 66 |         target_id = self.add_node(end_node)
 67 | 
 68 |         roadlink_id = None
 69 |         road_track_refs = properties.get("roadtrackorpathreference", [])
 70 |         if road_track_refs and len(road_track_refs) > 0:
 71 |             roadlink_id = road_track_refs[0].get("roadlinkid", "")
 72 | 
 73 |         edge_id = len(self.edges) + 1
 74 |         cost = properties.get("geometry_length", 100.0)
 75 | 
 76 |         edge = RouteEdge(
 77 |             id=edge_id,
 78 |             road_id=roadlink_id or "NONE",
 79 |             road_name=properties.get("name1_text"),
 80 |             source_node_id=source_id,
 81 |             target_node_id=target_id,
 82 |             cost=cost,
 83 |             reverse_cost=cost,
 84 |             geometry=road_data.get("geometry"),
 85 |         )
 86 | 
 87 |         self.edges[edge_id] = edge
 88 | 
 89 |         self.nodes[source_id].connected_edges.add(edge_id)
 90 |         self.nodes[target_id].connected_edges.add(edge_id)
 91 | 
 92 |     def get_connected_edges(self, node_id: int) -> List[RouteEdge]:
 93 |         """Get all edges connected to a node"""
 94 |         if node_id not in self.nodes:
 95 |             return []
 96 | 
 97 |         return [self.edges[edge_id] for edge_id in self.nodes[node_id].connected_edges]
 98 | 
 99 |     def get_summary(self) -> Dict[str, Any]:
100 |         """Get network summary statistics"""
101 |         return {
102 |             "total_nodes": len(self.nodes),
103 |             "total_edges": len(self.edges),
104 |             "is_built": self.is_built,
105 |             "sample_nodes": [
106 |                 {
107 |                     "id": node.id,
108 |                     "identifier": node.node_identifier,
109 |                     "connected_edges": len(node.connected_edges),
110 |                 }
111 |                 for node in list(self.nodes.values())[:5]
112 |             ],
113 |         }
114 | 
115 |     def get_all_nodes(self) -> List[Dict[str, Any]]:
116 |         """Get all nodes as a flat list"""
117 |         return [
118 |             {
119 |                 "id": node.id,
120 |                 "node_identifier": node.node_identifier,
121 |                 "connected_edge_count": len(node.connected_edges),
122 |                 "connected_edge_ids": list(node.connected_edges),
123 |             }
124 |             for node in self.nodes.values()
125 |         ]
126 | 
127 |     def get_all_edges(self) -> List[Dict[str, Any]]:
128 |         """Get all edges as a flat list"""
129 |         return [
130 |             {
131 |                 "id": edge.id,
132 |                 "road_id": edge.road_id,
133 |                 "road_name": edge.road_name,
134 |                 "source_node_id": edge.source_node_id,
135 |                 "target_node_id": edge.target_node_id,
136 |                 "cost": edge.cost,
137 |                 "reverse_cost": edge.reverse_cost,
138 |                 "geometry": edge.geometry,
139 |             }
140 |             for edge in self.edges.values()
141 |         ]
142 | 
143 | 
144 | class OSRoutingService:
145 |     """Service to build and query routing networks from OS NGD data"""
146 | 
147 |     def __init__(self, api_client):
148 |         self.api_client = api_client
149 |         self.network = InMemoryRoutingNetwork()
150 |         self.raw_restrictions: List[Dict[str, Any]] = []
151 | 
152 |     async def _fetch_restriction_data(
153 |         self, bbox: Optional[str] = None, limit: int = 100
154 |     ) -> List[Dict[str, Any]]:
155 |         """Fetch raw restriction data"""
156 |         try:
157 |             logger.debug("Fetching restriction data...")
158 |             params = {
159 |                 "limit": min(limit, 100),
160 |                 "crs": "http://www.opengis.net/def/crs/EPSG/0/4326",
161 |             }
162 | 
163 |             if bbox:
164 |                 params["bbox"] = bbox
165 | 
166 |             restriction_data = await self.api_client.make_request(
167 |                 "COLLECTION_FEATURES",
168 |                 params=params,
169 |                 path_params=["trn-rami-restriction-1"],
170 |             )
171 | 
172 |             features = restriction_data.get("features", [])
173 |             logger.debug(f"Fetched {len(features)} restriction features")
174 | 
175 |             return features
176 | 
177 |         except Exception as e:
178 |             logger.error(f"Error fetching restriction data: {e}")
179 |             return []
180 | 
181 |     async def build_routing_network(
182 |         self,
183 |         bbox: Optional[str] = None,
184 |         limit: int = 1000,
185 |         include_restrictions: bool = True,
186 |     ) -> Dict[str, Any]:
187 |         """Build the routing network from OS NGD road links with optional restriction data"""
188 |         try:
189 |             logger.debug("Building routing network from OS NGD data...")
190 | 
191 |             if include_restrictions:
192 |                 self.raw_restrictions = await self._fetch_restriction_data(bbox, limit)
193 | 
194 |             params = {
195 |                 "limit": min(limit, 100),
196 |                 "crs": "http://www.opengis.net/def/crs/EPSG/0/4326",
197 |             }
198 | 
199 |             if bbox:
200 |                 params["bbox"] = bbox
201 | 
202 |             road_links_data = await self.api_client.make_request(
203 |                 "COLLECTION_FEATURES",
204 |                 params=params,
205 |                 path_params=["trn-ntwk-roadlink-4"],
206 |             )
207 | 
208 |             features = road_links_data.get("features", [])
209 |             logger.debug(f"Processing {len(features)} road links...")
210 | 
211 |             for feature in features:
212 |                 self.network.add_edge(feature)
213 | 
214 |             self.network.is_built = True
215 | 
216 |             summary = self.network.get_summary()
217 |             logger.debug(
218 |                 f"Network built: {summary['total_nodes']} nodes, {summary['total_edges']} edges"
219 |             )
220 | 
221 |             return {
222 |                 "status": "success",
223 |                 "message": f"Built routing network with {summary['total_nodes']} nodes and {summary['total_edges']} edges",
224 |                 "network_summary": summary,
225 |                 "restrictions": self.raw_restrictions if include_restrictions else [],
226 |                 "restriction_count": len(self.raw_restrictions)
227 |                 if include_restrictions
228 |                 else 0,
229 |             }
230 | 
231 |         except Exception as e:
232 |             logger.error(f"Error building routing network: {e}")
233 |             return {"status": "error", "error": str(e)}
234 | 
235 |     def get_network_info(self) -> Dict[str, Any]:
236 |         """Get current network information"""
237 |         return {"status": "success", "network": self.network.get_summary()}
238 | 
239 |     def get_flat_nodes(self) -> Dict[str, Any]:
240 |         """Get flat list of all nodes"""
241 |         if not self.network.is_built:
242 |             return {
243 |                 "status": "error",
244 |                 "error": "Routing network not built. Call build_routing_network first.",
245 |             }
246 | 
247 |         return {"status": "success", "nodes": self.network.get_all_nodes()}
248 | 
249 |     def get_flat_edges(self) -> Dict[str, Any]:
250 |         """Get flat list of all edges with connections"""
251 |         if not self.network.is_built:
252 |             return {
253 |                 "status": "error",
254 |                 "error": "Routing network not built. Call build_routing_network first.",
255 |             }
256 | 
257 |         return {"status": "success", "edges": self.network.get_all_edges()}
258 | 
259 |     def get_routing_tables(self) -> Dict[str, Any]:
260 |         """Get both nodes and edges as flat tables"""
261 |         if not self.network.is_built:
262 |             return {
263 |                 "status": "error",
264 |                 "error": "Routing network not built. Call build_routing_network first.",
265 |             }
266 | 
267 |         return {
268 |             "status": "success",
269 |             "nodes": self.network.get_all_nodes(),
270 |             "edges": self.network.get_all_edges(),
271 |             "summary": self.network.get_summary(),
272 |             "restrictions": self.raw_restrictions,
273 |         }
274 | 
```

--------------------------------------------------------------------------------
/src/api_service/os_api.py:
--------------------------------------------------------------------------------

```python
  1 | import os
  2 | import aiohttp
  3 | import asyncio
  4 | import re
  5 | import concurrent.futures
  6 | import threading
  7 | 
  8 | from typing import Dict, List, Any, Optional
  9 | from models import (
 10 |     NGDAPIEndpoint,
 11 |     OpenAPISpecification,
 12 |     Collection,
 13 |     CollectionsCache,
 14 |     CollectionQueryables,
 15 | )
 16 | from api_service.protocols import APIClient
 17 | from utils.logging_config import get_logger
 18 | 
 19 | logger = get_logger(__name__)
 20 | 
 21 | 
 22 | class OSAPIClient(APIClient):
 23 |     """Implementation an OS API client"""
 24 | 
 25 |     user_agent = "os-ngd-mcp-server/1.0"
 26 | 
 27 |     def __init__(self, api_key: Optional[str] = None):
 28 |         """
 29 |         Initialise the OS API client
 30 | 
 31 |         Args:
 32 |             api_key: Optional API key, if not provided will use OS_API_KEY env var
 33 |         """
 34 |         self.api_key = api_key
 35 |         self.session = None
 36 |         self.last_request_time = 0
 37 |         # TODO: This is because there seems to be some rate limiting in place - TBC if this is the case
 38 |         self.request_delay = 0.7
 39 |         self._cached_openapi_spec: Optional[OpenAPISpecification] = None
 40 |         self._cached_collections: Optional[CollectionsCache] = None
 41 | 
 42 |     # Private helper methods
 43 |     def _sanitise_api_key(self, text: Any) -> str:
 44 |         """Remove API keys from any text (URLs, error messages, etc.)"""
 45 |         if not isinstance(text, str):
 46 |             return text
 47 | 
 48 |         patterns = [
 49 |             r"[?&]key=[^&\s]*",
 50 |             r"[?&]api_key=[^&\s]*",
 51 |             r"[?&]apikey=[^&\s]*",
 52 |             r"[?&]token=[^&\s]*",
 53 |         ]
 54 | 
 55 |         sanitized = text
 56 |         for pattern in patterns:
 57 |             sanitized = re.sub(pattern, "", sanitized, flags=re.IGNORECASE)
 58 | 
 59 |         sanitized = re.sub(r"[?&]$", "", sanitized)
 60 |         sanitized = re.sub(r"&{2,}", "&", sanitized)
 61 |         sanitized = re.sub(r"\?&", "?", sanitized)
 62 | 
 63 |         return sanitized
 64 | 
 65 |     def _sanitise_response(self, data: Any) -> Any:
 66 |         """Remove API keys from response data recursively"""
 67 |         if isinstance(data, dict):
 68 |             sanitized_dict = {}
 69 |             for key, value in data.items():
 70 |                 if isinstance(value, str) and any(
 71 |                     url_indicator in key.lower()
 72 |                     for url_indicator in ["href", "url", "link", "uri"]
 73 |                 ):
 74 |                     sanitized_dict[key] = self._sanitise_api_key(value)
 75 |                 elif isinstance(value, (dict, list)):
 76 |                     sanitized_dict[key] = self._sanitise_response(value)
 77 |                 else:
 78 |                     sanitized_dict[key] = value
 79 |             return sanitized_dict
 80 |         elif isinstance(data, list):
 81 |             return [self._sanitise_response(item) for item in data]
 82 |         elif isinstance(data, str):
 83 |             if any(
 84 |                 indicator in data
 85 |                 for indicator in [
 86 |                     "http://",
 87 |                     "https://",
 88 |                     "key=",
 89 |                     "api_key=",
 90 |                     "apikey=",
 91 |                     "token=",
 92 |                 ]
 93 |             ):
 94 |                 return self._sanitise_api_key(data)
 95 | 
 96 |         return data
 97 | 
 98 |     def _filter_latest_collections(
 99 |         self, collections: List[Dict[str, Any]]
100 |     ) -> List[Collection]:
101 |         """
102 |         Filter collections to keep only the latest version of each collection type.
103 |         For collections with IDs like 'trn-ntwk-roadlink-1', 'trn-ntwk-roadlink-2', 'trn-ntwk-roadlink-3',
104 |         only keep the one with the highest number.
105 | 
106 |         Args:
107 |             collections: Raw collections from API
108 | 
109 |         Returns:
110 |             Filtered list of Collection objects
111 |         """
112 |         latest_versions: Dict[str, Dict[str, Any]] = {}
113 | 
114 |         for col in collections:
115 |             col_id = col.get("id", "")
116 | 
117 |             match = re.match(r"^(.+?)-(\d+)$", col_id)
118 | 
119 |             if match:
120 |                 base_name = match.group(1)
121 |                 version_num = int(match.group(2))
122 | 
123 |                 if (
124 |                     base_name not in latest_versions
125 |                     or version_num > latest_versions[base_name]["version"]
126 |                 ):
127 |                     latest_versions[base_name] = {"version": version_num, "data": col}
128 |             else:
129 |                 latest_versions[col_id] = {"version": 0, "data": col}
130 | 
131 |         filtered_collections = []
132 |         for item in latest_versions.values():
133 |             col_data = item["data"]
134 |             filtered_collections.append(
135 |                 Collection(
136 |                     id=col_data.get("id", ""),
137 |                     title=col_data.get("title", ""),
138 |                     description=col_data.get("description", ""),
139 |                     links=col_data.get("links", []),
140 |                     extent=col_data.get("extent", {}),
141 |                     itemType=col_data.get("itemType", "feature"),
142 |                 )
143 |             )
144 | 
145 |         return filtered_collections
146 | 
147 |     def _parse_openapi_spec_for_llm(
148 |         self, spec_data: dict, collection_ids: List[str]
149 |     ) -> dict:
150 |         """Parse OpenAPI spec to extract only essential information for LLM context"""
151 |         supported_crs = {
152 |             "input": [],
153 |             "output": [],
154 |             "default": "http://www.opengis.net/def/crs/OGC/1.3/CRS84",
155 |         }
156 | 
157 |         parsed = {
158 |             "title": spec_data.get("info", {}).get("title", ""),
159 |             "version": spec_data.get("info", {}).get("version", ""),
160 |             "base_url": spec_data.get("servers", [{}])[0].get("url", ""),
161 |             "endpoints": {},
162 |             "collection_ids": collection_ids,
163 |             "supported_crs": supported_crs,
164 |         }
165 | 
166 |         paths = spec_data.get("paths", {})
167 |         for path, methods in paths.items():
168 |             for method, details in methods.items():
169 |                 if method == "get" and "parameters" in details:
170 |                     for param in details["parameters"]:
171 |                         param_name = param.get("name", "")
172 | 
173 |                         if param_name == "collectionId" and "schema" in param:
174 |                             enum_values = param["schema"].get("enum", [])
175 |                             if enum_values:
176 |                                 parsed["collection_ids"] = enum_values
177 | 
178 |                         elif (
179 |                             param_name in ["bbox-crs", "filter-crs"]
180 |                             and "schema" in param
181 |                         ):
182 |                             crs_values = param["schema"].get("enum", [])
183 |                             if crs_values and not supported_crs["input"]:
184 |                                 supported_crs["input"] = crs_values
185 | 
186 |                         elif param_name == "crs" and "schema" in param:
187 |                             crs_values = param["schema"].get("enum", [])
188 |                             if crs_values and not supported_crs["output"]:
189 |                                 supported_crs["output"] = crs_values
190 | 
191 |         endpoint_patterns = {
192 |             "/collections": "List all collections",
193 |             "/collections/{collectionId}": "Get collection info",
194 |             "/collections/{collectionId}/schema": "Get collection schema",
195 |             "/collections/{collectionId}/queryables": "Get collection queryables",
196 |             "/collections/{collectionId}/items": "Search features in collection",
197 |             "/collections/{collectionId}/items/{featureId}": "Get specific feature",
198 |         }
199 |         parsed["endpoints"] = endpoint_patterns
200 |         parsed["crs_guide"] = {
201 |             "WGS84": "http://www.opengis.net/def/crs/OGC/1.3/CRS84 (default, longitude/latitude)",
202 |             "British_National_Grid": "http://www.opengis.net/def/crs/EPSG/0/27700 (UK Ordnance Survey)",
203 |             "WGS84_latlon": "http://www.opengis.net/def/crs/EPSG/0/4326 (latitude/longitude)",
204 |             "Web_Mercator": "http://www.opengis.net/def/crs/EPSG/0/3857 (Web mapping)",
205 |         }
206 | 
207 |         return parsed
208 | 
209 |     # Private async methods
210 |     async def _get_open_api_spec(self) -> OpenAPISpecification:
211 |         """Get the OpenAPI spec for the OS NGD API"""
212 |         try:
213 |             response = await self.make_request("OPENAPI_SPEC", params={"f": "json"})
214 | 
215 |             # Sanitize the raw response before processing
216 |             sanitized_response = self._sanitise_response(response)
217 | 
218 |             collections_cache = await self.cache_collections()
219 |             filtered_collection_ids = [col.id for col in collections_cache.collections]
220 | 
221 |             parsed_spec = self._parse_openapi_spec_for_llm(
222 |                 sanitized_response, filtered_collection_ids
223 |             )
224 | 
225 |             spec = OpenAPISpecification(
226 |                 title=parsed_spec["title"],
227 |                 version=parsed_spec["version"],
228 |                 base_url=parsed_spec["base_url"],
229 |                 endpoints=parsed_spec["endpoints"],
230 |                 collection_ids=filtered_collection_ids,
231 |                 supported_crs=parsed_spec["supported_crs"],
232 |                 crs_guide=parsed_spec["crs_guide"],
233 |             )
234 |             return spec
235 |         except Exception as e:
236 |             raise ValueError(f"Failed to get OpenAPI spec: {e}")
237 | 
238 |     async def cache_openapi_spec(self) -> OpenAPISpecification:
239 |         """
240 |         Cache the OpenAPI spec.
241 | 
242 |         Returns:
243 |             The cached OpenAPI spec
244 |         """
245 |         if self._cached_openapi_spec is None:
246 |             logger.debug("Caching OpenAPI spec for LLM context...")
247 |             try:
248 |                 self._cached_openapi_spec = await self._get_open_api_spec()
249 |                 logger.debug("OpenAPI spec successfully cached")
250 |             except Exception as e:
251 |                 raise ValueError(f"Failed to cache OpenAPI spec: {e}")
252 |         return self._cached_openapi_spec
253 | 
254 |     async def _get_collections(self) -> CollectionsCache:
255 |         """Get all collections from the OS NGD API"""
256 |         try:
257 |             response = await self.make_request("COLLECTIONS")
258 |             collections_list = response.get("collections", [])
259 |             filtered = self._filter_latest_collections(collections_list)
260 |             logger.debug(f"Filtered collections: {len(filtered)} collections")
261 |             return CollectionsCache(collections=filtered, raw_response=response)
262 |         except Exception as e:
263 |             sanitized_error = self._sanitise_api_key(str(e))
264 |             logger.error(f"Error getting collections: {sanitized_error}")
265 |             raise ValueError(f"Failed to get collections: {sanitized_error}")
266 | 
267 |     async def cache_collections(self) -> CollectionsCache:
268 |         """
269 |         Cache the collections data with filtering applied.
270 | 
271 |         Returns:
272 |             The cached collections
273 |         """
274 |         if self._cached_collections is None:
275 |             logger.debug("Caching collections for LLM context...")
276 |             try:
277 |                 self._cached_collections = await self._get_collections()
278 |                 logger.debug(
279 |                     f"Collections successfully cached - {len(self._cached_collections.collections)} collections after filtering"
280 |                 )
281 |             except Exception as e:
282 |                 sanitized_error = self._sanitise_api_key(str(e))
283 |                 raise ValueError(f"Failed to cache collections: {sanitized_error}")
284 |         return self._cached_collections
285 | 
286 |     async def fetch_collections_queryables(
287 |         self, collection_ids: List[str]
288 |     ) -> Dict[str, CollectionQueryables]:
289 |         """Fetch detailed queryables for specific collections only"""
290 |         if not collection_ids:
291 |             return {}
292 | 
293 |         logger.debug(f"Fetching queryables for specific collections: {collection_ids}")
294 | 
295 |         collections_cache = await self.cache_collections()
296 |         collections_map = {coll.id: coll for coll in collections_cache.collections}
297 | 
298 |         tasks = [
299 |             self.make_request("COLLECTION_QUERYABLES", path_params=[collection_id])
300 |             for collection_id in collection_ids
301 |             if collection_id in collections_map
302 |         ]
303 | 
304 |         if not tasks:
305 |             return {}
306 | 
307 |         raw_queryables = await asyncio.gather(*tasks, return_exceptions=True)
308 | 
309 |         def process_single_collection_queryables(collection_id, queryables_data):
310 |             collection = collections_map[collection_id]
311 |             logger.debug(
312 |                 f"Processing collection {collection.id} in thread {threading.current_thread().name}"
313 |             )
314 | 
315 |             if isinstance(queryables_data, Exception):
316 |                 logger.warning(
317 |                     f"Failed to fetch queryables for {collection.id}: {queryables_data}"
318 |                 )
319 |                 return (
320 |                     collection.id,
321 |                     CollectionQueryables(
322 |                         id=collection.id,
323 |                         title=collection.title,
324 |                         description=collection.description,
325 |                         all_queryables={},
326 |                         enum_queryables={},
327 |                         has_enum_filters=False,
328 |                         total_queryables=0,
329 |                         enum_count=0,
330 |                     ),
331 |                 )
332 | 
333 |             all_queryables = {}
334 |             enum_queryables = {}
335 |             properties = queryables_data.get("properties", {})
336 | 
337 |             for prop_name, prop_details in properties.items():
338 |                 prop_type = prop_details.get("type", ["string"])
339 |                 if isinstance(prop_type, list):
340 |                     main_type = prop_type[0] if prop_type else "string"
341 |                     is_nullable = "null" in prop_type
342 |                 else:
343 |                     main_type = prop_type
344 |                     is_nullable = False
345 | 
346 |                 all_queryables[prop_name] = {
347 |                     "type": main_type,
348 |                     "nullable": is_nullable,
349 |                     "max_length": prop_details.get("maxLength"),
350 |                     "format": prop_details.get("format"),
351 |                     "pattern": prop_details.get("pattern"),
352 |                     "minimum": prop_details.get("minimum"),
353 |                     "maximum": prop_details.get("maximum"),
354 |                     "is_enum": prop_details.get("enumeration", False),
355 |                 }
356 | 
357 |                 if prop_details.get("enumeration") and "enum" in prop_details:
358 |                     enum_queryables[prop_name] = {
359 |                         "values": prop_details["enum"],
360 |                         "type": main_type,
361 |                         "nullable": is_nullable,
362 |                         "max_length": prop_details.get("maxLength"),
363 |                     }
364 |                     all_queryables[prop_name]["enum_values"] = prop_details["enum"]
365 | 
366 |                 all_queryables[prop_name] = {
367 |                     k: v for k, v in all_queryables[prop_name].items() if v is not None
368 |                 }
369 | 
370 |             return (
371 |                 collection.id,
372 |                 CollectionQueryables(
373 |                     id=collection.id,
374 |                     title=collection.title,
375 |                     description=collection.description,
376 |                     all_queryables=all_queryables,
377 |                     enum_queryables=enum_queryables,
378 |                     has_enum_filters=len(enum_queryables) > 0,
379 |                     total_queryables=len(all_queryables),
380 |                     enum_count=len(enum_queryables),
381 |                 ),
382 |             )
383 | 
384 |         with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
385 |             collection_data_pairs = list(zip(collection_ids, raw_queryables))
386 |             processed = await asyncio.get_event_loop().run_in_executor(
387 |                 executor,
388 |                 lambda: [
389 |                     process_single_collection_queryables(coll_id, data)
390 |                     for coll_id, data in collection_data_pairs
391 |                     if coll_id in collections_map
392 |                 ],
393 |             )
394 | 
395 |         return {coll_id: queryables for coll_id, queryables in processed}
396 | 
397 |     # Public async methods
398 |     async def initialise(self):
399 |         """Initialise the aiohttp session if not already created"""
400 |         if self.session is None:
401 |             self.session = aiohttp.ClientSession(
402 |                 connector=aiohttp.TCPConnector(
403 |                     force_close=True,
404 |                     limit=1,  # TODO: Strict limit to only 1 connection - may need to revisit this
405 |                 )
406 |             )
407 | 
408 |     async def close(self):
409 |         """Close the session when done"""
410 |         if self.session:
411 |             await self.session.close()
412 |             self.session = None
413 |             self._cached_openapi_spec = None
414 |             self._cached_collections = None
415 | 
416 |     async def get_api_key(self) -> str:
417 |         """Get the OS API key from environment variable or init param."""
418 |         if self.api_key:
419 |             return self.api_key
420 | 
421 |         api_key = os.environ.get("OS_API_KEY")
422 |         if not api_key:
423 |             raise ValueError("OS_API_KEY environment variable is not set")
424 |         return api_key
425 | 
426 |     async def make_request(
427 |         self,
428 |         endpoint: str,
429 |         params: Optional[Dict[str, Any]] = None,
430 |         path_params: Optional[List[str]] = None,
431 |         max_retries: int = 2,
432 |     ) -> Dict[str, Any]:
433 |         """
434 |         Make a request to the OS NGD API with proper error handling.
435 | 
436 |         Args:
437 |             endpoint: Enum endpoint to use
438 |             params: Additional query parameters
439 |             path_params: Parameters to format into the URL path
440 |             max_retries: Maximum number of retries for transient errors
441 | 
442 |         Returns:
443 |             JSON response as dictionary
444 |         """
445 |         await self.initialise()
446 | 
447 |         if self.session is None:
448 |             raise ValueError("Session not initialised")
449 | 
450 |         current_time = asyncio.get_event_loop().time()
451 |         elapsed = current_time - self.last_request_time
452 |         if elapsed < self.request_delay:
453 |             await asyncio.sleep(self.request_delay - elapsed)
454 | 
455 |         try:
456 |             endpoint_value = NGDAPIEndpoint[endpoint].value
457 |         except KeyError:
458 |             raise ValueError(f"Invalid endpoint: {endpoint}")
459 | 
460 |         if path_params:
461 |             endpoint_value = endpoint_value.format(*path_params)
462 | 
463 |         api_key = await self.get_api_key()
464 |         request_params = params or {}
465 |         request_params["key"] = api_key
466 | 
467 |         headers = {"User-Agent": self.user_agent, "Accept": "application/json"}
468 | 
469 |         client_ip = getattr(self.session, "_source_address", None)
470 |         client_info = f" from {client_ip}" if client_ip else ""
471 | 
472 |         sanitized_url = self._sanitise_api_key(endpoint_value)
473 |         logger.info(f"Requesting URL: {sanitized_url}{client_info}")
474 | 
475 |         for attempt in range(1, max_retries + 1):
476 |             try:
477 |                 self.last_request_time = asyncio.get_event_loop().time()
478 | 
479 |                 timeout = aiohttp.ClientTimeout(total=30.0)
480 |                 async with self.session.get(
481 |                     endpoint_value,
482 |                     params=request_params,
483 |                     headers=headers,
484 |                     timeout=timeout,
485 |                 ) as response:
486 |                     if response.status >= 400:
487 |                         error_text = await response.text()
488 |                         sanitized_error = self._sanitise_api_key(error_text)
489 |                         error_message = (
490 |                             f"HTTP Error: {response.status} - {sanitized_error}"
491 |                         )
492 |                         logger.error(f"Error: {error_message}")
493 |                         raise ValueError(error_message)
494 | 
495 |                     response_data = await response.json()
496 | 
497 |                     return self._sanitise_response(response_data)
498 |             except (aiohttp.ClientError, asyncio.TimeoutError) as e:
499 |                 if attempt == max_retries:
500 |                     sanitized_exception = self._sanitise_api_key(str(e))
501 |                     error_message = f"Request failed after {max_retries} attempts: {sanitized_exception}"
502 |                     logger.error(f"Error: {error_message}")
503 |                     raise ValueError(error_message)
504 |                 else:
505 |                     await asyncio.sleep(0.7)
506 |             except Exception as e:
507 |                 sanitized_exception = self._sanitise_api_key(str(e))
508 |                 error_message = f"Request failed: {sanitized_exception}"
509 |                 logger.error(f"Error: {error_message}")
510 |                 raise ValueError(error_message)
511 |         raise RuntimeError(
512 |             "Unreachable: make_request exited retry loop without returning or raising"
513 |         )
514 | 
515 |     async def make_request_no_auth(
516 |         self,
517 |         url: str,
518 |         params: Optional[Dict[str, Any]] = None,
519 |         max_retries: int = 2,
520 |     ) -> str:
521 |         """
522 |         Make a request without authentication (for public endpoints like documentation).
523 | 
524 |         Args:
525 |             url: Full URL to request
526 |             params: Additional query parameters
527 |             max_retries: Maximum number of retries for transient errors
528 | 
529 |         Returns:
530 |             Response text (not JSON parsed)
531 |         """
532 |         await self.initialise()
533 | 
534 |         if self.session is None:
535 |             raise ValueError("Session not initialised")
536 | 
537 |         current_time = asyncio.get_event_loop().time()
538 |         elapsed = current_time - self.last_request_time
539 |         if elapsed < self.request_delay:
540 |             await asyncio.sleep(self.request_delay - elapsed)
541 | 
542 |         request_params = params or {}
543 |         headers = {"User-Agent": self.user_agent}
544 | 
545 |         logger.info(f"Requesting URL (no auth): {url}")
546 | 
547 |         for attempt in range(1, max_retries + 1):
548 |             try:
549 |                 self.last_request_time = asyncio.get_event_loop().time()
550 | 
551 |                 timeout = aiohttp.ClientTimeout(total=30.0)
552 |                 async with self.session.get(
553 |                     url,
554 |                     params=request_params,
555 |                     headers=headers,
556 |                     timeout=timeout,
557 |                 ) as response:
558 |                     if response.status >= 400:
559 |                         error_text = await response.text()
560 |                         error_message = f"HTTP Error: {response.status} - {error_text}"
561 |                         logger.error(f"Error: {error_message}")
562 |                         raise ValueError(error_message)
563 | 
564 |                     return await response.text()
565 | 
566 |             except (aiohttp.ClientError, asyncio.TimeoutError) as e:
567 |                 if attempt == max_retries:
568 |                     error_message = (
569 |                         f"Request failed after {max_retries} attempts: {str(e)}"
570 |                     )
571 |                     logger.error(f"Error: {error_message}")
572 |                     raise ValueError(error_message)
573 |                 else:
574 |                     await asyncio.sleep(0.7)
575 |             except Exception as e:
576 |                 error_message = f"Request failed: {str(e)}"
577 |                 logger.error(f"Error: {error_message}")
578 |                 raise ValueError(error_message)
579 | 
580 |         raise RuntimeError(
581 |             "Unreachable: make_request_no_auth exited retry loop without returning or raising"
582 |         )
583 | 
```

--------------------------------------------------------------------------------
/src/mcp_service/os_service.py:
--------------------------------------------------------------------------------

```python
  1 | import json
  2 | import asyncio
  3 | import functools
  4 | import re
  5 | 
  6 | from typing import Optional, List, Dict, Any, Union, Callable
  7 | from api_service.protocols import APIClient
  8 | from prompt_templates.prompt_templates import PROMPT_TEMPLATES
  9 | from mcp_service.protocols import MCPService, FeatureService
 10 | from mcp_service.guardrails import ToolGuardrails
 11 | from workflow_generator.workflow_planner import WorkflowPlanner
 12 | from utils.logging_config import get_logger
 13 | from mcp_service.resources import OSDocumentationResources
 14 | from mcp_service.prompts import OSWorkflowPrompts
 15 | from mcp_service.routing_service import OSRoutingService
 16 | 
 17 | logger = get_logger(__name__)
 18 | 
 19 | 
 20 | class OSDataHubService(FeatureService):
 21 |     """Implementation of the OS NGD API service with MCP"""
 22 | 
 23 |     def __init__(
 24 |         self, api_client: APIClient, mcp_service: MCPService, stdio_middleware=None
 25 |     ):
 26 |         """
 27 |         Initialise the OS NGD service
 28 | 
 29 |         Args:
 30 |             api_client: API client implementation
 31 |             mcp_service: MCP service implementation
 32 |             stdio_middleware: Optional STDIO middleware for rate limiting
 33 |         """
 34 |         self.api_client = api_client
 35 |         self.mcp = mcp_service
 36 |         self.stdio_middleware = stdio_middleware
 37 |         self.workflow_planner: Optional[WorkflowPlanner] = None
 38 |         self.guardrails = ToolGuardrails()
 39 |         self.routing_service = OSRoutingService(api_client)
 40 |         self.register_tools()
 41 |         self.register_resources()
 42 |         self.register_prompts()
 43 | 
 44 |     # Register all the resources, tools, and prompts
 45 |     def register_resources(self) -> None:
 46 |         """Register all MCP resources"""
 47 |         doc_resources = OSDocumentationResources(self.mcp, self.api_client)
 48 |         doc_resources.register_all()
 49 | 
 50 |     def register_tools(self) -> None:
 51 |         """Register all MCP tools with guardrails and middleware"""
 52 | 
 53 |         def apply_middleware(func: Callable) -> Callable:
 54 |             wrapped = self.guardrails.basic_guardrails(func)
 55 |             wrapped = self._require_workflow_context(wrapped)
 56 |             if self.stdio_middleware:
 57 |                 wrapped = self.stdio_middleware.require_auth_and_rate_limit(wrapped)
 58 |             return wrapped
 59 | 
 60 |         # Apply middleware to ALL tools
 61 |         self.get_workflow_context = self.mcp.tool()(
 62 |             apply_middleware(self.get_workflow_context)
 63 |         )
 64 |         self.hello_world = self.mcp.tool()(apply_middleware(self.hello_world))
 65 |         self.check_api_key = self.mcp.tool()(apply_middleware(self.check_api_key))
 66 |         self.list_collections = self.mcp.tool()(apply_middleware(self.list_collections))
 67 |         self.get_single_collection = self.mcp.tool()(
 68 |             apply_middleware(self.get_single_collection)
 69 |         )
 70 |         self.get_single_collection_queryables = self.mcp.tool()(
 71 |             apply_middleware(self.get_single_collection_queryables)
 72 |         )
 73 |         self.search_features = self.mcp.tool()(apply_middleware(self.search_features))
 74 |         self.get_feature = self.mcp.tool()(apply_middleware(self.get_feature))
 75 |         self.get_linked_identifiers = self.mcp.tool()(
 76 |             apply_middleware(self.get_linked_identifiers)
 77 |         )
 78 |         self.get_bulk_features = self.mcp.tool()(
 79 |             apply_middleware(self.get_bulk_features)
 80 |         )
 81 |         self.get_bulk_linked_features = self.mcp.tool()(
 82 |             apply_middleware(self.get_bulk_linked_features)
 83 |         )
 84 |         self.get_prompt_templates = self.mcp.tool()(
 85 |             apply_middleware(self.get_prompt_templates)
 86 |         )
 87 |         self.fetch_detailed_collections = self.mcp.tool()(
 88 |             apply_middleware(self.fetch_detailed_collections)
 89 |         )
 90 |         self.get_routing_data = self.mcp.tool()(apply_middleware(self.get_routing_data))
 91 | 
 92 |     def register_prompts(self) -> None:
 93 |         """Register all MCP prompts"""
 94 |         workflow_prompts = OSWorkflowPrompts(self.mcp)
 95 |         workflow_prompts.register_all()
 96 | 
 97 |     # Run the MCP service
 98 |     def run(self) -> None:
 99 |         """Run the MCP service"""
100 |         try:
101 |             self.mcp.run()
102 |         finally:
103 |             try:
104 |                 try:
105 |                     loop = asyncio.get_running_loop()
106 |                     loop.create_task(self._cleanup())
107 |                 except RuntimeError:
108 |                     asyncio.run(self._cleanup())
109 |             except Exception as e:
110 |                 logger.error(f"Error during cleanup: {e}")
111 | 
112 |     async def _cleanup(self):
113 |         """Async cleanup method"""
114 |         try:
115 |             if hasattr(self, "api_client") and self.api_client:
116 |                 await self.api_client.close()
117 |                 logger.debug("API client closed successfully")
118 |         except Exception as e:
119 |             logger.error(f"Error closing API client: {e}")
120 | 
121 |     # Get the workflow context from the cached API client data
122 |     # TODO: Lots of work to do here to reduce the size of the context and make it more readable for the LLM but not sacrificing the information
123 |     async def get_workflow_context(self) -> str:
124 |         """Get basic workflow context - no detailed queryables yet"""
125 |         try:
126 |             if self.workflow_planner is None:
127 |                 collections_cache = await self.api_client.cache_collections()
128 |                 basic_collections_info = {
129 |                     coll.id: {
130 |                         "id": coll.id,
131 |                         "title": coll.title,
132 |                         "description": coll.description,
133 |                         # No queryables here - will be fetched on-demand
134 |                     }
135 |                     for coll in collections_cache.collections
136 |                 }
137 | 
138 |                 self.workflow_planner = WorkflowPlanner(
139 |                     await self.api_client.cache_openapi_spec(), basic_collections_info
140 |                 )
141 | 
142 |             context = self.workflow_planner.get_basic_context()
143 |             return json.dumps(
144 |                 {
145 |                     "CRITICAL_COLLECTION_LIST": sorted(
146 |                         context["available_collections"].keys()
147 |                     ),
148 |                     "MANDATORY_PLANNING_REQUIREMENT": {
149 |                         "CRITICAL": "You MUST follow the 2-step planning process:",
150 |                         "step_1": "Explain your complete plan listing which specific collections you will use and why",
151 |                         "step_2": "Call fetch_detailed_collections('collection-id-1,collection-id-2') to get queryables for those collections BEFORE making search calls",
152 |                         "required_explanation": {
153 |                             "1": "Which collections you will use and why",
154 |                             "2": "What you expect to find in those collections",
155 |                             "3": "What your search strategy will be",
156 |                         },
157 |                         "workflow_enforcement": "Do not proceed with search_features until you have fetched detailed queryables",
158 |                         "example_planning": "I will use 'lus-fts-site-1' for finding cinemas. Let me fetch its detailed queryables first...",
159 |                     },
160 |                     "available_collections": context[
161 |                         "available_collections"
162 |                     ],  # Basic info only - no queryables yet - this is to reduce the size of the context for the LLM
163 |                     "openapi_spec": context["openapi_spec"].model_dump()
164 |                     if context["openapi_spec"]
165 |                     else None,
166 |                     "TWO_STEP_WORKFLOW": {
167 |                         "step_1": "Plan with basic collection info (no detailed queryables available yet)",
168 |                         "step_2": "Use fetch_detailed_collections() to get queryables for your chosen collections",
169 |                         "step_3": "Execute search_features with proper filters using the fetched queryables",
170 |                     },
171 |                     "AVAILABLE_TOOLS": {
172 |                         "fetch_detailed_collections": "Get detailed queryables for specific collections: fetch_detailed_collections('lus-fts-site-1,trn-ntwk-street-1')",
173 |                         "search_features": "Search features (requires detailed queryables first)",
174 |                     },
175 |                     "QUICK_FILTERING_GUIDE": {
176 |                         "primary_tool": "search_features",
177 |                         "key_parameter": "filter",
178 |                         "enum_fields": "Use exact values from collection's enum_queryables (fetch these first!)",
179 |                         "simple_fields": "Use direct values (e.g., usrn = 12345678)",
180 |                     },
181 |                     "COMMON_EXAMPLES": {
182 |                         "workflow_example": "1) Explain plan → 2) fetch_detailed_collections('lus-fts-site-1') → 3) search_features with proper filter",
183 |                         "cinema_search": "After fetching queryables: search_features(collection_id='lus-fts-site-1', filter=\"oslandusetertiarygroup = 'Cinema'\")",
184 |                     },
185 |                     "CRITICAL_RULES": {
186 |                         "1": "ALWAYS explain your plan first",
187 |                         "2": "ALWAYS call fetch_detailed_collections() before search_features",
188 |                         "3": "Use exact enum values from the fetched enum_queryables",
189 |                         "4": "Quote string values in single quotes",
190 |                     },
191 |                 }
192 |             )
193 | 
194 |         except Exception as e:
195 |             logger.error(f"Error getting workflow context: {e}")
196 |             return json.dumps(
197 |                 {"error": str(e), "instruction": "Proceed with available tools"}
198 |             )
199 | 
200 |     def _require_workflow_context(self, func: Callable) -> Callable:
201 |         # Functions that don't need workflow context
202 |         skip_functions = {"get_workflow_context", "hello_world", "check_api_key"}
203 | 
204 |         @functools.wraps(func)
205 |         async def wrapper(*args: Any, **kwargs: Any) -> Any:
206 |             if self.workflow_planner is None:
207 |                 if func.__name__ in skip_functions:
208 |                     return await func(*args, **kwargs)
209 |                 else:
210 |                     return json.dumps(
211 |                         {
212 |                             "error": "WORKFLOW CONTEXT REQUIRED",
213 |                             "blocked_tool": func.__name__,
214 |                             "required_action": "You must call 'get_workflow_context' first",
215 |                             "message": "No tools are available until you get the workflow context. Please call get_workflow_context() now.",
216 |                         }
217 |                     )
218 |             return await func(*args, **kwargs)
219 | 
220 |         return wrapper
221 | 
222 |     # TODO: This is a bit of a hack - we need to improve the error handling and retry logic
223 |     # TODO: Could we actually spawn a seperate AI agent to handle the retry logic and return the result to the main agent?
224 |     def _add_retry_context(self, response_data: dict, tool_name: str) -> dict:
225 |         """Add retry guidance to tool responses"""
226 |         if "error" in response_data:
227 |             response_data["retry_guidance"] = {
228 |                 "tool": tool_name,
229 |                 "MANDATORY_INSTRUCTION 1": "Review the error message and try again with corrected parameters",
230 |                 "MANDATORY_INSTRUCTION 2": "YOU MUST call get_workflow_context() if you need to see available options again",
231 |             }
232 |         return response_data
233 | 
234 |     # All the tools
235 |     async def hello_world(self, name: str) -> str:
236 |         """Simple hello world tool for testing"""
237 |         return f"Hello, {name}! 👋"
238 | 
239 |     async def check_api_key(self) -> str:
240 |         """Check if the OS API key is available."""
241 |         try:
242 |             await self.api_client.get_api_key()
243 |             return json.dumps({"status": "success", "message": "OS_API_KEY is set!"})
244 |         except ValueError as e:
245 |             return json.dumps({"status": "error", "message": str(e)})
246 | 
247 |     async def list_collections(
248 |         self,
249 |     ) -> str:
250 |         """
251 |         List all available feature collections in the OS NGD API.
252 | 
253 |         Returns:
254 |             JSON string with collection info (id, title only)
255 |         """
256 |         try:
257 |             data = await self.api_client.make_request("COLLECTIONS")
258 | 
259 |             if not data or "collections" not in data:
260 |                 return json.dumps({"error": "No collections found"})
261 | 
262 |             collections = [
263 |                 {"id": col.get("id"), "title": col.get("title")}
264 |                 for col in data.get("collections", [])
265 |             ]
266 | 
267 |             return json.dumps({"collections": collections})
268 |         except Exception as e:
269 |             error_response = {"error": str(e)}
270 |             return json.dumps(
271 |                 self._add_retry_context(error_response, "list_collections")
272 |             )
273 | 
274 |     async def get_single_collection(
275 |         self,
276 |         collection_id: str,
277 |     ) -> str:
278 |         """
279 |         Get detailed information about a specific collection.
280 | 
281 |         Args:
282 |             collection_id: The collection ID
283 | 
284 |         Returns:
285 |             JSON string with collection information
286 |         """
287 |         try:
288 |             data = await self.api_client.make_request(
289 |                 "COLLECTION_INFO", path_params=[collection_id]
290 |             )
291 | 
292 |             return json.dumps(data)
293 |         except Exception as e:
294 |             error_response = {"error": str(e)}
295 |             return json.dumps(
296 |                 self._add_retry_context(error_response, "get_collection_info")
297 |             )
298 | 
299 |     async def get_single_collection_queryables(
300 |         self,
301 |         collection_id: str,
302 |     ) -> str:
303 |         """
304 |         Get the list of queryable properties for a collection.
305 | 
306 |         Args:
307 |             collection_id: The collection ID
308 | 
309 |         Returns:
310 |             JSON string with queryable properties
311 |         """
312 |         try:
313 |             data = await self.api_client.make_request(
314 |                 "COLLECTION_QUERYABLES", path_params=[collection_id]
315 |             )
316 | 
317 |             return json.dumps(data)
318 |         except Exception as e:
319 |             error_response = {"error": str(e)}
320 |             return json.dumps(
321 |                 self._add_retry_context(error_response, "get_collection_queryables")
322 |             )
323 | 
324 |     async def search_features(
325 |         self,
326 |         collection_id: str,
327 |         bbox: Optional[str] = None,
328 |         crs: Optional[str] = None,
329 |         limit: int = 10,
330 |         offset: int = 0,
331 |         filter: Optional[str] = None,
332 |         filter_lang: Optional[str] = "cql-text",
333 |         query_attr: Optional[str] = None,
334 |         query_attr_value: Optional[str] = None,
335 |     ) -> str:
336 |         """Search for features in a collection with full CQL2 filter support."""
337 |         try:
338 |             params: Dict[str, Union[str, int]] = {}
339 | 
340 |             if limit:
341 |                 params["limit"] = min(limit, 100)
342 |             if offset:
343 |                 params["offset"] = max(0, offset)
344 |             if bbox:
345 |                 params["bbox"] = bbox
346 |             if crs:
347 |                 params["crs"] = crs
348 |             if filter:
349 |                 if len(filter) > 1000:
350 |                     raise ValueError("Filter too long")
351 |                 dangerous_patterns = [
352 |                     r";\s*--",
353 |                     r";\s*/\*",
354 |                     r"\bUNION\b",
355 |                     r"\bSELECT\b",
356 |                     r"\bINSERT\b",
357 |                     r"\bUPDATE\b",
358 |                     r"\bDELETE\b",
359 |                     r"\bDROP\b",
360 |                     r"\bCREATE\b",
361 |                     r"\bALTER\b",
362 |                     r"\bTRUNCATE\b",
363 |                     r"\bEXEC\b",
364 |                     r"\bEXECUTE\b",
365 |                     r"\bSP_\b",
366 |                     r"\bXP_\b",
367 |                     r"<script\b",
368 |                     r"javascript:",
369 |                     r"vbscript:",
370 |                     r"onload\s*=",
371 |                     r"onerror\s*=",
372 |                     r"onclick\s*=",
373 |                     r"\beval\s*\(",
374 |                     r"document\.",
375 |                     r"window\.",
376 |                     r"location\.",
377 |                     r"cookie",
378 |                     r"innerHTML",
379 |                     r"outerHTML",
380 |                     r"alert\s*\(",
381 |                     r"confirm\s*\(",
382 |                     r"prompt\s*\(",
383 |                     r"setTimeout\s*\(",
384 |                     r"setInterval\s*\(",
385 |                     r"Function\s*\(",
386 |                     r"constructor",
387 |                     r"prototype",
388 |                     r"__proto__",
389 |                     r"process\.",
390 |                     r"require\s*\(",
391 |                     r"import\s+",
392 |                     r"from\s+.*import",
393 |                     r"\.\./",
394 |                     r"file://",
395 |                     r"ftp://",
396 |                     r"data:",
397 |                     r"blob:",
398 |                     r"\\x[0-9a-fA-F]{2}",
399 |                     r"%[0-9a-fA-F]{2}",
400 |                     r"&#x[0-9a-fA-F]+;",
401 |                     r"&[a-zA-Z]+;",
402 |                     r"\$\{",
403 |                     r"#\{",
404 |                     r"<%",
405 |                     r"%>",
406 |                     r"{{",
407 |                     r"}}",
408 |                     r"\\\w+",
409 |                     r"\0",
410 |                     r"\r\n",
411 |                     r"\n\r",
412 |                 ]
413 | 
414 |                 for pattern in dangerous_patterns:
415 |                     if re.search(pattern, filter, re.IGNORECASE):
416 |                         raise ValueError("Invalid filter content")
417 | 
418 |                 if filter.count("'") % 2 != 0:
419 |                     raise ValueError("Unmatched quotes in filter")
420 | 
421 |                 params["filter"] = filter.strip()
422 |                 if filter_lang:
423 |                     params["filter-lang"] = filter_lang
424 | 
425 |             elif query_attr and query_attr_value:
426 |                 if not re.match(r"^[a-zA-Z_][a-zA-Z0-9_]*$", query_attr):
427 |                     raise ValueError("Invalid field name")
428 | 
429 |                 escaped_value = str(query_attr_value).replace("'", "''")
430 |                 params["filter"] = f"{query_attr} = '{escaped_value}'"
431 |                 if filter_lang:
432 |                     params["filter-lang"] = filter_lang
433 | 
434 |             if self.workflow_planner:
435 |                 valid_collections = set(
436 |                     self.workflow_planner.basic_collections_info.keys()
437 |                 )
438 |                 if collection_id not in valid_collections:
439 |                     return json.dumps(
440 |                         {
441 |                             "error": f"Invalid collection '{collection_id}'. Valid collections: {sorted(valid_collections)[:10]}...",
442 |                             "suggestion": "Call get_workflow_context() to see all available collections",
443 |                         }
444 |                     )
445 | 
446 |             data = await self.api_client.make_request(
447 |                 "COLLECTION_FEATURES", params=params, path_params=[collection_id]
448 |             )
449 | 
450 |             return json.dumps(data)
451 |         except ValueError as ve:
452 |             error_response = {"error": f"Invalid input: {str(ve)}"}
453 |             return json.dumps(
454 |                 self._add_retry_context(error_response, "search_features")
455 |             )
456 |         except Exception as e:
457 |             error_response = {"error": str(e)}
458 |             return json.dumps(
459 |                 self._add_retry_context(error_response, "search_features")
460 |             )
461 | 
462 |     async def get_feature(
463 |         self,
464 |         collection_id: str,
465 |         feature_id: str,
466 |         crs: Optional[str] = None,
467 |     ) -> str:
468 |         """
469 |         Get a specific feature by ID.
470 | 
471 |         Args:
472 |             collection_id: The collection ID
473 |             feature_id: The feature ID
474 |             crs: Coordinate reference system for the response
475 | 
476 |         Returns:
477 |             JSON string with feature data
478 |         """
479 |         try:
480 |             params: Dict[str, str] = {}
481 |             if crs:
482 |                 params["crs"] = crs
483 | 
484 |             data = await self.api_client.make_request(
485 |                 "COLLECTION_FEATURE_BY_ID",
486 |                 params=params,
487 |                 path_params=[collection_id, feature_id],
488 |             )
489 | 
490 |             return json.dumps(data)
491 |         except Exception as e:
492 |             error_response = {"error": f"Error getting feature: {str(e)}"}
493 |             return json.dumps(self._add_retry_context(error_response, "get_feature"))
494 | 
495 |     async def get_linked_identifiers(
496 |         self,
497 |         identifier_type: str,
498 |         identifier: str,
499 |         feature_type: Optional[str] = None,
500 |     ) -> str:
501 |         """
502 |         Get linked identifiers for a specified identifier.
503 | 
504 |         Args:
505 |             identifier_type: The type of identifier (e.g., 'TOID', 'UPRN')
506 |             identifier: The identifier value
507 |             feature_type: Optional feature type to filter results
508 | 
509 |         Returns:
510 |             JSON string with linked identifiers or filtered results
511 |         """
512 |         try:
513 |             data = await self.api_client.make_request(
514 |                 "LINKED_IDENTIFIERS", path_params=[identifier_type, identifier]
515 |             )
516 | 
517 |             if feature_type:
518 |                 # Filter results by feature type
519 |                 filtered_results = []
520 |                 for item in data.get("results", []):
521 |                     if item.get("featureType") == feature_type:
522 |                         filtered_results.append(item)
523 |                 return json.dumps({"results": filtered_results})
524 | 
525 |             return json.dumps(data)
526 |         except Exception as e:
527 |             error_response = {"error": str(e)}
528 |             return json.dumps(
529 |                 self._add_retry_context(error_response, "get_linked_identifiers")
530 |             )
531 | 
532 |     async def get_bulk_features(
533 |         self,
534 |         collection_id: str,
535 |         identifiers: List[str],
536 |         query_by_attr: Optional[str] = None,
537 |     ) -> str:
538 |         """
539 |         Get multiple features in a single call.
540 | 
541 |         Args:
542 |             collection_id: The collection ID
543 |             identifiers: List of feature identifiers
544 |             query_by_attr: Attribute to query by (if not provided, assumes feature IDs)
545 | 
546 |         Returns:
547 |             JSON string with features data
548 |         """
549 |         try:
550 |             tasks: List[Any] = []
551 |             for identifier in identifiers:
552 |                 if query_by_attr:
553 |                     task = self.search_features(
554 |                         collection_id=collection_id,
555 |                         query_attr=query_by_attr,
556 |                         query_attr_value=identifier,
557 |                         limit=1,
558 |                     )
559 |                 else:
560 |                     task = self.get_feature(collection_id, identifier)
561 | 
562 |                 tasks.append(task)
563 | 
564 |             results = await asyncio.gather(*tasks)
565 | 
566 |             parsed_results = [json.loads(result) for result in results]
567 | 
568 |             return json.dumps({"results": parsed_results})
569 |         except Exception as e:
570 |             error_response = {"error": str(e)}
571 |             return json.dumps(
572 |                 self._add_retry_context(error_response, "get_bulk_features")
573 |             )
574 | 
575 |     async def get_bulk_linked_features(
576 |         self,
577 |         identifier_type: str,
578 |         identifiers: List[str],
579 |         feature_type: Optional[str] = None,
580 |     ) -> str:
581 |         """
582 |         Get linked features for multiple identifiers in a single call.
583 | 
584 |         Args:
585 |             identifier_type: The type of identifier (e.g., 'TOID', 'UPRN')
586 |             identifiers: List of identifier values
587 |             feature_type: Optional feature type to filter results
588 | 
589 |         Returns:
590 |             JSON string with linked features data
591 |         """
592 |         try:
593 |             tasks = [
594 |                 self.get_linked_identifiers(identifier_type, identifier, feature_type)
595 |                 for identifier in identifiers
596 |             ]
597 | 
598 |             results = await asyncio.gather(*tasks)
599 | 
600 |             parsed_results = [json.loads(result) for result in results]
601 | 
602 |             return json.dumps({"results": parsed_results})
603 |         except Exception as e:
604 |             error_response = {"error": str(e)}
605 |             return json.dumps(
606 |                 self._add_retry_context(error_response, "get_bulk_linked_features")
607 |             )
608 | 
609 |     async def get_prompt_templates(
610 |         self,
611 |         category: Optional[str] = None,
612 |     ) -> str:
613 |         """
614 |         Get standard prompt templates for interacting with this service.
615 | 
616 |         Args:
617 |             category: Optional category of templates to return
618 |                      (general, collections, features, linked_identifiers)
619 | 
620 |         Returns:
621 |             JSON string containing prompt templates
622 |         """
623 |         if category and category in PROMPT_TEMPLATES:
624 |             return json.dumps({category: PROMPT_TEMPLATES[category]})
625 | 
626 |         return json.dumps(PROMPT_TEMPLATES)
627 | 
628 |     async def fetch_detailed_collections(self, collection_ids: str) -> str:
629 |         """
630 |         Fetch detailed queryables for specific collections mentioned in LLM workflow plan.
631 | 
632 |         This is mainly to reduce the size of the context for the LLM.
633 | 
634 |         Only fetch what you really need.
635 | 
636 |         Args:
637 |             collection_ids: Comma-separated list of collection IDs (e.g., "lus-fts-site-1,trn-ntwk-street-1")
638 | 
639 |         Returns:
640 |             JSON string with detailed queryables for the specified collections
641 |         """
642 |         try:
643 |             if not self.workflow_planner:
644 |                 return json.dumps(
645 |                     {
646 |                         "error": "Workflow planner not initialized. Call get_workflow_context() first."
647 |                     }
648 |                 )
649 | 
650 |             requested_collections = [cid.strip() for cid in collection_ids.split(",")]
651 | 
652 |             valid_collections = set(self.workflow_planner.basic_collections_info.keys())
653 |             invalid_collections = [
654 |                 cid for cid in requested_collections if cid not in valid_collections
655 |             ]
656 | 
657 |             if invalid_collections:
658 |                 return json.dumps(
659 |                     {
660 |                         "error": f"Invalid collection IDs: {invalid_collections}",
661 |                         "valid_collections": sorted(valid_collections),
662 |                     }
663 |                 )
664 | 
665 |             cached_collections = [
666 |                 cid
667 |                 for cid in requested_collections
668 |                 if cid in self.workflow_planner.detailed_collections_cache
669 |             ]
670 | 
671 |             collections_to_fetch = [
672 |                 cid
673 |                 for cid in requested_collections
674 |                 if cid not in self.workflow_planner.detailed_collections_cache
675 |             ]
676 | 
677 |             if collections_to_fetch:
678 |                 logger.info(f"Fetching detailed queryables for: {collections_to_fetch}")
679 |                 detailed_queryables = (
680 |                     await self.api_client.fetch_collections_queryables(
681 |                         collections_to_fetch
682 |                     )
683 |                 )
684 | 
685 |                 for coll_id, queryables in detailed_queryables.items():
686 |                     self.workflow_planner.detailed_collections_cache[coll_id] = {
687 |                         "id": queryables.id,
688 |                         "title": queryables.title,
689 |                         "description": queryables.description,
690 |                         "all_queryables": queryables.all_queryables,
691 |                         "enum_queryables": queryables.enum_queryables,
692 |                         "has_enum_filters": queryables.has_enum_filters,
693 |                         "total_queryables": queryables.total_queryables,
694 |                         "enum_count": queryables.enum_count,
695 |                     }
696 | 
697 |             context = self.workflow_planner.get_detailed_context(requested_collections)
698 | 
699 |             return json.dumps(
700 |                 {
701 |                     "success": True,
702 |                     "collections_processed": requested_collections,
703 |                     "collections_fetched_from_api": collections_to_fetch,
704 |                     "collections_from_cache": cached_collections,
705 |                     "detailed_collections": context["available_collections"],
706 |                     "message": f"Detailed queryables now available for: {', '.join(requested_collections)}",
707 |                 }
708 |             )
709 | 
710 |         except Exception as e:
711 |             logger.error(f"Error fetching detailed collections: {e}")
712 |             return json.dumps(
713 |                 {"error": str(e), "suggestion": "Check collection IDs and try again"}
714 |             )
715 | 
716 |     async def get_routing_data(
717 |         self,
718 |         bbox: Optional[str] = None,
719 |         limit: int = 100,
720 |         include_nodes: bool = True,
721 |         include_edges: bool = True,
722 |         build_network: bool = True,
723 |     ) -> str:
724 |         """
725 |         Get routing data - builds network and returns nodes/edges as flat tables.
726 | 
727 |         Args:
728 |             bbox: Optional bounding box (format: "minx,miny,maxx,maxy")
729 |             limit: Maximum number of road links to process (default: 1000)
730 |             include_nodes: Whether to include nodes in response (default: True)
731 |             include_edges: Whether to include edges in response (default: True)
732 |             build_network: Whether to build network first (default: True)
733 | 
734 |         Returns:
735 |             JSON string with routing network data
736 |         """
737 |         try:
738 |             result = {}
739 | 
740 |             if build_network:
741 |                 build_result = await self.routing_service.build_routing_network(
742 |                     bbox, limit
743 |                 )
744 |                 result["build_status"] = build_result
745 | 
746 |                 if build_result.get("status") != "success":
747 |                     return json.dumps(result)
748 | 
749 |             if include_nodes:
750 |                 nodes_result = self.routing_service.get_flat_nodes()
751 |                 result["nodes"] = nodes_result.get("nodes", [])
752 | 
753 |             if include_edges:
754 |                 edges_result = self.routing_service.get_flat_edges()
755 |                 result["edges"] = edges_result.get("edges", [])
756 | 
757 |             summary = self.routing_service.get_network_info()
758 |             result["summary"] = summary.get("network", {})
759 |             result["status"] = "success"
760 | 
761 |             return json.dumps(result)
762 |         except Exception as e:
763 |             return json.dumps({"error": str(e)})
764 | 
```

--------------------------------------------------------------------------------
/src/config_docs/ogcapi-features-1.yaml:
--------------------------------------------------------------------------------

```yaml
  1 | openapi: 3.0.2
  2 | info:
  3 |   title: "Building Blocks specified in OGC API - Features - Part 1: Core"
  4 |   description: |-
  5 |     Common components used in the
  6 |     [OGC standard "OGC API - Features - Part 1: Core"](http://docs.opengeospatial.org/is/17-069r3/17-069r3.html).
  7 | 
  8 |     OGC API - Features - Part 1: Core 1.0 is an OGC Standard.
  9 |     Copyright (c) 2019 Open Geospatial Consortium.
 10 |     To obtain additional rights of use, visit http://www.opengeospatial.org/legal/ .
 11 | 
 12 |     This document is also available on
 13 |     [OGC](http://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/ogcapi-features-1.yaml).
 14 |   version: '1.0.0'
 15 |   contact:
 16 |     name: Clemens Portele
 17 |     email: [email protected]
 18 |   license:
 19 |     name: OGC License
 20 |     url: 'http://www.opengeospatial.org/legal/'
 21 | components:
 22 |   parameters:
 23 |     bbox:
 24 |       name: bbox
 25 |       in: query
 26 |       description: |-
 27 |         Only features that have a geometry that intersects the bounding box are selected.
 28 |         The bounding box is provided as four or six numbers, depending on whether the
 29 |         coordinate reference system includes a vertical axis (height or depth):
 30 | 
 31 |         * Lower left corner, coordinate axis 1
 32 |         * Lower left corner, coordinate axis 2
 33 |         * Minimum value, coordinate axis 3 (optional)
 34 |         * Upper right corner, coordinate axis 1
 35 |         * Upper right corner, coordinate axis 2
 36 |         * Maximum value, coordinate axis 3 (optional)
 37 | 
 38 |         If the value consists of four numbers, the coordinate reference system is
 39 |         WGS 84 longitude/latitude (http://www.opengis.net/def/crs/OGC/1.3/CRS84)
 40 |         unless a different coordinate reference system is specified in the parameter `bbox-crs`.
 41 | 
 42 |         If the value consists of six numbers, the coordinate reference system is WGS 84
 43 |         longitude/latitude/ellipsoidal height (http://www.opengis.net/def/crs/OGC/0/CRS84h)
 44 |         unless a different coordinate reference system is specified in the parameter `bbox-crs`.
 45 | 
 46 |         The query parameter `bbox-crs` is specified in OGC API - Features - Part 2: Coordinate
 47 |         Reference Systems by Reference.
 48 | 
 49 |         For WGS 84 longitude/latitude the values are in most cases the sequence of
 50 |         minimum longitude, minimum latitude, maximum longitude and maximum latitude.
 51 |         However, in cases where the box spans the antimeridian the first value
 52 |         (west-most box edge) is larger than the third value (east-most box edge).
 53 | 
 54 |         If the vertical axis is included, the third and the sixth number are
 55 |         the bottom and the top of the 3-dimensional bounding box.
 56 | 
 57 |         If a feature has multiple spatial geometry properties, it is the decision of the
 58 |         server whether only a single spatial geometry property is used to determine
 59 |         the extent or all relevant geometries.
 60 |       required: false
 61 |       schema:
 62 |         type: array
 63 |         oneOf:
 64 |         - minItems: 4
 65 |           maxItems: 4
 66 |         - minItems: 6
 67 |           maxItems: 6
 68 |         items:
 69 |           type: number
 70 |       style: form
 71 |       explode: false
 72 |     collectionId:
 73 |       name: collectionId
 74 |       in: path
 75 |       description: local identifier of a collection
 76 |       required: true
 77 |       schema:
 78 |         type: string
 79 |     datetime:
 80 |       name: datetime
 81 |       in: query
 82 |       description: |-
 83 |         Either a date-time or an interval. Date and time expressions adhere to RFC 3339.
 84 |         Intervals may be bounded or half-bounded (double-dots at start or end).
 85 | 
 86 |         Examples:
 87 | 
 88 |         * A date-time: "2018-02-12T23:20:50Z"
 89 |         * A bounded interval: "2018-02-12T00:00:00Z/2018-03-18T12:31:12Z"
 90 |         * Half-bounded intervals: "2018-02-12T00:00:00Z/.." or "../2018-03-18T12:31:12Z"
 91 | 
 92 |         Only features that have a temporal property that intersects the value of
 93 |         `datetime` are selected.
 94 | 
 95 |         If a feature has multiple temporal properties, it is the decision of the
 96 |         server whether only a single temporal property is used to determine
 97 |         the extent or all relevant temporal properties.
 98 |       required: false
 99 |       schema:
100 |         type: string
101 |       style: form
102 |       explode: false
103 |     featureId:
104 |       name: featureId
105 |       in: path
106 |       description: local identifier of a feature
107 |       required: true
108 |       schema:
109 |         type: string
110 |     limit:
111 |       name: limit
112 |       in: query
113 |       description: |-
114 |         The optional limit parameter limits the number of items that are presented in the response document.
115 | 
116 |         Only items are counted that are on the first level of the collection in the response document.
117 |         Nested objects contained within the explicitly requested items shall not be counted.
118 | 
119 |         Minimum = 1. Maximum = 10000. Default = 10.
120 |       required: false
121 |       schema:
122 |         type: integer
123 |         minimum: 1
124 |         maximum: 10000
125 |         default: 10
126 |       style: form
127 |       explode: false
128 |   schemas:
129 |     collection:
130 |       type: object
131 |       required:
132 |         - id
133 |         - links
134 |       properties:
135 |         id:
136 |           description: identifier of the collection used, for example, in URIs
137 |           type: string
138 |           example: address
139 |         title:
140 |           description: human readable title of the collection
141 |           type: string
142 |           example: address
143 |         description:
144 |           description: a description of the features in the collection
145 |           type: string
146 |           example: An address.
147 |         links:
148 |           type: array
149 |           items:
150 |             $ref: "#/components/schemas/link"
151 |           example:
152 |             - href: http://data.example.com/buildings
153 |               rel: item
154 |             - href: http://example.com/concepts/buildings.html
155 |               rel: describedby
156 |               type: text/html
157 |         extent:
158 |           $ref: "#/components/schemas/extent"
159 |         itemType:
160 |           description: indicator about the type of the items in the collection (the default value is 'feature').
161 |           type: string
162 |           default: feature
163 |         crs:
164 |           description: the list of coordinate reference systems supported by the service
165 |           type: array
166 |           items:
167 |             type: string
168 |           default:
169 |             - http://www.opengis.net/def/crs/OGC/1.3/CRS84
170 |           example:
171 |             - http://www.opengis.net/def/crs/OGC/1.3/CRS84
172 |             - http://www.opengis.net/def/crs/EPSG/0/4326
173 |     collections:
174 |       type: object
175 |       required:
176 |         - links
177 |         - collections
178 |       properties:
179 |         links:
180 |           type: array
181 |           items:
182 |             $ref: "#/components/schemas/link"
183 |         collections:
184 |           type: array
185 |           items:
186 |             $ref: "#/components/schemas/collection"
187 |     confClasses:
188 |       type: object
189 |       required:
190 |         - conformsTo
191 |       properties:
192 |         conformsTo:
193 |           type: array
194 |           items:
195 |             type: string
196 |     exception:
197 |       type: object
198 |       description: |-
199 |         Information about the exception: an error code plus an optional description.
200 |       required:
201 |         - code
202 |       properties:
203 |         code:
204 |           type: string
205 |         description:
206 |           type: string
207 |     extent:
208 |       type: object
209 |       description: |-
210 |         The extent of the features in the collection. In the Core only spatial and temporal
211 |         extents are specified. Extensions may add additional members to represent other
212 |         extents, for example, thermal or pressure ranges.
213 |       properties:
214 |         spatial:
215 |           description: |-
216 |             The spatial extent of the features in the collection.
217 |           type: object
218 |           properties:
219 |             bbox:
220 |               description: |-
221 |                 One or more bounding boxes that describe the spatial extent of the dataset.
222 |                 In the Core only a single bounding box is supported. Extensions may support
223 |                 additional areas. If multiple areas are provided, the union of the bounding
224 |                 boxes describes the spatial extent.
225 |               type: array
226 |               minItems: 1
227 |               items:
228 |                 description: |-
229 |                   Each bounding box is provided as four or six numbers, depending on
230 |                   whether the coordinate reference system includes a vertical axis
231 |                   (height or depth):
232 | 
233 |                   * Lower left corner, coordinate axis 1
234 |                   * Lower left corner, coordinate axis 2
235 |                   * Minimum value, coordinate axis 3 (optional)
236 |                   * Upper right corner, coordinate axis 1
237 |                   * Upper right corner, coordinate axis 2
238 |                   * Maximum value, coordinate axis 3 (optional)
239 | 
240 |                   The coordinate reference system of the values is WGS 84 longitude/latitude
241 |                   (http://www.opengis.net/def/crs/OGC/1.3/CRS84) unless a different coordinate
242 |                   reference system is specified in `crs`.
243 | 
244 |                   For WGS 84 longitude/latitude the values are in most cases the sequence of
245 |                   minimum longitude, minimum latitude, maximum longitude and maximum latitude.
246 |                   However, in cases where the box spans the antimeridian the first value
247 |                   (west-most box edge) is larger than the third value (east-most box edge).
248 | 
249 |                   If the vertical axis is included, the third and the sixth number are
250 |                   the bottom and the top of the 3-dimensional bounding box.
251 | 
252 |                   If a feature has multiple spatial geometry properties, it is the decision of the
253 |                   server whether only a single spatial geometry property is used to determine
254 |                   the extent or all relevant geometries.
255 |                 type: array
256 |                 oneOf:
257 |                 - minItems: 4
258 |                   maxItems: 4
259 |                 - minItems: 6
260 |                   maxItems: 6
261 |                 items:
262 |                   type: number
263 |                 example:
264 |                   - -180
265 |                   - -90
266 |                   - 180
267 |                   - 90
268 |             crs:
269 |               description: |-
270 |                 Coordinate reference system of the coordinates in the spatial extent
271 |                 (property `bbox`). The default reference system is WGS 84 longitude/latitude.
272 |                 In the Core this is the only supported coordinate reference system.
273 |                 Extensions may support additional coordinate reference systems and add
274 |                 additional enum values.
275 |               type: string
276 |               enum:
277 |                 - 'http://www.opengis.net/def/crs/OGC/1.3/CRS84'
278 |               default: 'http://www.opengis.net/def/crs/OGC/1.3/CRS84'
279 |         temporal:
280 |           description: |-
281 |             The temporal extent of the features in the collection.
282 |           type: object
283 |           properties:
284 |             interval:
285 |               description: |-
286 |                 One or more time intervals that describe the temporal extent of the dataset.
287 |                 The value `null` is supported and indicates an unbounded interval end.
288 |                 In the Core only a single time interval is supported. Extensions may support
289 |                 multiple intervals. If multiple intervals are provided, the union of the
290 |                 intervals describes the temporal extent.
291 |               type: array
292 |               minItems: 1
293 |               items:
294 |                 description: |-
295 |                   Begin and end times of the time interval. The timestamps are in the
296 |                   temporal coordinate reference system specified in `trs`. By default
297 |                   this is the Gregorian calendar.
298 |                 type: array
299 |                 minItems: 2
300 |                 maxItems: 2
301 |                 items:
302 |                   type: string
303 |                   format: date-time
304 |                   nullable: true
305 |                 example:
306 |                   - '2011-11-11T12:22:11Z'
307 |                   - null
308 |             trs:
309 |               description: |-
310 |                 Coordinate reference system of the coordinates in the temporal extent
311 |                 (property `interval`). The default reference system is the Gregorian calendar.
312 |                 In the Core this is the only supported temporal coordinate reference system.
313 |                 Extensions may support additional temporal coordinate reference systems and add
314 |                 additional enum values.
315 |               type: string
316 |               enum:
317 |                 - 'http://www.opengis.net/def/uom/ISO-8601/0/Gregorian'
318 |               default: 'http://www.opengis.net/def/uom/ISO-8601/0/Gregorian'
319 |     featureCollectionGeoJSON:
320 |       type: object
321 |       required:
322 |         - type
323 |         - features
324 |       properties:
325 |         type:
326 |           type: string
327 |           enum:
328 |             - FeatureCollection
329 |         features:
330 |           type: array
331 |           items:
332 |             $ref: "#/components/schemas/featureGeoJSON"
333 |         links:
334 |           type: array
335 |           items:
336 |             $ref: "#/components/schemas/link"
337 |         timeStamp:
338 |           $ref: "#/components/schemas/timeStamp"
339 |         numberMatched:
340 |           $ref: "#/components/schemas/numberMatched"
341 |         numberReturned:
342 |           $ref: "#/components/schemas/numberReturned"
343 |     featureGeoJSON:
344 |       type: object
345 |       required:
346 |         - type
347 |         - geometry
348 |         - properties
349 |       properties:
350 |         type:
351 |           type: string
352 |           enum:
353 |             - Feature
354 |         geometry:
355 |           $ref: "#/components/schemas/geometryGeoJSON"
356 |         properties:
357 |           type: object
358 |           nullable: true
359 |         id:
360 |           oneOf:
361 |             - type: string
362 |             - type: integer
363 |         links:
364 |           type: array
365 |           items:
366 |             $ref: "#/components/schemas/link"
367 |     geometryGeoJSON:
368 |       oneOf:
369 |       - $ref: "#/components/schemas/pointGeoJSON"
370 |       - $ref: "#/components/schemas/multipointGeoJSON"
371 |       - $ref: "#/components/schemas/linestringGeoJSON"
372 |       - $ref: "#/components/schemas/multilinestringGeoJSON"
373 |       - $ref: "#/components/schemas/polygonGeoJSON"
374 |       - $ref: "#/components/schemas/multipolygonGeoJSON"
375 |       - $ref: "#/components/schemas/geometrycollectionGeoJSON"
376 |     geometrycollectionGeoJSON:
377 |       type: object
378 |       required:
379 |         - type
380 |         - geometries
381 |       properties:
382 |         type:
383 |           type: string
384 |           enum:
385 |             - GeometryCollection
386 |         geometries:
387 |           type: array
388 |           items:
389 |             $ref: "#/components/schemas/geometryGeoJSON"
390 |     landingPage:
391 |       type: object
392 |       required:
393 |         - links
394 |       properties:
395 |         title:
396 |           type: string
397 |           example: Buildings in Bonn
398 |         description:
399 |           type: string
400 |           example: Access to data about buildings in the city of Bonn via a Web API that conforms to the OGC API Features specification.
401 |         links:
402 |           type: array
403 |           items:
404 |             $ref: "#/components/schemas/link"
405 |     linestringGeoJSON:
406 |       type: object
407 |       required:
408 |         - type
409 |         - coordinates
410 |       properties:
411 |         type:
412 |           type: string
413 |           enum:
414 |             - LineString
415 |         coordinates:
416 |           type: array
417 |           minItems: 2
418 |           items:
419 |             type: array
420 |             minItems: 2
421 |             items:
422 |               type: number
423 |     link:
424 |       type: object
425 |       required:
426 |         - href
427 |       properties:
428 |         href:
429 |           type: string
430 |           example: http://data.example.com/buildings/123
431 |         rel:
432 |           type: string
433 |           example: alternate
434 |         type:
435 |           type: string
436 |           example: application/geo+json
437 |         hreflang:
438 |           type: string
439 |           example: en
440 |         title:
441 |           type: string
442 |           example: Trierer Strasse 70, 53115 Bonn
443 |         length:
444 |           type: integer
445 |     multilinestringGeoJSON:
446 |       type: object
447 |       required:
448 |         - type
449 |         - coordinates
450 |       properties:
451 |         type:
452 |           type: string
453 |           enum:
454 |             - MultiLineString
455 |         coordinates:
456 |           type: array
457 |           items:
458 |             type: array
459 |             minItems: 2
460 |             items:
461 |               type: array
462 |               minItems: 2
463 |               items:
464 |                 type: number
465 |     multipointGeoJSON:
466 |       type: object
467 |       required:
468 |         - type
469 |         - coordinates
470 |       properties:
471 |         type:
472 |           type: string
473 |           enum:
474 |             - MultiPoint
475 |         coordinates:
476 |           type: array
477 |           items:
478 |             type: array
479 |             minItems: 2
480 |             items:
481 |               type: number
482 |     multipolygonGeoJSON:
483 |       type: object
484 |       required:
485 |         - type
486 |         - coordinates
487 |       properties:
488 |         type:
489 |           type: string
490 |           enum:
491 |             - MultiPolygon
492 |         coordinates:
493 |           type: array
494 |           items:
495 |             type: array
496 |             items:
497 |               type: array
498 |               minItems: 4
499 |               items:
500 |                 type: array
501 |                 minItems: 2
502 |                 items:
503 |                   type: number
504 |     numberMatched:
505 |       description: |-
506 |         The number of features of the feature type that match the selection
507 |         parameters like `bbox`.
508 |       type: integer
509 |       minimum: 0
510 |       example: 127
511 |     numberReturned:
512 |       description: |-
513 |         The number of features in the feature collection.
514 | 
515 |         A server may omit this information in a response, if the information
516 |         about the number of features is not known or difficult to compute.
517 | 
518 |         If the value is provided, the value shall be identical to the number
519 |         of items in the "features" array.
520 |       type: integer
521 |       minimum: 0
522 |       example: 10
523 |     pointGeoJSON:
524 |       type: object
525 |       required:
526 |         - type
527 |         - coordinates
528 |       properties:
529 |         type:
530 |           type: string
531 |           enum:
532 |             - Point
533 |         coordinates:
534 |           type: array
535 |           minItems: 2
536 |           items:
537 |             type: number
538 |     polygonGeoJSON:
539 |       type: object
540 |       required:
541 |         - type
542 |         - coordinates
543 |       properties:
544 |         type:
545 |           type: string
546 |           enum:
547 |             - Polygon
548 |         coordinates:
549 |           type: array
550 |           items:
551 |             type: array
552 |             minItems: 4
553 |             items:
554 |               type: array
555 |               minItems: 2
556 |               items:
557 |                 type: number
558 |     timeStamp:
559 |       description: This property indicates the time and date when the response was generated.
560 |       type: string
561 |       format: date-time
562 |       example: '2017-08-17T08:05:32Z'
563 |   responses:
564 |     LandingPage:
565 |       description: |-
566 |         The landing page provides links to the API definition
567 |         (link relations `service-desc` and `service-doc`),
568 |         the Conformance declaration (path `/conformance`,
569 |         link relation `conformance`), and the Feature
570 |         Collections (path `/collections`, link relation
571 |         `data`).
572 |       content:
573 |         application/json:
574 |           schema:
575 |             $ref: '#/components/schemas/landingPage'
576 |           example:
577 |             title: Buildings in Bonn
578 |             description: Access to data about buildings in the city of Bonn via a Web API that conforms to the OGC API Features specification.
579 |             links:
580 |               - href: 'http://data.example.org/'
581 |                 rel: self
582 |                 type: application/json
583 |                 title: this document
584 |               - href: 'http://data.example.org/api'
585 |                 rel: service-desc
586 |                 type: application/vnd.oai.openapi+json;version=3.0
587 |                 title: the API definition
588 |               - href: 'http://data.example.org/api.html'
589 |                 rel: service-doc
590 |                 type: text/html
591 |                 title: the API documentation
592 |               - href: 'http://data.example.org/conformance'
593 |                 rel: conformance
594 |                 type: application/json
595 |                 title: OGC API conformance classes implemented by this server
596 |               - href: 'http://data.example.org/collections'
597 |                 rel: data
598 |                 type: application/json
599 |                 title: Information about the feature collections
600 |         text/html:
601 |           schema:
602 |             type: string
603 |     ConformanceDeclaration:
604 |       description: |-
605 |         The URIs of all conformance classes supported by the server.
606 | 
607 |         To support "generic" clients that want to access multiple
608 |         OGC API Features implementations - and not "just" a specific
609 |         API / server, the server declares the conformance
610 |         classes it implements and conforms to.
611 |       content:
612 |         application/json:
613 |           schema:
614 |             $ref: '#/components/schemas/confClasses'
615 |           example:
616 |             conformsTo:
617 |               - 'http://www.opengis.net/spec/ogcapi-features-1/1.0/conf/core'
618 |               - 'http://www.opengis.net/spec/ogcapi-features-1/1.0/conf/oas30'
619 |               - 'http://www.opengis.net/spec/ogcapi-features-1/1.0/conf/html'
620 |               - 'http://www.opengis.net/spec/ogcapi-features-1/1.0/conf/geojson'
621 |         text/html:
622 |           schema:
623 |             type: string
624 |     Collections:
625 |       description: |-
626 |         The feature collections shared by this API.
627 | 
628 |         The dataset is organized as one or more feature collections. This resource
629 |         provides information about and access to the collections.
630 | 
631 |         The response contains the list of collections. For each collection, a link
632 |         to the items in the collection (path `/collections/{collectionId}/items`,
633 |         link relation `items`) as well as key information about the collection.
634 |         This information includes:
635 | 
636 |         * A local identifier for the collection that is unique for the dataset;
637 |         * A list of coordinate reference systems (CRS) in which geometries may be returned by the server. The first CRS is the default coordinate reference system (the default is always WGS 84 with axis order longitude/latitude);
638 |         * An optional title and description for the collection;
639 |         * An optional extent that can be used to provide an indication of the spatial and temporal extent of the collection - typically derived from the data;
640 |         * An optional indicator about the type of the items in the collection (the default value, if the indicator is not provided, is 'feature').
641 |       content:
642 |         application/json:
643 |           schema:
644 |             $ref: '#/components/schemas/collections'
645 |           example:
646 |             links:
647 |               - href: 'http://data.example.org/collections.json'
648 |                 rel: self
649 |                 type: application/json
650 |                 title: this document
651 |               - href: 'http://data.example.org/collections.html'
652 |                 rel: alternate
653 |                 type: text/html
654 |                 title: this document as HTML
655 |               - href: 'http://schemas.example.org/1.0/buildings.xsd'
656 |                 rel: describedby
657 |                 type: application/xml
658 |                 title: GML application schema for Acme Corporation building data
659 |               - href: 'http://download.example.org/buildings.gpkg'
660 |                 rel: enclosure
661 |                 type: application/geopackage+sqlite3
662 |                 title: Bulk download (GeoPackage)
663 |                 length: 472546
664 |             collections:
665 |               - id: buildings
666 |                 title: Buildings
667 |                 description: Buildings in the city of Bonn.
668 |                 extent:
669 |                   spatial:
670 |                     bbox:
671 |                       - - 7.01
672 |                         - 50.63
673 |                         - 7.22
674 |                         - 50.78
675 |                   temporal:
676 |                     interval:
677 |                       - - '2010-02-15T12:34:56Z'
678 |                         - null
679 |                 links:
680 |                   - href: 'http://data.example.org/collections/buildings/items'
681 |                     rel: items
682 |                     type: application/geo+json
683 |                     title: Buildings
684 |                   - href: 'http://data.example.org/collections/buildings/items.html'
685 |                     rel: items
686 |                     type: text/html
687 |                     title: Buildings
688 |                   - href: 'https://creativecommons.org/publicdomain/zero/1.0/'
689 |                     rel: license
690 |                     type: text/html
691 |                     title: CC0-1.0
692 |                   - href: 'https://creativecommons.org/publicdomain/zero/1.0/rdf'
693 |                     rel: license
694 |                     type: application/rdf+xml
695 |                     title: CC0-1.0
696 |         text/html:
697 |           schema:
698 |             type: string
699 |     Collection:
700 |       description: |-
701 |         Information about the feature collection with id `collectionId`.
702 | 
703 |         The response contains a link to the items in the collection
704 |         (path `/collections/{collectionId}/items`, link relation `items`)
705 |         as well as key information about the collection. This information
706 |         includes:
707 | 
708 |         * A local identifier for the collection that is unique for the dataset;
709 |         * A list of coordinate reference systems (CRS) in which geometries may be returned by the server. The first CRS is the default coordinate reference system (the default is always WGS 84 with axis order longitude/latitude);
710 |         * An optional title and description for the collection;
711 |         * An optional extent that can be used to provide an indication of the spatial and temporal extent of the collection - typically derived from the data;
712 |         * An optional indicator about the type of the items in the collection (the default value, if the indicator is not provided, is 'feature').
713 |       content:
714 |         application/json:
715 |           schema:
716 |             $ref: '#/components/schemas/collection'
717 |           example:
718 |             id: buildings
719 |             title: Buildings
720 |             description: Buildings in the city of Bonn.
721 |             extent:
722 |               spatial:
723 |                 bbox:
724 |                   - - 7.01
725 |                     - 50.63
726 |                     - 7.22
727 |                     - 50.78
728 |               temporal:
729 |                 interval:
730 |                   - - '2010-02-15T12:34:56Z'
731 |                     - null
732 |             links:
733 |               - href: 'http://data.example.org/collections/buildings/items'
734 |                 rel: items
735 |                 type: application/geo+json
736 |                 title: Buildings
737 |               - href: 'http://data.example.org/collections/buildings/items.html'
738 |                 rel: items
739 |                 type: text/html
740 |                 title: Buildings
741 |               - href: 'https://creativecommons.org/publicdomain/zero/1.0/'
742 |                 rel: license
743 |                 type: text/html
744 |                 title: CC0-1.0
745 |               - href: 'https://creativecommons.org/publicdomain/zero/1.0/rdf'
746 |                 rel: license
747 |                 type: application/rdf+xml
748 |                 title: CC0-1.0
749 |         text/html:
750 |           schema:
751 |             type: string
752 |     Features:
753 |       description: |-
754 |         The response is a document consisting of features in the collection.
755 |         The features included in the response are determined by the server
756 |         based on the query parameters of the request. To support access to
757 |         larger collections without overloading the client, the API supports
758 |         paged access with links to the next page, if more features are selected
759 |         that the page size.
760 | 
761 |         The `bbox` and `datetime` parameter can be used to select only a
762 |         subset of the features in the collection (the features that are in the
763 |         bounding box or time interval). The `bbox` parameter matches all features
764 |         in the collection that are not associated with a location, too. The
765 |         `datetime` parameter matches all features in the collection that are
766 |         not associated with a time stamp or interval, too.
767 | 
768 |         The `limit` parameter may be used to control the subset of the
769 |         selected features that should be returned in the response, the page size.
770 |         Each page may include information about the number of selected and
771 |         returned features (`numberMatched` and `numberReturned`) as well as
772 |         links to support paging (link relation `next`).
773 |       content:
774 |         application/geo+json:
775 |           schema:
776 |             $ref: '#/components/schemas/featureCollectionGeoJSON'
777 |           example:
778 |             type: FeatureCollection
779 |             links:
780 |               - href: 'http://data.example.com/collections/buildings/items.json'
781 |                 rel: self
782 |                 type: application/geo+json
783 |                 title: this document
784 |               - href: 'http://data.example.com/collections/buildings/items.html'
785 |                 rel: alternate
786 |                 type: text/html
787 |                 title: this document as HTML
788 |               - href: 'http://data.example.com/collections/buildings/items.json&offset=10&limit=2'
789 |                 rel: next
790 |                 type: application/geo+json
791 |                 title: next page
792 |             timeStamp: '2018-04-03T14:52:23Z'
793 |             numberMatched: 123
794 |             numberReturned: 2
795 |             features:
796 |               - type: Feature
797 |                 id: '123'
798 |                 geometry:
799 |                   type: Polygon
800 |                   coordinates:
801 |                     - ...
802 |                 properties:
803 |                   function: residential
804 |                   floors: '2'
805 |                   lastUpdate: '2015-08-01T12:34:56Z'
806 |               - type: Feature
807 |                 id: '132'
808 |                 geometry:
809 |                   type: Polygon
810 |                   coordinates:
811 |                     - ...
812 |                 properties:
813 |                   function: public use
814 |                   floors: '10'
815 |                   lastUpdate: '2013-12-03T10:15:37Z'
816 |         text/html:
817 |           schema:
818 |             type: string
819 |     Feature:
820 |       description: |-
821 |         fetch the feature with id `featureId` in the feature collection
822 |         with id `collectionId`
823 |       content:
824 |         application/geo+json:
825 |           schema:
826 |             $ref: '#/components/schemas/featureGeoJSON'
827 |           example:
828 |             type: Feature
829 |             links:
830 |               - href: 'http://data.example.com/id/building/123'
831 |                 rel: canonical
832 |                 title: canonical URI of the building
833 |               - href: 'http://data.example.com/collections/buildings/items/123.json'
834 |                 rel: self
835 |                 type: application/geo+json
836 |                 title: this document
837 |               - href: 'http://data.example.com/collections/buildings/items/123.html'
838 |                 rel: alternate
839 |                 type: text/html
840 |                 title: this document as HTML
841 |               - href: 'http://data.example.com/collections/buildings'
842 |                 rel: collection
843 |                 type: application/geo+json
844 |                 title: the collection document
845 |             id: '123'
846 |             geometry:
847 |               type: Polygon
848 |               coordinates:
849 |                 - ...
850 |             properties:
851 |               function: residential
852 |               floors: '2'
853 |               lastUpdate: '2015-08-01T12:34:56Z'
854 |         text/html:
855 |           schema:
856 |             type: string
857 |     InvalidParameter:
858 |       description: |-
859 |         A query parameter has an invalid value.
860 |       content:
861 |         application/json:
862 |           schema:
863 |             $ref: '#/components/schemas/exception'
864 |         text/html:
865 |           schema:
866 |             type: string
867 |     NotFound:
868 |       description: |-
869 |         The requested resource does not exist on the server. For example, a path parameter had an incorrect value.
870 |     NotAcceptable:
871 |       description: |-
872 |         Content negotiation failed. For example, the `Accept` header submitted in the request did not support any of the media types supported by the server for the requested resource.
873 |     ServerError:
874 |       description: |-
875 |         A server error occurred.
876 |       content:
877 |         application/json:
878 |           schema:
879 |             $ref: '#/components/schemas/exception'
880 |         text/html:
881 |           schema:
882 |             type: string
883 | 
```