# Directory Structure ``` ├── .dockerignore ├── .env.local ├── .gitignore ├── .python-version ├── dockerfile ├── LICENSE ├── makefile ├── pyproject.toml ├── README.md └── src ├── __init__.py ├── api_service │ ├── __init__.py │ ├── os_api.py │ └── protocols.py ├── config_docs │ ├── ogcapi-features-1.yaml │ ├── ogcapi-features-2.yaml │ └── ogcapi-features-3.yaml ├── http_client_test.py ├── mcp_service │ ├── __init__.py │ ├── guardrails.py │ ├── os_service.py │ ├── prompts.py │ ├── protocols.py │ ├── resources.py │ └── routing_service.py ├── middleware │ ├── __init__.py │ ├── http_middleware.py │ └── stdio_middleware.py ├── models.py ├── prompt_templates │ ├── __init__.py │ └── prompt_templates.py ├── server.py ├── stdio_client_test.py ├── utils │ ├── __init__.py │ └── logging_config.py └── workflow_generator ├── __init__.py └── workflow_planner.py ``` # Files -------------------------------------------------------------------------------- /.python-version: -------------------------------------------------------------------------------- ``` 1 | 3.11 2 | ``` -------------------------------------------------------------------------------- /.env.local: -------------------------------------------------------------------------------- ``` 1 | OS_API_KEY=xxxxxxx 2 | STDIO_KEY=xxxxxxx 3 | BEARER_TOKENS=xxxxxxx 4 | ``` -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- ``` 1 | # Python-generated files 2 | __pycache__/ 3 | *.py[oc] 4 | build/ 5 | dist/ 6 | wheels/ 7 | *.egg-info 8 | 9 | # Virtual environments 10 | .venv 11 | .env 12 | .DS_Store 13 | .ruff_cache 14 | uv.lock 15 | pyrightconfig.json 16 | *.log 17 | .pytest_cache/ 18 | .mypy_cache/ ``` -------------------------------------------------------------------------------- /.dockerignore: -------------------------------------------------------------------------------- ``` 1 | # Python artifacts 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | *.so 6 | .Python 7 | build/ 8 | develop-eggs/ 9 | dist/ 10 | downloads/ 11 | eggs/ 12 | .eggs/ 13 | lib/ 14 | lib64/ 15 | parts/ 16 | sdist/ 17 | var/ 18 | wheels/ 19 | *.egg-info/ 20 | .installed.cfg 21 | *.egg 22 | 23 | # Virtual environments 24 | .venv/ 25 | venv/ 26 | ENV/ 27 | env/ 28 | .env 29 | 30 | # Testing and type checking 31 | .pytest_cache/ 32 | .mypy_cache/ 33 | .coverage 34 | htmlcov/ 35 | .tox/ 36 | .hypothesis/ 37 | *.cover 38 | .coverage.* 39 | 40 | # Development tools 41 | .ruff_cache/ 42 | pyrightconfig.json 43 | .vscode/ 44 | .idea/ 45 | *.swp 46 | *.swo 47 | *~ 48 | 49 | # OS files 50 | .DS_Store 51 | .DS_Store? 52 | ._* 53 | .Spotlight-V100 54 | .Trashes 55 | ehthumbs.db 56 | Thumbs.db 57 | 58 | # Git 59 | .git/ 60 | .gitignore 61 | .gitattributes 62 | 63 | # Documentation 64 | *.md 65 | docs/ 66 | LICENSE 67 | 68 | # Build files 69 | Makefile 70 | makefile 71 | dockerfile 72 | Dockerfile 73 | 74 | # Logs 75 | *.log 76 | logs/ 77 | 78 | # Package manager 79 | uv.lock 80 | poetry.lock 81 | Pipfile.lock 82 | 83 | # Test files 84 | *_test.py 85 | test_*.py 86 | tests/ ``` -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- ```markdown 1 | # Ordnance Survey MCP Server 2 | 3 | A MCP server for accessing UK geospatial data through Ordnance Survey APIs. 4 | 5 | ## What it does 6 | 7 | Provides LLM access to the Ordnance Survey's Data Hub APIs. 8 | 9 | Ask simple questions such as find me all cinemas in Leeds City Centre or use the prompts templates for more complex, speciifc use cases - relating to street works, planning, etc. 10 | 11 | This MCP server enforces a 2 step workflow plan to ensure that the user gets the best results possible. 12 | 13 | ## Quick Start 14 | 15 | ### 1. Get an OS API Key 16 | 17 | Register at [OS Data Hub](https://osdatahub.os.uk/) to get your free API key and set up a project. 18 | 19 | ### 2. Run with Docker and add to your Claude Desktop config (easiest) 20 | 21 | ```bash 22 | Clone the repository: 23 | git clone https://github.com/your-username/os-mcp-server.git 24 | cd os-mcp-server 25 | ``` 26 | 27 | Then build the Docker image: 28 | 29 | ```bash 30 | docker build -t os-mcp-server . 31 | ``` 32 | 33 | Add the following to your Claude Desktop config: 34 | 35 | ```json 36 | { 37 | "mcpServers": { 38 | "os-mcp-server": { 39 | "command": "docker", 40 | "args": [ 41 | "run", 42 | "--rm", 43 | "-i", 44 | "-e", 45 | "OS_API_KEY=your_api_key_here", 46 | "-e", 47 | "STDIO_KEY=any_value", 48 | "os-mcp-server" 49 | ] 50 | } 51 | } 52 | } 53 | ``` 54 | 55 | Open Claude Desktop and you should now see all available tools, resources, and prompts. 56 | 57 | ## Requirements 58 | 59 | - Python 3.11+ 60 | - OS API Key from [OS Data Hub](https://osdatahub.os.uk/) 61 | - Set a STDIO_KEY env var which can be any value for now whilst auth is improved. 62 | 63 | ## License 64 | 65 | MIT License. This project does not have the endorsement of Ordnance Survey. This is a personal project and not affiliated with Ordnance Survey. This is not a commercial product. It is actively being worked on so expect breaking changes and bugs. 66 | ``` -------------------------------------------------------------------------------- /src/__init__.py: -------------------------------------------------------------------------------- ```python 1 | ``` -------------------------------------------------------------------------------- /src/api_service/__init__.py: -------------------------------------------------------------------------------- ```python 1 | ``` -------------------------------------------------------------------------------- /src/mcp_service/__init__.py: -------------------------------------------------------------------------------- ```python 1 | ``` -------------------------------------------------------------------------------- /src/middleware/__init__.py: -------------------------------------------------------------------------------- ```python 1 | ``` -------------------------------------------------------------------------------- /src/prompt_templates/__init__.py: -------------------------------------------------------------------------------- ```python 1 | ``` -------------------------------------------------------------------------------- /src/utils/__init__.py: -------------------------------------------------------------------------------- ```python 1 | ``` -------------------------------------------------------------------------------- /src/workflow_generator/__init__.py: -------------------------------------------------------------------------------- ```python 1 | ``` -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- ```toml 1 | [project] 2 | name = "os-mcp" 3 | version = "0.1.5" 4 | description = "A Python MCP server that provides access to Ordnance Survey's DataHub APIs." 5 | readme = "README.md" 6 | requires-python = ">=3.11" 7 | dependencies = [ 8 | "aiohttp>=3.11.18", 9 | "anthropic>=0.51.0", 10 | "fastapi>=0.116.1", 11 | "mcp>=1.12.0", 12 | "starlette>=0.47.1", 13 | "uvicorn>=0.35.0", 14 | ] 15 | ``` -------------------------------------------------------------------------------- /dockerfile: -------------------------------------------------------------------------------- ```dockerfile 1 | FROM python:3.11-slim 2 | 3 | WORKDIR /app 4 | 5 | ENV PYTHONPATH=/app/src \ 6 | PYTHONUNBUFFERED=1 \ 7 | PIP_NO_CACHE_DIR=1 \ 8 | PIP_DISABLE_PIP_VERSION_CHECK=1 9 | 10 | 11 | RUN apt-get update && apt-get install -y \ 12 | gcc \ 13 | && rm -rf /var/lib/apt/lists/* \ 14 | && apt-get clean 15 | 16 | COPY pyproject.toml ./ 17 | RUN pip install --no-cache-dir . 18 | 19 | COPY src/ ./src/ 20 | 21 | EXPOSE 8000 22 | 23 | CMD ["python", "src/server.py", "--transport", "stdio"] 24 | ``` -------------------------------------------------------------------------------- /src/workflow_generator/workflow_planner.py: -------------------------------------------------------------------------------- ```python 1 | from typing import Dict, Any, Optional, List 2 | from models import OpenAPISpecification 3 | from utils.logging_config import get_logger 4 | 5 | logger = get_logger(__name__) 6 | 7 | 8 | class WorkflowPlanner: 9 | """Context provider for LLM workflow planning""" 10 | 11 | def __init__( 12 | self, 13 | openapi_spec: Optional[OpenAPISpecification], 14 | basic_collections_info: Optional[Dict[str, Any]] = None, 15 | ): 16 | self.spec = openapi_spec 17 | self.basic_collections_info = basic_collections_info or {} 18 | self.detailed_collections_cache = {} 19 | 20 | def get_basic_context(self) -> Dict[str, Any]: 21 | """Get basic context for LLM to plan its workflow - no detailed queryables""" 22 | return { 23 | "available_collections": self.basic_collections_info, 24 | "openapi_spec": self.spec, 25 | } 26 | 27 | def get_detailed_context(self, collection_ids: List[str]) -> Dict[str, Any]: 28 | """Get detailed context for specific collections mentioned in the plan""" 29 | detailed_collections = { 30 | coll_id: self.detailed_collections_cache.get(coll_id) 31 | for coll_id in collection_ids 32 | if coll_id in self.detailed_collections_cache 33 | } 34 | 35 | return { 36 | "available_collections": detailed_collections, 37 | "openapi_spec": self.spec, 38 | } 39 | ``` -------------------------------------------------------------------------------- /src/api_service/protocols.py: -------------------------------------------------------------------------------- ```python 1 | from typing import Protocol, Dict, List, Any, Optional, runtime_checkable 2 | from models import ( 3 | OpenAPISpecification, 4 | CollectionsCache, 5 | CollectionQueryables, 6 | ) 7 | 8 | 9 | @runtime_checkable 10 | class APIClient(Protocol): 11 | """Protocol for API clients""" 12 | 13 | async def initialise(self): 14 | """Initialise the aiohttp session if not already created""" 15 | ... 16 | 17 | async def close(self): 18 | """Close the aiohttp session""" 19 | ... 20 | 21 | async def get_api_key(self) -> str: 22 | """Get the API key""" 23 | ... 24 | 25 | async def make_request( 26 | self, 27 | endpoint: str, 28 | params: Optional[Dict[str, Any]] = None, 29 | path_params: Optional[List[str]] = None, 30 | ) -> Dict[str, Any]: 31 | """Make a request to an API endpoint""" 32 | ... 33 | 34 | async def make_request_no_auth( 35 | self, 36 | url: str, 37 | params: Optional[Dict[str, Any]] = None, 38 | max_retries: int = 2, 39 | ) -> str: 40 | """Make a request without authentication""" 41 | ... 42 | 43 | async def cache_openapi_spec(self) -> OpenAPISpecification: 44 | """Cache the OpenAPI spec""" 45 | ... 46 | 47 | async def cache_collections(self) -> CollectionsCache: 48 | """Cache the collections data""" 49 | ... 50 | 51 | async def fetch_collections_queryables( 52 | self, collection_ids: List[str] 53 | ) -> Dict[str, CollectionQueryables]: 54 | """Fetch detailed queryables for specific collections only""" 55 | ... 56 | ``` -------------------------------------------------------------------------------- /src/utils/logging_config.py: -------------------------------------------------------------------------------- ```python 1 | import logging 2 | import sys 3 | import os 4 | import re 5 | 6 | 7 | class APIKeySanitisingFilter(logging.Filter): 8 | """Filter to sanitise API keys from log messages""" 9 | 10 | def __init__(self): 11 | super().__init__() 12 | self.patterns = [ 13 | r"[?&]key=[^&\s]*", 14 | r"[?&]api_key=[^&\s]*", 15 | r"[?&]apikey=[^&\s]*", 16 | r"[?&]token=[^&\s]*", 17 | ] 18 | 19 | def filter(self, record: logging.LogRecord) -> bool: 20 | """Sanitise the log record message""" 21 | if hasattr(record, "msg") and isinstance(record.msg, str): 22 | record.msg = self._sanitise_text(record.msg) 23 | 24 | if hasattr(record, "args") and record.args: 25 | sanitised_args = [] 26 | for arg in record.args: 27 | if isinstance(arg, str): 28 | sanitised_args.append(self._sanitise_text(arg)) 29 | else: 30 | sanitised_args.append(arg) 31 | record.args = tuple(sanitised_args) 32 | 33 | return True 34 | 35 | def _sanitise_text(self, text: str) -> str: 36 | """Remove API keys from text""" 37 | sanitised = text 38 | for pattern in self.patterns: 39 | sanitised = re.sub(pattern, "", sanitised, flags=re.IGNORECASE) 40 | 41 | sanitised = re.sub(r"[?&]$", "", sanitised) 42 | sanitised = re.sub(r"&{2,}", "&", sanitised) 43 | sanitised = re.sub(r"\?&", "?", sanitised) 44 | 45 | return sanitised 46 | 47 | 48 | def configure_logging(debug: bool = False) -> logging.Logger: 49 | """ 50 | Configure logging for the entire application. 51 | 52 | Args: 53 | debug: Whether to enable debug logging or not 54 | """ 55 | log_level = ( 56 | logging.DEBUG if (debug or os.environ.get("DEBUG") == "1") else logging.INFO 57 | ) 58 | 59 | formatter = logging.Formatter( 60 | "%(asctime)s - %(name)s - %(levelname)s - %(message)s" 61 | ) 62 | 63 | root_logger = logging.getLogger() 64 | root_logger.setLevel(log_level) 65 | 66 | for handler in root_logger.handlers[:]: 67 | root_logger.removeHandler(handler) 68 | 69 | console_handler = logging.StreamHandler(sys.stderr) 70 | console_handler.setFormatter(formatter) 71 | console_handler.setLevel(log_level) 72 | 73 | api_key_filter = APIKeySanitisingFilter() 74 | console_handler.addFilter(api_key_filter) 75 | 76 | root_logger.addHandler(console_handler) 77 | 78 | logging.getLogger("uvicorn").propagate = False 79 | 80 | return root_logger 81 | 82 | 83 | def get_logger(name: str) -> logging.Logger: 84 | """ 85 | Get a logger for a module. 86 | 87 | Args: 88 | name: Module name (usually __name__) 89 | 90 | Returns: 91 | Logger instance 92 | """ 93 | return logging.getLogger(name) 94 | ``` -------------------------------------------------------------------------------- /src/prompt_templates/prompt_templates.py: -------------------------------------------------------------------------------- ```python 1 | PROMPT_TEMPLATES = { 2 | "usrn_breakdown": ( 3 | "Break down USRN {usrn} into its component road links for routing analysis. " 4 | "Step 1: GET /collections/trn-ntwk-street-1/items?filter=usrn='{usrn}' to get street details. " 5 | "Step 2: Use Street geometry bbox to query Road Links: GET /collections/trn-ntwk-roadlink-4/items?bbox=[street_bbox] " 6 | "Step 3: Filter Road Links by Street reference using properties.street_ref matching street feature ID. " 7 | "Step 4: For each Road Link: GET /collections/trn-ntwk-roadnode-1/items?filter=roadlink_ref='roadlink_id' " 8 | "Step 5: Set crs=EPSG:27700 for British National Grid coordinates. " 9 | "Return: Complete breakdown of USRN into constituent Road Links with node connections." 10 | ), 11 | "restriction_matching_analysis": ( 12 | "Perform comprehensive traffic restriction matching for road network in bbox {bbox} with SPECIFIC STREET IDENTIFICATION. " 13 | "Step 1: Build routing network: get_routing_data(bbox='{bbox}', limit={limit}, build_network=True) " 14 | "Step 2: Extract restriction data from build_status.restrictions array. " 15 | "Step 3: For each restriction, match to road links using restrictionnetworkreference: " 16 | " - networkreferenceid = Road Link UUID from trn-ntwk-roadlink-4 " 17 | " - roadlinkdirection = 'In Direction' (with geometry) or 'In Opposite Direction' (against geometry) " 18 | " - roadlinksequence = order for multi-link restrictions (turns) " 19 | "Step 4: IMMEDIATELY lookup street names: Use get_bulk_features(collection_id='trn-ntwk-roadlink-4', identifiers=[road_link_uuids]) to resolve UUIDs to actual street names (name1_text, roadclassification, roadclassificationnumber) " 20 | "Step 5: Analyze restriction types by actual street names: " 21 | " - One Way: Apply directional constraint to specific named road " 22 | " - Turn Restriction: Block movement between named streets (from street A to street B) " 23 | " - Vehicle Restrictions: Apply dimension/weight limits to specific named roads " 24 | " - Access Restrictions: Apply vehicle type constraints to specific named streets " 25 | "Step 6: Check exemptions array (e.g., 'Pedal Cycles' exempt from one-way) for each named street " 26 | "Step 7: Group results by street names and road classifications (A Roads, B Roads, Local Roads) " 27 | "Return: Complete restriction mapping showing ACTUAL STREET NAMES with their specific restrictions, directions, exemptions, and road classifications. Present results as 'Street Name (Road Class)' rather than UUIDs." 28 | ), 29 | } 30 | ``` -------------------------------------------------------------------------------- /src/mcp_service/resources.py: -------------------------------------------------------------------------------- ```python 1 | """MCP Resources for OS NGD documentation""" 2 | 3 | import json 4 | import time 5 | from models import NGDAPIEndpoint 6 | from utils.logging_config import get_logger 7 | 8 | logger = get_logger(__name__) 9 | 10 | 11 | # TODO: Do this for 12 | class OSDocumentationResources: 13 | """Handles registration of OS NGD documentation resources""" 14 | 15 | def __init__(self, mcp_service, api_client): 16 | self.mcp = mcp_service 17 | self.api_client = api_client 18 | 19 | def register_all(self) -> None: 20 | """Register all documentation resources""" 21 | self._register_transport_network_resources() 22 | # Future: self._register_land_resources() 23 | # Future: self._register_building_resources() 24 | 25 | def _register_transport_network_resources(self) -> None: 26 | """Register transport network documentation resources""" 27 | 28 | @self.mcp.resource("os-docs://street") 29 | async def street_docs() -> str: 30 | return await self._fetch_doc_resource( 31 | "street", NGDAPIEndpoint.MARKDOWN_STREET.value 32 | ) 33 | 34 | @self.mcp.resource("os-docs://road") 35 | async def road_docs() -> str: 36 | return await self._fetch_doc_resource( 37 | "road", NGDAPIEndpoint.MARKDOWN_ROAD.value 38 | ) 39 | 40 | @self.mcp.resource("os-docs://tram-on-road") 41 | async def tram_on_road_docs() -> str: 42 | return await self._fetch_doc_resource( 43 | "tram-on-road", NGDAPIEndpoint.TRAM_ON_ROAD.value 44 | ) 45 | 46 | @self.mcp.resource("os-docs://road-node") 47 | async def road_node_docs() -> str: 48 | return await self._fetch_doc_resource( 49 | "road-node", NGDAPIEndpoint.ROAD_NODE.value 50 | ) 51 | 52 | @self.mcp.resource("os-docs://road-link") 53 | async def road_link_docs() -> str: 54 | return await self._fetch_doc_resource( 55 | "road-link", NGDAPIEndpoint.ROAD_LINK.value 56 | ) 57 | 58 | @self.mcp.resource("os-docs://road-junction") 59 | async def road_junction_docs() -> str: 60 | return await self._fetch_doc_resource( 61 | "road-junction", NGDAPIEndpoint.ROAD_JUNCTION.value 62 | ) 63 | 64 | async def _fetch_doc_resource(self, feature_type: str, url: str) -> str: 65 | """Generic method to fetch documentation resources""" 66 | try: 67 | content = await self.api_client.make_request_no_auth(url) 68 | 69 | return json.dumps( 70 | { 71 | "feature_type": feature_type, 72 | "content": content, 73 | "content_type": "markdown", 74 | "source_url": url, 75 | "timestamp": time.time(), 76 | } 77 | ) 78 | 79 | except Exception as e: 80 | logger.error(f"Error fetching {feature_type} documentation: {e}") 81 | return json.dumps({"error": str(e), "feature_type": feature_type}) 82 | ``` -------------------------------------------------------------------------------- /src/mcp_service/prompts.py: -------------------------------------------------------------------------------- ```python 1 | from typing import List 2 | from mcp.types import PromptMessage, TextContent 3 | from prompt_templates.prompt_templates import PROMPT_TEMPLATES 4 | from utils.logging_config import get_logger 5 | 6 | logger = get_logger(__name__) 7 | 8 | 9 | class OSWorkflowPrompts: 10 | """Handles registration of OS NGD workflow prompts""" 11 | 12 | def __init__(self, mcp_service): 13 | self.mcp = mcp_service 14 | 15 | def register_all(self) -> None: 16 | """Register all workflow prompts""" 17 | self._register_analysis_prompts() 18 | self._register_general_prompts() 19 | 20 | def _register_analysis_prompts(self) -> None: 21 | """Register analysis workflow prompts""" 22 | 23 | @self.mcp.prompt() 24 | def usrn_breakdown_analysis(usrn: str) -> List[PromptMessage]: 25 | """Generate a step-by-step USRN breakdown workflow""" 26 | template = PROMPT_TEMPLATES["usrn_breakdown"].format(usrn=usrn) 27 | 28 | return [ 29 | PromptMessage( 30 | role="user", 31 | content=TextContent( 32 | type="text", 33 | text=f"As an expert in OS NGD API workflows and transport network analysis, {template}", 34 | ), 35 | ) 36 | ] 37 | 38 | def _register_general_prompts(self) -> None: 39 | """Register general OS NGD guidance prompts""" 40 | 41 | @self.mcp.prompt() 42 | def collection_query_guidance( 43 | collection_id: str, query_type: str = "features" 44 | ) -> List[PromptMessage]: 45 | """Generate guidance for querying OS NGD collections""" 46 | return [ 47 | PromptMessage( 48 | role="user", 49 | content=TextContent( 50 | type="text", 51 | text=f"As an OS NGD API expert, guide me through querying the '{collection_id}' collection for {query_type}. " 52 | f"Include: 1) Available filters, 2) Best practices for bbox queries, " 53 | f"3) CRS considerations, 4) Example queries with proper syntax.", 54 | ), 55 | ) 56 | ] 57 | 58 | @self.mcp.prompt() 59 | def workflow_planning( 60 | user_request: str, data_theme: str = "transport" 61 | ) -> List[PromptMessage]: 62 | """Generate a workflow plan for complex OS NGD queries""" 63 | return [ 64 | PromptMessage( 65 | role="user", 66 | content=TextContent( 67 | type="text", 68 | text=f"As a geospatial workflow planner, create a detailed workflow plan for: '{user_request}'. " 69 | f"Focus on {data_theme} theme data. Include: " 70 | f"1) Collection selection rationale, " 71 | f"2) Query sequence with dependencies, " 72 | f"3) Filter strategies, " 73 | f"4) Error handling considerations.", 74 | ), 75 | ) 76 | ] 77 | ``` -------------------------------------------------------------------------------- /src/config_docs/ogcapi-features-2.yaml: -------------------------------------------------------------------------------- ```yaml 1 | openapi: 3.0.3 2 | info: 3 | title: "Building Blocks specified in OGC API - Features - Part 2: Coordinate Reference Systems by Reference" 4 | description: |- 5 | Common components used in the 6 | [OGC standard "OGC API - Features - Part 2: Coordinate Reference Systems by Reference"](http://docs.opengeospatial.org/is/18-058/18-058.html). 7 | 8 | OGC API - Features - Part 2: Coordinate Reference Systems by Reference 1.0 is an OGC Standard. 9 | Copyright (c) 2020 Open Geospatial Consortium. 10 | To obtain additional rights of use, visit http://www.opengeospatial.org/legal/ . 11 | 12 | This document is also available on 13 | [OGC](http://schemas.opengis.net/ogcapi/features/part2/1.0/openapi/ogcapi-features-2.yaml). 14 | version: '1.0.0' 15 | contact: 16 | name: Clemens Portele 17 | email: [email protected] 18 | license: 19 | name: OGC License 20 | url: 'http://www.opengeospatial.org/legal/' 21 | components: 22 | parameters: 23 | bbox-crs: 24 | name: bbox-crs 25 | in: query 26 | required: false 27 | schema: 28 | type: string 29 | format: uri 30 | style: form 31 | explode: false 32 | crs: 33 | name: crs 34 | in: query 35 | required: false 36 | schema: 37 | type: string 38 | format: uri 39 | style: form 40 | explode: false 41 | schemas: 42 | collection: 43 | allOf: 44 | - $ref: http://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/collection.yaml 45 | - $ref: '#/components/schemas/collectionExtensionCrs' 46 | collectionExtensionCrs: 47 | type: object 48 | properties: 49 | storageCrs: 50 | description: the CRS identifier, from the list of supported CRS identifiers, that may be used to retrieve features from a collection without the need to apply a CRS transformation 51 | type: string 52 | format: uri 53 | storageCrsCoordinateEpoch: 54 | description: point in time at which coordinates in the spatial feature collection are referenced to the dynamic coordinate reference system in `storageCrs`, that may be used to retrieve features from a collection without the need to apply a change of coordinate epoch. It is expressed as a decimal year in the Gregorian calendar 55 | type: number 56 | example: '2017-03-25 in the Gregorian calendar is epoch 2017.23' 57 | collections: 58 | allOf: 59 | - $ref: http://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/collections.yaml 60 | - $ref: '#/components/schemas/collectionsExtensionCrs' 61 | collectionsExtensionCrs: 62 | type: object 63 | properties: 64 | crs: 65 | description: a global list of CRS identifiers that are supported by spatial feature collections offered by the service 66 | type: array 67 | items: 68 | type: string 69 | format: uri 70 | headers: 71 | Content-Crs: 72 | description: a URI, in angular brackets, identifying the coordinate reference system used in the content / payload 73 | schema: 74 | type: string 75 | example: '<http://www.opengis.net/def/crs/EPSG/0/3395>' 76 | ``` -------------------------------------------------------------------------------- /src/mcp_service/protocols.py: -------------------------------------------------------------------------------- ```python 1 | from typing import Protocol, Optional, Callable, List, runtime_checkable, Any 2 | 3 | 4 | @runtime_checkable 5 | class MCPService(Protocol): 6 | """Protocol for MCP services""" 7 | 8 | def tool(self) -> Callable[..., Any]: 9 | """Register a function as an MCP tool""" 10 | ... 11 | 12 | def resource( 13 | self, 14 | uri: str, 15 | *, 16 | name: str | None = None, 17 | title: str | None = None, 18 | description: str | None = None, 19 | mime_type: str | None = None, 20 | ) -> Callable[[Any], Any]: 21 | """Register a function as an MCP resource""" 22 | ... 23 | 24 | def run(self) -> None: 25 | """Run the MCP service""" 26 | ... 27 | 28 | 29 | @runtime_checkable 30 | class FeatureService(Protocol): 31 | """Protocol for OS NGD feature services""" 32 | 33 | def hello_world(self, name: str) -> str: 34 | """Test connection to the service""" 35 | ... 36 | 37 | def check_api_key(self) -> str: 38 | """Check if API key is available""" 39 | ... 40 | 41 | async def list_collections(self) -> str: 42 | """List all available feature collections""" 43 | ... 44 | 45 | async def get_single_collection(self, collection_id: str) -> str: 46 | """Get detailed information about a specific collection""" 47 | ... 48 | 49 | async def get_single_collection_queryables(self, collection_id: str) -> str: 50 | """Get queryable properties for a collection""" 51 | ... 52 | 53 | # TODO: Need to make sure the full list of parameters is supported 54 | # TODO: Supporting cql-text is clunky and need to figure out how to support this better 55 | async def search_features( 56 | self, 57 | collection_id: str, 58 | bbox: Optional[str] = None, 59 | crs: Optional[str] = None, 60 | limit: int = 10, 61 | offset: int = 0, 62 | filter: Optional[str] = None, 63 | filter_lang: Optional[str] = "cql-text", 64 | query_attr: Optional[str] = None, 65 | query_attr_value: Optional[str] = None, 66 | ) -> str: 67 | """Search for features in a collection with full CQL filter support""" 68 | ... 69 | 70 | async def get_feature( 71 | self, collection_id: str, feature_id: str, crs: Optional[str] = None 72 | ) -> str: 73 | """Get a specific feature by ID""" 74 | ... 75 | 76 | async def get_linked_identifiers( 77 | self, identifier_type: str, identifier: str, feature_type: Optional[str] = None 78 | ) -> str: 79 | """Get linked identifiers for a specified identifier""" 80 | ... 81 | 82 | async def get_bulk_features( 83 | self, 84 | collection_id: str, 85 | identifiers: List[str], 86 | query_by_attr: Optional[str] = None, 87 | ) -> str: 88 | """Get multiple features in a single call""" 89 | ... 90 | 91 | async def get_bulk_linked_features( 92 | self, 93 | identifier_type: str, 94 | identifiers: List[str], 95 | feature_type: Optional[str] = None, 96 | ) -> str: 97 | """Get linked features for multiple identifiers""" 98 | ... 99 | 100 | async def fetch_detailed_collections(self, collection_ids: List[str]) -> str: 101 | """Get detailed information about specific collections for workflow planning""" 102 | ... 103 | ``` -------------------------------------------------------------------------------- /src/middleware/stdio_middleware.py: -------------------------------------------------------------------------------- ```python 1 | import json 2 | import time 3 | import asyncio 4 | from collections import deque 5 | from typing import Callable, TypeVar, Any, Union, cast 6 | from functools import wraps 7 | from utils.logging_config import get_logger 8 | 9 | logger = get_logger(__name__) 10 | F = TypeVar("F", bound=Callable[..., Any]) 11 | 12 | 13 | class StdioRateLimiter: 14 | """STDIO-specific rate limiting""" 15 | 16 | def __init__(self, requests_per_minute: int = 10, window_seconds: int = 60): 17 | self.requests_per_minute = requests_per_minute 18 | self.window_seconds = window_seconds 19 | self.request_timestamps = deque() 20 | 21 | def check_rate_limit(self) -> bool: 22 | """Check rate limit for STDIO client""" 23 | current_time = time.time() 24 | 25 | while ( 26 | self.request_timestamps 27 | and current_time - self.request_timestamps[0] >= self.window_seconds 28 | ): 29 | self.request_timestamps.popleft() 30 | 31 | if len(self.request_timestamps) >= self.requests_per_minute: 32 | logger.warning("STDIO rate limit exceeded") 33 | return False 34 | 35 | self.request_timestamps.append(current_time) 36 | return True 37 | 38 | 39 | class StdioMiddleware: 40 | """STDIO authentication and rate limiting""" 41 | 42 | def __init__(self, requests_per_minute: int = 20): 43 | self.authenticated = False 44 | self.client_id = "anonymous" 45 | self.rate_limiter = StdioRateLimiter(requests_per_minute=requests_per_minute) 46 | 47 | def require_auth_and_rate_limit(self, func: F) -> F: 48 | """Decorator for auth and rate limiting""" 49 | 50 | @wraps(func) 51 | async def async_wrapper(*args: Any, **kwargs: Any) -> Union[str, Any]: 52 | if not self.authenticated: 53 | logger.error( 54 | json.dumps({"error": "Authentication required", "code": 401}) 55 | ) 56 | return json.dumps({"error": "Authentication required", "code": 401}) 57 | 58 | if not self.rate_limiter.check_rate_limit(): 59 | logger.error(json.dumps({"error": "Rate limited", "code": 429})) 60 | return json.dumps({"error": "Rate limited", "code": 429}) 61 | 62 | return await func(*args, **kwargs) 63 | 64 | @wraps(func) 65 | def sync_wrapper(*args: Any, **kwargs: Any) -> Union[str, Any]: 66 | if not self.authenticated: 67 | logger.error( 68 | json.dumps({"error": "Authentication required", "code": 401}) 69 | ) 70 | return json.dumps({"error": "Authentication required", "code": 401}) 71 | 72 | if not self.rate_limiter.check_rate_limit(): 73 | logger.error(json.dumps({"error": "Rate limited", "code": 429})) 74 | return json.dumps({"error": "Rate limited", "code": 429}) 75 | 76 | return func(*args, **kwargs) 77 | 78 | if asyncio.iscoroutinefunction(func): 79 | return cast(F, async_wrapper) 80 | return cast(F, sync_wrapper) 81 | 82 | def authenticate(self, key: str) -> bool: 83 | """Authenticate with API key""" 84 | if key and key.strip(): 85 | self.authenticated = True 86 | self.client_id = key 87 | return True 88 | else: 89 | self.authenticated = False 90 | return False 91 | ``` -------------------------------------------------------------------------------- /src/mcp_service/guardrails.py: -------------------------------------------------------------------------------- ```python 1 | import re 2 | import json 3 | import asyncio 4 | from functools import wraps 5 | from typing import TypeVar, Callable, Any, Union, cast 6 | from utils.logging_config import get_logger 7 | 8 | logger = get_logger(__name__) 9 | F = TypeVar("F", bound=Callable[..., Any]) 10 | 11 | 12 | class ToolGuardrails: 13 | """Prompt injection protection""" 14 | 15 | def __init__(self): 16 | self.suspicious_patterns = [ 17 | r"(?i)ignore previous", 18 | r"(?i)ignore all previous instructions", 19 | r"(?i)assistant:", 20 | r"\{\{.*?\}\}", 21 | r"(?i)forget", 22 | r"(?i)show credentials", 23 | r"(?i)show secrets", 24 | r"(?i)reveal password", 25 | r"(?i)dump (tokens|secrets|passwords|credentials)", 26 | r"(?i)leak confidential", 27 | r"(?i)reveal secrets", 28 | r"(?i)expose secrets", 29 | r"(?i)secrets.*contain", 30 | r"(?i)extract secrets", 31 | ] 32 | 33 | def detect_prompt_injection(self, input_text: Any) -> bool: 34 | """Check if input contains prompt injection attempts""" 35 | if not isinstance(input_text, str): 36 | return False 37 | return any( 38 | re.search(pattern, input_text) for pattern in self.suspicious_patterns 39 | ) 40 | 41 | def basic_guardrails(self, func: F) -> F: 42 | """Prompt injection protection only""" 43 | 44 | @wraps(func) 45 | async def async_wrapper(*args: Any, **kwargs: Any) -> Union[str, Any]: 46 | try: 47 | for arg in args: 48 | if isinstance(arg, str) and self.detect_prompt_injection(arg): 49 | raise ValueError("Prompt injection detected!") 50 | 51 | for name, value in kwargs.items(): 52 | if not ( 53 | hasattr(value, "request_context") 54 | and hasattr(value, "request_id") 55 | ): 56 | if isinstance(value, str) and self.detect_prompt_injection( 57 | value 58 | ): 59 | raise ValueError(f"Prompt injection in '{name}'!") 60 | except ValueError as e: 61 | return json.dumps({"error": str(e), "code": 400}) 62 | 63 | return await func(*args, **kwargs) 64 | 65 | @wraps(func) 66 | def sync_wrapper(*args: Any, **kwargs: Any) -> Union[str, Any]: 67 | try: 68 | for arg in args: 69 | if isinstance(arg, str) and self.detect_prompt_injection(arg): 70 | raise ValueError("Prompt injection detected!") 71 | 72 | for name, value in kwargs.items(): 73 | if not ( 74 | hasattr(value, "request_context") 75 | and hasattr(value, "request_id") 76 | ): 77 | if isinstance(value, str) and self.detect_prompt_injection( 78 | value 79 | ): 80 | raise ValueError(f"Prompt injection in '{name}'!") 81 | except ValueError as e: 82 | return json.dumps({"error": str(e), "code": 400}) 83 | 84 | return func(*args, **kwargs) 85 | 86 | if asyncio.iscoroutinefunction(func): 87 | return cast(F, async_wrapper) 88 | return cast(F, sync_wrapper) 89 | ``` -------------------------------------------------------------------------------- /src/models.py: -------------------------------------------------------------------------------- ```python 1 | """ 2 | Types for the OS NGD API MCP Server 3 | """ 4 | 5 | from enum import Enum 6 | from pydantic import BaseModel 7 | from typing import Any, List, Dict 8 | 9 | 10 | class NGDAPIEndpoint(Enum): 11 | """ 12 | Enum for the OS API Endpoints following OGC API Features standard 13 | """ 14 | 15 | # NGD Features Endpoints 16 | NGD_FEATURES_BASE_PATH = "https://api.os.uk/features/ngd/ofa/v1/{}" 17 | COLLECTIONS = NGD_FEATURES_BASE_PATH.format("collections") 18 | COLLECTION_INFO = NGD_FEATURES_BASE_PATH.format("collections/{}") 19 | COLLECTION_SCHEMA = NGD_FEATURES_BASE_PATH.format("collections/{}/schema") 20 | COLLECTION_FEATURES = NGD_FEATURES_BASE_PATH.format("collections/{}/items") 21 | COLLECTION_FEATURE_BY_ID = NGD_FEATURES_BASE_PATH.format("collections/{}/items/{}") 22 | COLLECTION_QUERYABLES = NGD_FEATURES_BASE_PATH.format("collections/{}/queryables") 23 | 24 | # OpenAPI Specification Endpoint 25 | OPENAPI_SPEC = NGD_FEATURES_BASE_PATH.format("api") 26 | 27 | # Linked Identifiers Endpoints 28 | LINKED_IDENTIFIERS_BASE_PATH = "https://api.os.uk/search/links/v1/{}" 29 | LINKED_IDENTIFIERS = LINKED_IDENTIFIERS_BASE_PATH.format("identifierTypes/{}/{}") 30 | 31 | # Markdown Resources 32 | MARKDOWN_BASE_PATH = "https://docs.os.uk/osngd/data-structure/{}" 33 | MARKDOWN_STREET = MARKDOWN_BASE_PATH.format("transport/transport-network/street.md") 34 | MARKDOWN_ROAD = MARKDOWN_BASE_PATH.format("transport/transport-network/road.md") 35 | TRAM_ON_ROAD = MARKDOWN_BASE_PATH.format( 36 | "transport/transport-network/tram-on-road.md" 37 | ) 38 | ROAD_NODE = MARKDOWN_BASE_PATH.format("transport/transport-network/road-node.md") 39 | ROAD_LINK = MARKDOWN_BASE_PATH.format("transport/transport-network/road-link.md") 40 | ROAD_JUNCTION = MARKDOWN_BASE_PATH.format( 41 | "transport/transport-network/road-junction.md" 42 | ) 43 | 44 | # Places API Endpoints 45 | # TODO: Add these back in when I get access to the Places API from OS 46 | # PLACES_BASE_PATH = "https://api.os.uk/search/places/v1/{}" 47 | # PLACES_UPRN = PLACES_BASE_PATH.format("uprn") 48 | # POST_CODE = PLACES_BASE_PATH.format("postcode") 49 | 50 | 51 | class OpenAPISpecification(BaseModel): 52 | """Parsed OpenAPI specification optimized for LLM context""" 53 | 54 | title: str 55 | version: str 56 | base_url: str 57 | endpoints: Dict[str, str] 58 | collection_ids: List[str] 59 | supported_crs: Dict[str, Any] 60 | crs_guide: Dict[str, str] 61 | 62 | 63 | class WorkflowStep(BaseModel): 64 | """A single step in a workflow plan""" 65 | 66 | step_number: int 67 | description: str 68 | api_endpoint: str 69 | parameters: Dict[str, Any] 70 | dependencies: List[int] = [] 71 | 72 | 73 | class WorkflowPlan(BaseModel): 74 | """Generated workflow plan for user requests""" 75 | 76 | user_request: str 77 | steps: List[WorkflowStep] 78 | reasoning: str 79 | estimated_complexity: str = "simple" 80 | 81 | 82 | class Collection(BaseModel): 83 | """Represents a feature collection from the OS NGD API""" 84 | 85 | id: str 86 | title: str 87 | description: str = "" 88 | links: List[Dict[str, Any]] = [] 89 | extent: Dict[str, Any] = {} 90 | itemType: str = "feature" 91 | 92 | 93 | class CollectionsCache(BaseModel): 94 | """Cached collections data with filtering applied""" 95 | 96 | collections: List[Collection] 97 | raw_response: Dict[str, Any] 98 | 99 | 100 | class CollectionQueryables(BaseModel): 101 | """Queryables information for a collection""" 102 | 103 | id: str 104 | title: str 105 | description: str 106 | all_queryables: Dict[str, Any] 107 | enum_queryables: Dict[str, Any] 108 | has_enum_filters: bool 109 | total_queryables: int 110 | enum_count: int 111 | 112 | 113 | class WorkflowContextCache(BaseModel): 114 | """Cached workflow context data""" 115 | 116 | collections_info: Dict[str, CollectionQueryables] 117 | openapi_spec: OpenAPISpecification 118 | cached_at: float 119 | ``` -------------------------------------------------------------------------------- /src/server.py: -------------------------------------------------------------------------------- ```python 1 | import argparse 2 | import os 3 | import uvicorn 4 | from typing import Any 5 | from utils.logging_config import configure_logging 6 | 7 | from api_service.os_api import OSAPIClient 8 | from mcp_service.os_service import OSDataHubService 9 | from mcp.server.fastmcp import FastMCP 10 | from middleware.stdio_middleware import StdioMiddleware 11 | from middleware.http_middleware import HTTPMiddleware 12 | from starlette.middleware import Middleware 13 | from starlette.middleware.cors import CORSMiddleware 14 | from starlette.routing import Route 15 | from starlette.responses import JSONResponse 16 | 17 | logger = configure_logging() 18 | 19 | 20 | def main(): 21 | """Main entry point""" 22 | parser = argparse.ArgumentParser(description="OS DataHub API MCP Server") 23 | parser.add_argument( 24 | "--transport", 25 | choices=["stdio", "streamable-http"], 26 | default="stdio", 27 | help="Transport protocol to use (stdio or streamable-http)", 28 | ) 29 | parser.add_argument( 30 | "--host", default="0.0.0.0", help="Host to bind to (default: 0.0.0.0)" 31 | ) 32 | parser.add_argument( 33 | "--port", type=int, default=8000, help="Port to bind to (default: 8000)" 34 | ) 35 | parser.add_argument("--debug", action="store_true", help="Enable debug mode") 36 | args = parser.parse_args() 37 | 38 | configure_logging(debug=args.debug) 39 | 40 | logger.info( 41 | f"OS DataHub API MCP Server starting with {args.transport} transport..." 42 | ) 43 | 44 | api_client = OSAPIClient() 45 | 46 | match args.transport: 47 | case "stdio": 48 | logger.info("Starting with stdio transport") 49 | 50 | mcp = FastMCP( 51 | "os-ngd-api", 52 | debug=args.debug, 53 | log_level="DEBUG" if args.debug else "INFO", 54 | ) 55 | 56 | stdio_auth = StdioMiddleware() 57 | 58 | service = OSDataHubService(api_client, mcp, stdio_middleware=stdio_auth) 59 | 60 | stdio_api_key = os.environ.get("STDIO_KEY") 61 | if not stdio_api_key or not stdio_auth.authenticate(stdio_api_key): 62 | logger.error("Authentication failed") 63 | return 64 | 65 | service.run() 66 | 67 | case "streamable-http": 68 | logger.info(f"Starting Streamable HTTP server on {args.host}:{args.port}") 69 | 70 | mcp = FastMCP( 71 | "os-ngd-api", 72 | host=args.host, 73 | port=args.port, 74 | debug=args.debug, 75 | json_response=True, 76 | stateless_http=False, 77 | log_level="DEBUG" if args.debug else "INFO", 78 | ) 79 | 80 | OSDataHubService(api_client, mcp) 81 | 82 | async def auth_discovery(_: Any) -> JSONResponse: 83 | """Return authentication methods.""" 84 | return JSONResponse( 85 | content={"authMethods": [{"type": "http", "scheme": "bearer"}]} 86 | ) 87 | 88 | app = mcp.streamable_http_app() 89 | 90 | app.routes.append( 91 | Route( 92 | "/.well-known/mcp-auth", 93 | endpoint=auth_discovery, 94 | methods=["GET"], 95 | ) 96 | ) 97 | 98 | app.user_middleware.extend( 99 | [ 100 | Middleware( 101 | CORSMiddleware, 102 | allow_origins=["*"], 103 | allow_credentials=True, 104 | allow_methods=["GET", "POST", "OPTIONS"], 105 | allow_headers=["*"], 106 | expose_headers=["*"], 107 | ), 108 | Middleware(HTTPMiddleware), 109 | ] 110 | ) 111 | 112 | uvicorn.run( 113 | app, 114 | host=args.host, 115 | port=args.port, 116 | log_level="debug" if args.debug else "info", 117 | ) 118 | 119 | case _: 120 | logger.error(f"Unknown transport: {args.transport}") 121 | return 122 | 123 | 124 | if __name__ == "__main__": 125 | main() 126 | ``` -------------------------------------------------------------------------------- /src/config_docs/ogcapi-features-3.yaml: -------------------------------------------------------------------------------- ```yaml 1 | openapi: 3.0.3 2 | info: 3 | title: "Building Blocks specified in OGC API - Features - Part 3: Filtering" 4 | description: |- 5 | Common components used in the 6 | [OGC standard "OGC API - Features - Part 3: Filtering"](https://docs.ogc.org/is/19-079r2/19-079r2.html). 7 | 8 | OGC API - Features - Part 3: Filtering 1.0 is an OGC Standard. 9 | Copyright (c) 2024 Open Geospatial Consortium. 10 | To obtain additional rights of use, visit https://www.ogc.org/legal/ . 11 | 12 | This document is also available in the 13 | [OGC Schema Repository](https://schemas.opengis.net/ogcapi/features/part3/1.0/openapi/ogcapi-features-3.yaml). 14 | version: '1.0.0' 15 | contact: 16 | name: Clemens Portele 17 | email: [email protected] 18 | license: 19 | name: OGC License 20 | url: 'https://www.ogc.org/legal/' 21 | paths: 22 | /collections/{collectionId}/queryables: 23 | get: 24 | summary: Get the list of supported queryables for a collection 25 | description: |- 26 | This operation returns the list of supported queryables of a collection. Queryables are 27 | the properties that may be used to construct a filter expression on items in the collection. 28 | The response is a JSON Schema of a object where each property is a queryable. 29 | operationId: getQueryables 30 | parameters: 31 | - $ref: 'https://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/parameters/collectionId.yaml' 32 | responses: 33 | '200': 34 | description: The queryable properties of the collection. 35 | content: 36 | application/schema+json: 37 | schema: 38 | type: object 39 | /functions: 40 | get: 41 | summary: Get the list of supported functions 42 | description: |- 43 | This operation returns the list of custom functions supported in CQL2 expressions. 44 | operationId: getFunctions 45 | responses: 46 | '200': 47 | description: The list of custom functions supported in CQL2 expressions. 48 | content: 49 | application/json: 50 | schema: 51 | $ref: '#/components/schemas/functions' 52 | components: 53 | parameters: 54 | filter: 55 | name: filter 56 | in: query 57 | required: false 58 | schema: 59 | type: string 60 | style: form 61 | explode: false 62 | filter-lang: 63 | name: filter-lang 64 | in: query 65 | required: false 66 | schema: 67 | type: string 68 | enum: 69 | - 'cql2-text' 70 | - 'cql2-json' 71 | default: 'cql2-text' 72 | style: form 73 | filter-crs: 74 | name: filter-crs 75 | in: query 76 | required: false 77 | schema: 78 | type: string 79 | format: uri-reference 80 | style: form 81 | explode: false 82 | schemas: 83 | functions: 84 | type: object 85 | required: 86 | - functions 87 | properties: 88 | functions: 89 | type: array 90 | items: 91 | type: object 92 | required: 93 | - name 94 | - returns 95 | properties: 96 | name: 97 | type: string 98 | description: 99 | type: string 100 | metadataUrl: 101 | type: string 102 | format: uri-reference 103 | arguments: 104 | type: array 105 | items: 106 | type: object 107 | required: 108 | - type 109 | properties: 110 | title: 111 | type: string 112 | description: 113 | type: string 114 | type: 115 | type: array 116 | items: 117 | type: string 118 | enum: 119 | - string 120 | - number 121 | - integer 122 | - datetime 123 | - geometry 124 | - boolean 125 | returns: 126 | type: array 127 | items: 128 | type: string 129 | enum: 130 | - string 131 | - number 132 | - integer 133 | - datetime 134 | - geometry 135 | - boolean 136 | ``` -------------------------------------------------------------------------------- /src/middleware/http_middleware.py: -------------------------------------------------------------------------------- ```python 1 | import os 2 | import time 3 | from collections import defaultdict, deque 4 | from typing import List, Callable, Awaitable 5 | from starlette.middleware.base import BaseHTTPMiddleware 6 | from starlette.requests import Request 7 | from starlette.responses import Response, JSONResponse 8 | from utils.logging_config import get_logger 9 | 10 | logger = get_logger(__name__) 11 | 12 | 13 | class RateLimiter: 14 | """HTTP-layer rate limiting""" 15 | 16 | def __init__(self, requests_per_minute: int = 10, window_seconds: int = 60): 17 | self.requests_per_minute = requests_per_minute 18 | self.window_seconds = window_seconds 19 | self.request_timestamps = defaultdict(lambda: deque()) 20 | 21 | def check_rate_limit(self, client_id: str) -> bool: 22 | """Check if client has exceeded rate limit""" 23 | current_time = time.time() 24 | timestamps = self.request_timestamps[client_id] 25 | 26 | while timestamps and current_time - timestamps[0] >= self.window_seconds: 27 | timestamps.popleft() 28 | 29 | if len(timestamps) >= self.requests_per_minute: 30 | logger.warning(f"HTTP rate limit exceeded for client {client_id}") 31 | return False 32 | 33 | timestamps.append(current_time) 34 | return True 35 | 36 | 37 | def get_valid_bearer_tokens() -> List[str]: 38 | """Get valid bearer tokens from environment variable.""" 39 | try: 40 | tokens = os.environ.get("BEARER_TOKENS", "").split(",") 41 | valid_tokens = [t.strip() for t in tokens if t.strip()] 42 | 43 | if not valid_tokens: 44 | logger.warning( 45 | "No BEARER_TOKENS configured, all authentication will be rejected" 46 | ) 47 | return [] 48 | 49 | return valid_tokens 50 | except Exception as e: 51 | logger.error(f"Error getting valid tokens: {e}") 52 | return [] 53 | 54 | 55 | async def verify_bearer_token(token: str) -> bool: 56 | """Verify bearer token is valid.""" 57 | try: 58 | valid_tokens = get_valid_bearer_tokens() 59 | if not valid_tokens or not token: 60 | return False 61 | return token in valid_tokens 62 | except Exception as e: 63 | logger.error(f"Error validating token: {e}") 64 | return False 65 | 66 | 67 | class HTTPMiddleware(BaseHTTPMiddleware): 68 | def __init__(self, app, requests_per_minute: int = 10): 69 | super().__init__(app) 70 | self.rate_limiter = RateLimiter(requests_per_minute=requests_per_minute) 71 | 72 | async def dispatch( 73 | self, request: Request, call_next: Callable[[Request], Awaitable[Response]] 74 | ) -> Response: 75 | if request.url.path == "/.well-known/mcp-auth" or request.method == "OPTIONS": 76 | return await call_next(request) 77 | 78 | session_id = request.headers.get("mcp-session-id") 79 | if not session_id: 80 | client_ip = request.client.host if request.client else "unknown" 81 | session_id = f"ip-{client_ip}" 82 | 83 | if not self.rate_limiter.check_rate_limit(session_id): 84 | return JSONResponse( 85 | status_code=429, 86 | content={"detail": "Too many requests. Please try again later."}, 87 | headers={"Retry-After": "60"}, 88 | ) 89 | 90 | origin = request.headers.get("origin", "") 91 | if origin and not self._is_valid_origin(origin, request): 92 | client_ip = request.client.host if request.client else "unknown" 93 | logger.warning( 94 | f"Blocked request with suspicious origin from {client_ip}, Origin: {origin}" 95 | ) 96 | return JSONResponse(status_code=403, content={"detail": "Invalid origin"}) 97 | 98 | user_agent = request.headers.get("user-agent", "") 99 | if self._is_browser_plugin(user_agent, request): 100 | client_ip = request.client.host if request.client else "unknown" 101 | logger.warning( 102 | f"Blocked browser plugin access from {client_ip}, User-Agent: {user_agent}" 103 | ) 104 | return JSONResponse( 105 | status_code=403, 106 | content={"detail": "Browser plugin access is not allowed"}, 107 | ) 108 | 109 | auth_header = request.headers.get("Authorization") 110 | if auth_header and auth_header.startswith("Bearer "): 111 | token = auth_header.replace("Bearer ", "") 112 | if await verify_bearer_token(token): 113 | request.state.token = token 114 | return await call_next(request) 115 | else: 116 | logger.warning( 117 | f"Invalid bearer token attempt from {request.client.host if request.client else 'unknown'}" 118 | ) 119 | else: 120 | logger.warning( 121 | f"Missing or invalid Authorization header from {request.client.host if request.client else 'unknown'}" 122 | ) 123 | 124 | return JSONResponse( 125 | status_code=401, 126 | content={"detail": "Authentication required"}, 127 | headers={"WWW-Authenticate": "Bearer"}, 128 | ) 129 | 130 | def _is_valid_origin(self, origin: str, request: Request) -> bool: 131 | """Validate Origin header to prevent DNS rebinding attacks.""" 132 | valid_local_origins = ["http://localhost:", "http://127.0.0.1:"] 133 | valid_domains = os.environ.get("ALLOWED_ORIGINS", "").split(",") 134 | valid_domains = [d.strip() for d in valid_domains if d.strip()] 135 | 136 | for valid_origin in valid_local_origins: 137 | if origin.startswith(valid_origin): 138 | return True 139 | 140 | for domain in valid_domains: 141 | if ( 142 | origin == domain 143 | or origin.startswith(f"https://{domain}") 144 | or origin.startswith(f"http://{domain}") 145 | ): 146 | return True 147 | 148 | return False 149 | 150 | def _is_browser_plugin(self, user_agent: str, request: Request) -> bool: 151 | """Check if request is from a browser plugin.""" 152 | plugin_patterns = [ 153 | "Chrome-Extension", 154 | "Mozilla/5.0 (compatible; Extension)", 155 | "Browser-Extension", 156 | ] 157 | 158 | for pattern in plugin_patterns: 159 | if pattern.lower() in user_agent.lower(): 160 | return True 161 | 162 | origin = request.headers.get("origin", "") 163 | if origin and ( 164 | origin.startswith("chrome-extension://") 165 | or origin.startswith("moz-extension://") 166 | or origin.startswith("safari-extension://") 167 | ): 168 | return True 169 | 170 | return False 171 | ``` -------------------------------------------------------------------------------- /src/stdio_client_test.py: -------------------------------------------------------------------------------- ```python 1 | import asyncio 2 | import subprocess 3 | import sys 4 | import os 5 | import time 6 | from mcp import ClientSession 7 | from mcp.client.stdio import stdio_client 8 | from mcp.client.stdio import StdioServerParameters 9 | from mcp.types import TextContent 10 | 11 | 12 | def extract_text_from_result(result) -> str: 13 | """Safely extract text from MCP tool result""" 14 | if not result.content: 15 | return "No content" 16 | 17 | for item in result.content: 18 | if isinstance(item, TextContent): 19 | return item.text 20 | 21 | content_types = [type(item).__name__ for item in result.content] 22 | return f"Non-text content: {', '.join(content_types)}" 23 | 24 | 25 | async def test_stdio_rate_limiting(): 26 | """Test STDIO rate limiting - should block after 1 request per minute""" 27 | env = os.environ.copy() 28 | env["STDIO_KEY"] = "test-stdio-key" 29 | env["OS_API_KEY"] = os.environ.get("OS_API_KEY", "dummy-key-for-testing") 30 | env["PYTHONPATH"] = "src" 31 | 32 | print("Starting STDIO server subprocess...") 33 | 34 | server_process = subprocess.Popen( 35 | [sys.executable, "src/server.py", "--transport", "stdio", "--debug"], 36 | stdin=subprocess.PIPE, 37 | stdout=subprocess.PIPE, 38 | stderr=subprocess.PIPE, 39 | env=env, 40 | text=True, 41 | bufsize=0, 42 | ) 43 | 44 | try: 45 | print("Connecting to STDIO server...") 46 | 47 | server_params = StdioServerParameters( 48 | command=sys.executable, 49 | args=["src/server.py", "--transport", "stdio", "--debug"], 50 | env=env, 51 | ) 52 | async with stdio_client(server_params) as (read_stream, write_stream): 53 | async with ClientSession(read_stream, write_stream) as session: 54 | print("Initializing session...") 55 | await session.initialize() 56 | 57 | print("Listing available tools...") 58 | tools = await session.list_tools() 59 | print(f" Found {len(tools.tools)} tools") 60 | 61 | print("\nRAPID FIRE TEST (should hit rate limit)") 62 | print("-" * 50) 63 | 64 | results = [] 65 | for i in range(3): 66 | try: 67 | start_time = time.time() 68 | print(f"Request #{i + 1}: ", end="", flush=True) 69 | 70 | result = await session.call_tool( 71 | "hello_world", {"name": f"TestUser{i + 1}"} 72 | ) 73 | elapsed = time.time() - start_time 74 | 75 | result_text = extract_text_from_result(result) 76 | 77 | if "rate limit" in result_text.lower() or "429" in result_text: 78 | print(f"BLOCKED ({elapsed:.1f}s) - {result_text}") 79 | results.append(("BLOCKED", elapsed, result_text)) 80 | else: 81 | print(f"SUCCESS ({elapsed:.1f}s) - {result_text}") 82 | results.append(("SUCCESS", elapsed, result_text)) 83 | 84 | except Exception as e: 85 | elapsed = time.time() - start_time 86 | print(f"ERROR ({elapsed:.1f}s) - {e}") 87 | results.append(("ERROR", elapsed, str(e))) 88 | 89 | if i < 2: 90 | await asyncio.sleep(0.1) 91 | 92 | print("\nRESULTS ANALYSIS") 93 | print("-" * 50) 94 | 95 | success_count = sum(1 for r in results if r[0] == "SUCCESS") 96 | blocked_count = sum(1 for r in results if r[0] == "BLOCKED") 97 | error_count = sum(1 for r in results if r[0] == "ERROR") 98 | 99 | print(f"Successful requests: {success_count}") 100 | print(f"Rate limited requests: {blocked_count}") 101 | print(f"Error requests: {error_count}") 102 | 103 | print("\nVISUAL TIMELINE:") 104 | timeline = "" 105 | for i, (status, elapsed, _) in enumerate(results): 106 | if status == "SUCCESS": 107 | timeline += "OK" 108 | elif status == "BLOCKED": 109 | timeline += "XX" 110 | else: 111 | timeline += "ER" 112 | 113 | if i < len(results) - 1: 114 | timeline += "--" 115 | 116 | print(f" {timeline}") 117 | print(" Request: 1 2 3") 118 | 119 | print("\nTEST VERDICT") 120 | print("-" * 50) 121 | 122 | if success_count == 1 and blocked_count >= 1: 123 | print("PASS: Rate limiting works correctly!") 124 | print(" First request succeeded") 125 | print(" Subsequent requests blocked") 126 | elif success_count > 1: 127 | print("FAIL: Rate limiting too permissive!") 128 | print(f" {success_count} requests succeeded (expected 1)") 129 | else: 130 | print("FAIL: No requests succeeded!") 131 | print(" Check authentication or server issues") 132 | 133 | print("\nWAITING FOR RATE LIMIT RESET...") 134 | print(" (Testing if limit resets after window)") 135 | print(" Waiting 10 seconds...") 136 | 137 | for countdown in range(10, 0, -1): 138 | print(f" {countdown}s remaining...", end="\r", flush=True) 139 | await asyncio.sleep(1) 140 | 141 | print("\nTesting after rate limit window...") 142 | try: 143 | result = await session.call_tool( 144 | "hello_world", {"name": "AfterWait"} 145 | ) 146 | result_text = extract_text_from_result(result) 147 | 148 | if "rate limit" not in result_text.lower(): 149 | print("SUCCESS: Rate limit properly reset!") 150 | else: 151 | print("Rate limit still active (might need longer wait)") 152 | 153 | except Exception as e: 154 | print(f"Error after wait: {e}") 155 | 156 | except Exception as e: 157 | print(f"Test failed with error: {e}") 158 | 159 | finally: 160 | print("\nCleaning up server process...") 161 | server_process.terminate() 162 | try: 163 | await asyncio.wait_for( 164 | asyncio.create_task(asyncio.to_thread(server_process.wait)), timeout=5.0 165 | ) 166 | print("Server terminated gracefully") 167 | except asyncio.TimeoutError: 168 | print("Force killing server...") 169 | server_process.kill() 170 | server_process.wait() 171 | 172 | 173 | if __name__ == "__main__": 174 | print("STDIO Rate Limit Test Suite") 175 | print("Testing if STDIO middleware properly blocks > 1 request/minute") 176 | print() 177 | 178 | if not os.environ.get("OS_API_KEY"): 179 | print("Warning: OS_API_KEY not set, using dummy key for testing") 180 | print(" (This is OK for rate limit testing)") 181 | print() 182 | 183 | asyncio.run(test_stdio_rate_limiting()) 184 | ``` -------------------------------------------------------------------------------- /src/http_client_test.py: -------------------------------------------------------------------------------- ```python 1 | import asyncio 2 | import logging 3 | from mcp.client.streamable_http import streamablehttp_client 4 | from mcp import ClientSession 5 | from mcp.types import TextContent 6 | 7 | logging.basicConfig(level=logging.WARNING) 8 | logger = logging.getLogger(__name__) 9 | 10 | 11 | def extract_text_from_result(result) -> str: 12 | """Safely extract text from MCP tool result""" 13 | if not result.content: 14 | return "No content" 15 | 16 | for item in result.content: 17 | if isinstance(item, TextContent): 18 | return item.text 19 | 20 | content_types = [type(item).__name__ for item in result.content] 21 | return f"Non-text content: {', '.join(content_types)}" 22 | 23 | 24 | async def test_usrn_search(session_name: str, usrn_values: list): 25 | """Test USRN searches with different values""" 26 | headers = {"Authorization": "Bearer dev-token"} 27 | results = [] 28 | session_id = None 29 | 30 | print(f"\n{session_name} - Testing USRN searches") 31 | print("-" * 40) 32 | 33 | try: 34 | async with streamablehttp_client( 35 | "http://127.0.0.1:8000/mcp", headers=headers 36 | ) as ( 37 | read_stream, 38 | write_stream, 39 | get_session_id, 40 | ): 41 | async with ClientSession(read_stream, write_stream) as session: 42 | await session.initialize() 43 | session_id = get_session_id() 44 | print(f"{session_name} Session ID: {session_id}") 45 | 46 | for i, usrn in enumerate(usrn_values): 47 | try: 48 | result = await session.call_tool( 49 | "search_features", 50 | { 51 | "collection_id": "trn-ntwk-street-1", 52 | "query_attr": "usrn", 53 | "query_attr_value": str(usrn), 54 | "limit": 5, 55 | }, 56 | ) 57 | result_text = extract_text_from_result(result) 58 | print(f" USRN {usrn}: SUCCESS - Found data") 59 | print(result_text) 60 | results.append(("SUCCESS", usrn, result_text[:100] + "...")) 61 | await asyncio.sleep(0.1) 62 | 63 | except Exception as e: 64 | if "429" in str(e) or "Too Many Requests" in str(e): 65 | print(f" USRN {usrn}: BLOCKED - Rate limited") 66 | results.append(("BLOCKED", usrn, str(e))) 67 | else: 68 | print(f" USRN {usrn}: ERROR - {e}") 69 | results.append(("ERROR", usrn, str(e))) 70 | 71 | except Exception as e: 72 | print(f"{session_name}: Connection error - {str(e)[:100]}") 73 | if len(results) == 0: 74 | results = [("ERROR", "connection", str(e))] * len(usrn_values) 75 | 76 | return results, session_id 77 | 78 | 79 | async def test_usrn_calls(): 80 | """Test calling USRN searches with different values""" 81 | print("USRN Search Test - Two Different USRNs") 82 | print("=" * 50) 83 | 84 | # Different USRN values to test 85 | usrn_values_1 = ["24501091", "24502114"] 86 | usrn_values_2 = ["24502114", "24501091"] 87 | 88 | results_a, session_id_a = await test_usrn_search("SESSION-A", usrn_values_1) 89 | await asyncio.sleep(0.5) 90 | results_b, session_id_b = await test_usrn_search("SESSION-B", usrn_values_2) 91 | 92 | # Analyze results 93 | print("\nRESULTS SUMMARY") 94 | print("=" * 30) 95 | 96 | success_a = len([r for r in results_a if r[0] == "SUCCESS"]) 97 | blocked_a = len([r for r in results_a if r[0] == "BLOCKED"]) 98 | error_a = len([r for r in results_a if r[0] == "ERROR"]) 99 | 100 | success_b = len([r for r in results_b if r[0] == "SUCCESS"]) 101 | blocked_b = len([r for r in results_b if r[0] == "BLOCKED"]) 102 | error_b = len([r for r in results_b if r[0] == "ERROR"]) 103 | 104 | print(f"SESSION-A: {success_a} success, {blocked_a} blocked, {error_a} errors") 105 | print(f"SESSION-B: {success_b} success, {blocked_b} blocked, {error_b} errors") 106 | print(f"Total: {success_a + success_b} success, {blocked_a + blocked_b} blocked") 107 | 108 | if session_id_a and session_id_b: 109 | print(f"Different session IDs: {session_id_a != session_id_b}") 110 | else: 111 | print("Could not compare session IDs") 112 | 113 | # Show detailed results 114 | print("\nDETAILED RESULTS:") 115 | print("SESSION-A:") 116 | for status, usrn, details in results_a: 117 | print(f" USRN {usrn}: {status}") 118 | 119 | print("SESSION-B:") 120 | for status, usrn, details in results_b: 121 | print(f" USRN {usrn}: {status}") 122 | 123 | 124 | async def test_session_safe(session_name: str, requests: int = 3): 125 | """Test a single session with safer error handling""" 126 | headers = {"Authorization": "Bearer dev-token"} 127 | results = [] 128 | session_id = None 129 | 130 | print(f"\n{session_name} - Testing {requests} requests") 131 | print("-" * 40) 132 | 133 | try: 134 | async with streamablehttp_client( 135 | "http://127.0.0.1:8000/mcp", headers=headers 136 | ) as ( 137 | read_stream, 138 | write_stream, 139 | get_session_id, 140 | ): 141 | async with ClientSession(read_stream, write_stream) as session: 142 | await session.initialize() 143 | session_id = get_session_id() 144 | print(f"{session_name} Session ID: {session_id}") 145 | 146 | for i in range(requests): 147 | try: 148 | result = await session.call_tool( 149 | "hello_world", {"name": f"{session_name}-User{i + 1}"} 150 | ) 151 | result_text = extract_text_from_result(result) 152 | print(f" Request {i + 1}: SUCCESS - {result_text}") 153 | results.append("SUCCESS") 154 | await asyncio.sleep(0.1) 155 | 156 | except Exception as e: 157 | if "429" in str(e) or "Too Many Requests" in str(e): 158 | print(f" Request {i + 1}: BLOCKED - Rate limited") 159 | results.append("BLOCKED") 160 | else: 161 | print(f" Request {i + 1}: ERROR - {e}") 162 | results.append("ERROR") 163 | 164 | for j in range(i + 1, requests): 165 | print(f" Request {j + 1}: BLOCKED - Session rate limited") 166 | results.append("BLOCKED") 167 | break 168 | 169 | except Exception as e: 170 | print(f"{session_name}: Connection error - {str(e)[:100]}") 171 | if len(results) == 0: 172 | results = ["ERROR"] * requests 173 | elif len(results) < requests: 174 | remaining = requests - len(results) 175 | results.extend(["BLOCKED"] * remaining) 176 | 177 | return results, session_id 178 | 179 | 180 | async def test_two_sessions(): 181 | """Test rate limiting across two sessions""" 182 | print("HTTP Rate Limit Test - Two Sessions") 183 | print("=" * 50) 184 | 185 | results_a, session_id_a = await test_session_safe("SESSION-A", 20) 186 | await asyncio.sleep(0.5) 187 | results_b, session_id_b = await test_session_safe("SESSION-B", 20) 188 | 189 | # Analyze results 190 | print("\nRESULTS SUMMARY") 191 | print("=" * 30) 192 | 193 | success_a = results_a.count("SUCCESS") 194 | blocked_a = results_a.count("BLOCKED") 195 | error_a = results_a.count("ERROR") 196 | 197 | success_b = results_b.count("SUCCESS") 198 | blocked_b = results_b.count("BLOCKED") 199 | error_b = results_b.count("ERROR") 200 | 201 | print(f"SESSION-A: {success_a} success, {blocked_a} blocked, {error_a} errors") 202 | print(f"SESSION-B: {success_b} success, {blocked_b} blocked, {error_b} errors") 203 | print(f"Total: {success_a + success_b} success, {blocked_a + blocked_b} blocked") 204 | 205 | if session_id_a and session_id_b: 206 | print(f"Different session IDs: {session_id_a != session_id_b}") 207 | else: 208 | print("Could not compare session IDs") 209 | 210 | total_success = success_a + success_b 211 | total_blocked = blocked_a + blocked_b 212 | 213 | print("\nRATE LIMITING ASSESSMENT:") 214 | if total_success >= 2 and total_blocked >= 2: 215 | print("✅ PASS: Rate limiting is working") 216 | print(f" - {total_success} requests succeeded") 217 | print(f" - {total_blocked} requests were rate limited") 218 | print(" - Each session got limited after ~2 requests") 219 | elif total_success == 0: 220 | print("❌ FAIL: No requests succeeded (check server/auth)") 221 | else: 222 | print("⚠️ UNCLEAR: Unexpected pattern") 223 | print(" Check server logs for actual behavior") 224 | 225 | 226 | if __name__ == "__main__": 227 | # Run the USRN test by default 228 | asyncio.run(test_usrn_calls()) 229 | 230 | # Uncomment the line below to run the original test instead 231 | # asyncio.run(test_two_sessions()) 232 | ``` -------------------------------------------------------------------------------- /src/mcp_service/routing_service.py: -------------------------------------------------------------------------------- ```python 1 | from typing import Dict, List, Set, Optional, Any 2 | from dataclasses import dataclass 3 | from utils.logging_config import get_logger 4 | 5 | logger = get_logger(__name__) 6 | 7 | 8 | @dataclass 9 | class RouteNode: 10 | """Represents a routing node (intersection)""" 11 | 12 | id: int 13 | node_identifier: str 14 | connected_edges: Set[int] 15 | 16 | 17 | @dataclass 18 | class RouteEdge: 19 | """Represents a routing edge (road segment)""" 20 | 21 | id: int 22 | road_id: str 23 | road_name: Optional[str] 24 | source_node_id: int 25 | target_node_id: int 26 | cost: float 27 | reverse_cost: float 28 | geometry: Optional[Dict[str, Any]] 29 | 30 | 31 | class InMemoryRoutingNetwork: 32 | """In-memory routing network built from OS NGD data""" 33 | 34 | def __init__(self): 35 | self.nodes: Dict[int, RouteNode] = {} 36 | self.edges: Dict[int, RouteEdge] = {} 37 | self.node_lookup: Dict[str, int] = {} 38 | self.is_built = False 39 | 40 | def add_node(self, node_identifier: str) -> int: 41 | """Add a node and return its internal ID""" 42 | if node_identifier in self.node_lookup: 43 | return self.node_lookup[node_identifier] 44 | 45 | node_id = len(self.nodes) + 1 46 | self.nodes[node_id] = RouteNode( 47 | id=node_id, 48 | node_identifier=node_identifier, 49 | connected_edges=set(), 50 | ) 51 | self.node_lookup[node_identifier] = node_id 52 | return node_id 53 | 54 | def add_edge(self, road_data: Dict[str, Any]) -> None: 55 | """Add a road link as an edge""" 56 | properties = road_data.get("properties", {}) 57 | 58 | start_node = properties.get("startnode", "") 59 | end_node = properties.get("endnode", "") 60 | 61 | if not start_node or not end_node: 62 | logger.warning(f"Road link {properties.get('id')} missing node data") 63 | return 64 | 65 | source_id = self.add_node(start_node) 66 | target_id = self.add_node(end_node) 67 | 68 | roadlink_id = None 69 | road_track_refs = properties.get("roadtrackorpathreference", []) 70 | if road_track_refs and len(road_track_refs) > 0: 71 | roadlink_id = road_track_refs[0].get("roadlinkid", "") 72 | 73 | edge_id = len(self.edges) + 1 74 | cost = properties.get("geometry_length", 100.0) 75 | 76 | edge = RouteEdge( 77 | id=edge_id, 78 | road_id=roadlink_id or "NONE", 79 | road_name=properties.get("name1_text"), 80 | source_node_id=source_id, 81 | target_node_id=target_id, 82 | cost=cost, 83 | reverse_cost=cost, 84 | geometry=road_data.get("geometry"), 85 | ) 86 | 87 | self.edges[edge_id] = edge 88 | 89 | self.nodes[source_id].connected_edges.add(edge_id) 90 | self.nodes[target_id].connected_edges.add(edge_id) 91 | 92 | def get_connected_edges(self, node_id: int) -> List[RouteEdge]: 93 | """Get all edges connected to a node""" 94 | if node_id not in self.nodes: 95 | return [] 96 | 97 | return [self.edges[edge_id] for edge_id in self.nodes[node_id].connected_edges] 98 | 99 | def get_summary(self) -> Dict[str, Any]: 100 | """Get network summary statistics""" 101 | return { 102 | "total_nodes": len(self.nodes), 103 | "total_edges": len(self.edges), 104 | "is_built": self.is_built, 105 | "sample_nodes": [ 106 | { 107 | "id": node.id, 108 | "identifier": node.node_identifier, 109 | "connected_edges": len(node.connected_edges), 110 | } 111 | for node in list(self.nodes.values())[:5] 112 | ], 113 | } 114 | 115 | def get_all_nodes(self) -> List[Dict[str, Any]]: 116 | """Get all nodes as a flat list""" 117 | return [ 118 | { 119 | "id": node.id, 120 | "node_identifier": node.node_identifier, 121 | "connected_edge_count": len(node.connected_edges), 122 | "connected_edge_ids": list(node.connected_edges), 123 | } 124 | for node in self.nodes.values() 125 | ] 126 | 127 | def get_all_edges(self) -> List[Dict[str, Any]]: 128 | """Get all edges as a flat list""" 129 | return [ 130 | { 131 | "id": edge.id, 132 | "road_id": edge.road_id, 133 | "road_name": edge.road_name, 134 | "source_node_id": edge.source_node_id, 135 | "target_node_id": edge.target_node_id, 136 | "cost": edge.cost, 137 | "reverse_cost": edge.reverse_cost, 138 | "geometry": edge.geometry, 139 | } 140 | for edge in self.edges.values() 141 | ] 142 | 143 | 144 | class OSRoutingService: 145 | """Service to build and query routing networks from OS NGD data""" 146 | 147 | def __init__(self, api_client): 148 | self.api_client = api_client 149 | self.network = InMemoryRoutingNetwork() 150 | self.raw_restrictions: List[Dict[str, Any]] = [] 151 | 152 | async def _fetch_restriction_data( 153 | self, bbox: Optional[str] = None, limit: int = 100 154 | ) -> List[Dict[str, Any]]: 155 | """Fetch raw restriction data""" 156 | try: 157 | logger.debug("Fetching restriction data...") 158 | params = { 159 | "limit": min(limit, 100), 160 | "crs": "http://www.opengis.net/def/crs/EPSG/0/4326", 161 | } 162 | 163 | if bbox: 164 | params["bbox"] = bbox 165 | 166 | restriction_data = await self.api_client.make_request( 167 | "COLLECTION_FEATURES", 168 | params=params, 169 | path_params=["trn-rami-restriction-1"], 170 | ) 171 | 172 | features = restriction_data.get("features", []) 173 | logger.debug(f"Fetched {len(features)} restriction features") 174 | 175 | return features 176 | 177 | except Exception as e: 178 | logger.error(f"Error fetching restriction data: {e}") 179 | return [] 180 | 181 | async def build_routing_network( 182 | self, 183 | bbox: Optional[str] = None, 184 | limit: int = 1000, 185 | include_restrictions: bool = True, 186 | ) -> Dict[str, Any]: 187 | """Build the routing network from OS NGD road links with optional restriction data""" 188 | try: 189 | logger.debug("Building routing network from OS NGD data...") 190 | 191 | if include_restrictions: 192 | self.raw_restrictions = await self._fetch_restriction_data(bbox, limit) 193 | 194 | params = { 195 | "limit": min(limit, 100), 196 | "crs": "http://www.opengis.net/def/crs/EPSG/0/4326", 197 | } 198 | 199 | if bbox: 200 | params["bbox"] = bbox 201 | 202 | road_links_data = await self.api_client.make_request( 203 | "COLLECTION_FEATURES", 204 | params=params, 205 | path_params=["trn-ntwk-roadlink-4"], 206 | ) 207 | 208 | features = road_links_data.get("features", []) 209 | logger.debug(f"Processing {len(features)} road links...") 210 | 211 | for feature in features: 212 | self.network.add_edge(feature) 213 | 214 | self.network.is_built = True 215 | 216 | summary = self.network.get_summary() 217 | logger.debug( 218 | f"Network built: {summary['total_nodes']} nodes, {summary['total_edges']} edges" 219 | ) 220 | 221 | return { 222 | "status": "success", 223 | "message": f"Built routing network with {summary['total_nodes']} nodes and {summary['total_edges']} edges", 224 | "network_summary": summary, 225 | "restrictions": self.raw_restrictions if include_restrictions else [], 226 | "restriction_count": len(self.raw_restrictions) 227 | if include_restrictions 228 | else 0, 229 | } 230 | 231 | except Exception as e: 232 | logger.error(f"Error building routing network: {e}") 233 | return {"status": "error", "error": str(e)} 234 | 235 | def get_network_info(self) -> Dict[str, Any]: 236 | """Get current network information""" 237 | return {"status": "success", "network": self.network.get_summary()} 238 | 239 | def get_flat_nodes(self) -> Dict[str, Any]: 240 | """Get flat list of all nodes""" 241 | if not self.network.is_built: 242 | return { 243 | "status": "error", 244 | "error": "Routing network not built. Call build_routing_network first.", 245 | } 246 | 247 | return {"status": "success", "nodes": self.network.get_all_nodes()} 248 | 249 | def get_flat_edges(self) -> Dict[str, Any]: 250 | """Get flat list of all edges with connections""" 251 | if not self.network.is_built: 252 | return { 253 | "status": "error", 254 | "error": "Routing network not built. Call build_routing_network first.", 255 | } 256 | 257 | return {"status": "success", "edges": self.network.get_all_edges()} 258 | 259 | def get_routing_tables(self) -> Dict[str, Any]: 260 | """Get both nodes and edges as flat tables""" 261 | if not self.network.is_built: 262 | return { 263 | "status": "error", 264 | "error": "Routing network not built. Call build_routing_network first.", 265 | } 266 | 267 | return { 268 | "status": "success", 269 | "nodes": self.network.get_all_nodes(), 270 | "edges": self.network.get_all_edges(), 271 | "summary": self.network.get_summary(), 272 | "restrictions": self.raw_restrictions, 273 | } 274 | ``` -------------------------------------------------------------------------------- /src/api_service/os_api.py: -------------------------------------------------------------------------------- ```python 1 | import os 2 | import aiohttp 3 | import asyncio 4 | import re 5 | import concurrent.futures 6 | import threading 7 | 8 | from typing import Dict, List, Any, Optional 9 | from models import ( 10 | NGDAPIEndpoint, 11 | OpenAPISpecification, 12 | Collection, 13 | CollectionsCache, 14 | CollectionQueryables, 15 | ) 16 | from api_service.protocols import APIClient 17 | from utils.logging_config import get_logger 18 | 19 | logger = get_logger(__name__) 20 | 21 | 22 | class OSAPIClient(APIClient): 23 | """Implementation an OS API client""" 24 | 25 | user_agent = "os-ngd-mcp-server/1.0" 26 | 27 | def __init__(self, api_key: Optional[str] = None): 28 | """ 29 | Initialise the OS API client 30 | 31 | Args: 32 | api_key: Optional API key, if not provided will use OS_API_KEY env var 33 | """ 34 | self.api_key = api_key 35 | self.session = None 36 | self.last_request_time = 0 37 | # TODO: This is because there seems to be some rate limiting in place - TBC if this is the case 38 | self.request_delay = 0.7 39 | self._cached_openapi_spec: Optional[OpenAPISpecification] = None 40 | self._cached_collections: Optional[CollectionsCache] = None 41 | 42 | # Private helper methods 43 | def _sanitise_api_key(self, text: Any) -> str: 44 | """Remove API keys from any text (URLs, error messages, etc.)""" 45 | if not isinstance(text, str): 46 | return text 47 | 48 | patterns = [ 49 | r"[?&]key=[^&\s]*", 50 | r"[?&]api_key=[^&\s]*", 51 | r"[?&]apikey=[^&\s]*", 52 | r"[?&]token=[^&\s]*", 53 | ] 54 | 55 | sanitized = text 56 | for pattern in patterns: 57 | sanitized = re.sub(pattern, "", sanitized, flags=re.IGNORECASE) 58 | 59 | sanitized = re.sub(r"[?&]$", "", sanitized) 60 | sanitized = re.sub(r"&{2,}", "&", sanitized) 61 | sanitized = re.sub(r"\?&", "?", sanitized) 62 | 63 | return sanitized 64 | 65 | def _sanitise_response(self, data: Any) -> Any: 66 | """Remove API keys from response data recursively""" 67 | if isinstance(data, dict): 68 | sanitized_dict = {} 69 | for key, value in data.items(): 70 | if isinstance(value, str) and any( 71 | url_indicator in key.lower() 72 | for url_indicator in ["href", "url", "link", "uri"] 73 | ): 74 | sanitized_dict[key] = self._sanitise_api_key(value) 75 | elif isinstance(value, (dict, list)): 76 | sanitized_dict[key] = self._sanitise_response(value) 77 | else: 78 | sanitized_dict[key] = value 79 | return sanitized_dict 80 | elif isinstance(data, list): 81 | return [self._sanitise_response(item) for item in data] 82 | elif isinstance(data, str): 83 | if any( 84 | indicator in data 85 | for indicator in [ 86 | "http://", 87 | "https://", 88 | "key=", 89 | "api_key=", 90 | "apikey=", 91 | "token=", 92 | ] 93 | ): 94 | return self._sanitise_api_key(data) 95 | 96 | return data 97 | 98 | def _filter_latest_collections( 99 | self, collections: List[Dict[str, Any]] 100 | ) -> List[Collection]: 101 | """ 102 | Filter collections to keep only the latest version of each collection type. 103 | For collections with IDs like 'trn-ntwk-roadlink-1', 'trn-ntwk-roadlink-2', 'trn-ntwk-roadlink-3', 104 | only keep the one with the highest number. 105 | 106 | Args: 107 | collections: Raw collections from API 108 | 109 | Returns: 110 | Filtered list of Collection objects 111 | """ 112 | latest_versions: Dict[str, Dict[str, Any]] = {} 113 | 114 | for col in collections: 115 | col_id = col.get("id", "") 116 | 117 | match = re.match(r"^(.+?)-(\d+)$", col_id) 118 | 119 | if match: 120 | base_name = match.group(1) 121 | version_num = int(match.group(2)) 122 | 123 | if ( 124 | base_name not in latest_versions 125 | or version_num > latest_versions[base_name]["version"] 126 | ): 127 | latest_versions[base_name] = {"version": version_num, "data": col} 128 | else: 129 | latest_versions[col_id] = {"version": 0, "data": col} 130 | 131 | filtered_collections = [] 132 | for item in latest_versions.values(): 133 | col_data = item["data"] 134 | filtered_collections.append( 135 | Collection( 136 | id=col_data.get("id", ""), 137 | title=col_data.get("title", ""), 138 | description=col_data.get("description", ""), 139 | links=col_data.get("links", []), 140 | extent=col_data.get("extent", {}), 141 | itemType=col_data.get("itemType", "feature"), 142 | ) 143 | ) 144 | 145 | return filtered_collections 146 | 147 | def _parse_openapi_spec_for_llm( 148 | self, spec_data: dict, collection_ids: List[str] 149 | ) -> dict: 150 | """Parse OpenAPI spec to extract only essential information for LLM context""" 151 | supported_crs = { 152 | "input": [], 153 | "output": [], 154 | "default": "http://www.opengis.net/def/crs/OGC/1.3/CRS84", 155 | } 156 | 157 | parsed = { 158 | "title": spec_data.get("info", {}).get("title", ""), 159 | "version": spec_data.get("info", {}).get("version", ""), 160 | "base_url": spec_data.get("servers", [{}])[0].get("url", ""), 161 | "endpoints": {}, 162 | "collection_ids": collection_ids, 163 | "supported_crs": supported_crs, 164 | } 165 | 166 | paths = spec_data.get("paths", {}) 167 | for path, methods in paths.items(): 168 | for method, details in methods.items(): 169 | if method == "get" and "parameters" in details: 170 | for param in details["parameters"]: 171 | param_name = param.get("name", "") 172 | 173 | if param_name == "collectionId" and "schema" in param: 174 | enum_values = param["schema"].get("enum", []) 175 | if enum_values: 176 | parsed["collection_ids"] = enum_values 177 | 178 | elif ( 179 | param_name in ["bbox-crs", "filter-crs"] 180 | and "schema" in param 181 | ): 182 | crs_values = param["schema"].get("enum", []) 183 | if crs_values and not supported_crs["input"]: 184 | supported_crs["input"] = crs_values 185 | 186 | elif param_name == "crs" and "schema" in param: 187 | crs_values = param["schema"].get("enum", []) 188 | if crs_values and not supported_crs["output"]: 189 | supported_crs["output"] = crs_values 190 | 191 | endpoint_patterns = { 192 | "/collections": "List all collections", 193 | "/collections/{collectionId}": "Get collection info", 194 | "/collections/{collectionId}/schema": "Get collection schema", 195 | "/collections/{collectionId}/queryables": "Get collection queryables", 196 | "/collections/{collectionId}/items": "Search features in collection", 197 | "/collections/{collectionId}/items/{featureId}": "Get specific feature", 198 | } 199 | parsed["endpoints"] = endpoint_patterns 200 | parsed["crs_guide"] = { 201 | "WGS84": "http://www.opengis.net/def/crs/OGC/1.3/CRS84 (default, longitude/latitude)", 202 | "British_National_Grid": "http://www.opengis.net/def/crs/EPSG/0/27700 (UK Ordnance Survey)", 203 | "WGS84_latlon": "http://www.opengis.net/def/crs/EPSG/0/4326 (latitude/longitude)", 204 | "Web_Mercator": "http://www.opengis.net/def/crs/EPSG/0/3857 (Web mapping)", 205 | } 206 | 207 | return parsed 208 | 209 | # Private async methods 210 | async def _get_open_api_spec(self) -> OpenAPISpecification: 211 | """Get the OpenAPI spec for the OS NGD API""" 212 | try: 213 | response = await self.make_request("OPENAPI_SPEC", params={"f": "json"}) 214 | 215 | # Sanitize the raw response before processing 216 | sanitized_response = self._sanitise_response(response) 217 | 218 | collections_cache = await self.cache_collections() 219 | filtered_collection_ids = [col.id for col in collections_cache.collections] 220 | 221 | parsed_spec = self._parse_openapi_spec_for_llm( 222 | sanitized_response, filtered_collection_ids 223 | ) 224 | 225 | spec = OpenAPISpecification( 226 | title=parsed_spec["title"], 227 | version=parsed_spec["version"], 228 | base_url=parsed_spec["base_url"], 229 | endpoints=parsed_spec["endpoints"], 230 | collection_ids=filtered_collection_ids, 231 | supported_crs=parsed_spec["supported_crs"], 232 | crs_guide=parsed_spec["crs_guide"], 233 | ) 234 | return spec 235 | except Exception as e: 236 | raise ValueError(f"Failed to get OpenAPI spec: {e}") 237 | 238 | async def cache_openapi_spec(self) -> OpenAPISpecification: 239 | """ 240 | Cache the OpenAPI spec. 241 | 242 | Returns: 243 | The cached OpenAPI spec 244 | """ 245 | if self._cached_openapi_spec is None: 246 | logger.debug("Caching OpenAPI spec for LLM context...") 247 | try: 248 | self._cached_openapi_spec = await self._get_open_api_spec() 249 | logger.debug("OpenAPI spec successfully cached") 250 | except Exception as e: 251 | raise ValueError(f"Failed to cache OpenAPI spec: {e}") 252 | return self._cached_openapi_spec 253 | 254 | async def _get_collections(self) -> CollectionsCache: 255 | """Get all collections from the OS NGD API""" 256 | try: 257 | response = await self.make_request("COLLECTIONS") 258 | collections_list = response.get("collections", []) 259 | filtered = self._filter_latest_collections(collections_list) 260 | logger.debug(f"Filtered collections: {len(filtered)} collections") 261 | return CollectionsCache(collections=filtered, raw_response=response) 262 | except Exception as e: 263 | sanitized_error = self._sanitise_api_key(str(e)) 264 | logger.error(f"Error getting collections: {sanitized_error}") 265 | raise ValueError(f"Failed to get collections: {sanitized_error}") 266 | 267 | async def cache_collections(self) -> CollectionsCache: 268 | """ 269 | Cache the collections data with filtering applied. 270 | 271 | Returns: 272 | The cached collections 273 | """ 274 | if self._cached_collections is None: 275 | logger.debug("Caching collections for LLM context...") 276 | try: 277 | self._cached_collections = await self._get_collections() 278 | logger.debug( 279 | f"Collections successfully cached - {len(self._cached_collections.collections)} collections after filtering" 280 | ) 281 | except Exception as e: 282 | sanitized_error = self._sanitise_api_key(str(e)) 283 | raise ValueError(f"Failed to cache collections: {sanitized_error}") 284 | return self._cached_collections 285 | 286 | async def fetch_collections_queryables( 287 | self, collection_ids: List[str] 288 | ) -> Dict[str, CollectionQueryables]: 289 | """Fetch detailed queryables for specific collections only""" 290 | if not collection_ids: 291 | return {} 292 | 293 | logger.debug(f"Fetching queryables for specific collections: {collection_ids}") 294 | 295 | collections_cache = await self.cache_collections() 296 | collections_map = {coll.id: coll for coll in collections_cache.collections} 297 | 298 | tasks = [ 299 | self.make_request("COLLECTION_QUERYABLES", path_params=[collection_id]) 300 | for collection_id in collection_ids 301 | if collection_id in collections_map 302 | ] 303 | 304 | if not tasks: 305 | return {} 306 | 307 | raw_queryables = await asyncio.gather(*tasks, return_exceptions=True) 308 | 309 | def process_single_collection_queryables(collection_id, queryables_data): 310 | collection = collections_map[collection_id] 311 | logger.debug( 312 | f"Processing collection {collection.id} in thread {threading.current_thread().name}" 313 | ) 314 | 315 | if isinstance(queryables_data, Exception): 316 | logger.warning( 317 | f"Failed to fetch queryables for {collection.id}: {queryables_data}" 318 | ) 319 | return ( 320 | collection.id, 321 | CollectionQueryables( 322 | id=collection.id, 323 | title=collection.title, 324 | description=collection.description, 325 | all_queryables={}, 326 | enum_queryables={}, 327 | has_enum_filters=False, 328 | total_queryables=0, 329 | enum_count=0, 330 | ), 331 | ) 332 | 333 | all_queryables = {} 334 | enum_queryables = {} 335 | properties = queryables_data.get("properties", {}) 336 | 337 | for prop_name, prop_details in properties.items(): 338 | prop_type = prop_details.get("type", ["string"]) 339 | if isinstance(prop_type, list): 340 | main_type = prop_type[0] if prop_type else "string" 341 | is_nullable = "null" in prop_type 342 | else: 343 | main_type = prop_type 344 | is_nullable = False 345 | 346 | all_queryables[prop_name] = { 347 | "type": main_type, 348 | "nullable": is_nullable, 349 | "max_length": prop_details.get("maxLength"), 350 | "format": prop_details.get("format"), 351 | "pattern": prop_details.get("pattern"), 352 | "minimum": prop_details.get("minimum"), 353 | "maximum": prop_details.get("maximum"), 354 | "is_enum": prop_details.get("enumeration", False), 355 | } 356 | 357 | if prop_details.get("enumeration") and "enum" in prop_details: 358 | enum_queryables[prop_name] = { 359 | "values": prop_details["enum"], 360 | "type": main_type, 361 | "nullable": is_nullable, 362 | "max_length": prop_details.get("maxLength"), 363 | } 364 | all_queryables[prop_name]["enum_values"] = prop_details["enum"] 365 | 366 | all_queryables[prop_name] = { 367 | k: v for k, v in all_queryables[prop_name].items() if v is not None 368 | } 369 | 370 | return ( 371 | collection.id, 372 | CollectionQueryables( 373 | id=collection.id, 374 | title=collection.title, 375 | description=collection.description, 376 | all_queryables=all_queryables, 377 | enum_queryables=enum_queryables, 378 | has_enum_filters=len(enum_queryables) > 0, 379 | total_queryables=len(all_queryables), 380 | enum_count=len(enum_queryables), 381 | ), 382 | ) 383 | 384 | with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor: 385 | collection_data_pairs = list(zip(collection_ids, raw_queryables)) 386 | processed = await asyncio.get_event_loop().run_in_executor( 387 | executor, 388 | lambda: [ 389 | process_single_collection_queryables(coll_id, data) 390 | for coll_id, data in collection_data_pairs 391 | if coll_id in collections_map 392 | ], 393 | ) 394 | 395 | return {coll_id: queryables for coll_id, queryables in processed} 396 | 397 | # Public async methods 398 | async def initialise(self): 399 | """Initialise the aiohttp session if not already created""" 400 | if self.session is None: 401 | self.session = aiohttp.ClientSession( 402 | connector=aiohttp.TCPConnector( 403 | force_close=True, 404 | limit=1, # TODO: Strict limit to only 1 connection - may need to revisit this 405 | ) 406 | ) 407 | 408 | async def close(self): 409 | """Close the session when done""" 410 | if self.session: 411 | await self.session.close() 412 | self.session = None 413 | self._cached_openapi_spec = None 414 | self._cached_collections = None 415 | 416 | async def get_api_key(self) -> str: 417 | """Get the OS API key from environment variable or init param.""" 418 | if self.api_key: 419 | return self.api_key 420 | 421 | api_key = os.environ.get("OS_API_KEY") 422 | if not api_key: 423 | raise ValueError("OS_API_KEY environment variable is not set") 424 | return api_key 425 | 426 | async def make_request( 427 | self, 428 | endpoint: str, 429 | params: Optional[Dict[str, Any]] = None, 430 | path_params: Optional[List[str]] = None, 431 | max_retries: int = 2, 432 | ) -> Dict[str, Any]: 433 | """ 434 | Make a request to the OS NGD API with proper error handling. 435 | 436 | Args: 437 | endpoint: Enum endpoint to use 438 | params: Additional query parameters 439 | path_params: Parameters to format into the URL path 440 | max_retries: Maximum number of retries for transient errors 441 | 442 | Returns: 443 | JSON response as dictionary 444 | """ 445 | await self.initialise() 446 | 447 | if self.session is None: 448 | raise ValueError("Session not initialised") 449 | 450 | current_time = asyncio.get_event_loop().time() 451 | elapsed = current_time - self.last_request_time 452 | if elapsed < self.request_delay: 453 | await asyncio.sleep(self.request_delay - elapsed) 454 | 455 | try: 456 | endpoint_value = NGDAPIEndpoint[endpoint].value 457 | except KeyError: 458 | raise ValueError(f"Invalid endpoint: {endpoint}") 459 | 460 | if path_params: 461 | endpoint_value = endpoint_value.format(*path_params) 462 | 463 | api_key = await self.get_api_key() 464 | request_params = params or {} 465 | request_params["key"] = api_key 466 | 467 | headers = {"User-Agent": self.user_agent, "Accept": "application/json"} 468 | 469 | client_ip = getattr(self.session, "_source_address", None) 470 | client_info = f" from {client_ip}" if client_ip else "" 471 | 472 | sanitized_url = self._sanitise_api_key(endpoint_value) 473 | logger.info(f"Requesting URL: {sanitized_url}{client_info}") 474 | 475 | for attempt in range(1, max_retries + 1): 476 | try: 477 | self.last_request_time = asyncio.get_event_loop().time() 478 | 479 | timeout = aiohttp.ClientTimeout(total=30.0) 480 | async with self.session.get( 481 | endpoint_value, 482 | params=request_params, 483 | headers=headers, 484 | timeout=timeout, 485 | ) as response: 486 | if response.status >= 400: 487 | error_text = await response.text() 488 | sanitized_error = self._sanitise_api_key(error_text) 489 | error_message = ( 490 | f"HTTP Error: {response.status} - {sanitized_error}" 491 | ) 492 | logger.error(f"Error: {error_message}") 493 | raise ValueError(error_message) 494 | 495 | response_data = await response.json() 496 | 497 | return self._sanitise_response(response_data) 498 | except (aiohttp.ClientError, asyncio.TimeoutError) as e: 499 | if attempt == max_retries: 500 | sanitized_exception = self._sanitise_api_key(str(e)) 501 | error_message = f"Request failed after {max_retries} attempts: {sanitized_exception}" 502 | logger.error(f"Error: {error_message}") 503 | raise ValueError(error_message) 504 | else: 505 | await asyncio.sleep(0.7) 506 | except Exception as e: 507 | sanitized_exception = self._sanitise_api_key(str(e)) 508 | error_message = f"Request failed: {sanitized_exception}" 509 | logger.error(f"Error: {error_message}") 510 | raise ValueError(error_message) 511 | raise RuntimeError( 512 | "Unreachable: make_request exited retry loop without returning or raising" 513 | ) 514 | 515 | async def make_request_no_auth( 516 | self, 517 | url: str, 518 | params: Optional[Dict[str, Any]] = None, 519 | max_retries: int = 2, 520 | ) -> str: 521 | """ 522 | Make a request without authentication (for public endpoints like documentation). 523 | 524 | Args: 525 | url: Full URL to request 526 | params: Additional query parameters 527 | max_retries: Maximum number of retries for transient errors 528 | 529 | Returns: 530 | Response text (not JSON parsed) 531 | """ 532 | await self.initialise() 533 | 534 | if self.session is None: 535 | raise ValueError("Session not initialised") 536 | 537 | current_time = asyncio.get_event_loop().time() 538 | elapsed = current_time - self.last_request_time 539 | if elapsed < self.request_delay: 540 | await asyncio.sleep(self.request_delay - elapsed) 541 | 542 | request_params = params or {} 543 | headers = {"User-Agent": self.user_agent} 544 | 545 | logger.info(f"Requesting URL (no auth): {url}") 546 | 547 | for attempt in range(1, max_retries + 1): 548 | try: 549 | self.last_request_time = asyncio.get_event_loop().time() 550 | 551 | timeout = aiohttp.ClientTimeout(total=30.0) 552 | async with self.session.get( 553 | url, 554 | params=request_params, 555 | headers=headers, 556 | timeout=timeout, 557 | ) as response: 558 | if response.status >= 400: 559 | error_text = await response.text() 560 | error_message = f"HTTP Error: {response.status} - {error_text}" 561 | logger.error(f"Error: {error_message}") 562 | raise ValueError(error_message) 563 | 564 | return await response.text() 565 | 566 | except (aiohttp.ClientError, asyncio.TimeoutError) as e: 567 | if attempt == max_retries: 568 | error_message = ( 569 | f"Request failed after {max_retries} attempts: {str(e)}" 570 | ) 571 | logger.error(f"Error: {error_message}") 572 | raise ValueError(error_message) 573 | else: 574 | await asyncio.sleep(0.7) 575 | except Exception as e: 576 | error_message = f"Request failed: {str(e)}" 577 | logger.error(f"Error: {error_message}") 578 | raise ValueError(error_message) 579 | 580 | raise RuntimeError( 581 | "Unreachable: make_request_no_auth exited retry loop without returning or raising" 582 | ) 583 | ``` -------------------------------------------------------------------------------- /src/mcp_service/os_service.py: -------------------------------------------------------------------------------- ```python 1 | import json 2 | import asyncio 3 | import functools 4 | import re 5 | 6 | from typing import Optional, List, Dict, Any, Union, Callable 7 | from api_service.protocols import APIClient 8 | from prompt_templates.prompt_templates import PROMPT_TEMPLATES 9 | from mcp_service.protocols import MCPService, FeatureService 10 | from mcp_service.guardrails import ToolGuardrails 11 | from workflow_generator.workflow_planner import WorkflowPlanner 12 | from utils.logging_config import get_logger 13 | from mcp_service.resources import OSDocumentationResources 14 | from mcp_service.prompts import OSWorkflowPrompts 15 | from mcp_service.routing_service import OSRoutingService 16 | 17 | logger = get_logger(__name__) 18 | 19 | 20 | class OSDataHubService(FeatureService): 21 | """Implementation of the OS NGD API service with MCP""" 22 | 23 | def __init__( 24 | self, api_client: APIClient, mcp_service: MCPService, stdio_middleware=None 25 | ): 26 | """ 27 | Initialise the OS NGD service 28 | 29 | Args: 30 | api_client: API client implementation 31 | mcp_service: MCP service implementation 32 | stdio_middleware: Optional STDIO middleware for rate limiting 33 | """ 34 | self.api_client = api_client 35 | self.mcp = mcp_service 36 | self.stdio_middleware = stdio_middleware 37 | self.workflow_planner: Optional[WorkflowPlanner] = None 38 | self.guardrails = ToolGuardrails() 39 | self.routing_service = OSRoutingService(api_client) 40 | self.register_tools() 41 | self.register_resources() 42 | self.register_prompts() 43 | 44 | # Register all the resources, tools, and prompts 45 | def register_resources(self) -> None: 46 | """Register all MCP resources""" 47 | doc_resources = OSDocumentationResources(self.mcp, self.api_client) 48 | doc_resources.register_all() 49 | 50 | def register_tools(self) -> None: 51 | """Register all MCP tools with guardrails and middleware""" 52 | 53 | def apply_middleware(func: Callable) -> Callable: 54 | wrapped = self.guardrails.basic_guardrails(func) 55 | wrapped = self._require_workflow_context(wrapped) 56 | if self.stdio_middleware: 57 | wrapped = self.stdio_middleware.require_auth_and_rate_limit(wrapped) 58 | return wrapped 59 | 60 | # Apply middleware to ALL tools 61 | self.get_workflow_context = self.mcp.tool()( 62 | apply_middleware(self.get_workflow_context) 63 | ) 64 | self.hello_world = self.mcp.tool()(apply_middleware(self.hello_world)) 65 | self.check_api_key = self.mcp.tool()(apply_middleware(self.check_api_key)) 66 | self.list_collections = self.mcp.tool()(apply_middleware(self.list_collections)) 67 | self.get_single_collection = self.mcp.tool()( 68 | apply_middleware(self.get_single_collection) 69 | ) 70 | self.get_single_collection_queryables = self.mcp.tool()( 71 | apply_middleware(self.get_single_collection_queryables) 72 | ) 73 | self.search_features = self.mcp.tool()(apply_middleware(self.search_features)) 74 | self.get_feature = self.mcp.tool()(apply_middleware(self.get_feature)) 75 | self.get_linked_identifiers = self.mcp.tool()( 76 | apply_middleware(self.get_linked_identifiers) 77 | ) 78 | self.get_bulk_features = self.mcp.tool()( 79 | apply_middleware(self.get_bulk_features) 80 | ) 81 | self.get_bulk_linked_features = self.mcp.tool()( 82 | apply_middleware(self.get_bulk_linked_features) 83 | ) 84 | self.get_prompt_templates = self.mcp.tool()( 85 | apply_middleware(self.get_prompt_templates) 86 | ) 87 | self.fetch_detailed_collections = self.mcp.tool()( 88 | apply_middleware(self.fetch_detailed_collections) 89 | ) 90 | self.get_routing_data = self.mcp.tool()(apply_middleware(self.get_routing_data)) 91 | 92 | def register_prompts(self) -> None: 93 | """Register all MCP prompts""" 94 | workflow_prompts = OSWorkflowPrompts(self.mcp) 95 | workflow_prompts.register_all() 96 | 97 | # Run the MCP service 98 | def run(self) -> None: 99 | """Run the MCP service""" 100 | try: 101 | self.mcp.run() 102 | finally: 103 | try: 104 | try: 105 | loop = asyncio.get_running_loop() 106 | loop.create_task(self._cleanup()) 107 | except RuntimeError: 108 | asyncio.run(self._cleanup()) 109 | except Exception as e: 110 | logger.error(f"Error during cleanup: {e}") 111 | 112 | async def _cleanup(self): 113 | """Async cleanup method""" 114 | try: 115 | if hasattr(self, "api_client") and self.api_client: 116 | await self.api_client.close() 117 | logger.debug("API client closed successfully") 118 | except Exception as e: 119 | logger.error(f"Error closing API client: {e}") 120 | 121 | # Get the workflow context from the cached API client data 122 | # TODO: Lots of work to do here to reduce the size of the context and make it more readable for the LLM but not sacrificing the information 123 | async def get_workflow_context(self) -> str: 124 | """Get basic workflow context - no detailed queryables yet""" 125 | try: 126 | if self.workflow_planner is None: 127 | collections_cache = await self.api_client.cache_collections() 128 | basic_collections_info = { 129 | coll.id: { 130 | "id": coll.id, 131 | "title": coll.title, 132 | "description": coll.description, 133 | # No queryables here - will be fetched on-demand 134 | } 135 | for coll in collections_cache.collections 136 | } 137 | 138 | self.workflow_planner = WorkflowPlanner( 139 | await self.api_client.cache_openapi_spec(), basic_collections_info 140 | ) 141 | 142 | context = self.workflow_planner.get_basic_context() 143 | return json.dumps( 144 | { 145 | "CRITICAL_COLLECTION_LIST": sorted( 146 | context["available_collections"].keys() 147 | ), 148 | "MANDATORY_PLANNING_REQUIREMENT": { 149 | "CRITICAL": "You MUST follow the 2-step planning process:", 150 | "step_1": "Explain your complete plan listing which specific collections you will use and why", 151 | "step_2": "Call fetch_detailed_collections('collection-id-1,collection-id-2') to get queryables for those collections BEFORE making search calls", 152 | "required_explanation": { 153 | "1": "Which collections you will use and why", 154 | "2": "What you expect to find in those collections", 155 | "3": "What your search strategy will be", 156 | }, 157 | "workflow_enforcement": "Do not proceed with search_features until you have fetched detailed queryables", 158 | "example_planning": "I will use 'lus-fts-site-1' for finding cinemas. Let me fetch its detailed queryables first...", 159 | }, 160 | "available_collections": context[ 161 | "available_collections" 162 | ], # Basic info only - no queryables yet - this is to reduce the size of the context for the LLM 163 | "openapi_spec": context["openapi_spec"].model_dump() 164 | if context["openapi_spec"] 165 | else None, 166 | "TWO_STEP_WORKFLOW": { 167 | "step_1": "Plan with basic collection info (no detailed queryables available yet)", 168 | "step_2": "Use fetch_detailed_collections() to get queryables for your chosen collections", 169 | "step_3": "Execute search_features with proper filters using the fetched queryables", 170 | }, 171 | "AVAILABLE_TOOLS": { 172 | "fetch_detailed_collections": "Get detailed queryables for specific collections: fetch_detailed_collections('lus-fts-site-1,trn-ntwk-street-1')", 173 | "search_features": "Search features (requires detailed queryables first)", 174 | }, 175 | "QUICK_FILTERING_GUIDE": { 176 | "primary_tool": "search_features", 177 | "key_parameter": "filter", 178 | "enum_fields": "Use exact values from collection's enum_queryables (fetch these first!)", 179 | "simple_fields": "Use direct values (e.g., usrn = 12345678)", 180 | }, 181 | "COMMON_EXAMPLES": { 182 | "workflow_example": "1) Explain plan → 2) fetch_detailed_collections('lus-fts-site-1') → 3) search_features with proper filter", 183 | "cinema_search": "After fetching queryables: search_features(collection_id='lus-fts-site-1', filter=\"oslandusetertiarygroup = 'Cinema'\")", 184 | }, 185 | "CRITICAL_RULES": { 186 | "1": "ALWAYS explain your plan first", 187 | "2": "ALWAYS call fetch_detailed_collections() before search_features", 188 | "3": "Use exact enum values from the fetched enum_queryables", 189 | "4": "Quote string values in single quotes", 190 | }, 191 | } 192 | ) 193 | 194 | except Exception as e: 195 | logger.error(f"Error getting workflow context: {e}") 196 | return json.dumps( 197 | {"error": str(e), "instruction": "Proceed with available tools"} 198 | ) 199 | 200 | def _require_workflow_context(self, func: Callable) -> Callable: 201 | # Functions that don't need workflow context 202 | skip_functions = {"get_workflow_context", "hello_world", "check_api_key"} 203 | 204 | @functools.wraps(func) 205 | async def wrapper(*args: Any, **kwargs: Any) -> Any: 206 | if self.workflow_planner is None: 207 | if func.__name__ in skip_functions: 208 | return await func(*args, **kwargs) 209 | else: 210 | return json.dumps( 211 | { 212 | "error": "WORKFLOW CONTEXT REQUIRED", 213 | "blocked_tool": func.__name__, 214 | "required_action": "You must call 'get_workflow_context' first", 215 | "message": "No tools are available until you get the workflow context. Please call get_workflow_context() now.", 216 | } 217 | ) 218 | return await func(*args, **kwargs) 219 | 220 | return wrapper 221 | 222 | # TODO: This is a bit of a hack - we need to improve the error handling and retry logic 223 | # TODO: Could we actually spawn a seperate AI agent to handle the retry logic and return the result to the main agent? 224 | def _add_retry_context(self, response_data: dict, tool_name: str) -> dict: 225 | """Add retry guidance to tool responses""" 226 | if "error" in response_data: 227 | response_data["retry_guidance"] = { 228 | "tool": tool_name, 229 | "MANDATORY_INSTRUCTION 1": "Review the error message and try again with corrected parameters", 230 | "MANDATORY_INSTRUCTION 2": "YOU MUST call get_workflow_context() if you need to see available options again", 231 | } 232 | return response_data 233 | 234 | # All the tools 235 | async def hello_world(self, name: str) -> str: 236 | """Simple hello world tool for testing""" 237 | return f"Hello, {name}! 👋" 238 | 239 | async def check_api_key(self) -> str: 240 | """Check if the OS API key is available.""" 241 | try: 242 | await self.api_client.get_api_key() 243 | return json.dumps({"status": "success", "message": "OS_API_KEY is set!"}) 244 | except ValueError as e: 245 | return json.dumps({"status": "error", "message": str(e)}) 246 | 247 | async def list_collections( 248 | self, 249 | ) -> str: 250 | """ 251 | List all available feature collections in the OS NGD API. 252 | 253 | Returns: 254 | JSON string with collection info (id, title only) 255 | """ 256 | try: 257 | data = await self.api_client.make_request("COLLECTIONS") 258 | 259 | if not data or "collections" not in data: 260 | return json.dumps({"error": "No collections found"}) 261 | 262 | collections = [ 263 | {"id": col.get("id"), "title": col.get("title")} 264 | for col in data.get("collections", []) 265 | ] 266 | 267 | return json.dumps({"collections": collections}) 268 | except Exception as e: 269 | error_response = {"error": str(e)} 270 | return json.dumps( 271 | self._add_retry_context(error_response, "list_collections") 272 | ) 273 | 274 | async def get_single_collection( 275 | self, 276 | collection_id: str, 277 | ) -> str: 278 | """ 279 | Get detailed information about a specific collection. 280 | 281 | Args: 282 | collection_id: The collection ID 283 | 284 | Returns: 285 | JSON string with collection information 286 | """ 287 | try: 288 | data = await self.api_client.make_request( 289 | "COLLECTION_INFO", path_params=[collection_id] 290 | ) 291 | 292 | return json.dumps(data) 293 | except Exception as e: 294 | error_response = {"error": str(e)} 295 | return json.dumps( 296 | self._add_retry_context(error_response, "get_collection_info") 297 | ) 298 | 299 | async def get_single_collection_queryables( 300 | self, 301 | collection_id: str, 302 | ) -> str: 303 | """ 304 | Get the list of queryable properties for a collection. 305 | 306 | Args: 307 | collection_id: The collection ID 308 | 309 | Returns: 310 | JSON string with queryable properties 311 | """ 312 | try: 313 | data = await self.api_client.make_request( 314 | "COLLECTION_QUERYABLES", path_params=[collection_id] 315 | ) 316 | 317 | return json.dumps(data) 318 | except Exception as e: 319 | error_response = {"error": str(e)} 320 | return json.dumps( 321 | self._add_retry_context(error_response, "get_collection_queryables") 322 | ) 323 | 324 | async def search_features( 325 | self, 326 | collection_id: str, 327 | bbox: Optional[str] = None, 328 | crs: Optional[str] = None, 329 | limit: int = 10, 330 | offset: int = 0, 331 | filter: Optional[str] = None, 332 | filter_lang: Optional[str] = "cql-text", 333 | query_attr: Optional[str] = None, 334 | query_attr_value: Optional[str] = None, 335 | ) -> str: 336 | """Search for features in a collection with full CQL2 filter support.""" 337 | try: 338 | params: Dict[str, Union[str, int]] = {} 339 | 340 | if limit: 341 | params["limit"] = min(limit, 100) 342 | if offset: 343 | params["offset"] = max(0, offset) 344 | if bbox: 345 | params["bbox"] = bbox 346 | if crs: 347 | params["crs"] = crs 348 | if filter: 349 | if len(filter) > 1000: 350 | raise ValueError("Filter too long") 351 | dangerous_patterns = [ 352 | r";\s*--", 353 | r";\s*/\*", 354 | r"\bUNION\b", 355 | r"\bSELECT\b", 356 | r"\bINSERT\b", 357 | r"\bUPDATE\b", 358 | r"\bDELETE\b", 359 | r"\bDROP\b", 360 | r"\bCREATE\b", 361 | r"\bALTER\b", 362 | r"\bTRUNCATE\b", 363 | r"\bEXEC\b", 364 | r"\bEXECUTE\b", 365 | r"\bSP_\b", 366 | r"\bXP_\b", 367 | r"<script\b", 368 | r"javascript:", 369 | r"vbscript:", 370 | r"onload\s*=", 371 | r"onerror\s*=", 372 | r"onclick\s*=", 373 | r"\beval\s*\(", 374 | r"document\.", 375 | r"window\.", 376 | r"location\.", 377 | r"cookie", 378 | r"innerHTML", 379 | r"outerHTML", 380 | r"alert\s*\(", 381 | r"confirm\s*\(", 382 | r"prompt\s*\(", 383 | r"setTimeout\s*\(", 384 | r"setInterval\s*\(", 385 | r"Function\s*\(", 386 | r"constructor", 387 | r"prototype", 388 | r"__proto__", 389 | r"process\.", 390 | r"require\s*\(", 391 | r"import\s+", 392 | r"from\s+.*import", 393 | r"\.\./", 394 | r"file://", 395 | r"ftp://", 396 | r"data:", 397 | r"blob:", 398 | r"\\x[0-9a-fA-F]{2}", 399 | r"%[0-9a-fA-F]{2}", 400 | r"&#x[0-9a-fA-F]+;", 401 | r"&[a-zA-Z]+;", 402 | r"\$\{", 403 | r"#\{", 404 | r"<%", 405 | r"%>", 406 | r"{{", 407 | r"}}", 408 | r"\\\w+", 409 | r"\0", 410 | r"\r\n", 411 | r"\n\r", 412 | ] 413 | 414 | for pattern in dangerous_patterns: 415 | if re.search(pattern, filter, re.IGNORECASE): 416 | raise ValueError("Invalid filter content") 417 | 418 | if filter.count("'") % 2 != 0: 419 | raise ValueError("Unmatched quotes in filter") 420 | 421 | params["filter"] = filter.strip() 422 | if filter_lang: 423 | params["filter-lang"] = filter_lang 424 | 425 | elif query_attr and query_attr_value: 426 | if not re.match(r"^[a-zA-Z_][a-zA-Z0-9_]*$", query_attr): 427 | raise ValueError("Invalid field name") 428 | 429 | escaped_value = str(query_attr_value).replace("'", "''") 430 | params["filter"] = f"{query_attr} = '{escaped_value}'" 431 | if filter_lang: 432 | params["filter-lang"] = filter_lang 433 | 434 | if self.workflow_planner: 435 | valid_collections = set( 436 | self.workflow_planner.basic_collections_info.keys() 437 | ) 438 | if collection_id not in valid_collections: 439 | return json.dumps( 440 | { 441 | "error": f"Invalid collection '{collection_id}'. Valid collections: {sorted(valid_collections)[:10]}...", 442 | "suggestion": "Call get_workflow_context() to see all available collections", 443 | } 444 | ) 445 | 446 | data = await self.api_client.make_request( 447 | "COLLECTION_FEATURES", params=params, path_params=[collection_id] 448 | ) 449 | 450 | return json.dumps(data) 451 | except ValueError as ve: 452 | error_response = {"error": f"Invalid input: {str(ve)}"} 453 | return json.dumps( 454 | self._add_retry_context(error_response, "search_features") 455 | ) 456 | except Exception as e: 457 | error_response = {"error": str(e)} 458 | return json.dumps( 459 | self._add_retry_context(error_response, "search_features") 460 | ) 461 | 462 | async def get_feature( 463 | self, 464 | collection_id: str, 465 | feature_id: str, 466 | crs: Optional[str] = None, 467 | ) -> str: 468 | """ 469 | Get a specific feature by ID. 470 | 471 | Args: 472 | collection_id: The collection ID 473 | feature_id: The feature ID 474 | crs: Coordinate reference system for the response 475 | 476 | Returns: 477 | JSON string with feature data 478 | """ 479 | try: 480 | params: Dict[str, str] = {} 481 | if crs: 482 | params["crs"] = crs 483 | 484 | data = await self.api_client.make_request( 485 | "COLLECTION_FEATURE_BY_ID", 486 | params=params, 487 | path_params=[collection_id, feature_id], 488 | ) 489 | 490 | return json.dumps(data) 491 | except Exception as e: 492 | error_response = {"error": f"Error getting feature: {str(e)}"} 493 | return json.dumps(self._add_retry_context(error_response, "get_feature")) 494 | 495 | async def get_linked_identifiers( 496 | self, 497 | identifier_type: str, 498 | identifier: str, 499 | feature_type: Optional[str] = None, 500 | ) -> str: 501 | """ 502 | Get linked identifiers for a specified identifier. 503 | 504 | Args: 505 | identifier_type: The type of identifier (e.g., 'TOID', 'UPRN') 506 | identifier: The identifier value 507 | feature_type: Optional feature type to filter results 508 | 509 | Returns: 510 | JSON string with linked identifiers or filtered results 511 | """ 512 | try: 513 | data = await self.api_client.make_request( 514 | "LINKED_IDENTIFIERS", path_params=[identifier_type, identifier] 515 | ) 516 | 517 | if feature_type: 518 | # Filter results by feature type 519 | filtered_results = [] 520 | for item in data.get("results", []): 521 | if item.get("featureType") == feature_type: 522 | filtered_results.append(item) 523 | return json.dumps({"results": filtered_results}) 524 | 525 | return json.dumps(data) 526 | except Exception as e: 527 | error_response = {"error": str(e)} 528 | return json.dumps( 529 | self._add_retry_context(error_response, "get_linked_identifiers") 530 | ) 531 | 532 | async def get_bulk_features( 533 | self, 534 | collection_id: str, 535 | identifiers: List[str], 536 | query_by_attr: Optional[str] = None, 537 | ) -> str: 538 | """ 539 | Get multiple features in a single call. 540 | 541 | Args: 542 | collection_id: The collection ID 543 | identifiers: List of feature identifiers 544 | query_by_attr: Attribute to query by (if not provided, assumes feature IDs) 545 | 546 | Returns: 547 | JSON string with features data 548 | """ 549 | try: 550 | tasks: List[Any] = [] 551 | for identifier in identifiers: 552 | if query_by_attr: 553 | task = self.search_features( 554 | collection_id=collection_id, 555 | query_attr=query_by_attr, 556 | query_attr_value=identifier, 557 | limit=1, 558 | ) 559 | else: 560 | task = self.get_feature(collection_id, identifier) 561 | 562 | tasks.append(task) 563 | 564 | results = await asyncio.gather(*tasks) 565 | 566 | parsed_results = [json.loads(result) for result in results] 567 | 568 | return json.dumps({"results": parsed_results}) 569 | except Exception as e: 570 | error_response = {"error": str(e)} 571 | return json.dumps( 572 | self._add_retry_context(error_response, "get_bulk_features") 573 | ) 574 | 575 | async def get_bulk_linked_features( 576 | self, 577 | identifier_type: str, 578 | identifiers: List[str], 579 | feature_type: Optional[str] = None, 580 | ) -> str: 581 | """ 582 | Get linked features for multiple identifiers in a single call. 583 | 584 | Args: 585 | identifier_type: The type of identifier (e.g., 'TOID', 'UPRN') 586 | identifiers: List of identifier values 587 | feature_type: Optional feature type to filter results 588 | 589 | Returns: 590 | JSON string with linked features data 591 | """ 592 | try: 593 | tasks = [ 594 | self.get_linked_identifiers(identifier_type, identifier, feature_type) 595 | for identifier in identifiers 596 | ] 597 | 598 | results = await asyncio.gather(*tasks) 599 | 600 | parsed_results = [json.loads(result) for result in results] 601 | 602 | return json.dumps({"results": parsed_results}) 603 | except Exception as e: 604 | error_response = {"error": str(e)} 605 | return json.dumps( 606 | self._add_retry_context(error_response, "get_bulk_linked_features") 607 | ) 608 | 609 | async def get_prompt_templates( 610 | self, 611 | category: Optional[str] = None, 612 | ) -> str: 613 | """ 614 | Get standard prompt templates for interacting with this service. 615 | 616 | Args: 617 | category: Optional category of templates to return 618 | (general, collections, features, linked_identifiers) 619 | 620 | Returns: 621 | JSON string containing prompt templates 622 | """ 623 | if category and category in PROMPT_TEMPLATES: 624 | return json.dumps({category: PROMPT_TEMPLATES[category]}) 625 | 626 | return json.dumps(PROMPT_TEMPLATES) 627 | 628 | async def fetch_detailed_collections(self, collection_ids: str) -> str: 629 | """ 630 | Fetch detailed queryables for specific collections mentioned in LLM workflow plan. 631 | 632 | This is mainly to reduce the size of the context for the LLM. 633 | 634 | Only fetch what you really need. 635 | 636 | Args: 637 | collection_ids: Comma-separated list of collection IDs (e.g., "lus-fts-site-1,trn-ntwk-street-1") 638 | 639 | Returns: 640 | JSON string with detailed queryables for the specified collections 641 | """ 642 | try: 643 | if not self.workflow_planner: 644 | return json.dumps( 645 | { 646 | "error": "Workflow planner not initialized. Call get_workflow_context() first." 647 | } 648 | ) 649 | 650 | requested_collections = [cid.strip() for cid in collection_ids.split(",")] 651 | 652 | valid_collections = set(self.workflow_planner.basic_collections_info.keys()) 653 | invalid_collections = [ 654 | cid for cid in requested_collections if cid not in valid_collections 655 | ] 656 | 657 | if invalid_collections: 658 | return json.dumps( 659 | { 660 | "error": f"Invalid collection IDs: {invalid_collections}", 661 | "valid_collections": sorted(valid_collections), 662 | } 663 | ) 664 | 665 | cached_collections = [ 666 | cid 667 | for cid in requested_collections 668 | if cid in self.workflow_planner.detailed_collections_cache 669 | ] 670 | 671 | collections_to_fetch = [ 672 | cid 673 | for cid in requested_collections 674 | if cid not in self.workflow_planner.detailed_collections_cache 675 | ] 676 | 677 | if collections_to_fetch: 678 | logger.info(f"Fetching detailed queryables for: {collections_to_fetch}") 679 | detailed_queryables = ( 680 | await self.api_client.fetch_collections_queryables( 681 | collections_to_fetch 682 | ) 683 | ) 684 | 685 | for coll_id, queryables in detailed_queryables.items(): 686 | self.workflow_planner.detailed_collections_cache[coll_id] = { 687 | "id": queryables.id, 688 | "title": queryables.title, 689 | "description": queryables.description, 690 | "all_queryables": queryables.all_queryables, 691 | "enum_queryables": queryables.enum_queryables, 692 | "has_enum_filters": queryables.has_enum_filters, 693 | "total_queryables": queryables.total_queryables, 694 | "enum_count": queryables.enum_count, 695 | } 696 | 697 | context = self.workflow_planner.get_detailed_context(requested_collections) 698 | 699 | return json.dumps( 700 | { 701 | "success": True, 702 | "collections_processed": requested_collections, 703 | "collections_fetched_from_api": collections_to_fetch, 704 | "collections_from_cache": cached_collections, 705 | "detailed_collections": context["available_collections"], 706 | "message": f"Detailed queryables now available for: {', '.join(requested_collections)}", 707 | } 708 | ) 709 | 710 | except Exception as e: 711 | logger.error(f"Error fetching detailed collections: {e}") 712 | return json.dumps( 713 | {"error": str(e), "suggestion": "Check collection IDs and try again"} 714 | ) 715 | 716 | async def get_routing_data( 717 | self, 718 | bbox: Optional[str] = None, 719 | limit: int = 100, 720 | include_nodes: bool = True, 721 | include_edges: bool = True, 722 | build_network: bool = True, 723 | ) -> str: 724 | """ 725 | Get routing data - builds network and returns nodes/edges as flat tables. 726 | 727 | Args: 728 | bbox: Optional bounding box (format: "minx,miny,maxx,maxy") 729 | limit: Maximum number of road links to process (default: 1000) 730 | include_nodes: Whether to include nodes in response (default: True) 731 | include_edges: Whether to include edges in response (default: True) 732 | build_network: Whether to build network first (default: True) 733 | 734 | Returns: 735 | JSON string with routing network data 736 | """ 737 | try: 738 | result = {} 739 | 740 | if build_network: 741 | build_result = await self.routing_service.build_routing_network( 742 | bbox, limit 743 | ) 744 | result["build_status"] = build_result 745 | 746 | if build_result.get("status") != "success": 747 | return json.dumps(result) 748 | 749 | if include_nodes: 750 | nodes_result = self.routing_service.get_flat_nodes() 751 | result["nodes"] = nodes_result.get("nodes", []) 752 | 753 | if include_edges: 754 | edges_result = self.routing_service.get_flat_edges() 755 | result["edges"] = edges_result.get("edges", []) 756 | 757 | summary = self.routing_service.get_network_info() 758 | result["summary"] = summary.get("network", {}) 759 | result["status"] = "success" 760 | 761 | return json.dumps(result) 762 | except Exception as e: 763 | return json.dumps({"error": str(e)}) 764 | ``` -------------------------------------------------------------------------------- /src/config_docs/ogcapi-features-1.yaml: -------------------------------------------------------------------------------- ```yaml 1 | openapi: 3.0.2 2 | info: 3 | title: "Building Blocks specified in OGC API - Features - Part 1: Core" 4 | description: |- 5 | Common components used in the 6 | [OGC standard "OGC API - Features - Part 1: Core"](http://docs.opengeospatial.org/is/17-069r3/17-069r3.html). 7 | 8 | OGC API - Features - Part 1: Core 1.0 is an OGC Standard. 9 | Copyright (c) 2019 Open Geospatial Consortium. 10 | To obtain additional rights of use, visit http://www.opengeospatial.org/legal/ . 11 | 12 | This document is also available on 13 | [OGC](http://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/ogcapi-features-1.yaml). 14 | version: '1.0.0' 15 | contact: 16 | name: Clemens Portele 17 | email: [email protected] 18 | license: 19 | name: OGC License 20 | url: 'http://www.opengeospatial.org/legal/' 21 | components: 22 | parameters: 23 | bbox: 24 | name: bbox 25 | in: query 26 | description: |- 27 | Only features that have a geometry that intersects the bounding box are selected. 28 | The bounding box is provided as four or six numbers, depending on whether the 29 | coordinate reference system includes a vertical axis (height or depth): 30 | 31 | * Lower left corner, coordinate axis 1 32 | * Lower left corner, coordinate axis 2 33 | * Minimum value, coordinate axis 3 (optional) 34 | * Upper right corner, coordinate axis 1 35 | * Upper right corner, coordinate axis 2 36 | * Maximum value, coordinate axis 3 (optional) 37 | 38 | If the value consists of four numbers, the coordinate reference system is 39 | WGS 84 longitude/latitude (http://www.opengis.net/def/crs/OGC/1.3/CRS84) 40 | unless a different coordinate reference system is specified in the parameter `bbox-crs`. 41 | 42 | If the value consists of six numbers, the coordinate reference system is WGS 84 43 | longitude/latitude/ellipsoidal height (http://www.opengis.net/def/crs/OGC/0/CRS84h) 44 | unless a different coordinate reference system is specified in the parameter `bbox-crs`. 45 | 46 | The query parameter `bbox-crs` is specified in OGC API - Features - Part 2: Coordinate 47 | Reference Systems by Reference. 48 | 49 | For WGS 84 longitude/latitude the values are in most cases the sequence of 50 | minimum longitude, minimum latitude, maximum longitude and maximum latitude. 51 | However, in cases where the box spans the antimeridian the first value 52 | (west-most box edge) is larger than the third value (east-most box edge). 53 | 54 | If the vertical axis is included, the third and the sixth number are 55 | the bottom and the top of the 3-dimensional bounding box. 56 | 57 | If a feature has multiple spatial geometry properties, it is the decision of the 58 | server whether only a single spatial geometry property is used to determine 59 | the extent or all relevant geometries. 60 | required: false 61 | schema: 62 | type: array 63 | oneOf: 64 | - minItems: 4 65 | maxItems: 4 66 | - minItems: 6 67 | maxItems: 6 68 | items: 69 | type: number 70 | style: form 71 | explode: false 72 | collectionId: 73 | name: collectionId 74 | in: path 75 | description: local identifier of a collection 76 | required: true 77 | schema: 78 | type: string 79 | datetime: 80 | name: datetime 81 | in: query 82 | description: |- 83 | Either a date-time or an interval. Date and time expressions adhere to RFC 3339. 84 | Intervals may be bounded or half-bounded (double-dots at start or end). 85 | 86 | Examples: 87 | 88 | * A date-time: "2018-02-12T23:20:50Z" 89 | * A bounded interval: "2018-02-12T00:00:00Z/2018-03-18T12:31:12Z" 90 | * Half-bounded intervals: "2018-02-12T00:00:00Z/.." or "../2018-03-18T12:31:12Z" 91 | 92 | Only features that have a temporal property that intersects the value of 93 | `datetime` are selected. 94 | 95 | If a feature has multiple temporal properties, it is the decision of the 96 | server whether only a single temporal property is used to determine 97 | the extent or all relevant temporal properties. 98 | required: false 99 | schema: 100 | type: string 101 | style: form 102 | explode: false 103 | featureId: 104 | name: featureId 105 | in: path 106 | description: local identifier of a feature 107 | required: true 108 | schema: 109 | type: string 110 | limit: 111 | name: limit 112 | in: query 113 | description: |- 114 | The optional limit parameter limits the number of items that are presented in the response document. 115 | 116 | Only items are counted that are on the first level of the collection in the response document. 117 | Nested objects contained within the explicitly requested items shall not be counted. 118 | 119 | Minimum = 1. Maximum = 10000. Default = 10. 120 | required: false 121 | schema: 122 | type: integer 123 | minimum: 1 124 | maximum: 10000 125 | default: 10 126 | style: form 127 | explode: false 128 | schemas: 129 | collection: 130 | type: object 131 | required: 132 | - id 133 | - links 134 | properties: 135 | id: 136 | description: identifier of the collection used, for example, in URIs 137 | type: string 138 | example: address 139 | title: 140 | description: human readable title of the collection 141 | type: string 142 | example: address 143 | description: 144 | description: a description of the features in the collection 145 | type: string 146 | example: An address. 147 | links: 148 | type: array 149 | items: 150 | $ref: "#/components/schemas/link" 151 | example: 152 | - href: http://data.example.com/buildings 153 | rel: item 154 | - href: http://example.com/concepts/buildings.html 155 | rel: describedby 156 | type: text/html 157 | extent: 158 | $ref: "#/components/schemas/extent" 159 | itemType: 160 | description: indicator about the type of the items in the collection (the default value is 'feature'). 161 | type: string 162 | default: feature 163 | crs: 164 | description: the list of coordinate reference systems supported by the service 165 | type: array 166 | items: 167 | type: string 168 | default: 169 | - http://www.opengis.net/def/crs/OGC/1.3/CRS84 170 | example: 171 | - http://www.opengis.net/def/crs/OGC/1.3/CRS84 172 | - http://www.opengis.net/def/crs/EPSG/0/4326 173 | collections: 174 | type: object 175 | required: 176 | - links 177 | - collections 178 | properties: 179 | links: 180 | type: array 181 | items: 182 | $ref: "#/components/schemas/link" 183 | collections: 184 | type: array 185 | items: 186 | $ref: "#/components/schemas/collection" 187 | confClasses: 188 | type: object 189 | required: 190 | - conformsTo 191 | properties: 192 | conformsTo: 193 | type: array 194 | items: 195 | type: string 196 | exception: 197 | type: object 198 | description: |- 199 | Information about the exception: an error code plus an optional description. 200 | required: 201 | - code 202 | properties: 203 | code: 204 | type: string 205 | description: 206 | type: string 207 | extent: 208 | type: object 209 | description: |- 210 | The extent of the features in the collection. In the Core only spatial and temporal 211 | extents are specified. Extensions may add additional members to represent other 212 | extents, for example, thermal or pressure ranges. 213 | properties: 214 | spatial: 215 | description: |- 216 | The spatial extent of the features in the collection. 217 | type: object 218 | properties: 219 | bbox: 220 | description: |- 221 | One or more bounding boxes that describe the spatial extent of the dataset. 222 | In the Core only a single bounding box is supported. Extensions may support 223 | additional areas. If multiple areas are provided, the union of the bounding 224 | boxes describes the spatial extent. 225 | type: array 226 | minItems: 1 227 | items: 228 | description: |- 229 | Each bounding box is provided as four or six numbers, depending on 230 | whether the coordinate reference system includes a vertical axis 231 | (height or depth): 232 | 233 | * Lower left corner, coordinate axis 1 234 | * Lower left corner, coordinate axis 2 235 | * Minimum value, coordinate axis 3 (optional) 236 | * Upper right corner, coordinate axis 1 237 | * Upper right corner, coordinate axis 2 238 | * Maximum value, coordinate axis 3 (optional) 239 | 240 | The coordinate reference system of the values is WGS 84 longitude/latitude 241 | (http://www.opengis.net/def/crs/OGC/1.3/CRS84) unless a different coordinate 242 | reference system is specified in `crs`. 243 | 244 | For WGS 84 longitude/latitude the values are in most cases the sequence of 245 | minimum longitude, minimum latitude, maximum longitude and maximum latitude. 246 | However, in cases where the box spans the antimeridian the first value 247 | (west-most box edge) is larger than the third value (east-most box edge). 248 | 249 | If the vertical axis is included, the third and the sixth number are 250 | the bottom and the top of the 3-dimensional bounding box. 251 | 252 | If a feature has multiple spatial geometry properties, it is the decision of the 253 | server whether only a single spatial geometry property is used to determine 254 | the extent or all relevant geometries. 255 | type: array 256 | oneOf: 257 | - minItems: 4 258 | maxItems: 4 259 | - minItems: 6 260 | maxItems: 6 261 | items: 262 | type: number 263 | example: 264 | - -180 265 | - -90 266 | - 180 267 | - 90 268 | crs: 269 | description: |- 270 | Coordinate reference system of the coordinates in the spatial extent 271 | (property `bbox`). The default reference system is WGS 84 longitude/latitude. 272 | In the Core this is the only supported coordinate reference system. 273 | Extensions may support additional coordinate reference systems and add 274 | additional enum values. 275 | type: string 276 | enum: 277 | - 'http://www.opengis.net/def/crs/OGC/1.3/CRS84' 278 | default: 'http://www.opengis.net/def/crs/OGC/1.3/CRS84' 279 | temporal: 280 | description: |- 281 | The temporal extent of the features in the collection. 282 | type: object 283 | properties: 284 | interval: 285 | description: |- 286 | One or more time intervals that describe the temporal extent of the dataset. 287 | The value `null` is supported and indicates an unbounded interval end. 288 | In the Core only a single time interval is supported. Extensions may support 289 | multiple intervals. If multiple intervals are provided, the union of the 290 | intervals describes the temporal extent. 291 | type: array 292 | minItems: 1 293 | items: 294 | description: |- 295 | Begin and end times of the time interval. The timestamps are in the 296 | temporal coordinate reference system specified in `trs`. By default 297 | this is the Gregorian calendar. 298 | type: array 299 | minItems: 2 300 | maxItems: 2 301 | items: 302 | type: string 303 | format: date-time 304 | nullable: true 305 | example: 306 | - '2011-11-11T12:22:11Z' 307 | - null 308 | trs: 309 | description: |- 310 | Coordinate reference system of the coordinates in the temporal extent 311 | (property `interval`). The default reference system is the Gregorian calendar. 312 | In the Core this is the only supported temporal coordinate reference system. 313 | Extensions may support additional temporal coordinate reference systems and add 314 | additional enum values. 315 | type: string 316 | enum: 317 | - 'http://www.opengis.net/def/uom/ISO-8601/0/Gregorian' 318 | default: 'http://www.opengis.net/def/uom/ISO-8601/0/Gregorian' 319 | featureCollectionGeoJSON: 320 | type: object 321 | required: 322 | - type 323 | - features 324 | properties: 325 | type: 326 | type: string 327 | enum: 328 | - FeatureCollection 329 | features: 330 | type: array 331 | items: 332 | $ref: "#/components/schemas/featureGeoJSON" 333 | links: 334 | type: array 335 | items: 336 | $ref: "#/components/schemas/link" 337 | timeStamp: 338 | $ref: "#/components/schemas/timeStamp" 339 | numberMatched: 340 | $ref: "#/components/schemas/numberMatched" 341 | numberReturned: 342 | $ref: "#/components/schemas/numberReturned" 343 | featureGeoJSON: 344 | type: object 345 | required: 346 | - type 347 | - geometry 348 | - properties 349 | properties: 350 | type: 351 | type: string 352 | enum: 353 | - Feature 354 | geometry: 355 | $ref: "#/components/schemas/geometryGeoJSON" 356 | properties: 357 | type: object 358 | nullable: true 359 | id: 360 | oneOf: 361 | - type: string 362 | - type: integer 363 | links: 364 | type: array 365 | items: 366 | $ref: "#/components/schemas/link" 367 | geometryGeoJSON: 368 | oneOf: 369 | - $ref: "#/components/schemas/pointGeoJSON" 370 | - $ref: "#/components/schemas/multipointGeoJSON" 371 | - $ref: "#/components/schemas/linestringGeoJSON" 372 | - $ref: "#/components/schemas/multilinestringGeoJSON" 373 | - $ref: "#/components/schemas/polygonGeoJSON" 374 | - $ref: "#/components/schemas/multipolygonGeoJSON" 375 | - $ref: "#/components/schemas/geometrycollectionGeoJSON" 376 | geometrycollectionGeoJSON: 377 | type: object 378 | required: 379 | - type 380 | - geometries 381 | properties: 382 | type: 383 | type: string 384 | enum: 385 | - GeometryCollection 386 | geometries: 387 | type: array 388 | items: 389 | $ref: "#/components/schemas/geometryGeoJSON" 390 | landingPage: 391 | type: object 392 | required: 393 | - links 394 | properties: 395 | title: 396 | type: string 397 | example: Buildings in Bonn 398 | description: 399 | type: string 400 | example: Access to data about buildings in the city of Bonn via a Web API that conforms to the OGC API Features specification. 401 | links: 402 | type: array 403 | items: 404 | $ref: "#/components/schemas/link" 405 | linestringGeoJSON: 406 | type: object 407 | required: 408 | - type 409 | - coordinates 410 | properties: 411 | type: 412 | type: string 413 | enum: 414 | - LineString 415 | coordinates: 416 | type: array 417 | minItems: 2 418 | items: 419 | type: array 420 | minItems: 2 421 | items: 422 | type: number 423 | link: 424 | type: object 425 | required: 426 | - href 427 | properties: 428 | href: 429 | type: string 430 | example: http://data.example.com/buildings/123 431 | rel: 432 | type: string 433 | example: alternate 434 | type: 435 | type: string 436 | example: application/geo+json 437 | hreflang: 438 | type: string 439 | example: en 440 | title: 441 | type: string 442 | example: Trierer Strasse 70, 53115 Bonn 443 | length: 444 | type: integer 445 | multilinestringGeoJSON: 446 | type: object 447 | required: 448 | - type 449 | - coordinates 450 | properties: 451 | type: 452 | type: string 453 | enum: 454 | - MultiLineString 455 | coordinates: 456 | type: array 457 | items: 458 | type: array 459 | minItems: 2 460 | items: 461 | type: array 462 | minItems: 2 463 | items: 464 | type: number 465 | multipointGeoJSON: 466 | type: object 467 | required: 468 | - type 469 | - coordinates 470 | properties: 471 | type: 472 | type: string 473 | enum: 474 | - MultiPoint 475 | coordinates: 476 | type: array 477 | items: 478 | type: array 479 | minItems: 2 480 | items: 481 | type: number 482 | multipolygonGeoJSON: 483 | type: object 484 | required: 485 | - type 486 | - coordinates 487 | properties: 488 | type: 489 | type: string 490 | enum: 491 | - MultiPolygon 492 | coordinates: 493 | type: array 494 | items: 495 | type: array 496 | items: 497 | type: array 498 | minItems: 4 499 | items: 500 | type: array 501 | minItems: 2 502 | items: 503 | type: number 504 | numberMatched: 505 | description: |- 506 | The number of features of the feature type that match the selection 507 | parameters like `bbox`. 508 | type: integer 509 | minimum: 0 510 | example: 127 511 | numberReturned: 512 | description: |- 513 | The number of features in the feature collection. 514 | 515 | A server may omit this information in a response, if the information 516 | about the number of features is not known or difficult to compute. 517 | 518 | If the value is provided, the value shall be identical to the number 519 | of items in the "features" array. 520 | type: integer 521 | minimum: 0 522 | example: 10 523 | pointGeoJSON: 524 | type: object 525 | required: 526 | - type 527 | - coordinates 528 | properties: 529 | type: 530 | type: string 531 | enum: 532 | - Point 533 | coordinates: 534 | type: array 535 | minItems: 2 536 | items: 537 | type: number 538 | polygonGeoJSON: 539 | type: object 540 | required: 541 | - type 542 | - coordinates 543 | properties: 544 | type: 545 | type: string 546 | enum: 547 | - Polygon 548 | coordinates: 549 | type: array 550 | items: 551 | type: array 552 | minItems: 4 553 | items: 554 | type: array 555 | minItems: 2 556 | items: 557 | type: number 558 | timeStamp: 559 | description: This property indicates the time and date when the response was generated. 560 | type: string 561 | format: date-time 562 | example: '2017-08-17T08:05:32Z' 563 | responses: 564 | LandingPage: 565 | description: |- 566 | The landing page provides links to the API definition 567 | (link relations `service-desc` and `service-doc`), 568 | the Conformance declaration (path `/conformance`, 569 | link relation `conformance`), and the Feature 570 | Collections (path `/collections`, link relation 571 | `data`). 572 | content: 573 | application/json: 574 | schema: 575 | $ref: '#/components/schemas/landingPage' 576 | example: 577 | title: Buildings in Bonn 578 | description: Access to data about buildings in the city of Bonn via a Web API that conforms to the OGC API Features specification. 579 | links: 580 | - href: 'http://data.example.org/' 581 | rel: self 582 | type: application/json 583 | title: this document 584 | - href: 'http://data.example.org/api' 585 | rel: service-desc 586 | type: application/vnd.oai.openapi+json;version=3.0 587 | title: the API definition 588 | - href: 'http://data.example.org/api.html' 589 | rel: service-doc 590 | type: text/html 591 | title: the API documentation 592 | - href: 'http://data.example.org/conformance' 593 | rel: conformance 594 | type: application/json 595 | title: OGC API conformance classes implemented by this server 596 | - href: 'http://data.example.org/collections' 597 | rel: data 598 | type: application/json 599 | title: Information about the feature collections 600 | text/html: 601 | schema: 602 | type: string 603 | ConformanceDeclaration: 604 | description: |- 605 | The URIs of all conformance classes supported by the server. 606 | 607 | To support "generic" clients that want to access multiple 608 | OGC API Features implementations - and not "just" a specific 609 | API / server, the server declares the conformance 610 | classes it implements and conforms to. 611 | content: 612 | application/json: 613 | schema: 614 | $ref: '#/components/schemas/confClasses' 615 | example: 616 | conformsTo: 617 | - 'http://www.opengis.net/spec/ogcapi-features-1/1.0/conf/core' 618 | - 'http://www.opengis.net/spec/ogcapi-features-1/1.0/conf/oas30' 619 | - 'http://www.opengis.net/spec/ogcapi-features-1/1.0/conf/html' 620 | - 'http://www.opengis.net/spec/ogcapi-features-1/1.0/conf/geojson' 621 | text/html: 622 | schema: 623 | type: string 624 | Collections: 625 | description: |- 626 | The feature collections shared by this API. 627 | 628 | The dataset is organized as one or more feature collections. This resource 629 | provides information about and access to the collections. 630 | 631 | The response contains the list of collections. For each collection, a link 632 | to the items in the collection (path `/collections/{collectionId}/items`, 633 | link relation `items`) as well as key information about the collection. 634 | This information includes: 635 | 636 | * A local identifier for the collection that is unique for the dataset; 637 | * A list of coordinate reference systems (CRS) in which geometries may be returned by the server. The first CRS is the default coordinate reference system (the default is always WGS 84 with axis order longitude/latitude); 638 | * An optional title and description for the collection; 639 | * An optional extent that can be used to provide an indication of the spatial and temporal extent of the collection - typically derived from the data; 640 | * An optional indicator about the type of the items in the collection (the default value, if the indicator is not provided, is 'feature'). 641 | content: 642 | application/json: 643 | schema: 644 | $ref: '#/components/schemas/collections' 645 | example: 646 | links: 647 | - href: 'http://data.example.org/collections.json' 648 | rel: self 649 | type: application/json 650 | title: this document 651 | - href: 'http://data.example.org/collections.html' 652 | rel: alternate 653 | type: text/html 654 | title: this document as HTML 655 | - href: 'http://schemas.example.org/1.0/buildings.xsd' 656 | rel: describedby 657 | type: application/xml 658 | title: GML application schema for Acme Corporation building data 659 | - href: 'http://download.example.org/buildings.gpkg' 660 | rel: enclosure 661 | type: application/geopackage+sqlite3 662 | title: Bulk download (GeoPackage) 663 | length: 472546 664 | collections: 665 | - id: buildings 666 | title: Buildings 667 | description: Buildings in the city of Bonn. 668 | extent: 669 | spatial: 670 | bbox: 671 | - - 7.01 672 | - 50.63 673 | - 7.22 674 | - 50.78 675 | temporal: 676 | interval: 677 | - - '2010-02-15T12:34:56Z' 678 | - null 679 | links: 680 | - href: 'http://data.example.org/collections/buildings/items' 681 | rel: items 682 | type: application/geo+json 683 | title: Buildings 684 | - href: 'http://data.example.org/collections/buildings/items.html' 685 | rel: items 686 | type: text/html 687 | title: Buildings 688 | - href: 'https://creativecommons.org/publicdomain/zero/1.0/' 689 | rel: license 690 | type: text/html 691 | title: CC0-1.0 692 | - href: 'https://creativecommons.org/publicdomain/zero/1.0/rdf' 693 | rel: license 694 | type: application/rdf+xml 695 | title: CC0-1.0 696 | text/html: 697 | schema: 698 | type: string 699 | Collection: 700 | description: |- 701 | Information about the feature collection with id `collectionId`. 702 | 703 | The response contains a link to the items in the collection 704 | (path `/collections/{collectionId}/items`, link relation `items`) 705 | as well as key information about the collection. This information 706 | includes: 707 | 708 | * A local identifier for the collection that is unique for the dataset; 709 | * A list of coordinate reference systems (CRS) in which geometries may be returned by the server. The first CRS is the default coordinate reference system (the default is always WGS 84 with axis order longitude/latitude); 710 | * An optional title and description for the collection; 711 | * An optional extent that can be used to provide an indication of the spatial and temporal extent of the collection - typically derived from the data; 712 | * An optional indicator about the type of the items in the collection (the default value, if the indicator is not provided, is 'feature'). 713 | content: 714 | application/json: 715 | schema: 716 | $ref: '#/components/schemas/collection' 717 | example: 718 | id: buildings 719 | title: Buildings 720 | description: Buildings in the city of Bonn. 721 | extent: 722 | spatial: 723 | bbox: 724 | - - 7.01 725 | - 50.63 726 | - 7.22 727 | - 50.78 728 | temporal: 729 | interval: 730 | - - '2010-02-15T12:34:56Z' 731 | - null 732 | links: 733 | - href: 'http://data.example.org/collections/buildings/items' 734 | rel: items 735 | type: application/geo+json 736 | title: Buildings 737 | - href: 'http://data.example.org/collections/buildings/items.html' 738 | rel: items 739 | type: text/html 740 | title: Buildings 741 | - href: 'https://creativecommons.org/publicdomain/zero/1.0/' 742 | rel: license 743 | type: text/html 744 | title: CC0-1.0 745 | - href: 'https://creativecommons.org/publicdomain/zero/1.0/rdf' 746 | rel: license 747 | type: application/rdf+xml 748 | title: CC0-1.0 749 | text/html: 750 | schema: 751 | type: string 752 | Features: 753 | description: |- 754 | The response is a document consisting of features in the collection. 755 | The features included in the response are determined by the server 756 | based on the query parameters of the request. To support access to 757 | larger collections without overloading the client, the API supports 758 | paged access with links to the next page, if more features are selected 759 | that the page size. 760 | 761 | The `bbox` and `datetime` parameter can be used to select only a 762 | subset of the features in the collection (the features that are in the 763 | bounding box or time interval). The `bbox` parameter matches all features 764 | in the collection that are not associated with a location, too. The 765 | `datetime` parameter matches all features in the collection that are 766 | not associated with a time stamp or interval, too. 767 | 768 | The `limit` parameter may be used to control the subset of the 769 | selected features that should be returned in the response, the page size. 770 | Each page may include information about the number of selected and 771 | returned features (`numberMatched` and `numberReturned`) as well as 772 | links to support paging (link relation `next`). 773 | content: 774 | application/geo+json: 775 | schema: 776 | $ref: '#/components/schemas/featureCollectionGeoJSON' 777 | example: 778 | type: FeatureCollection 779 | links: 780 | - href: 'http://data.example.com/collections/buildings/items.json' 781 | rel: self 782 | type: application/geo+json 783 | title: this document 784 | - href: 'http://data.example.com/collections/buildings/items.html' 785 | rel: alternate 786 | type: text/html 787 | title: this document as HTML 788 | - href: 'http://data.example.com/collections/buildings/items.json&offset=10&limit=2' 789 | rel: next 790 | type: application/geo+json 791 | title: next page 792 | timeStamp: '2018-04-03T14:52:23Z' 793 | numberMatched: 123 794 | numberReturned: 2 795 | features: 796 | - type: Feature 797 | id: '123' 798 | geometry: 799 | type: Polygon 800 | coordinates: 801 | - ... 802 | properties: 803 | function: residential 804 | floors: '2' 805 | lastUpdate: '2015-08-01T12:34:56Z' 806 | - type: Feature 807 | id: '132' 808 | geometry: 809 | type: Polygon 810 | coordinates: 811 | - ... 812 | properties: 813 | function: public use 814 | floors: '10' 815 | lastUpdate: '2013-12-03T10:15:37Z' 816 | text/html: 817 | schema: 818 | type: string 819 | Feature: 820 | description: |- 821 | fetch the feature with id `featureId` in the feature collection 822 | with id `collectionId` 823 | content: 824 | application/geo+json: 825 | schema: 826 | $ref: '#/components/schemas/featureGeoJSON' 827 | example: 828 | type: Feature 829 | links: 830 | - href: 'http://data.example.com/id/building/123' 831 | rel: canonical 832 | title: canonical URI of the building 833 | - href: 'http://data.example.com/collections/buildings/items/123.json' 834 | rel: self 835 | type: application/geo+json 836 | title: this document 837 | - href: 'http://data.example.com/collections/buildings/items/123.html' 838 | rel: alternate 839 | type: text/html 840 | title: this document as HTML 841 | - href: 'http://data.example.com/collections/buildings' 842 | rel: collection 843 | type: application/geo+json 844 | title: the collection document 845 | id: '123' 846 | geometry: 847 | type: Polygon 848 | coordinates: 849 | - ... 850 | properties: 851 | function: residential 852 | floors: '2' 853 | lastUpdate: '2015-08-01T12:34:56Z' 854 | text/html: 855 | schema: 856 | type: string 857 | InvalidParameter: 858 | description: |- 859 | A query parameter has an invalid value. 860 | content: 861 | application/json: 862 | schema: 863 | $ref: '#/components/schemas/exception' 864 | text/html: 865 | schema: 866 | type: string 867 | NotFound: 868 | description: |- 869 | The requested resource does not exist on the server. For example, a path parameter had an incorrect value. 870 | NotAcceptable: 871 | description: |- 872 | Content negotiation failed. For example, the `Accept` header submitted in the request did not support any of the media types supported by the server for the requested resource. 873 | ServerError: 874 | description: |- 875 | A server error occurred. 876 | content: 877 | application/json: 878 | schema: 879 | $ref: '#/components/schemas/exception' 880 | text/html: 881 | schema: 882 | type: string 883 | ```