genomoncology/biomcp # codebase.md

This is page 5 of 19. Use http://codebase.md/genomoncology/biomcp?lines=true&page={x} to view the full context.

# Directory Structure

```
├── .github
│   ├── actions
│   │   └── setup-python-env
│   │       └── action.yml
│   ├── dependabot.yml
│   └── workflows
│       ├── ci.yml
│       ├── deploy-docs.yml
│       ├── main.yml.disabled
│       ├── on-release-main.yml
│       └── validate-codecov-config.yml
├── .gitignore
├── .pre-commit-config.yaml
├── BIOMCP_DATA_FLOW.md
├── CHANGELOG.md
├── CNAME
├── codecov.yaml
├── docker-compose.yml
├── Dockerfile
├── docs
│   ├── apis
│   │   ├── error-codes.md
│   │   ├── overview.md
│   │   └── python-sdk.md
│   ├── assets
│   │   ├── biomcp-cursor-locations.png
│   │   ├── favicon.ico
│   │   ├── icon.png
│   │   ├── logo.png
│   │   ├── mcp_architecture.txt
│   │   └── remote-connection
│   │       ├── 00_connectors.png
│   │       ├── 01_add_custom_connector.png
│   │       ├── 02_connector_enabled.png
│   │       ├── 03_connect_to_biomcp.png
│   │       ├── 04_select_google_oauth.png
│   │       └── 05_success_connect.png
│   ├── backend-services-reference
│   │   ├── 01-overview.md
│   │   ├── 02-biothings-suite.md
│   │   ├── 03-cbioportal.md
│   │   ├── 04-clinicaltrials-gov.md
│   │   ├── 05-nci-cts-api.md
│   │   ├── 06-pubtator3.md
│   │   └── 07-alphagenome.md
│   ├── blog
│   │   ├── ai-assisted-clinical-trial-search-analysis.md
│   │   ├── images
│   │   │   ├── deep-researcher-video.png
│   │   │   ├── researcher-announce.png
│   │   │   ├── researcher-drop-down.png
│   │   │   ├── researcher-prompt.png
│   │   │   ├── trial-search-assistant.png
│   │   │   └── what_is_biomcp_thumbnail.png
│   │   └── researcher-persona-resource.md
│   ├── changelog.md
│   ├── CNAME
│   ├── concepts
│   │   ├── 01-what-is-biomcp.md
│   │   ├── 02-the-deep-researcher-persona.md
│   │   └── 03-sequential-thinking-with-the-think-tool.md
│   ├── developer-guides
│   │   ├── 01-server-deployment.md
│   │   ├── 02-contributing-and-testing.md
│   │   ├── 03-third-party-endpoints.md
│   │   ├── 04-transport-protocol.md
│   │   ├── 05-error-handling.md
│   │   ├── 06-http-client-and-caching.md
│   │   ├── 07-performance-optimizations.md
│   │   └── generate_endpoints.py
│   ├── faq-condensed.md
│   ├── FDA_SECURITY.md
│   ├── genomoncology.md
│   ├── getting-started
│   │   ├── 01-quickstart-cli.md
│   │   ├── 02-claude-desktop-integration.md
│   │   └── 03-authentication-and-api-keys.md
│   ├── how-to-guides
│   │   ├── 01-find-articles-and-cbioportal-data.md
│   │   ├── 02-find-trials-with-nci-and-biothings.md
│   │   ├── 03-get-comprehensive-variant-annotations.md
│   │   ├── 04-predict-variant-effects-with-alphagenome.md
│   │   ├── 05-logging-and-monitoring-with-bigquery.md
│   │   └── 06-search-nci-organizations-and-interventions.md
│   ├── index.md
│   ├── policies.md
│   ├── reference
│   │   ├── architecture-diagrams.md
│   │   ├── quick-architecture.md
│   │   ├── quick-reference.md
│   │   └── visual-architecture.md
│   ├── robots.txt
│   ├── stylesheets
│   │   ├── announcement.css
│   │   └── extra.css
│   ├── troubleshooting.md
│   ├── tutorials
│   │   ├── biothings-prompts.md
│   │   ├── claude-code-biomcp-alphagenome.md
│   │   ├── nci-prompts.md
│   │   ├── openfda-integration.md
│   │   ├── openfda-prompts.md
│   │   ├── pydantic-ai-integration.md
│   │   └── remote-connection.md
│   ├── user-guides
│   │   ├── 01-command-line-interface.md
│   │   ├── 02-mcp-tools-reference.md
│   │   └── 03-integrating-with-ides-and-clients.md
│   └── workflows
│       └── all-workflows.md
├── example_scripts
│   ├── mcp_integration.py
│   └── python_sdk.py
├── glama.json
├── LICENSE
├── lzyank.toml
├── Makefile
├── mkdocs.yml
├── package-lock.json
├── package.json
├── pyproject.toml
├── README.md
├── scripts
│   ├── check_docs_in_mkdocs.py
│   ├── check_http_imports.py
│   └── generate_endpoints_doc.py
├── smithery.yaml
├── src
│   └── biomcp
│       ├── __init__.py
│       ├── __main__.py
│       ├── articles
│       │   ├── __init__.py
│       │   ├── autocomplete.py
│       │   ├── fetch.py
│       │   ├── preprints.py
│       │   ├── search_optimized.py
│       │   ├── search.py
│       │   └── unified.py
│       ├── biomarkers
│       │   ├── __init__.py
│       │   └── search.py
│       ├── cbioportal_helper.py
│       ├── circuit_breaker.py
│       ├── cli
│       │   ├── __init__.py
│       │   ├── articles.py
│       │   ├── biomarkers.py
│       │   ├── diseases.py
│       │   ├── health.py
│       │   ├── interventions.py
│       │   ├── main.py
│       │   ├── openfda.py
│       │   ├── organizations.py
│       │   ├── server.py
│       │   ├── trials.py
│       │   └── variants.py
│       ├── connection_pool.py
│       ├── constants.py
│       ├── core.py
│       ├── diseases
│       │   ├── __init__.py
│       │   ├── getter.py
│       │   └── search.py
│       ├── domain_handlers.py
│       ├── drugs
│       │   ├── __init__.py
│       │   └── getter.py
│       ├── exceptions.py
│       ├── genes
│       │   ├── __init__.py
│       │   └── getter.py
│       ├── http_client_simple.py
│       ├── http_client.py
│       ├── individual_tools.py
│       ├── integrations
│       │   ├── __init__.py
│       │   ├── biothings_client.py
│       │   └── cts_api.py
│       ├── interventions
│       │   ├── __init__.py
│       │   ├── getter.py
│       │   └── search.py
│       ├── logging_filter.py
│       ├── metrics_handler.py
│       ├── metrics.py
│       ├── openfda
│       │   ├── __init__.py
│       │   ├── adverse_events_helpers.py
│       │   ├── adverse_events.py
│       │   ├── cache.py
│       │   ├── constants.py
│       │   ├── device_events_helpers.py
│       │   ├── device_events.py
│       │   ├── drug_approvals.py
│       │   ├── drug_labels_helpers.py
│       │   ├── drug_labels.py
│       │   ├── drug_recalls_helpers.py
│       │   ├── drug_recalls.py
│       │   ├── drug_shortages_detail_helpers.py
│       │   ├── drug_shortages_helpers.py
│       │   ├── drug_shortages.py
│       │   ├── exceptions.py
│       │   ├── input_validation.py
│       │   ├── rate_limiter.py
│       │   ├── utils.py
│       │   └── validation.py
│       ├── organizations
│       │   ├── __init__.py
│       │   ├── getter.py
│       │   └── search.py
│       ├── parameter_parser.py
│       ├── prefetch.py
│       ├── query_parser.py
│       ├── query_router.py
│       ├── rate_limiter.py
│       ├── render.py
│       ├── request_batcher.py
│       ├── resources
│       │   ├── __init__.py
│       │   ├── getter.py
│       │   ├── instructions.md
│       │   └── researcher.md
│       ├── retry.py
│       ├── router_handlers.py
│       ├── router.py
│       ├── shared_context.py
│       ├── thinking
│       │   ├── __init__.py
│       │   ├── sequential.py
│       │   └── session.py
│       ├── thinking_tool.py
│       ├── thinking_tracker.py
│       ├── trials
│       │   ├── __init__.py
│       │   ├── getter.py
│       │   ├── nci_getter.py
│       │   ├── nci_search.py
│       │   └── search.py
│       ├── utils
│       │   ├── __init__.py
│       │   ├── cancer_types_api.py
│       │   ├── cbio_http_adapter.py
│       │   ├── endpoint_registry.py
│       │   ├── gene_validator.py
│       │   ├── metrics.py
│       │   ├── mutation_filter.py
│       │   ├── query_utils.py
│       │   ├── rate_limiter.py
│       │   └── request_cache.py
│       ├── variants
│       │   ├── __init__.py
│       │   ├── alphagenome.py
│       │   ├── cancer_types.py
│       │   ├── cbio_external_client.py
│       │   ├── cbioportal_mutations.py
│       │   ├── cbioportal_search_helpers.py
│       │   ├── cbioportal_search.py
│       │   ├── constants.py
│       │   ├── external.py
│       │   ├── filters.py
│       │   ├── getter.py
│       │   ├── links.py
│       │   └── search.py
│       └── workers
│           ├── __init__.py
│           ├── worker_entry_stytch.js
│           ├── worker_entry.js
│           └── worker.py
├── tests
│   ├── bdd
│   │   ├── cli_help
│   │   │   ├── help.feature
│   │   │   └── test_help.py
│   │   ├── conftest.py
│   │   ├── features
│   │   │   └── alphagenome_integration.feature
│   │   ├── fetch_articles
│   │   │   ├── fetch.feature
│   │   │   └── test_fetch.py
│   │   ├── get_trials
│   │   │   ├── get.feature
│   │   │   └── test_get.py
│   │   ├── get_variants
│   │   │   ├── get.feature
│   │   │   └── test_get.py
│   │   ├── search_articles
│   │   │   ├── autocomplete.feature
│   │   │   ├── search.feature
│   │   │   ├── test_autocomplete.py
│   │   │   └── test_search.py
│   │   ├── search_trials
│   │   │   ├── search.feature
│   │   │   └── test_search.py
│   │   ├── search_variants
│   │   │   ├── search.feature
│   │   │   └── test_search.py
│   │   └── steps
│   │       └── test_alphagenome_steps.py
│   ├── config
│   │   └── test_smithery_config.py
│   ├── conftest.py
│   ├── data
│   │   ├── ct_gov
│   │   │   ├── clinical_trials_api_v2.yaml
│   │   │   ├── trials_NCT04280705.json
│   │   │   └── trials_NCT04280705.txt
│   │   ├── myvariant
│   │   │   ├── myvariant_api.yaml
│   │   │   ├── myvariant_field_descriptions.csv
│   │   │   ├── variants_full_braf_v600e.json
│   │   │   ├── variants_full_braf_v600e.txt
│   │   │   └── variants_part_braf_v600_multiple.json
│   │   ├── openfda
│   │   │   ├── drugsfda_detail.json
│   │   │   ├── drugsfda_search.json
│   │   │   ├── enforcement_detail.json
│   │   │   └── enforcement_search.json
│   │   └── pubtator
│   │       ├── pubtator_autocomplete.json
│   │       └── pubtator3_paper.txt
│   ├── integration
│   │   ├── test_openfda_integration.py
│   │   ├── test_preprints_integration.py
│   │   ├── test_simple.py
│   │   └── test_variants_integration.py
│   ├── tdd
│   │   ├── articles
│   │   │   ├── test_autocomplete.py
│   │   │   ├── test_cbioportal_integration.py
│   │   │   ├── test_fetch.py
│   │   │   ├── test_preprints.py
│   │   │   ├── test_search.py
│   │   │   └── test_unified.py
│   │   ├── conftest.py
│   │   ├── drugs
│   │   │   ├── __init__.py
│   │   │   └── test_drug_getter.py
│   │   ├── openfda
│   │   │   ├── __init__.py
│   │   │   ├── test_adverse_events.py
│   │   │   ├── test_device_events.py
│   │   │   ├── test_drug_approvals.py
│   │   │   ├── test_drug_labels.py
│   │   │   ├── test_drug_recalls.py
│   │   │   ├── test_drug_shortages.py
│   │   │   └── test_security.py
│   │   ├── test_biothings_integration_real.py
│   │   ├── test_biothings_integration.py
│   │   ├── test_circuit_breaker.py
│   │   ├── test_concurrent_requests.py
│   │   ├── test_connection_pool.py
│   │   ├── test_domain_handlers.py
│   │   ├── test_drug_approvals.py
│   │   ├── test_drug_recalls.py
│   │   ├── test_drug_shortages.py
│   │   ├── test_endpoint_documentation.py
│   │   ├── test_error_scenarios.py
│   │   ├── test_europe_pmc_fetch.py
│   │   ├── test_mcp_integration.py
│   │   ├── test_mcp_tools.py
│   │   ├── test_metrics.py
│   │   ├── test_nci_integration.py
│   │   ├── test_nci_mcp_tools.py
│   │   ├── test_network_policies.py
│   │   ├── test_offline_mode.py
│   │   ├── test_openfda_unified.py
│   │   ├── test_pten_r173_search.py
│   │   ├── test_render.py
│   │   ├── test_request_batcher.py.disabled
│   │   ├── test_retry.py
│   │   ├── test_router.py
│   │   ├── test_shared_context.py.disabled
│   │   ├── test_unified_biothings.py
│   │   ├── thinking
│   │   │   ├── __init__.py
│   │   │   └── test_sequential.py
│   │   ├── trials
│   │   │   ├── test_backward_compatibility.py
│   │   │   ├── test_getter.py
│   │   │   └── test_search.py
│   │   ├── utils
│   │   │   ├── test_gene_validator.py
│   │   │   ├── test_mutation_filter.py
│   │   │   ├── test_rate_limiter.py
│   │   │   └── test_request_cache.py
│   │   ├── variants
│   │   │   ├── constants.py
│   │   │   ├── test_alphagenome_api_key.py
│   │   │   ├── test_alphagenome_comprehensive.py
│   │   │   ├── test_alphagenome.py
│   │   │   ├── test_cbioportal_mutations.py
│   │   │   ├── test_cbioportal_search.py
│   │   │   ├── test_external_integration.py
│   │   │   ├── test_external.py
│   │   │   ├── test_extract_gene_aa_change.py
│   │   │   ├── test_filters.py
│   │   │   ├── test_getter.py
│   │   │   ├── test_links.py
│   │   │   └── test_search.py
│   │   └── workers
│   │       └── test_worker_sanitization.js
│   └── test_pydantic_ai_integration.py
├── THIRD_PARTY_ENDPOINTS.md
├── tox.ini
├── uv.lock
└── wrangler.toml
```

# Files

--------------------------------------------------------------------------------
/tests/tdd/test_drug_approvals.py:
--------------------------------------------------------------------------------

```python
  1 | """Tests for FDA drug approvals module."""
  2 | 
  3 | import json
  4 | from pathlib import Path
  5 | from unittest.mock import AsyncMock, patch
  6 | 
  7 | import pytest
  8 | 
  9 | from biomcp.openfda.drug_approvals import (
 10 |     get_drug_approval,
 11 |     search_drug_approvals,
 12 | )
 13 | 
 14 | # Load mock data
 15 | MOCK_DIR = Path(__file__).parent.parent / "data" / "openfda"
 16 | MOCK_APPROVALS_SEARCH = json.loads(
 17 |     (MOCK_DIR / "drugsfda_search.json").read_text()
 18 | )
 19 | MOCK_APPROVAL_DETAIL = json.loads(
 20 |     (MOCK_DIR / "drugsfda_detail.json").read_text()
 21 | )
 22 | 
 23 | 
 24 | class TestDrugApprovals:
 25 |     """Test drug approvals functionality."""
 26 | 
 27 |     @pytest.mark.asyncio
 28 |     async def test_search_drug_approvals_success(self):
 29 |         """Test successful drug approval search."""
 30 |         with patch(
 31 |             "biomcp.openfda.drug_approvals.make_openfda_request",
 32 |             new_callable=AsyncMock,
 33 |         ) as mock_request:
 34 |             mock_request.return_value = (MOCK_APPROVALS_SEARCH, None)
 35 | 
 36 |             result = await search_drug_approvals(
 37 |                 drug="pembrolizumab",
 38 |                 limit=10,
 39 |             )
 40 | 
 41 |             assert "FDA Drug Approval Records" in result
 42 |             assert "pembrolizumab" in result.lower()
 43 |             assert "Application" in result
 44 |             assert "BLA125514" in result
 45 |             mock_request.assert_called_once()
 46 | 
 47 |     @pytest.mark.asyncio
 48 |     async def test_search_drug_approvals_with_filters(self):
 49 |         """Test drug approval search with multiple filters."""
 50 |         with patch(
 51 |             "biomcp.openfda.drug_approvals.make_openfda_request",
 52 |             new_callable=AsyncMock,
 53 |         ) as mock_request:
 54 |             mock_request.return_value = (MOCK_APPROVALS_SEARCH, None)
 55 | 
 56 |             result = await search_drug_approvals(
 57 |                 drug="keytruda",
 58 |                 application_number="BLA125514",
 59 |                 approval_year="2014",
 60 |                 limit=5,
 61 |                 api_key="test-key",
 62 |             )
 63 | 
 64 |             assert "FDA Drug Approval Records" in result
 65 |             # Verify API key was passed as the 4th positional argument
 66 |             call_args = mock_request.call_args
 67 |             assert (
 68 |                 call_args[0][3] == "test-key"
 69 |             )  # api_key is 4th positional arg
 70 | 
 71 |     @pytest.mark.asyncio
 72 |     async def test_search_drug_approvals_no_results(self):
 73 |         """Test drug approval search with no results."""
 74 |         with patch(
 75 |             "biomcp.openfda.drug_approvals.make_openfda_request",
 76 |             new_callable=AsyncMock,
 77 |         ) as mock_request:
 78 |             mock_request.return_value = ({"results": []}, None)
 79 | 
 80 |             result = await search_drug_approvals(drug="nonexistent-drug")
 81 | 
 82 |             assert "No drug approval records found" in result
 83 | 
 84 |     @pytest.mark.asyncio
 85 |     async def test_search_drug_approvals_api_error(self):
 86 |         """Test drug approval search with API error."""
 87 |         with patch(
 88 |             "biomcp.openfda.drug_approvals.make_openfda_request",
 89 |             new_callable=AsyncMock,
 90 |         ) as mock_request:
 91 |             mock_request.return_value = (None, "API rate limit exceeded")
 92 | 
 93 |             result = await search_drug_approvals(drug="test")
 94 | 
 95 |             assert "Error searching drug approvals" in result
 96 |             assert "API rate limit exceeded" in result
 97 | 
 98 |     @pytest.mark.asyncio
 99 |     async def test_get_drug_approval_success(self):
100 |         """Test getting specific drug approval details."""
101 |         with patch(
102 |             "biomcp.openfda.drug_approvals.make_openfda_request",
103 |             new_callable=AsyncMock,
104 |         ) as mock_request:
105 |             mock_request.return_value = (MOCK_APPROVAL_DETAIL, None)
106 | 
107 |             result = await get_drug_approval("BLA125514")
108 | 
109 |             # Should have detailed approval info
110 |             assert "BLA125514" in result or "Drug Approval Details" in result
111 |             assert "BLA125514" in result
112 |             assert "Products" in result
113 |             assert "Submission" in result
114 | 
115 |     @pytest.mark.asyncio
116 |     async def test_get_drug_approval_not_found(self):
117 |         """Test getting drug approval that doesn't exist."""
118 |         with patch(
119 |             "biomcp.openfda.drug_approvals.make_openfda_request",
120 |             new_callable=AsyncMock,
121 |         ) as mock_request:
122 |             mock_request.return_value = ({"results": []}, None)
123 | 
124 |             result = await get_drug_approval("INVALID123")
125 | 
126 |             assert "No approval record found" in result
127 |             assert "INVALID123" in result
128 | 
129 |     @pytest.mark.asyncio
130 |     async def test_get_drug_approval_with_api_key(self):
131 |         """Test getting drug approval with API key."""
132 |         with patch(
133 |             "biomcp.openfda.drug_approvals.make_openfda_request",
134 |             new_callable=AsyncMock,
135 |         ) as mock_request:
136 |             mock_request.return_value = (MOCK_APPROVAL_DETAIL, None)
137 | 
138 |             result = await get_drug_approval(
139 |                 "BLA125514",
140 |                 api_key="test-api-key",
141 |             )
142 | 
143 |             # Should have detailed approval info
144 |             assert "BLA125514" in result or "Drug Approval Details" in result
145 |             # Verify API key was passed as the 4th positional argument
146 |             call_args = mock_request.call_args
147 |             assert (
148 |                 call_args[0][3] == "test-api-key"
149 |             )  # api_key is 4th positional arg
150 | 
151 |     @pytest.mark.asyncio
152 |     async def test_search_drug_approvals_pagination(self):
153 |         """Test drug approval search pagination."""
154 |         with patch(
155 |             "biomcp.openfda.drug_approvals.make_openfda_request",
156 |             new_callable=AsyncMock,
157 |         ) as mock_request:
158 |             mock_response = {
159 |                 "meta": {"results": {"total": 100}},
160 |                 "results": MOCK_APPROVALS_SEARCH["results"],
161 |             }
162 |             mock_request.return_value = (mock_response, None)
163 | 
164 |             result = await search_drug_approvals(
165 |                 drug="cancer",
166 |                 limit=10,
167 |                 skip=20,
168 |             )
169 | 
170 |             # The output format is different - just check for the total
171 |             assert "100" in result
172 |             # Verify skip parameter was passed (2nd positional arg)
173 |             call_args = mock_request.call_args
174 |             assert (
175 |                 call_args[0][1]["skip"] == "20"
176 |             )  # params is 2nd positional arg, value is string
177 | 
178 |     @pytest.mark.asyncio
179 |     async def test_approval_year_validation(self):
180 |         """Test that approval year is properly formatted."""
181 |         with patch(
182 |             "biomcp.openfda.drug_approvals.make_openfda_request",
183 |             new_callable=AsyncMock,
184 |         ) as mock_request:
185 |             mock_request.return_value = (MOCK_APPROVALS_SEARCH, None)
186 | 
187 |             await search_drug_approvals(
188 |                 approval_year="2023",
189 |             )
190 | 
191 |             # Check that year was properly formatted in query
192 |             call_args = mock_request.call_args
193 |             params = call_args[0][1]  # params is 2nd positional arg
194 |             assert "marketing_status_date" in params["search"]
195 |             assert "[2023-01-01 TO 2023-12-31]" in params["search"]
196 | 
```

--------------------------------------------------------------------------------
/src/biomcp/articles/fetch.py:
--------------------------------------------------------------------------------

```python
  1 | import json
  2 | import re
  3 | from ssl import TLSVersion
  4 | from typing import Annotated, Any
  5 | 
  6 | from pydantic import BaseModel, Field, computed_field
  7 | 
  8 | from .. import http_client, render
  9 | from ..constants import PUBTATOR3_FULLTEXT_URL
 10 | from ..http_client import RequestError
 11 | 
 12 | 
 13 | class PassageInfo(BaseModel):
 14 |     section_type: str | None = Field(
 15 |         None,
 16 |         description="Type of the section.",
 17 |     )
 18 |     passage_type: str | None = Field(
 19 |         None,
 20 |         alias="type",
 21 |         description="Type of the passage.",
 22 |     )
 23 | 
 24 | 
 25 | class Passage(BaseModel):
 26 |     info: PassageInfo | None = Field(
 27 |         None,
 28 |         alias="infons",
 29 |     )
 30 |     text: str | None = None
 31 | 
 32 |     @property
 33 |     def section_type(self) -> str:
 34 |         section_type = None
 35 |         if self.info is not None:
 36 |             section_type = self.info.section_type or self.info.passage_type
 37 |         section_type = section_type or "UNKNOWN"
 38 |         return section_type.upper()
 39 | 
 40 |     @property
 41 |     def is_title(self) -> bool:
 42 |         return self.section_type == "TITLE"
 43 | 
 44 |     @property
 45 |     def is_abstract(self) -> bool:
 46 |         return self.section_type == "ABSTRACT"
 47 | 
 48 |     @property
 49 |     def is_text(self) -> bool:
 50 |         return self.section_type in {
 51 |             "INTRO",
 52 |             "RESULTS",
 53 |             "METHODS",
 54 |             "DISCUSS",
 55 |             "CONCL",
 56 |             "FIG",
 57 |             "TABLE",
 58 |         }
 59 | 
 60 | 
 61 | class Article(BaseModel):
 62 |     pmid: int | None = Field(
 63 |         None,
 64 |         description="PubMed ID of the reference article.",
 65 |     )
 66 |     pmcid: str | None = Field(
 67 |         None,
 68 |         description="PubMed Central ID of the reference article.",
 69 |     )
 70 |     date: str | None = Field(
 71 |         None,
 72 |         description="Date of the reference article's publication.",
 73 |     )
 74 |     journal: str | None = Field(
 75 |         None,
 76 |         description="Journal name.",
 77 |     )
 78 |     authors: list[str] | None = Field(
 79 |         None,
 80 |         description="List of authors.",
 81 |     )
 82 |     passages: list[Passage] = Field(
 83 |         ...,
 84 |         alias="passages",
 85 |         description="List of passages in the reference article.",
 86 |         exclude=True,
 87 |     )
 88 | 
 89 |     @computed_field
 90 |     def title(self) -> str:
 91 |         lines = []
 92 |         for passage in filter(lambda p: p.is_title, self.passages):
 93 |             if passage.text:
 94 |                 lines.append(passage.text)
 95 |         return " ... ".join(lines) or f"Article: {self.pmid}"
 96 | 
 97 |     @computed_field
 98 |     def abstract(self) -> str:
 99 |         lines = []
100 |         for passage in filter(lambda p: p.is_abstract, self.passages):
101 |             if passage.text:
102 |                 lines.append(passage.text)
103 |         return "\n\n".join(lines) or f"Article: {self.pmid}"
104 | 
105 |     @computed_field
106 |     def full_text(self) -> str:
107 |         lines = []
108 |         for passage in filter(lambda p: p.is_text, self.passages):
109 |             if passage.text:
110 |                 lines.append(passage.text)
111 |         return "\n\n".join(lines) or ""
112 | 
113 |     @computed_field
114 |     def pubmed_url(self) -> str | None:
115 |         url = None
116 |         if self.pmid:
117 |             url = f"https://pubmed.ncbi.nlm.nih.gov/{self.pmid}/"
118 |         return url
119 | 
120 |     @computed_field
121 |     def pmc_url(self) -> str | None:
122 |         """Generates the PMC URL if PMCID exists."""
123 |         url = None
124 |         if self.pmcid:
125 |             url = f"https://www.ncbi.nlm.nih.gov/pmc/articles/{self.pmcid}/"
126 |         return url
127 | 
128 | 
129 | class FetchArticlesResponse(BaseModel):
130 |     articles: list[Article] = Field(
131 |         ...,
132 |         alias="PubTator3",
133 |         description="List of full texts Articles retrieved from PubTator3.",
134 |     )
135 | 
136 |     def get_abstract(self, pmid: int | None) -> str | None:
137 |         for article in self.articles:
138 |             if pmid and article.pmid == pmid:
139 |                 return str(article.abstract)
140 |         return None
141 | 
142 | 
143 | async def call_pubtator_api(
144 |     pmids: list[int],
145 |     full: bool,
146 | ) -> tuple[FetchArticlesResponse | None, RequestError | None]:
147 |     """Fetch the text of a list of PubMed IDs."""
148 | 
149 |     request = {
150 |         "pmids": ",".join(str(pmid) for pmid in pmids),
151 |         "full": str(full).lower(),
152 |     }
153 | 
154 |     response, error = await http_client.request_api(
155 |         url=PUBTATOR3_FULLTEXT_URL,
156 |         request=request,
157 |         response_model_type=FetchArticlesResponse,
158 |         tls_version=TLSVersion.TLSv1_2,
159 |         domain="pubmed",
160 |     )
161 |     return response, error
162 | 
163 | 
164 | async def fetch_articles(
165 |     pmids: list[int],
166 |     full: bool,
167 |     output_json: bool = False,
168 | ) -> str:
169 |     """Fetch the text of a list of PubMed IDs."""
170 | 
171 |     response, error = await call_pubtator_api(pmids, full)
172 | 
173 |     # PubTator API returns full text even when full=False
174 |     exclude_fields = {"full_text"} if not full else set()
175 | 
176 |     # noinspection DuplicatedCode
177 |     if error:
178 |         data: list[dict[str, Any]] = [
179 |             {"error": f"Error {error.code}: {error.message}"}
180 |         ]
181 |     else:
182 |         data = [
183 |             article.model_dump(
184 |                 mode="json",
185 |                 exclude_none=True,
186 |                 exclude=exclude_fields,
187 |             )
188 |             for article in (response.articles if response else [])
189 |         ]
190 | 
191 |     if data and not output_json:
192 |         return render.to_markdown(data)
193 |     else:
194 |         return json.dumps(data, indent=2)
195 | 
196 | 
197 | def is_doi(identifier: str) -> bool:
198 |     """Check if the identifier is a DOI."""
199 |     # DOI pattern: starts with 10. followed by numbers/slash/alphanumeric
200 |     doi_pattern = r"^10\.\d{4,9}/[\-._;()/:\w]+$"
201 |     return bool(re.match(doi_pattern, str(identifier)))
202 | 
203 | 
204 | def is_pmid(identifier: str) -> bool:
205 |     """Check if the identifier is a PubMed ID."""
206 |     # PMID is a numeric string
207 |     return str(identifier).isdigit()
208 | 
209 | 
210 | async def _article_details(
211 |     call_benefit: Annotated[
212 |         str,
213 |         "Define and summarize why this function is being called and the intended benefit",
214 |     ],
215 |     pmid,
216 | ) -> str:
217 |     """
218 |     Retrieves details for a single article given its identifier.
219 | 
220 |     Parameters:
221 |     - call_benefit: Define and summarize why this function is being called and the intended benefit
222 |     - pmid: An article identifier - either a PubMed ID (e.g., 34397683) or DOI (e.g., 10.1101/2024.01.20.23288905)
223 | 
224 |     Process:
225 |     - For PMIDs: Calls the PubTator3 API to fetch the article's title, abstract, and full text (if available)
226 |     - For DOIs: Calls Europe PMC API to fetch preprint details
227 | 
228 |     Output: A JSON formatted string containing the retrieved article content.
229 |     """
230 |     identifier = str(pmid)
231 | 
232 |     # Check if it's a DOI (Europe PMC preprint)
233 |     if is_doi(identifier):
234 |         from .preprints import fetch_europe_pmc_article
235 | 
236 |         return await fetch_europe_pmc_article(identifier, output_json=True)
237 |     # Check if it's a PMID (PubMed article)
238 |     elif is_pmid(identifier):
239 |         return await fetch_articles(
240 |             [int(identifier)], full=True, output_json=True
241 |         )
242 |     else:
243 |         # Unknown identifier format
244 |         return json.dumps(
245 |             [
246 |                 {
247 |                     "error": f"Invalid identifier format: {identifier}. Expected either a PMID (numeric) or DOI (10.xxxx/xxxx format)."
248 |                 }
249 |             ],
250 |             indent=2,
251 |         )
252 | 
```

--------------------------------------------------------------------------------
/docs/concepts/02-the-deep-researcher-persona.md:
--------------------------------------------------------------------------------

```markdown
  1 | # The Deep Researcher Persona
  2 | 
  3 | ## Overview
  4 | 
  5 | The Deep Researcher Persona is a core philosophy of BioMCP that transforms AI assistants into systematic biomedical research partners. This persona embodies the methodical approach of a dedicated biomedical researcher, enabling AI agents to conduct thorough literature reviews, analyze complex datasets, and synthesize findings into actionable insights.
  6 | 
  7 | ## Why the Deep Researcher Persona?
  8 | 
  9 | Traditional AI interactions often result in surface-level responses. The Deep Researcher Persona addresses this by:
 10 | 
 11 | - **Enforcing Systematic Thinking**: Requiring the use of the `think` tool before any research operation
 12 | - **Preventing Premature Conclusions**: Breaking complex queries into manageable research steps
 13 | - **Ensuring Comprehensive Analysis**: Following a proven 10-step methodology
 14 | - **Maintaining Research Rigor**: Documenting thought processes and decision rationale
 15 | 
 16 | ## Core Traits and Personality
 17 | 
 18 | The Deep Researcher embodies these characteristics:
 19 | 
 20 | - **Curious and Methodical**: Always seeking deeper understanding through systematic investigation
 21 | - **Evidence-Based**: Grounding all conclusions in concrete data from multiple sources
 22 | - **Professional Voice**: Clear, concise scientific communication
 23 | - **Collaborative**: Working as a research partner, not just an information retriever
 24 | - **Objective**: Presenting balanced findings including contradictory evidence
 25 | 
 26 | ## The 10-Step Sequential Thinking Process
 27 | 
 28 | This methodology ensures comprehensive research coverage:
 29 | 
 30 | ### 1. Problem Definition and Scope
 31 | 
 32 | - Parse the research question to identify key concepts
 33 | - Define clear objectives and expected deliverables
 34 | - Establish research boundaries and constraints
 35 | 
 36 | ### 2. Initial Knowledge Assessment
 37 | 
 38 | - Evaluate existing knowledge on the topic
 39 | - Identify knowledge gaps requiring investigation
 40 | - Form initial hypotheses to guide research
 41 | 
 42 | ### 3. Search Strategy Development
 43 | 
 44 | - Design comprehensive search queries
 45 | - Select appropriate databases and tools
 46 | - Plan iterative search refinements
 47 | 
 48 | ### 4. Data Collection and Retrieval
 49 | 
 50 | - Execute searches across multiple sources (PubTator3, ClinicalTrials.gov, variant databases)
 51 | - Collect relevant articles, trials, and annotations
 52 | - Document search parameters and results
 53 | 
 54 | ### 5. Quality Assessment and Filtering
 55 | 
 56 | - Evaluate source credibility and relevance
 57 | - Apply inclusion/exclusion criteria
 58 | - Prioritize high-impact findings
 59 | 
 60 | ### 6. Information Extraction
 61 | 
 62 | - Extract key findings, methodologies, and conclusions
 63 | - Identify patterns and relationships
 64 | - Note contradictions and uncertainties
 65 | 
 66 | ### 7. Synthesis and Integration
 67 | 
 68 | - Combine findings from multiple sources
 69 | - Resolve contradictions when possible
 70 | - Build coherent narrative from evidence
 71 | 
 72 | ### 8. Critical Analysis
 73 | 
 74 | - Evaluate strength of evidence
 75 | - Identify limitations and biases
 76 | - Consider alternative interpretations
 77 | 
 78 | ### 9. Knowledge Synthesis
 79 | 
 80 | - Create structured summary of findings
 81 | - Highlight key insights and implications
 82 | - Prepare actionable recommendations
 83 | 
 84 | ### 10. Communication and Reporting
 85 | 
 86 | - Format findings for target audience
 87 | - Include proper citations and references
 88 | - Provide clear next steps
 89 | 
 90 | ## Mandatory Think Tool Usage
 91 | 
 92 | **CRITICAL**: The `think` tool must ALWAYS be used first before any BioMCP operation. This is not optional.
 93 | 
 94 | ```python
 95 | # Correct pattern - ALWAYS start with think
 96 | think(thought="Breaking down the research question...", thoughtNumber=1)
 97 | # Then proceed with searches
 98 | article_searcher(genes=["BRAF"], diseases=["melanoma"])
 99 | 
100 | # INCORRECT - Never skip the think step
101 | article_searcher(genes=["BRAF"])  # ❌ Will produce suboptimal results
102 | ```
103 | 
104 | ## Implementation in Practice
105 | 
106 | ### Example Research Flow
107 | 
108 | 1. **User Query**: "What are the treatment options for BRAF V600E melanoma?"
109 | 
110 | 2. **Think Step 1**: Problem decomposition
111 | 
112 |    ```
113 |    think(thought="Breaking down query: Need to find 1) BRAF V600E mutation significance, 2) current treatments, 3) clinical trials", thoughtNumber=1)
114 |    ```
115 | 
116 | 3. **Think Step 2**: Search strategy
117 | 
118 |    ```
119 |    think(thought="Will search articles for BRAF inhibitors, then trials for V600E-specific treatments", thoughtNumber=2)
120 |    ```
121 | 
122 | 4. **Execute Searches**: Following the planned strategy
123 | 5. **Synthesize**: Combine findings into comprehensive brief
124 | 
125 | ### Research Brief Format
126 | 
127 | Every research session concludes with a structured brief:
128 | 
129 | ```markdown
130 | ## Research Brief: [Topic]
131 | 
132 | ### Executive Summary
133 | 
134 | - 3-5 bullet points of key findings
135 | - Clear, actionable insights
136 | 
137 | ### Detailed Findings
138 | 
139 | 1. **Literature Review** (X papers analyzed)
140 | 
141 |    - Key discoveries
142 |    - Consensus findings
143 |    - Contradictions noted
144 | 
145 | 2. **Clinical Evidence** (Y trials reviewed)
146 | 
147 |    - Current treatment landscape
148 |    - Emerging therapies
149 |    - Trial enrollment opportunities
150 | 
151 | 3. **Molecular Insights**
152 |    - Variant annotations
153 |    - Pathway implications
154 |    - Biomarker relevance
155 | 
156 | ### Recommendations
157 | 
158 | - Evidence-based suggestions
159 | - Areas for further investigation
160 | - Clinical considerations
161 | 
162 | ### References
163 | 
164 | - Full citations for all sources
165 | - Direct links to primary data
166 | ```
167 | 
168 | ## Tool Inventory and Usage
169 | 
170 | The Deep Researcher has access to 24 specialized tools:
171 | 
172 | ### Core Research Tools
173 | 
174 | - **think**: Sequential reasoning and planning
175 | - **article_searcher**: PubMed/PubTator3 literature search
176 | - **trial_searcher**: Clinical trials discovery
177 | - **variant_searcher**: Genetic variant annotations
178 | 
179 | ### Specialized Analysis Tools
180 | 
181 | - **gene_getter**: Gene function and pathway data
182 | - **drug_getter**: Medication information
183 | - **disease_getter**: Disease ontology and synonyms
184 | - **alphagenome_predictor**: Variant effect prediction
185 | 
186 | ### Integration Features
187 | 
188 | - **Automatic cBioPortal Integration**: Cancer genomics context for all gene searches
189 | - **BioThings Suite Access**: Real-time biomedical annotations
190 | - **NCI Database Integration**: Comprehensive cancer trial data
191 | 
192 | ## Best Practices
193 | 
194 | 1. **Always Think First**: Never skip the sequential thinking process
195 | 2. **Use Multiple Sources**: Cross-reference findings across databases
196 | 3. **Document Reasoning**: Explain why certain searches or filters were chosen
197 | 4. **Consider Context**: Account for disease stage, prior treatments, and patient factors
198 | 5. **Stay Current**: Leverage preprint integration for latest findings
199 | 
200 | ## Community Impact
201 | 
202 | The Deep Researcher Persona has transformed how researchers interact with biomedical data:
203 | 
204 | - **Reduced Research Time**: From days to minutes for comprehensive reviews
205 | - **Improved Accuracy**: Systematic approach reduces missed connections
206 | - **Enhanced Collaboration**: Consistent methodology enables team research
207 | - **Democratized Access**: Complex research capabilities available to all
208 | 
209 | ## Getting Started
210 | 
211 | To use the Deep Researcher Persona:
212 | 
213 | 1. Ensure BioMCP is installed and configured
214 | 2. Load the persona resource when starting your AI session
215 | 3. Always begin research queries with the think tool
216 | 4. Follow the 10-step methodology for comprehensive results
217 | 
218 | Remember: The Deep Researcher Persona is not just a tool configuration—it's a systematic approach to biomedical research that ensures thorough, evidence-based insights every time.
219 | 
```

--------------------------------------------------------------------------------
/src/biomcp/render.py:
--------------------------------------------------------------------------------

```python
  1 | import json
  2 | import re
  3 | import textwrap
  4 | from typing import Any
  5 | 
  6 | MAX_WIDTH = 72
  7 | 
  8 | REMOVE_MULTI_LINES = re.compile(r"\s+")
  9 | 
 10 | 
 11 | def dedupe_list_keep_order(lst: list[Any]) -> list[Any]:
 12 |     """
 13 |     Remove duplicates from a list while preserving order.
 14 |     Uses string to handle elements like dicts that are not hashable.
 15 |     """
 16 |     seen = set()
 17 |     data = []
 18 |     for x in lst:
 19 |         if str(x) not in seen:
 20 |             data.append(x)
 21 |             seen.add(str(x))
 22 |     return data
 23 | 
 24 | 
 25 | def to_markdown(data: str | list | dict) -> str:
 26 |     """Convert a JSON string or already-parsed data (dict or list) into
 27 |     a simple Markdown representation.
 28 | 
 29 |     :param data: The input data, either as a JSON string, or a parsed list/dict.
 30 |     :return: A string containing the generated Markdown output.
 31 |     """
 32 |     if isinstance(data, str):
 33 |         data = json.loads(data)
 34 | 
 35 |     if isinstance(data, list):
 36 |         new_data = []
 37 |         for index, item in enumerate(data, start=1):
 38 |             new_data.append({f"Record {index}": item})
 39 |         data = new_data
 40 | 
 41 |     lines: list[str] = []
 42 |     process_any(data, [], lines)
 43 |     return ("\n".join(lines)).strip() + "\n"
 44 | 
 45 | 
 46 | def wrap_preserve_newlines(text: str, width: int) -> list[str]:
 47 |     """For each line in the text (split by newlines), wrap it to 'width' columns.
 48 |     Blank lines are preserved. Returns a list of wrapped lines without
 49 |     inserting extra blank lines.
 50 | 
 51 |     :param text: The multiline string to wrap.
 52 |     :param width: Maximum line width for wrapping.
 53 |     :return: A list of lines after wrapping.
 54 |     """
 55 |     wrapped_lines: list[str] = []
 56 |     for line in text.splitlines(keepends=False):
 57 |         if not line.strip():
 58 |             wrapped_lines.append("")
 59 |             continue
 60 |         # remove excessive spaces (pmid=38296628)
 61 |         line = REMOVE_MULTI_LINES.sub(" ", line)
 62 |         pieces = textwrap.wrap(line, width=width)
 63 |         wrapped_lines.extend(pieces)
 64 |     return wrapped_lines
 65 | 
 66 | 
 67 | def append_line(lines: list[str], line: str) -> None:
 68 |     """Append a line to 'lines', avoiding consecutive blank lines.
 69 | 
 70 |     :param lines: The running list of lines to which we add.
 71 |     :param line: The line to append.
 72 |     """
 73 |     line = line.rstrip()
 74 |     lines.append(line)
 75 | 
 76 | 
 77 | def process_any(
 78 |     value: Any,
 79 |     path_keys: list[str],
 80 |     lines: list[str],
 81 | ) -> None:
 82 |     """Dispatch function to handle dict, list, or scalar (str/int/float/bool).
 83 | 
 84 |     :param value: The current JSON data node.
 85 |     :param path_keys: The list of keys leading to this node (for headings).
 86 |     :param lines: The running list of output Markdown lines.
 87 |     """
 88 |     if isinstance(value, dict):
 89 |         process_dict(value, path_keys, lines)
 90 |     elif isinstance(value, list):
 91 |         process_list(value, path_keys, lines)
 92 |     elif value is not None:
 93 |         render_key_value(lines, path_keys[-1], value)
 94 | 
 95 | 
 96 | def process_dict(dct: dict, path_keys: list[str], lines: list[str]) -> None:
 97 |     """Handle a dictionary by printing a heading for the current path (if any),
 98 |     then processing key/value pairs in order: scalars first, then nested dicts, then lists.
 99 | 
100 |     :param dct: The dictionary to process.
101 |     :param path_keys: The list of keys leading to this dict (for heading).
102 |     :param lines: The running list of output Markdown lines.
103 |     """
104 |     if path_keys:
105 |         level = min(len(path_keys), 5)
106 |         heading_hash = "#" * level
107 |         heading_text = transform_key(path_keys[-1])
108 |         # Blank line, then heading
109 |         append_line(lines, "")
110 |         append_line(lines, f"{heading_hash} {heading_text}")
111 | 
112 |     # Group keys by value type
113 |     scalar_keys = []
114 |     dict_keys = []
115 |     list_keys = []
116 | 
117 |     for key, val in dct.items():
118 |         if isinstance(val, str | int | float | bool) or val is None:
119 |             scalar_keys.append(key)
120 |         elif isinstance(val, dict):
121 |             dict_keys.append(key)
122 |         elif isinstance(val, list):
123 |             list_keys.append(key)
124 | 
125 |     # Process scalars first
126 |     for key in scalar_keys:
127 |         next_path = path_keys + [key]
128 |         process_any(dct[key], next_path, lines)
129 | 
130 |     # Process dicts second
131 |     for key in dict_keys:
132 |         next_path = path_keys + [key]
133 |         process_any(dct[key], next_path, lines)
134 | 
135 |     # Process lists last
136 |     for key in list_keys:
137 |         next_path = path_keys + [key]
138 |         process_any(dct[key], next_path, lines)
139 | 
140 | 
141 | def process_list(lst: list, path_keys: list[str], lines: list[str]) -> None:
142 |     """If all items in the list are scalar, attempt to render them on one line
143 |     if it fits, otherwise use bullet points. Otherwise, we recursively
144 |     process each item.
145 | 
146 |     :param lst: The list of items to process.
147 |     :param path_keys: The keys leading to this list.
148 |     :param lines: The running list of Markdown lines.
149 |     """
150 |     all_scalars = all(isinstance(i, str | int | float | bool) for i in lst)
151 |     lst = dedupe_list_keep_order(lst)
152 |     if path_keys and all_scalars:
153 |         key = path_keys[-1]
154 |         process_scalar_list(key, lines, lst)
155 |     else:
156 |         for item in lst:
157 |             process_any(item, path_keys, lines)
158 | 
159 | 
160 | def process_scalar_list(key: str, lines: list[str], lst: list) -> None:
161 |     """Print a list of scalars either on one line as "Key: item1, item2, ..."
162 |     if it fits within MAX_WIDTH, otherwise print a bullet list.
163 | 
164 |     :param key: The key name for this list of scalars.
165 |     :param lines: The running list of Markdown lines.
166 |     :param lst: The actual list of scalar items.
167 |     """
168 |     label = transform_key(key)
169 |     items_str = ", ".join(str(item) for item in lst)
170 |     single_line = f"{label}: {items_str}"
171 |     if len(single_line) <= MAX_WIDTH:
172 |         append_line(lines, single_line)
173 |     else:
174 |         # bullet list
175 |         append_line(lines, f"{label}:")
176 |         for item in lst:
177 |             bullet = f"- {item}"
178 |             append_line(lines, bullet)
179 | 
180 | 
181 | def render_key_value(lines: list[str], key: str, value: Any) -> None:
182 |     """Render a single "key: value" pair. If the value is a long string,
183 |     we do multiline wrapping with an indentation for clarity. Otherwise,
184 |     it appears on the same line.
185 | 
186 |     :param lines: The running list of Markdown lines.
187 |     :param key: The raw key name (untransformed).
188 |     :param value: The value associated with this key.
189 |     """
190 |     label = transform_key(key)
191 |     val_str = str(value)
192 | 
193 |     # If the value is a fairly long string, do multiline
194 |     if isinstance(value, str) and len(value) > MAX_WIDTH:
195 |         append_line(lines, f"{label}:")
196 |         for wrapped in wrap_preserve_newlines(val_str, MAX_WIDTH):
197 |             append_line(lines, "  " + wrapped)
198 |     else:
199 |         append_line(lines, f"{label}: {val_str}")
200 | 
201 | 
202 | def transform_key(s: str) -> str:
203 |     # Replace underscores with spaces.
204 |     s = s.replace("_", " ")
205 |     # Insert a space between an uppercase letter followed by an uppercase letter then a lowercase letter.
206 |     s = re.sub(r"(?<=[A-Z])(?=[A-Z][a-z])", " ", s)
207 |     # Insert a space between a lowercase letter or digit and an uppercase letter.
208 |     s = re.sub(r"(?<=[a-z0-9])(?=[A-Z])", " ", s)
209 | 
210 |     words = s.split()
211 |     transformed_words = []
212 |     for word in words:
213 |         transformed_words.append(word.capitalize())
214 |     return " ".join(transformed_words)
215 | 
```

--------------------------------------------------------------------------------
/docs/getting-started/02-claude-desktop-integration.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Claude Desktop Integration
  2 | 
  3 | This guide covers how to integrate BioMCP with Claude Desktop, enabling AI-powered biomedical research directly in your Claude conversations.
  4 | 
  5 | ## Prerequisites
  6 | 
  7 | - [Claude Desktop](https://claude.ai/download) application
  8 | - One of the following:
  9 |   - **Option A**: Python 3.10+ and [uv](https://docs.astral.sh/uv/) (recommended)
 10 |   - **Option B**: [Docker](https://www.docker.com/products/docker-desktop/)
 11 | 
 12 | ## Installation Methods
 13 | 
 14 | ### Option A: Using uv (Recommended)
 15 | 
 16 | This method is fastest and easiest for most users.
 17 | 
 18 | #### 1. Install uv
 19 | 
 20 | ```bash
 21 | # macOS/Linux
 22 | curl -LsSf https://astral.sh/uv/install.sh | sh
 23 | 
 24 | # Windows
 25 | powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
 26 | ```
 27 | 
 28 | #### 2. Configure Claude Desktop
 29 | 
 30 | Add BioMCP to your Claude Desktop configuration file:
 31 | 
 32 | **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
 33 | **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
 34 | 
 35 | ```json
 36 | {
 37 |   "mcpServers": {
 38 |     "biomcp": {
 39 |       "command": "uv",
 40 |       "args": ["run", "--with", "biomcp-python", "biomcp", "run"],
 41 |       "env": {
 42 |         "NCI_API_KEY": "your-nci-api-key-here",
 43 |         "ALPHAGENOME_API_KEY": "your-alphagenome-key-here",
 44 |         "CBIO_TOKEN": "your-cbioportal-token-here"
 45 |       }
 46 |     }
 47 |   }
 48 | }
 49 | ```
 50 | 
 51 | ### Option B: Using Docker
 52 | 
 53 | This method provides better isolation and consistency across systems.
 54 | 
 55 | #### 1. Create a Dockerfile
 56 | 
 57 | Create a file named `Dockerfile`:
 58 | 
 59 | ```dockerfile
 60 | FROM python:3.11-slim
 61 | 
 62 | # Install BioMCP
 63 | RUN pip install biomcp-python
 64 | 
 65 | # Set the entrypoint
 66 | ENTRYPOINT ["biomcp", "run"]
 67 | ```
 68 | 
 69 | #### 2. Build the Docker Image
 70 | 
 71 | ```bash
 72 | docker build -t biomcp:latest .
 73 | ```
 74 | 
 75 | #### 3. Configure Claude Desktop
 76 | 
 77 | Add BioMCP to your configuration file:
 78 | 
 79 | ```json
 80 | {
 81 |   "mcpServers": {
 82 |     "biomcp": {
 83 |       "command": "docker",
 84 |       "args": ["run", "-i", "--rm", "biomcp:latest"],
 85 |       "env": {
 86 |         "NCI_API_KEY": "your-nci-api-key-here",
 87 |         "ALPHAGENOME_API_KEY": "your-alphagenome-key-here",
 88 |         "CBIO_TOKEN": "your-cbioportal-token-here"
 89 |       }
 90 |     }
 91 |   }
 92 | }
 93 | ```
 94 | 
 95 | ## Verification
 96 | 
 97 | 1. Restart Claude Desktop after updating the configuration
 98 | 2. Start a new conversation
 99 | 3. Look for the 🔌 icon indicating MCP is connected
100 | 4. Test with: "Can you search for articles about BRAF mutations in melanoma?"
101 | 
102 | ## Setting Up API Keys
103 | 
104 | While BioMCP works without API keys, some features require them for full functionality:
105 | 
106 | ### NCI API Key (Optional)
107 | 
108 | Enables access to NCI's clinical trials database with advanced filters:
109 | 
110 | - Get your key from [NCI API Portal](https://api.cancer.gov)
111 | - Add to configuration as `NCI_API_KEY`
112 | 
113 | ### AlphaGenome API Key (Optional)
114 | 
115 | Enables variant effect predictions using Google DeepMind's AlphaGenome:
116 | 
117 | - Register at [AlphaGenome Portal](https://alphagenome.google.com)
118 | - Add to configuration as `ALPHAGENOME_API_KEY`
119 | 
120 | ### cBioPortal Token (Optional)
121 | 
122 | Enables enhanced cancer genomics queries:
123 | 
124 | - Get token from [cBioPortal](https://www.cbioportal.org/webAPI)
125 | - Add to configuration as `CBIO_TOKEN`
126 | 
127 | ## Usage Examples
128 | 
129 | Once configured, you can ask Claude to perform various biomedical research tasks:
130 | 
131 | ### Literature Search
132 | 
133 | ```
134 | "Find recent articles about CAR-T therapy for B-cell lymphomas"
135 | ```
136 | 
137 | ### Clinical Trials
138 | 
139 | ```
140 | "Search for actively recruiting trials for EGFR-mutant lung cancer"
141 | ```
142 | 
143 | ### Variant Analysis
144 | 
145 | ```
146 | "What is known about the pathogenicity of BRCA1 c.5266dupC?"
147 | ```
148 | 
149 | ### Drug Information
150 | 
151 | ```
152 | "Tell me about the mechanism of action and indications for pembrolizumab"
153 | ```
154 | 
155 | ### Complex Research
156 | 
157 | ```
158 | "I need a comprehensive analysis of treatment options for a patient with
159 | BRAF V600E melanoma who has progressed on dabrafenib/trametinib"
160 | ```
161 | 
162 | ## The Deep Researcher Persona
163 | 
164 | BioMCP includes a specialized "Deep Researcher" persona that enhances Claude's biomedical research capabilities:
165 | 
166 | - **Sequential Thinking**: Automatically uses the `think` tool for systematic analysis
167 | - **Comprehensive Coverage**: Searches multiple databases and synthesizes findings
168 | - **Evidence-Based**: Provides citations and links to primary sources
169 | - **Clinical Focus**: Understands medical context and terminology
170 | 
171 | To activate, simply ask biomedical questions naturally. The persona automatically engages for research tasks.
172 | 
173 | ## Troubleshooting
174 | 
175 | ### "MCP Connection Failed"
176 | 
177 | 1. Verify the configuration file path is correct
178 | 2. Check JSON syntax (no trailing commas)
179 | 3. Ensure Claude Desktop has been restarted
180 | 4. Check that uv or Docker is properly installed
181 | 
182 | ### "Command Not Found"
183 | 
184 | **For uv**:
185 | 
186 | ```bash
187 | # Verify uv installation
188 | uv --version
189 | 
190 | # Ensure PATH includes uv
191 | echo $PATH | grep -q "\.local/bin" || echo "PATH needs updating"
192 | ```
193 | 
194 | **For Docker**:
195 | 
196 | ```bash
197 | # Verify Docker is running
198 | docker ps
199 | 
200 | # Test BioMCP container
201 | docker run -it --rm biomcp:latest --help
202 | ```
203 | 
204 | ### "No Results Found"
205 | 
206 | - Check your internet connection
207 | - Verify API keys are correctly set (if using optional features)
208 | - Try simpler queries first
209 | - Use official gene symbols (e.g., "TP53" not "p53")
210 | 
211 | ### Performance Issues
212 | 
213 | **For uv**:
214 | 
215 | - First run may be slow due to package downloads
216 | - Subsequent runs use cached environments
217 | 
218 | **For Docker**:
219 | 
220 | - Ensure Docker has sufficient memory allocated
221 | - Consider building with `--platform` flag for Apple Silicon
222 | 
223 | ## Advanced Configuration
224 | 
225 | ### Custom Environment Variables
226 | 
227 | Add any additional environment variables your research requires:
228 | 
229 | ```json
230 | {
231 |   "mcpServers": {
232 |     "biomcp": {
233 |       "command": "uv",
234 |       "args": ["run", "--with", "biomcp-python", "biomcp", "run"],
235 |       "env": {
236 |         "BIOMCP_LOG_LEVEL": "DEBUG",
237 |         "BIOMCP_CACHE_DIR": "/path/to/cache",
238 |         "HTTP_PROXY": "http://your-proxy:8080"
239 |       }
240 |     }
241 |   }
242 | }
243 | ```
244 | 
245 | ### Multiple Configurations
246 | 
247 | You can run multiple BioMCP instances with different settings:
248 | 
249 | ```json
250 | {
251 |   "mcpServers": {
252 |     "biomcp-prod": {
253 |       "command": "uv",
254 |       "args": ["run", "--with", "biomcp-python", "biomcp", "run"],
255 |       "env": {
256 |         "BIOMCP_ENV": "production"
257 |       }
258 |     },
259 |     "biomcp-dev": {
260 |       "command": "uv",
261 |       "args": ["run", "--with", "biomcp-python@latest", "biomcp", "run"],
262 |       "env": {
263 |         "BIOMCP_ENV": "development",
264 |         "BIOMCP_LOG_LEVEL": "DEBUG"
265 |       }
266 |     }
267 |   }
268 | }
269 | ```
270 | 
271 | ## Best Practices
272 | 
273 | 1. **Start Simple**: Test with basic queries before complex research tasks
274 | 2. **Be Specific**: Use official gene symbols and disease names
275 | 3. **Iterate**: Refine queries based on initial results
276 | 4. **Verify Sources**: Always check the provided citations
277 | 5. **Save Important Findings**: Export conversation or copy key results
278 | 
279 | ## Getting Help
280 | 
281 | - **Documentation**: [BioMCP Docs](https://github.com/genomoncology/biomcp)
282 | - **Issues**: [GitHub Issues](https://github.com/genomoncology/biomcp/issues)
283 | - **Community**: [Discussions](https://github.com/genomoncology/biomcp/discussions)
284 | 
285 | ## Next Steps
286 | 
287 | Now that BioMCP is integrated with Claude Desktop:
288 | 
289 | 1. Try the [example queries](#usage-examples) above
290 | 2. Explore [How-to Guides](../how-to-guides/01-find-articles-and-cbioportal-data.md) for specific research workflows
291 | 3. Learn about [Sequential Thinking](../concepts/03-sequential-thinking-with-the-think-tool.md) for complex analyses
292 | 4. Set up [additional API keys](03-authentication-and-api-keys.md) for enhanced features
293 | 
```

--------------------------------------------------------------------------------
/src/biomcp/articles/unified.py:
--------------------------------------------------------------------------------

```python
  1 | """Unified article search combining PubMed and preprint sources."""
  2 | 
  3 | import asyncio
  4 | import json
  5 | import logging
  6 | from collections.abc import Coroutine
  7 | from typing import Any
  8 | 
  9 | from .. import render
 10 | from .preprints import search_preprints
 11 | from .search import PubmedRequest, search_articles
 12 | 
 13 | logger = logging.getLogger(__name__)
 14 | 
 15 | 
 16 | def _deduplicate_articles(articles: list[dict]) -> list[dict]:
 17 |     """Remove duplicate articles based on DOI."""
 18 |     seen_dois = set()
 19 |     unique_articles = []
 20 |     for article in articles:
 21 |         doi = article.get("doi")
 22 |         if doi and doi in seen_dois:
 23 |             continue
 24 |         if doi:
 25 |             seen_dois.add(doi)
 26 |         unique_articles.append(article)
 27 |     return unique_articles
 28 | 
 29 | 
 30 | def _parse_search_results(results: list) -> list[dict]:
 31 |     """Parse search results from JSON strings."""
 32 |     all_articles = []
 33 |     for result in results:
 34 |         if isinstance(result, str):
 35 |             try:
 36 |                 articles = json.loads(result)
 37 |                 if isinstance(articles, list):
 38 |                     all_articles.extend(articles)
 39 |             except json.JSONDecodeError:
 40 |                 continue
 41 |     return all_articles
 42 | 
 43 | 
 44 | async def _extract_mutation_pattern(
 45 |     keywords: list[str],
 46 | ) -> tuple[str | None, str | None]:
 47 |     """Extract mutation pattern from keywords asynchronously."""
 48 |     if not keywords:
 49 |         return None, None
 50 | 
 51 |     # Use asyncio.to_thread for CPU-bound regex operations
 52 |     import re
 53 | 
 54 |     def _extract_sync():
 55 |         for keyword in keywords:
 56 |             # Check for specific mutations (e.g., F57Y, V600E)
 57 |             if re.match(r"^[A-Z]\d+[A-Z*]$", keyword):
 58 |                 if keyword.endswith("*"):
 59 |                     return keyword, None  # mutation_pattern
 60 |                 else:
 61 |                     return None, keyword  # specific_mutation
 62 |         return None, None
 63 | 
 64 |     # Run CPU-bound operation in thread pool
 65 |     return await asyncio.to_thread(_extract_sync)
 66 | 
 67 | 
 68 | async def _get_mutation_summary(
 69 |     gene: str, mutation: str | None, pattern: str | None
 70 | ) -> str | None:
 71 |     """Get mutation-specific cBioPortal summary."""
 72 |     from ..variants.cbioportal_mutations import (
 73 |         CBioPortalMutationClient,
 74 |         format_mutation_search_result,
 75 |     )
 76 | 
 77 |     mutation_client = CBioPortalMutationClient()
 78 | 
 79 |     if mutation:
 80 |         logger.info(f"Searching for specific mutation {gene} {mutation}")
 81 |         result = await mutation_client.search_specific_mutation(
 82 |             gene=gene, mutation=mutation, max_studies=20
 83 |         )
 84 |     else:
 85 |         logger.info(f"Searching for mutation pattern {gene} {pattern}")
 86 |         result = await mutation_client.search_specific_mutation(
 87 |             gene=gene, pattern=pattern, max_studies=20
 88 |         )
 89 | 
 90 |     return format_mutation_search_result(result) if result else None
 91 | 
 92 | 
 93 | async def _get_gene_summary(gene: str) -> str | None:
 94 |     """Get regular gene cBioPortal summary."""
 95 |     from ..variants.cbioportal_search import (
 96 |         CBioPortalSearchClient,
 97 |         format_cbioportal_search_summary,
 98 |     )
 99 | 
100 |     client = CBioPortalSearchClient()
101 |     summary = await client.get_gene_search_summary(gene, max_studies=5)
102 |     return format_cbioportal_search_summary(summary) if summary else None
103 | 
104 | 
105 | async def _get_cbioportal_summary(request: PubmedRequest) -> str | None:
106 |     """Get cBioPortal summary for the search request."""
107 |     if not request.genes:
108 |         return None
109 | 
110 |     try:
111 |         gene = request.genes[0]
112 |         mutation_pattern, specific_mutation = await _extract_mutation_pattern(
113 |             request.keywords
114 |         )
115 | 
116 |         if specific_mutation or mutation_pattern:
117 |             return await _get_mutation_summary(
118 |                 gene, specific_mutation, mutation_pattern
119 |             )
120 |         else:
121 |             return await _get_gene_summary(gene)
122 | 
123 |     except Exception as e:
124 |         logger.warning(
125 |             f"Failed to get cBioPortal summary for gene search: {e}"
126 |         )
127 |         return None
128 | 
129 | 
130 | async def search_articles_unified(  # noqa: C901
131 |     request: PubmedRequest,
132 |     include_pubmed: bool = True,
133 |     include_preprints: bool = False,
134 |     include_cbioportal: bool = True,
135 |     output_json: bool = False,
136 | ) -> str:
137 |     """Search for articles across PubMed and preprint sources."""
138 |     # Import here to avoid circular imports
139 |     from ..shared_context import SearchContextManager
140 | 
141 |     # Use shared context to avoid redundant validations
142 |     with SearchContextManager() as context:
143 |         # Pre-validate genes once
144 |         if request.genes:
145 |             valid_genes = []
146 |             for gene in request.genes:
147 |                 if await context.validate_gene(gene):
148 |                     valid_genes.append(gene)
149 |             request.genes = valid_genes
150 | 
151 |         tasks: list[Coroutine[Any, Any, Any]] = []
152 |         task_labels = []
153 | 
154 |         if include_pubmed:
155 |             tasks.append(search_articles(request, output_json=True))
156 |             task_labels.append("pubmed")
157 | 
158 |         if include_preprints:
159 |             tasks.append(search_preprints(request, output_json=True))
160 |             task_labels.append("preprints")
161 | 
162 |         # Add cBioPortal to parallel execution
163 |         if include_cbioportal and request.genes:
164 |             tasks.append(_get_cbioportal_summary(request))
165 |             task_labels.append("cbioportal")
166 | 
167 |         if not tasks:
168 |             return json.dumps([]) if output_json else render.to_markdown([])
169 | 
170 |         # Run all operations in parallel
171 |         results = await asyncio.gather(*tasks, return_exceptions=True)
172 | 
173 |         # Create result map for easier processing
174 |         result_map = dict(zip(task_labels, results, strict=False))
175 | 
176 |         # Extract cBioPortal summary if it was included
177 |         cbioportal_summary: str | None = None
178 |         if "cbioportal" in result_map:
179 |             result = result_map["cbioportal"]
180 |             if not isinstance(result, Exception) and isinstance(result, str):
181 |                 cbioportal_summary = result
182 | 
183 |         # Parse article search results
184 |         article_results = []
185 |         for label, result in result_map.items():
186 |             if label != "cbioportal" and not isinstance(result, Exception):
187 |                 article_results.append(result)
188 | 
189 |         # Parse and deduplicate results
190 |         all_articles = _parse_search_results(article_results)
191 |         unique_articles = _deduplicate_articles(all_articles)
192 | 
193 |         # Sort by publication state (peer-reviewed first) and then by date
194 |         unique_articles.sort(
195 |             key=lambda x: (
196 |                 0
197 |                 if x.get("publication_state", "peer_reviewed")
198 |                 == "peer_reviewed"
199 |                 else 1,
200 |                 x.get("date", "0000-00-00"),
201 |             ),
202 |             reverse=True,
203 |         )
204 | 
205 |         if unique_articles and not output_json:
206 |             result = render.to_markdown(unique_articles)
207 |             if cbioportal_summary and isinstance(cbioportal_summary, str):
208 |                 # Add cBioPortal summary at the beginning
209 |                 result = cbioportal_summary + "\n\n---\n\n" + result
210 |             return result
211 |         else:
212 |             if cbioportal_summary:
213 |                 return json.dumps(
214 |                     {
215 |                         "cbioportal_summary": cbioportal_summary,
216 |                         "articles": unique_articles,
217 |                     },
218 |                     indent=2,
219 |                 )
220 |             return json.dumps(unique_articles, indent=2)
221 | 
```

--------------------------------------------------------------------------------
/src/biomcp/openfda/adverse_events.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | OpenFDA Drug Adverse Events (FAERS) integration.
  3 | """
  4 | 
  5 | import logging
  6 | 
  7 | from .adverse_events_helpers import (
  8 |     format_drug_details,
  9 |     format_reaction_details,
 10 |     format_report_metadata,
 11 |     format_report_summary,
 12 |     format_search_summary,
 13 |     format_top_reactions,
 14 | )
 15 | from .constants import (
 16 |     OPENFDA_DEFAULT_LIMIT,
 17 |     OPENFDA_DISCLAIMER,
 18 |     OPENFDA_DRUG_EVENTS_URL,
 19 |     OPENFDA_MAX_LIMIT,
 20 | )
 21 | from .exceptions import (
 22 |     OpenFDAConnectionError,
 23 |     OpenFDARateLimitError,
 24 |     OpenFDATimeoutError,
 25 | )
 26 | from .input_validation import sanitize_input
 27 | from .utils import clean_text, make_openfda_request
 28 | 
 29 | logger = logging.getLogger(__name__)
 30 | 
 31 | 
 32 | def _build_search_query(
 33 |     drug: str | None, reaction: str | None, serious: bool | None
 34 | ) -> str:
 35 |     """Build the search query for adverse events."""
 36 |     search_parts = []
 37 | 
 38 |     if drug:
 39 |         # Sanitize drug input to prevent injection
 40 |         drug = sanitize_input(drug, max_length=100)
 41 |         if drug:
 42 |             drug_query = (
 43 |                 f'(patient.drug.medicinalproduct:"{drug}" OR '
 44 |                 f'patient.drug.openfda.brand_name:"{drug}" OR '
 45 |                 f'patient.drug.openfda.generic_name:"{drug}")'
 46 |             )
 47 |             search_parts.append(drug_query)
 48 | 
 49 |     if reaction:
 50 |         # Sanitize reaction input
 51 |         reaction = sanitize_input(reaction, max_length=200)
 52 |         if reaction:
 53 |             search_parts.append(
 54 |                 f'patient.reaction.reactionmeddrapt:"{reaction}"'
 55 |             )
 56 | 
 57 |     if serious is not None:
 58 |         serious_value = "1" if serious else "2"
 59 |         search_parts.append(f"serious:{serious_value}")
 60 | 
 61 |     return " AND ".join(search_parts)
 62 | 
 63 | 
 64 | async def search_adverse_events(  # noqa: C901
 65 |     drug: str | None = None,
 66 |     reaction: str | None = None,
 67 |     serious: bool | None = None,
 68 |     limit: int = OPENFDA_DEFAULT_LIMIT,
 69 |     skip: int = 0,
 70 |     api_key: str | None = None,
 71 | ) -> str:
 72 |     """
 73 |     Search FDA adverse event reports (FAERS).
 74 | 
 75 |     Args:
 76 |         drug: Drug name to search for
 77 |         reaction: Adverse reaction term to search for
 78 |         serious: Filter for serious events only
 79 |         limit: Maximum number of results
 80 |         skip: Number of results to skip
 81 |         api_key: Optional OpenFDA API key (overrides OPENFDA_API_KEY env var)
 82 | 
 83 |     Returns:
 84 |         Formatted string with adverse event information
 85 |     """
 86 |     if not drug and not reaction:
 87 |         return (
 88 |             "⚠️ Please specify either a drug name or reaction term to search "
 89 |             "adverse events.\n\n"
 90 |             "Examples:\n"
 91 |             "- Search by drug: --drug 'imatinib'\n"
 92 |             "- Search by reaction: --reaction 'nausea'\n"
 93 |             "- Both: --drug 'imatinib' --reaction 'nausea'"
 94 |         )
 95 | 
 96 |     # Build and execute search
 97 |     search_query = _build_search_query(drug, reaction, serious)
 98 |     params = {
 99 |         "search": search_query,
100 |         "limit": min(limit, OPENFDA_MAX_LIMIT),
101 |         "skip": skip,
102 |     }
103 | 
104 |     try:
105 |         response, error = await make_openfda_request(
106 |             OPENFDA_DRUG_EVENTS_URL, params, "openfda_adverse_events", api_key
107 |         )
108 |     except OpenFDARateLimitError:
109 |         return (
110 |             "⚠️ **FDA API Rate Limit Exceeded**\n\n"
111 |             "You've exceeded the FDA's rate limit. Options:\n"
112 |             "• Wait a moment and try again\n"
113 |             "• Provide an FDA API key for higher limits (240/min vs 40/min)\n"
114 |             "• Get a free key at: https://open.fda.gov/apis/authentication/"
115 |         )
116 |     except OpenFDATimeoutError:
117 |         return (
118 |             "⏱️ **Request Timeout**\n\n"
119 |             "The FDA API is taking too long to respond. This may be due to:\n"
120 |             "• High server load\n"
121 |             "• Complex query\n"
122 |             "• Network issues\n\n"
123 |             "Please try again in a moment."
124 |         )
125 |     except OpenFDAConnectionError as e:
126 |         return (
127 |             "🔌 **Connection Error**\n\n"
128 |             f"Unable to connect to FDA API: {e}\n\n"
129 |             "Please check your internet connection and try again."
130 |         )
131 | 
132 |     if error:
133 |         return f"⚠️ Error searching adverse events: {error}"
134 | 
135 |     if not response or not response.get("results"):
136 |         search_desc = []
137 |         if drug:
138 |             search_desc.append(f"drug '{drug}'")
139 |         if reaction:
140 |             search_desc.append(f"reaction '{reaction}'")
141 |         return (
142 |             f"No adverse event reports found for {' and '.join(search_desc)}."
143 |         )
144 | 
145 |     results = response["results"]
146 |     total = (
147 |         response.get("meta", {}).get("results", {}).get("total", len(results))
148 |     )
149 | 
150 |     # Build output
151 |     output = ["## FDA Adverse Event Reports\n"]
152 |     output.extend(format_search_summary(drug, reaction, serious, total))
153 | 
154 |     # Add top reactions if searching by drug
155 |     if drug and not reaction:
156 |         output.extend(format_top_reactions(results))
157 | 
158 |     # Add sample reports
159 |     output.append(
160 |         f"### Sample Reports (showing {min(len(results), 3)} of {total}):\n"
161 |     )
162 |     for i, result in enumerate(results[:3], 1):
163 |         output.extend(format_report_summary(result, i))
164 | 
165 |     output.append(f"\n{OPENFDA_DISCLAIMER}")
166 |     return "\n".join(output)
167 | 
168 | 
169 | async def get_adverse_event(report_id: str, api_key: str | None = None) -> str:
170 |     """
171 |     Get detailed information for a specific adverse event report.
172 | 
173 |     Args:
174 |         report_id: Safety report ID
175 |         api_key: Optional OpenFDA API key (overrides OPENFDA_API_KEY env var)
176 | 
177 |     Returns:
178 |         Formatted string with detailed report information
179 |     """
180 |     params = {
181 |         "search": f'safetyreportid:"{report_id}"',
182 |         "limit": 1,
183 |     }
184 | 
185 |     response, error = await make_openfda_request(
186 |         OPENFDA_DRUG_EVENTS_URL,
187 |         params,
188 |         "openfda_adverse_event_detail",
189 |         api_key,
190 |     )
191 | 
192 |     if error:
193 |         return f"⚠️ Error retrieving adverse event report: {error}"
194 | 
195 |     if not response or not response.get("results"):
196 |         return f"Adverse event report '{report_id}' not found."
197 | 
198 |     result = response["results"][0]
199 |     patient = result.get("patient", {})
200 | 
201 |     # Build detailed output
202 |     output = [f"## Adverse Event Report: {report_id}\n"]
203 | 
204 |     # Patient Information
205 |     output.extend(_format_patient_info(patient))
206 | 
207 |     # Drug Information
208 |     if drugs := patient.get("drug", []):
209 |         output.extend(format_drug_details(drugs))
210 | 
211 |     # Reactions
212 |     if reactions := patient.get("reaction", []):
213 |         output.extend(format_reaction_details(reactions))
214 | 
215 |     # Event Summary
216 |     if summary := patient.get("summary", {}).get("narrativeincludeclinical"):
217 |         output.append("### Event Narrative")
218 |         output.append(clean_text(summary))
219 |         output.append("")
220 | 
221 |     # Report metadata
222 |     output.extend(format_report_metadata(result))
223 | 
224 |     output.append(f"\n{OPENFDA_DISCLAIMER}")
225 |     return "\n".join(output)
226 | 
227 | 
228 | def _format_patient_info(patient: dict) -> list[str]:
229 |     """Format patient information section."""
230 |     output = ["### Patient Information"]
231 | 
232 |     if age := patient.get("patientonsetage"):
233 |         output.append(f"- **Age**: {age} years")
234 | 
235 |     sex_map = {0: "Unknown", 1: "Male", 2: "Female"}
236 |     sex_code = patient.get("patientsex")
237 |     sex = (
238 |         sex_map.get(sex_code, "Unknown") if sex_code is not None else "Unknown"
239 |     )
240 |     output.append(f"- **Sex**: {sex}")
241 | 
242 |     if weight := patient.get("patientweight"):
243 |         output.append(f"- **Weight**: {weight} kg")
244 | 
245 |     output.append("")
246 |     return output
247 | 
```

--------------------------------------------------------------------------------
/docs/how-to-guides/01-find-articles-and-cbioportal-data.md:
--------------------------------------------------------------------------------

```markdown
  1 | # How to Find Articles and cBioPortal Data
  2 | 
  3 | This guide walks you through searching biomedical literature with automatic cancer genomics integration from cBioPortal.
  4 | 
  5 | ## Overview
  6 | 
  7 | When searching for articles about genes, BioMCP automatically enriches your results with:
  8 | 
  9 | - **cBioPortal Summary**: Mutation frequencies, hotspots, and cancer type distribution ([API Reference](../backend-services-reference/03-cbioportal.md))
 10 | - **PubMed Articles**: Peer-reviewed research with entity annotations ([PubTator3 Reference](../backend-services-reference/06-pubtator3.md))
 11 | - **Preprints**: Latest findings from bioRxiv and medRxiv
 12 | 
 13 | ## Basic Article Search
 14 | 
 15 | ### Search by Gene
 16 | 
 17 | Find articles about a specific gene:
 18 | 
 19 | ```bash
 20 | # CLI
 21 | biomcp article search --gene BRAF --limit 5
 22 | 
 23 | # Python
 24 | articles = await client.articles.search(genes=["BRAF"], limit=5)
 25 | 
 26 | # MCP Tool
 27 | article_searcher(genes=["BRAF"], limit=5)
 28 | ```
 29 | 
 30 | This automatically includes:
 31 | 
 32 | 1. cBioPortal summary showing BRAF mutation frequency across cancers
 33 | 2. Top mutation hotspots (e.g., V600E)
 34 | 3. Recent articles mentioning BRAF
 35 | 
 36 | ### Search by Disease
 37 | 
 38 | Find articles about a specific disease:
 39 | 
 40 | ```bash
 41 | # CLI
 42 | biomcp article search --disease melanoma --limit 10
 43 | 
 44 | # Python
 45 | articles = await client.articles.search(diseases=["melanoma"])
 46 | 
 47 | # MCP Tool
 48 | article_searcher(diseases=["melanoma"])
 49 | ```
 50 | 
 51 | ## Advanced Search Techniques
 52 | 
 53 | ### Combining Multiple Filters
 54 | 
 55 | Search for articles at the intersection of genes, diseases, and chemicals:
 56 | 
 57 | ```bash
 58 | # CLI - EGFR mutations in lung cancer treated with erlotinib
 59 | biomcp article search \
 60 |   --gene EGFR \
 61 |   --disease "lung cancer" \
 62 |   --chemical erlotinib \
 63 |   --limit 20
 64 | 
 65 | # Python
 66 | articles = await client.articles.search(
 67 |     genes=["EGFR"],
 68 |     diseases=["lung cancer"],
 69 |     chemicals=["erlotinib"]
 70 | )
 71 | ```
 72 | 
 73 | ### Using OR Logic in Keywords
 74 | 
 75 | Find articles mentioning different notations of the same variant:
 76 | 
 77 | ```bash
 78 | # CLI - Find any notation of BRAF V600E
 79 | biomcp article search \
 80 |   --gene BRAF \
 81 |   --keyword "V600E|p.V600E|c.1799T>A"
 82 | 
 83 | # Python - Different names for same concept
 84 | articles = await client.articles.search(
 85 |     diseases=["NSCLC|non-small cell lung cancer"],
 86 |     chemicals=["pembrolizumab|Keytruda|anti-PD-1"]
 87 | )
 88 | ```
 89 | 
 90 | ### Excluding Preprints
 91 | 
 92 | For peer-reviewed articles only:
 93 | 
 94 | ```bash
 95 | # CLI
 96 | biomcp article search --gene TP53 --no-preprints
 97 | 
 98 | # Python
 99 | articles = await client.articles.search(
100 |     genes=["TP53"],
101 |     include_preprints=False
102 | )
103 | ```
104 | 
105 | ## Understanding cBioPortal Integration
106 | 
107 | ### What cBioPortal Provides
108 | 
109 | When you search for a gene, the first result includes:
110 | 
111 | ```markdown
112 | ### cBioPortal Summary for BRAF
113 | 
114 | - **Mutation Frequency**: 76.7% (368 mutations in 480 samples)
115 | - **Studies**: 1 of 5 studies have mutations
116 | 
117 | **Top Hotspots:**
118 | 
119 | 1. V600E: 310 mutations (84.2%)
120 | 2. V600K: 23 mutations (6.3%)
121 | 3. V600M: 12 mutations (3.3%)
122 | 
123 | **Cancer Type Distribution:**
124 | 
125 | - Skin Cancer, Non-Melanoma: 156 mutations
126 | - Melanoma: 91 mutations
127 | - Thyroid Cancer: 87 mutations
128 | ```
129 | 
130 | ### Mutation-Specific Searches
131 | 
132 | Search for articles about specific mutations:
133 | 
134 | ```python
135 | # Search for BRAF V600E specifically
136 | articles = await client.articles.search(
137 |     genes=["BRAF"],
138 |     keywords=["V600E"],
139 |     include_cbioportal=True  # Default
140 | )
141 | ```
142 | 
143 | The cBioPortal summary will highlight the specific mutation if found.
144 | 
145 | ### Disabling cBioPortal
146 | 
147 | If you don't need cancer genomics data:
148 | 
149 | ```bash
150 | # CLI
151 | biomcp article search --gene BRCA1 --no-cbioportal
152 | 
153 | # Python
154 | articles = await client.articles.search(
155 |     genes=["BRCA1"],
156 |     include_cbioportal=False
157 | )
158 | ```
159 | 
160 | ## Practical Examples
161 | 
162 | ### Example 1: Resistance Mechanism Research
163 | 
164 | Find articles about EGFR T790M resistance:
165 | 
166 | ```python
167 | # Using think tool first (for MCP)
168 | think(
169 |     thought="Researching EGFR T790M resistance mechanisms in lung cancer",
170 |     thoughtNumber=1
171 | )
172 | 
173 | # Search with multiple relevant terms
174 | articles = await article_searcher(
175 |     genes=["EGFR"],
176 |     diseases=["lung cancer|NSCLC"],
177 |     keywords=["T790M|p.T790M|resistance|resistant"],
178 |     chemicals=["osimertinib|gefitinib|erlotinib"]
179 | )
180 | ```
181 | 
182 | ### Example 2: Combination Therapy Research
183 | 
184 | Research BRAF/MEK combination therapy:
185 | 
186 | ```bash
187 | # CLI approach
188 | biomcp article search \
189 |   --gene BRAF --gene MEK1 --gene MEK2 \
190 |   --disease melanoma \
191 |   --chemical dabrafenib --chemical trametinib \
192 |   --keyword "combination therapy|combined treatment"
193 | ```
194 | 
195 | ### Example 3: Biomarker Discovery
196 | 
197 | Find articles about potential biomarkers:
198 | 
199 | ```python
200 | # Search for PD-L1 as a biomarker
201 | articles = await client.articles.search(
202 |     genes=["CD274"],  # PD-L1 gene symbol
203 |     keywords=["biomarker|predictive|prognostic"],
204 |     diseases=["cancer"],
205 |     limit=50
206 | )
207 | 
208 | # Filter results programmatically
209 | biomarker_articles = [
210 |     a for a in articles
211 |     if "biomarker" in a.title.lower() or "predictive" in a.abstract.lower()
212 | ]
213 | ```
214 | 
215 | ## Working with Results
216 | 
217 | ### Extracting Key Information
218 | 
219 | ```python
220 | # Process article results
221 | for article in articles:
222 |     print(f"Title: {article.title}")
223 |     print(f"PMID: {article.pmid}")
224 |     print(f"URL: {article.url}")
225 | 
226 |     # Extract annotated entities
227 |     genes = article.metadata.get("genes", [])
228 |     diseases = article.metadata.get("diseases", [])
229 |     chemicals = article.metadata.get("chemicals", [])
230 | 
231 |     print(f"Genes mentioned: {', '.join(genes)}")
232 |     print(f"Diseases: {', '.join(diseases)}")
233 |     print(f"Chemicals: {', '.join(chemicals)}")
234 | ```
235 | 
236 | ### Fetching Full Article Details
237 | 
238 | Get complete article information:
239 | 
240 | ```python
241 | # Get article by PMID
242 | full_article = await client.articles.get("38768446")
243 | 
244 | # Access full abstract
245 | print(full_article.abstract)
246 | 
247 | # Check for full text availability
248 | if full_article.full_text_url:
249 |     print(f"Full text: {full_article.full_text_url}")
250 | ```
251 | 
252 | ## Tips for Effective Searches
253 | 
254 | ### 1. Use Official Gene Symbols
255 | 
256 | ```python
257 | # ✅ Correct - Official HGNC symbol
258 | articles = await search(genes=["ERBB2"])
259 | 
260 | # ❌ Avoid - Common name
261 | articles = await search(genes=["HER2"])  # May miss results
262 | ```
263 | 
264 | ### 2. Include Synonyms for Diseases
265 | 
266 | ```python
267 | # Cover all variations
268 | articles = await search(
269 |     diseases=["GIST|gastrointestinal stromal tumor|gastrointestinal stromal tumour"]
270 | )
271 | ```
272 | 
273 | ### 3. Leverage PubTator Annotations
274 | 
275 | PubTator automatically annotates articles with:
276 | 
277 | - Gene mentions (normalized to official symbols)
278 | - Disease concepts (mapped to MeSH terms)
279 | - Chemical/drug entities
280 | - Genetic variants
281 | - Species
282 | 
283 | ### 4. Combine with Other Tools
284 | 
285 | ```python
286 | # 1. Find articles about a gene
287 | articles = await article_searcher(genes=["ALK"])
288 | 
289 | # 2. Get gene details for context
290 | gene_info = await gene_getter("ALK")
291 | 
292 | # 3. Find relevant trials
293 | trials = await trial_searcher(
294 |     other_terms=["ALK positive", "ALK rearrangement"]
295 | )
296 | ```
297 | 
298 | ## Troubleshooting
299 | 
300 | ### No Results Found
301 | 
302 | 1. **Check gene symbols**: Use [genenames.org](https://www.genenames.org)
303 | 2. **Broaden search**: Remove filters one by one
304 | 3. **Try synonyms**: Especially for diseases and drugs
305 | 
306 | ### cBioPortal Data Missing
307 | 
308 | - Some genes may not have cancer genomics data
309 | - Try searching for cancer-related genes
310 | - Check if gene symbol is correct
311 | 
312 | ### Preprint Issues
313 | 
314 | - Europe PMC may have delays in indexing
315 | - Some preprints may not have DOIs
316 | - Try searching by title keywords instead
317 | 
318 | ## Next Steps
319 | 
320 | - Learn to [find trials with NCI and BioThings](02-find-trials-with-nci-and-biothings.md)
321 | - Explore [variant annotations](03-get-comprehensive-variant-annotations.md)
322 | - Set up [API keys](../getting-started/03-authentication-and-api-keys.md) for enhanced features
323 | 
```

--------------------------------------------------------------------------------
/tests/tdd/test_network_policies.py:
--------------------------------------------------------------------------------

```python
  1 | """Comprehensive tests for network policies and HTTP centralization."""
  2 | 
  3 | from pathlib import Path
  4 | from unittest.mock import patch
  5 | 
  6 | import pytest
  7 | 
  8 | from biomcp.http_client import request_api
  9 | from biomcp.utils.endpoint_registry import (
 10 |     DataType,
 11 |     EndpointCategory,
 12 |     EndpointInfo,
 13 |     EndpointRegistry,
 14 |     get_registry,
 15 | )
 16 | 
 17 | 
 18 | class TestEndpointRegistry:
 19 |     """Test the endpoint registry functionality."""
 20 | 
 21 |     def test_registry_initialization(self):
 22 |         """Test that registry initializes with known endpoints."""
 23 |         registry = EndpointRegistry()
 24 |         endpoints = registry.get_all_endpoints()
 25 | 
 26 |         # Check we have endpoints registered
 27 |         assert len(endpoints) > 0
 28 | 
 29 |         # Check specific endpoints exist
 30 |         assert "pubtator3_search" in endpoints
 31 |         assert "clinicaltrials_search" in endpoints
 32 |         assert "myvariant_query" in endpoints
 33 |         assert "cbioportal_api" in endpoints
 34 | 
 35 |     def test_get_endpoints_by_category(self):
 36 |         """Test filtering endpoints by category."""
 37 |         registry = EndpointRegistry()
 38 | 
 39 |         # Get biomedical literature endpoints
 40 |         lit_endpoints = registry.get_endpoints_by_category(
 41 |             EndpointCategory.BIOMEDICAL_LITERATURE
 42 |         )
 43 |         assert len(lit_endpoints) > 0
 44 |         assert all(
 45 |             e.category == EndpointCategory.BIOMEDICAL_LITERATURE
 46 |             for e in lit_endpoints.values()
 47 |         )
 48 | 
 49 |         # Get clinical trials endpoints
 50 |         trial_endpoints = registry.get_endpoints_by_category(
 51 |             EndpointCategory.CLINICAL_TRIALS
 52 |         )
 53 |         assert len(trial_endpoints) > 0
 54 |         assert all(
 55 |             e.category == EndpointCategory.CLINICAL_TRIALS
 56 |             for e in trial_endpoints.values()
 57 |         )
 58 | 
 59 |     def test_get_unique_domains(self):
 60 |         """Test getting unique domains."""
 61 |         registry = EndpointRegistry()
 62 |         domains = registry.get_unique_domains()
 63 | 
 64 |         assert len(domains) > 0
 65 |         assert "www.ncbi.nlm.nih.gov" in domains
 66 |         assert "clinicaltrials.gov" in domains
 67 |         assert "myvariant.info" in domains
 68 |         assert "www.cbioportal.org" in domains
 69 | 
 70 |     def test_endpoint_info_properties(self):
 71 |         """Test EndpointInfo dataclass properties."""
 72 |         endpoint = EndpointInfo(
 73 |             url="https://api.example.com/test",
 74 |             category=EndpointCategory.BIOMEDICAL_LITERATURE,
 75 |             data_types=[DataType.RESEARCH_ARTICLES],
 76 |             description="Test endpoint",
 77 |             compliance_notes="Test compliance",
 78 |             rate_limit="10 requests/second",
 79 |             authentication="API key required",
 80 |         )
 81 | 
 82 |         assert endpoint.domain == "api.example.com"
 83 |         assert endpoint.category == EndpointCategory.BIOMEDICAL_LITERATURE
 84 |         assert DataType.RESEARCH_ARTICLES in endpoint.data_types
 85 | 
 86 |     def test_markdown_report_generation(self):
 87 |         """Test markdown report generation."""
 88 |         registry = EndpointRegistry()
 89 |         report = registry.generate_markdown_report()
 90 | 
 91 |         # Check report contains expected sections
 92 |         assert "# Third-Party Endpoints Used by BioMCP" in report
 93 |         assert "## Overview" in report
 94 |         assert "## Endpoints by Category" in report
 95 |         assert "## Domain Summary" in report
 96 |         assert "## Compliance and Privacy" in report
 97 |         assert "## Network Control" in report
 98 | 
 99 |         # Check it mentions offline mode
100 |         assert "BIOMCP_OFFLINE" in report
101 | 
102 |         # Check it contains actual endpoints
103 |         assert "pubtator3" in report
104 |         assert "clinicaltrials.gov" in report
105 |         assert "myvariant.info" in report
106 | 
107 |     def test_save_markdown_report(self, tmp_path):
108 |         """Test saving markdown report to file."""
109 |         registry = EndpointRegistry()
110 |         output_path = tmp_path / "test_endpoints.md"
111 | 
112 |         saved_path = registry.save_markdown_report(output_path)
113 | 
114 |         assert saved_path == output_path
115 |         assert output_path.exists()
116 | 
117 |         # Read and verify content
118 |         content = output_path.read_text()
119 |         assert "Third-Party Endpoints Used by BioMCP" in content
120 | 
121 | 
122 | class TestEndpointTracking:
123 |     """Test endpoint tracking in HTTP client."""
124 | 
125 |     @pytest.mark.asyncio
126 |     async def test_valid_endpoint_key(self):
127 |         """Test that valid endpoint keys are accepted."""
128 |         with patch("biomcp.http_client.call_http") as mock_call:
129 |             mock_call.return_value = (200, '{"data": "test"}')
130 | 
131 |             # Should not raise an error
132 |             result, error = await request_api(
133 |                 url="https://www.ncbi.nlm.nih.gov/research/pubtator3-api/search/",
134 |                 request={"text": "BRAF"},
135 |                 endpoint_key="pubtator3_search",
136 |                 cache_ttl=0,
137 |             )
138 | 
139 |             assert result == {"data": "test"}
140 |             assert error is None
141 | 
142 |     @pytest.mark.asyncio
143 |     async def test_invalid_endpoint_key_raises_error(self):
144 |         """Test that invalid endpoint keys raise an error."""
145 |         with pytest.raises(ValueError, match="Unknown endpoint key"):
146 |             await request_api(
147 |                 url="https://api.example.com/test",
148 |                 request={"test": "data"},
149 |                 endpoint_key="invalid_endpoint_key",
150 |                 cache_ttl=0,
151 |             )
152 | 
153 |     @pytest.mark.asyncio
154 |     async def test_no_endpoint_key_allowed(self):
155 |         """Test that requests without endpoint keys are allowed."""
156 |         with patch("biomcp.http_client.call_http") as mock_call:
157 |             mock_call.return_value = (200, '{"data": "test"}')
158 | 
159 |             # Should not raise an error
160 |             result, error = await request_api(
161 |                 url="https://api.example.com/test",
162 |                 request={"test": "data"},
163 |                 cache_ttl=0,
164 |             )
165 | 
166 |             assert result == {"data": "test"}
167 |             assert error is None
168 | 
169 | 
170 | class TestHTTPImportChecks:
171 |     """Test the HTTP import checking script."""
172 | 
173 |     def test_check_script_exists(self):
174 |         """Test that the check script exists."""
175 |         script_path = (
176 |             Path(__file__).parent.parent.parent
177 |             / "scripts"
178 |             / "check_http_imports.py"
179 |         )
180 |         assert script_path.exists()
181 | 
182 |     def test_allowed_files_configured(self):
183 |         """Test that allowed files are properly configured."""
184 |         # Import the script module
185 |         import sys
186 | 
187 |         script_path = Path(__file__).parent.parent.parent / "scripts"
188 |         sys.path.insert(0, str(script_path))
189 | 
190 |         try:
191 |             from check_http_imports import ALLOWED_FILES, HTTP_LIBRARIES
192 | 
193 |             # Check essential files are allowed
194 |             assert "http_client.py" in ALLOWED_FILES
195 |             assert "http_client_simple.py" in ALLOWED_FILES
196 | 
197 |             # Check we're checking for the right libraries
198 |             assert "httpx" in HTTP_LIBRARIES
199 |             assert "aiohttp" in HTTP_LIBRARIES
200 |             assert "requests" in HTTP_LIBRARIES
201 |         finally:
202 |             sys.path.pop(0)
203 | 
204 | 
205 | class TestGlobalRegistry:
206 |     """Test the global registry instance."""
207 | 
208 |     def test_get_registry_returns_same_instance(self):
209 |         """Test that get_registry returns the same instance."""
210 |         registry1 = get_registry()
211 |         registry2 = get_registry()
212 | 
213 |         assert registry1 is registry2
214 | 
215 |     def test_global_registry_has_endpoints(self):
216 |         """Test that the global registry has endpoints."""
217 |         registry = get_registry()
218 |         endpoints = registry.get_all_endpoints()
219 | 
220 |         assert len(endpoints) > 0
221 | 
```

--------------------------------------------------------------------------------
/docs/index.md:
--------------------------------------------------------------------------------

```markdown
  1 | # BioMCP: AI-Powered Biomedical Research
  2 | 
  3 | [![Release](https://img.shields.io/github/v/tag/genomoncology/biomcp)](https://github.com/genomoncology/biomcp/tags)
  4 | [![Build status](https://img.shields.io/github/actions/workflow/status/genomoncology/biomcp/main.yml?branch=main)](https://github.com/genomoncology/biomcp/actions/workflows/main.yml?query=branch%3Amain)
  5 | [![License](https://img.shields.io/github/license/genomoncology/biomcp)](https://img.shields.io/github/license/genomoncology/biomcp)
  6 | 
  7 | **Transform how you search and analyze biomedical data** with BioMCP - a powerful tool that connects AI assistants and researchers to critical biomedical databases through natural language.
  8 | 
  9 | ### Built and Maintained by <a href="https://www.genomoncology.com"><img src="./assets/logo.png" width=200 valign="middle" /></a>
 10 | 
 11 | <div class="announcement-banner">
 12 |   <div class="announcement-content">
 13 |     <h2>
 14 |       <span class="badge-new">NEW</span>
 15 |       Remote BioMCP Now Available!
 16 |     </h2>
 17 |     <p>Connect to BioMCP instantly through Claude - no installation required!</p>
 18 | 
 19 |     <div class="announcement-features">
 20 |       <div class="feature-item">
 21 |         <strong>🚀 Instant Access</strong>
 22 |         <span>Start using BioMCP in under 2 minutes</span>
 23 |       </div>
 24 |       <div class="feature-item">
 25 |         <strong>☁️ Cloud-Powered</strong>
 26 |         <span>Always up-to-date with latest features</span>
 27 |       </div>
 28 |       <div class="feature-item">
 29 |         <strong>🔒 Secure Auth</strong>
 30 |         <span>Google OAuth authentication</span>
 31 |       </div>
 32 |       <div class="feature-item">
 33 |         <strong>🛠️ 23+ Tools</strong>
 34 |         <span>Full suite of biomedical research tools</span>
 35 |       </div>
 36 |     </div>
 37 | 
 38 |     <a href="tutorials/remote-connection/" class="cta-button">
 39 |       Connect to Remote BioMCP Now
 40 |     </a>
 41 | 
 42 |   </div>
 43 | </div>
 44 | 
 45 | ## What Can You Do with BioMCP?
 46 | 
 47 | ### Search Research Literature
 48 | 
 49 | Find articles about genes, variants, diseases, and drugs with automatic cancer genomics data from cBioPortal
 50 | 
 51 | ```bash
 52 | biomcp article search --gene BRAF --disease melanoma
 53 | ```
 54 | 
 55 | ### Discover Clinical Trials
 56 | 
 57 | Search active trials by condition, location, phase, and eligibility criteria including genetic biomarkers
 58 | 
 59 | ```bash
 60 | biomcp trial search --condition "lung cancer" --status RECRUITING
 61 | ```
 62 | 
 63 | ### Analyze Genetic Variants
 64 | 
 65 | Query variant databases, predict effects, and understand clinical significance
 66 | 
 67 | ```bash
 68 | biomcp variant search --gene TP53 --significance pathogenic
 69 | ```
 70 | 
 71 | ### AI-Powered Analysis
 72 | 
 73 | Use with Claude Desktop for conversational biomedical research with sequential thinking
 74 | 
 75 | ```python
 76 | # Claude automatically uses BioMCP tools
 77 | "What BRAF mutations are found in melanoma?"
 78 | ```
 79 | 
 80 | ## 5-Minute Quick Start
 81 | 
 82 | ### Choose Your Interface
 83 | 
 84 | === "Claude Desktop (Recommended)"
 85 | 
 86 |     **Best for**: Conversational research, complex queries, AI-assisted analysis
 87 | 
 88 |     1. **Install Claude Desktop** from [claude.ai/desktop](https://claude.ai/desktop)
 89 | 
 90 |     2. **Configure BioMCP**:
 91 |        ```json
 92 |        {
 93 |          "mcpServers": {
 94 |            "biomcp": {
 95 |              "command": "uv",
 96 |              "args": [
 97 |         "run", "--with", "biomcp-python",
 98 |         "biomcp", "run"
 99 |       ]
100 |            }
101 |          }
102 |        }
103 |        ```
104 | 
105 |     3. **Start researching**: Ask Claude about any biomedical topic!
106 | 
107 |     [Full Claude Desktop Guide →](getting-started/02-claude-desktop-integration.md)
108 | 
109 | === "Command Line"
110 | 
111 |     **Best for**: Direct queries, scripting, automation
112 | 
113 |     1. **Install BioMCP**:
114 |        ```bash
115 |        # Using uv (recommended)
116 |        uv tool install biomcp
117 | 
118 |        # Or using pip
119 |        pip install biomcp-python
120 |        ```
121 | 
122 |     2. **Run your first search**:
123 |        ```bash
124 |        biomcp article search \
125 |          --gene BRAF --disease melanoma \
126 |          --limit 5
127 |        ```
128 | 
129 |     [CLI Reference →](user-guides/01-command-line-interface.md)
130 | 
131 | === "Python SDK"
132 | 
133 |     **Best for**: Integration, custom applications, bulk operations
134 | 
135 |     1. **Install the package**:
136 |        ```bash
137 |        pip install biomcp-python
138 |        ```
139 | 
140 |     2. **Use in your code**:
141 |        ```python
142 |        from biomcp import BioMCPClient
143 | 
144 |        async with BioMCPClient() as client:
145 |            articles = await client.articles.search(
146 |                genes=["BRAF"],
147 |                diseases=["melanoma"]
148 |            )
149 |        ```
150 | 
151 |     [Python SDK Docs →](apis/python-sdk.md)
152 | 
153 | ## Key Features
154 | 
155 | ### Unified Search Across Databases
156 | 
157 | - **PubMed/PubTator3**: 30M+ research articles with entity recognition
158 | - **ClinicalTrials.gov**: 400K+ clinical trials worldwide
159 | - **MyVariant.info**: Comprehensive variant annotations
160 | - **cBioPortal**: Automatic cancer genomics integration
161 | 
162 | ### Intelligent Query Processing
163 | 
164 | - Natural language to structured queries
165 | - Automatic synonym expansion
166 | - OR logic support for flexible matching
167 | - Cross-domain relationship discovery
168 | 
169 | ### Built for AI Integration
170 | 
171 | - 24 specialized MCP tools
172 | - Sequential thinking for complex analysis
173 | - Streaming responses for real-time updates
174 | - Context preservation across queries
175 | 
176 | [Explore All Features →](concepts/01-what-is-biomcp.md)
177 | 
178 | ## Learn by Example
179 | 
180 | ### Find Articles About a Specific Mutation
181 | 
182 | ```bash
183 | # Search for BRAF V600E mutations
184 | biomcp article search --gene BRAF \
185 |   --keyword "V600E|p.V600E|c.1799T>A"
186 | ```
187 | 
188 | ### Discover Trials Near You
189 | 
190 | ```bash
191 | # Find cancer trials in Boston area
192 | biomcp trial search --condition cancer \
193 |   --latitude 42.3601 --longitude -71.0589 \
194 |   --distance 50
195 | ```
196 | 
197 | ### Get Gene Information
198 | 
199 | ```bash
200 | # Get comprehensive gene data
201 | biomcp gene get TP53
202 | ```
203 | 
204 | [More Examples →](tutorials/biothings-prompts.md)
205 | 
206 | ## Popular Workflows
207 | 
208 | ### Literature Review
209 | 
210 | Systematic search across papers, preprints, and clinical trials
211 | [Workflow Guide →](workflows/all-workflows.md#1-literature-review-workflow)
212 | 
213 | ### Variant Interpretation
214 | 
215 | From variant ID to clinical significance and treatment implications
216 | [Workflow Guide →](workflows/all-workflows.md#3-variant-interpretation-workflow)
217 | 
218 | ### Trial Matching
219 | 
220 | Find eligible trials based on patient criteria and biomarkers
221 | [Workflow Guide →](workflows/all-workflows.md#2-clinical-trial-matching-workflow)
222 | 
223 | ### Drug Research
224 | 
225 | Connect drugs to targets, trials, and research literature
226 | [Workflow Guide →](workflows/all-workflows.md)
227 | 
228 | ## Advanced Features
229 | 
230 | - **[NCI Integration](getting-started/03-authentication-and-api-keys.md#nci-clinical-trials-api)**: Enhanced cancer trial search with biomarker filtering
231 | - **[AlphaGenome](how-to-guides/04-predict-variant-effects-with-alphagenome.md)**: Predict variant effects on gene regulation
232 | - **[BigQuery Logging](how-to-guides/05-logging-and-monitoring-with-bigquery.md)**: Monitor usage and performance
233 | - **[HTTP Server Mode](developer-guides/01-server-deployment.md)**: Deploy as a service
234 | 
235 | ## Documentation
236 | 
237 | - **[Getting Started](getting-started/01-quickstart-cli.md)** - Installation and first steps
238 | - **[User Guides](user-guides/01-command-line-interface.md)** - Detailed usage instructions
239 | - **[API Reference](apis/overview.md)** - Technical documentation
240 | - **[FAQ](faq-condensed.md)** - Quick answers to common questions
241 | 
242 | ## Community & Support
243 | 
244 | - **GitHub**: [github.com/genomoncology/biomcp](https://github.com/genomoncology/biomcp)
245 | - **Issues**: [Report bugs or request features](https://github.com/genomoncology/biomcp/issues)
246 | - **Discussions**: [Ask questions and share tips](https://github.com/genomoncology/biomcp/discussions)
247 | 
248 | ## License
249 | 
250 | BioMCP is licensed under the MIT License. See [LICENSE](https://github.com/genomoncology/biomcp/blob/main/LICENSE) for details.
251 | 
```

--------------------------------------------------------------------------------
/docs/tutorials/claude-code-biomcp-alphagenome.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Using Claude Code with BioMCP for AlphaGenome Variant Analysis
  2 | 
  3 | This tutorial demonstrates how to use Claude Code with BioMCP to analyze genetic variants using Google DeepMind's AlphaGenome. We'll explore both the MCP server integration and CLI approaches, showing how Claude Code can seamlessly work with both interfaces.
  4 | 
  5 | ## Prerequisites
  6 | 
  7 | - **Claude Code**: Latest version with MCP support
  8 | - **Python 3.11+**: Required for BioMCP and AlphaGenome
  9 | - **uv**: Modern Python package manager ([installation guide](https://docs.astral.sh/uv/getting-started/installation/))
 10 | - **AlphaGenome API Key**: Get free access at [Google DeepMind AlphaGenome](https://deepmind.google.com/science/alphagenome)
 11 | 
 12 | ## Setup Overview
 13 | 
 14 | BioMCP offers two interfaces that work perfectly with Claude Code:
 15 | 
 16 | 1. **MCP Server**: Integrated directly into Claude Code for seamless workflows
 17 | 2. **CLI**: Command-line interface for direct terminal access
 18 | 
 19 | Both produce identical results, giving you flexibility in how you work.
 20 | 
 21 | ## Part 1: MCP Server Setup
 22 | 
 23 | ### Step 1: Install BioMCP CLI
 24 | 
 25 | ```bash
 26 | # Install BioMCP CLI globally (note: biomcp-python, not biomcp!)
 27 | uv tool install -q biomcp-python
 28 | 
 29 | # Verify installation
 30 | biomcp --version
 31 | ```
 32 | 
 33 | ### Step 2: Configure MCP Server
 34 | 
 35 | Add BioMCP to your Claude Code MCP configuration:
 36 | 
 37 | ```bash
 38 | # Basic setup (requires ALPHAGENOME_API_KEY environment variable)
 39 | claude mcp add biomcp -- uv run --with biomcp-python biomcp run
 40 | 
 41 | # Or with API key in configuration
 42 | claude mcp add biomcp -e ALPHAGENOME_API_KEY=your-api-key-here -- uv run --with biomcp-python biomcp run
 43 | ```
 44 | 
 45 | Verify the setup:
 46 | 
 47 | ```bash
 48 | claude mcp list
 49 | claude mcp get biomcp
 50 | ```
 51 | 
 52 | ### Step 3: Set Environment Variable
 53 | 
 54 | ```bash
 55 | # Add to your shell profile (~/.zshrc or ~/.bashrc)
 56 | export ALPHAGENOME_API_KEY='your-api-key-here'
 57 | 
 58 | # Or set per-session
 59 | export ALPHAGENOME_API_KEY='your-api-key-here'
 60 | ```
 61 | 
 62 | ### Step 4: Install AlphaGenome
 63 | 
 64 | ```bash
 65 | # Clone and install AlphaGenome
 66 | git clone https://github.com/google-deepmind/alphagenome.git
 67 | cd alphagenome && uv pip install .
 68 | ```
 69 | 
 70 | ## Part 2: Testing with Claude Code
 71 | 
 72 | ### Example: DLG1 Exon Skipping Variant
 73 | 
 74 | Let's analyze the variant `chr3:197081044:TACTC>T` from the AlphaGenome paper, which demonstrates exon skipping in the DLG1 gene.
 75 | 
 76 | #### Using MCP Server (Recommended)
 77 | 
 78 | ```python
 79 | # Claude Code automatically uses MCP when available
 80 | mcp__biomcp__alphagenome_predictor(
 81 |     chromosome="chr3",
 82 |     position=197081044,
 83 |     reference="TACTC",
 84 |     alternate="T"
 85 | )
 86 | ```
 87 | 
 88 | **Result:**
 89 | 
 90 | ```markdown
 91 | ## AlphaGenome Variant Effect Predictions
 92 | 
 93 | **Variant**: chr3:197081044 TACTC>T
 94 | **Analysis window**: 131,072 bp
 95 | 
 96 | ### Gene Expression
 97 | 
 98 | - **MELTF**: +2.57 log₂ fold change (↑ increases expression)
 99 | 
100 | ### Chromatin Accessibility
101 | 
102 | - **EFO:0005719 DNase-seq**: +17.27 log₂ change (↑ increases accessibility)
103 | 
104 | ### Splicing
105 | 
106 | - Potential splicing alterations detected
107 | 
108 | ### Summary
109 | 
110 | - Analyzed 11796 regulatory tracks
111 | - 6045 tracks show substantial changes (|log₂| > 0.5)
112 | ```
113 | 
114 | #### Using CLI Interface
115 | 
116 | ```bash
117 | # Same analysis via CLI
118 | export ALPHAGENOME_API_KEY='your-api-key-here'
119 | uv run biomcp variant predict chr3 197081044 TACTC T
120 | ```
121 | 
122 | **Result:** Identical output to MCP server.
123 | 
124 | ## Part 3: Why Both Interfaces Matter
125 | 
126 | ### MCP Server Advantages 🔌
127 | 
128 | - **Persistent State**: No need to re-export environment variables
129 | - **Workflow Integration**: Seamless chaining with other biomedical tools
130 | - **Structured Data**: Direct programmatic access to results
131 | - **Auto-Documentation**: Built-in parameter validation
132 | 
133 | ### CLI Advantages 💻
134 | 
135 | - **Immediate Access**: No server setup required
136 | - **Debugging**: Direct command-line testing
137 | - **Scripting**: Easy integration into bash scripts
138 | - **Standalone Use**: Works without Claude Code
139 | 
140 | ### Claude Code Perspective
141 | 
142 | As Claude Code, both interfaces work equally well. The **MCP approach provides slight benefits**:
143 | 
144 | - Results persist across conversation turns
145 | - Built-in error handling and validation
146 | - Automatic integration with thinking and search workflows
147 | - No need to manage environment variables per session
148 | 
149 | **Trade-off**: MCP requires initial setup, while CLI is immediately available.
150 | 
151 | ## Part 4: Advanced Usage Examples
152 | 
153 | ### Multi-Variant Analysis
154 | 
155 | ```python
156 | # Analyze multiple variants from AlphaGenome paper
157 | variants = [
158 |     ("chr3", 197081044, "TACTC", "T"),      # DLG1 exon skipping
159 |     ("chr21", 46126238, "G", "C"),          # COL6A2 splice junction
160 |     ("chr16", 173694, "A", "G")             # HBA2 polyadenylation
161 | ]
162 | 
163 | for chr, pos, ref, alt in variants:
164 |     result = mcp__biomcp__alphagenome_predictor(
165 |         chromosome=chr,
166 |         position=pos,
167 |         reference=ref,
168 |         alternate=alt
169 |     )
170 |     print(f"Most affected gene: {result}")
171 | ```
172 | 
173 | ### Tissue-Specific Analysis
174 | 
175 | ```python
176 | # Analyze with tissue context
177 | mcp__biomcp__alphagenome_predictor(
178 |     chromosome="chr7",
179 |     position=140753336,
180 |     reference="A",
181 |     alternate="T",
182 |     tissue_types=["UBERON:0000310"]  # breast tissue
183 | )
184 | ```
185 | 
186 | ### Combined BioMCP Workflow
187 | 
188 | ```python
189 | # 1. First, search for known annotations
190 | variant_data = mcp__biomcp__variant_searcher(gene="BRAF")
191 | 
192 | # 2. Then predict regulatory effects
193 | regulatory_effects = mcp__biomcp__alphagenome_predictor(
194 |     chromosome="chr7",
195 |     position=140753336,
196 |     reference="A",
197 |     alternate="T"
198 | )
199 | 
200 | # 3. Search literature for context
201 | literature = mcp__biomcp__article_searcher(
202 |     genes=["BRAF"],
203 |     variants=["V600E"]
204 | )
205 | ```
206 | 
207 | ## Part 5: Validation and Quality Assurance
208 | 
209 | ### How We Validated the Integration
210 | 
211 | 1. **Raw API Testing**: Directly tested Google's AlphaGenome API
212 | 2. **Source Code Analysis**: Verified BioMCP uses correct API methods (`score_variant` + `get_recommended_scorers`)
213 | 3. **Cross-Validation**: Confirmed identical results across all three approaches:
214 |    - Raw Python API: MELTF +2.57 log₂
215 |    - BioMCP CLI: MELTF +2.57 log₂
216 |    - BioMCP MCP: MELTF +2.57 log₂
217 | 
218 | ### Key Scientific Finding
219 | 
220 | The variant `chr3:197081044:TACTC>T` most strongly affects **MELTF** (+2.57 log₂ fold change), not DLG1 as initially expected. This demonstrates that AlphaGenome considers the full regulatory landscape, not just the nearest gene.
221 | 
222 | ## Part 6: Best Practices
223 | 
224 | ### For MCP Usage
225 | 
226 | - Use structured thinking with `mcp__biomcp__think` for complex analyses
227 | - Leverage `call_benefit` parameter to improve result quality
228 | - Chain multiple tools for comprehensive variant characterization
229 | 
230 | ### For CLI Usage
231 | 
232 | - Set `ALPHAGENOME_API_KEY` in your shell profile
233 | - Use `--help` to explore all available parameters
234 | - Combine with other CLI tools via pipes and scripts
235 | 
236 | ### General Tips
237 | 
238 | - Start with default 131kb analysis window
239 | - Use tissue-specific analysis when relevant
240 | - Validate surprising results with literature search
241 | - Consider both gene expression and chromatin accessibility effects
242 | 
243 | ## Conclusion
244 | 
245 | BioMCP's dual interface approach (MCP + CLI) provides robust variant analysis capabilities. Claude Code works seamlessly with both, offering flexibility for different workflows. The MCP integration provides slight advantages for interactive analysis, while the CLI excels for scripting and debugging.
246 | 
247 | The combination of AlphaGenome's predictive power with BioMCP's comprehensive biomedical data access creates a powerful platform for genetic variant analysis and interpretation.
248 | 
249 | ## Resources
250 | 
251 | - [BioMCP Documentation](https://biomcp.org)
252 | - [AlphaGenome Paper](https://deepmind.google/science/alphagenome)
253 | - [Claude Code MCP Guide](https://docs.anthropic.com/claude/docs/model-context-protocol)
254 | - [uv Documentation](https://docs.astral.sh/uv/)
255 | 
```

--------------------------------------------------------------------------------
/tests/tdd/articles/test_search.py:
--------------------------------------------------------------------------------

```python
  1 | import json
  2 | from unittest.mock import patch
  3 | 
  4 | import pytest
  5 | 
  6 | from biomcp.articles.search import (
  7 |     PubmedRequest,
  8 |     ResultItem,
  9 |     SearchResponse,
 10 |     convert_request,
 11 |     search_articles,
 12 | )
 13 | 
 14 | 
 15 | async def test_convert_search_query(anyio_backend):
 16 |     pubmed_request = PubmedRequest(
 17 |         chemicals=["Caffeine"],
 18 |         diseases=["non-small cell lung cancer"],
 19 |         genes=["BRAF"],
 20 |         variants=["BRAF V600E"],
 21 |         keywords=["therapy"],
 22 |     )
 23 |     pubtator_request = await convert_request(request=pubmed_request)
 24 | 
 25 |     # The API may or may not return prefixed entity IDs, so we check for both possibilities
 26 |     query_text = pubtator_request.text
 27 | 
 28 |     # Keywords should always be first
 29 |     assert query_text.startswith("therapy AND ")
 30 | 
 31 |     # Check that all terms are present (with or without prefixes)
 32 |     assert "Caffeine" in query_text or "@CHEMICAL_Caffeine" in query_text
 33 |     assert (
 34 |         "non-small cell lung cancer" in query_text.lower()
 35 |         or "carcinoma" in query_text.lower()
 36 |         or "@DISEASE_" in query_text
 37 |     )
 38 |     assert "BRAF" in query_text or "@GENE_BRAF" in query_text
 39 |     assert (
 40 |         "V600E" in query_text
 41 |         or "p.V600E" in query_text
 42 |         or "@VARIANT_" in query_text
 43 |     )
 44 | 
 45 |     # All terms should be joined with AND
 46 |     assert (
 47 |         query_text.count(" AND ") >= 4
 48 |     )  # At least 4 AND operators for 5 terms
 49 | 
 50 |     # default page request (changed to 10 for token efficiency)
 51 |     assert pubtator_request.size == 10
 52 | 
 53 | 
 54 | async def test_convert_search_query_with_or_logic(anyio_backend):
 55 |     """Test that keywords with pipe separators are converted to OR queries."""
 56 |     pubmed_request = PubmedRequest(
 57 |         genes=["PTEN"],
 58 |         keywords=["R173|Arg173|p.R173", "mutation"],
 59 |     )
 60 |     pubtator_request = await convert_request(request=pubmed_request)
 61 | 
 62 |     query_text = pubtator_request.text
 63 | 
 64 |     # Check that OR logic is properly formatted
 65 |     assert "(R173 OR Arg173 OR p.R173)" in query_text
 66 |     assert "mutation" in query_text
 67 |     assert "PTEN" in query_text or "@GENE_PTEN" in query_text
 68 | 
 69 |     # Check overall structure
 70 |     assert (
 71 |         query_text.count(" AND ") >= 2
 72 |     )  # At least 2 AND operators for 3 terms
 73 | 
 74 | 
 75 | async def test_search(anyio_backend):
 76 |     """Test search with real API call - may be flaky due to network dependency.
 77 | 
 78 |     This test makes real API calls to PubTator3 and can fail due to:
 79 |     - Network connectivity issues (Error 599)
 80 |     - API rate limiting
 81 |     - Changes in search results over time
 82 | 
 83 |     Consider using test_search_mocked for more reliable testing.
 84 |     """
 85 |     query = {
 86 |         "genes": ["BRAF"],
 87 |         "diseases": ["NSCLC", "Non - Small Cell Lung Cancer"],
 88 |         "keywords": ["BRAF mutations NSCLC"],
 89 |         "variants": ["mutation", "mutations"],
 90 |     }
 91 | 
 92 |     query = PubmedRequest(**query)
 93 |     output = await search_articles(query, output_json=True)
 94 |     data = json.loads(output)
 95 |     assert isinstance(data, list)
 96 | 
 97 |     # Handle potential errors - if the first item has an 'error' key, it's an error response
 98 |     if data and isinstance(data[0], dict) and "error" in data[0]:
 99 |         import pytest
100 | 
101 |         pytest.skip(f"API returned error: {data[0]['error']}")
102 | 
103 |     assert len(data) == 10  # Changed from 40 to 10 for token efficiency
104 |     result = ResultItem.model_validate(data[0])
105 |     # todo: this might be flaky.
106 |     assert (
107 |         result.title
108 |         == "[Expert consensus on the diagnosis and treatment in advanced "
109 |         "non-small cell lung cancer with BRAF mutation in China]."
110 |     )
111 | 
112 | 
113 | @pytest.mark.asyncio
114 | async def test_search_mocked(anyio_backend):
115 |     """Test search with mocked API response to avoid network dependency."""
116 |     query = {
117 |         "genes": ["BRAF"],
118 |         "diseases": ["NSCLC", "Non - Small Cell Lung Cancer"],
119 |         "keywords": ["BRAF mutations NSCLC"],
120 |         "variants": ["mutation", "mutations"],
121 |     }
122 | 
123 |     # Create mock response - don't include abstract here as it will be added by add_abstracts
124 |     mock_response = SearchResponse(
125 |         results=[
126 |             ResultItem(
127 |                 pmid=37495419,
128 |                 title="[Expert consensus on the diagnosis and treatment in advanced "
129 |                 "non-small cell lung cancer with BRAF mutation in China].",
130 |                 journal="Zhonghua Zhong Liu Za Zhi",
131 |                 authors=["Zhang", "Li", "Wang"],
132 |                 date="2023-07-23",
133 |                 doi="10.3760/cma.j.cn112152-20230314-00115",
134 |             )
135 |             for _ in range(10)  # Create 40 results
136 |         ],
137 |         page_size=10,
138 |         current=1,
139 |         count=10,
140 |         total_pages=1,
141 |     )
142 | 
143 |     with patch("biomcp.http_client.request_api") as mock_request:
144 |         mock_request.return_value = (mock_response, None)
145 | 
146 |         # Mock the autocomplete calls
147 |         with patch("biomcp.articles.search.autocomplete") as mock_autocomplete:
148 |             mock_autocomplete.return_value = (
149 |                 None  # Simplified - no entity mapping
150 |             )
151 | 
152 |             # Mock the call_pubtator_api function
153 |             with patch(
154 |                 "biomcp.articles.search.call_pubtator_api"
155 |             ) as mock_pubtator:
156 |                 from biomcp.articles.fetch import (
157 |                     Article,
158 |                     FetchArticlesResponse,
159 |                     Passage,
160 |                     PassageInfo,
161 |                 )
162 | 
163 |                 # Create a mock response with abstracts
164 |                 mock_fetch_response = FetchArticlesResponse(
165 |                     PubTator3=[
166 |                         Article(
167 |                             pmid=37495419,
168 |                             passages=[
169 |                                 Passage(
170 |                                     text="This is a test abstract about BRAF mutations in NSCLC.",
171 |                                     infons=PassageInfo(
172 |                                         section_type="ABSTRACT"
173 |                                     ),
174 |                                 )
175 |                             ],
176 |                         )
177 |                     ]
178 |                 )
179 |                 mock_pubtator.return_value = (mock_fetch_response, None)
180 | 
181 |                 query_obj = PubmedRequest(**query)
182 |                 output = await search_articles(query_obj, output_json=True)
183 |                 data = json.loads(output)
184 | 
185 |                 assert isinstance(data, list)
186 |                 assert (
187 |                     len(data) == 10
188 |                 )  # Changed from 40 to 10 for token efficiency
189 |                 result = ResultItem.model_validate(data[0])
190 |                 assert (
191 |                     result.title
192 |                     == "[Expert consensus on the diagnosis and treatment in advanced "
193 |                     "non-small cell lung cancer with BRAF mutation in China]."
194 |                 )
195 |                 assert (
196 |                     result.abstract
197 |                     == "This is a test abstract about BRAF mutations in NSCLC."
198 |                 )
199 | 
200 | 
201 | @pytest.mark.asyncio
202 | async def test_search_network_error(anyio_backend):
203 |     """Test search handles network errors gracefully."""
204 |     query = PubmedRequest(genes=["BRAF"])
205 | 
206 |     with patch("biomcp.http_client.request_api") as mock_request:
207 |         from biomcp.http_client import RequestError
208 | 
209 |         mock_request.return_value = (
210 |             None,
211 |             RequestError(code=599, message="Network connectivity error"),
212 |         )
213 | 
214 |         output = await search_articles(query, output_json=True)
215 |         data = json.loads(output)
216 | 
217 |         assert isinstance(data, list)
218 |         assert len(data) == 1
219 |         assert "error" in data[0]
220 |         assert "Error 599: Network connectivity error" in data[0]["error"]
221 | 
```

--------------------------------------------------------------------------------
/BIOMCP_DATA_FLOW.md:
--------------------------------------------------------------------------------

```markdown
  1 | # BioMCP Data Flow Diagram
  2 | 
  3 | This document illustrates how BioMCP (Biomedical Model Context Protocol) works, showing the interaction between AI clients, the MCP server, domains, and external data sources.
  4 | 
  5 | ## High-Level Architecture
  6 | 
  7 | ```mermaid
  8 | graph TB
  9 |     subgraph "AI Client Layer"
 10 |         AI[AI Assistant<br/>e.g., Claude, GPT]
 11 |     end
 12 | 
 13 |     subgraph "MCP Server Layer"
 14 |         MCP[MCP Server<br/>router.py]
 15 |         SEARCH[search tool]
 16 |         FETCH[fetch tool]
 17 |     end
 18 | 
 19 |     subgraph "Domain Routing Layer"
 20 |         ROUTER[Query Router]
 21 |         PARSER[Query Parser]
 22 |         UNIFIED[Unified Query<br/>Language]
 23 |     end
 24 | 
 25 |     subgraph "Domain Handlers"
 26 |         ARTICLES[Articles Domain<br/>Handler]
 27 |         TRIALS[Trials Domain<br/>Handler]
 28 |         VARIANTS[Variants Domain<br/>Handler]
 29 |         THINKING[Thinking Domain<br/>Handler]
 30 |     end
 31 | 
 32 |     subgraph "External APIs"
 33 |         subgraph "Article Sources"
 34 |             PUBMED[PubTator3/<br/>PubMed]
 35 |             BIORXIV[bioRxiv/<br/>medRxiv]
 36 |             EUROPEPMC[Europe PMC]
 37 |         end
 38 | 
 39 |         subgraph "Clinical Data"
 40 |             CLINICALTRIALS[ClinicalTrials.gov]
 41 |         end
 42 | 
 43 |         subgraph "Variant Sources"
 44 |             MYVARIANT[MyVariant.info]
 45 |             TCGA[TCGA]
 46 |             KG[1000 Genomes]
 47 |             CBIO[cBioPortal]
 48 |         end
 49 |     end
 50 | 
 51 |     %% Connections
 52 |     AI -->|MCP Protocol| MCP
 53 |     MCP --> SEARCH
 54 |     MCP --> FETCH
 55 | 
 56 |     SEARCH --> ROUTER
 57 |     ROUTER --> PARSER
 58 |     PARSER --> UNIFIED
 59 | 
 60 |     ROUTER --> ARTICLES
 61 |     ROUTER --> TRIALS
 62 |     ROUTER --> VARIANTS
 63 |     ROUTER --> THINKING
 64 | 
 65 |     ARTICLES --> PUBMED
 66 |     ARTICLES --> BIORXIV
 67 |     ARTICLES --> EUROPEPMC
 68 |     ARTICLES -.->|Gene enrichment| CBIO
 69 | 
 70 |     TRIALS --> CLINICALTRIALS
 71 | 
 72 |     VARIANTS --> MYVARIANT
 73 |     MYVARIANT --> TCGA
 74 |     MYVARIANT --> KG
 75 |     VARIANTS --> CBIO
 76 | 
 77 |     THINKING -->|Internal| THINKING
 78 | 
 79 |     classDef clientClass fill:#e1f5fe,stroke:#01579b,stroke-width:2px
 80 |     classDef serverClass fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
 81 |     classDef domainClass fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
 82 |     classDef apiClass fill:#fff3e0,stroke:#e65100,stroke-width:2px
 83 | 
 84 |     class AI clientClass
 85 |     class MCP,SEARCH,FETCH serverClass
 86 |     class ARTICLES,TRIALS,VARIANTS,THINKING domainClass
 87 |     class PUBMED,BIORXIV,EUROPEPMC,CLINICALTRIALS,MYVARIANT,TCGA,KG,CBIO apiClass
 88 | ```
 89 | 
 90 | ## Detailed Search Flow
 91 | 
 92 | ```mermaid
 93 | sequenceDiagram
 94 |     participant AI as AI Client
 95 |     participant MCP as MCP Server
 96 |     participant Router as Query Router
 97 |     participant Domain as Domain Handler
 98 |     participant API as External API
 99 | 
100 |     AI->>MCP: search(query="gene:BRAF AND disease:melanoma")
101 |     MCP->>Router: Parse & route query
102 | 
103 |     alt Unified Query
104 |         Router->>Router: Parse field syntax
105 |         Router->>Router: Create routing plan
106 | 
107 |         par Search Articles
108 |             Router->>Domain: Search articles (BRAF, melanoma)
109 |             Domain->>API: PubTator3 API call
110 |             API-->>Domain: Article results
111 |             Domain->>API: cBioPortal enrichment
112 |             API-->>Domain: Mutation data
113 |         and Search Trials
114 |             Router->>Domain: Search trials (melanoma)
115 |             Domain->>API: ClinicalTrials.gov API
116 |             API-->>Domain: Trial results
117 |         and Search Variants
118 |             Router->>Domain: Search variants (BRAF)
119 |             Domain->>API: MyVariant.info API
120 |             API-->>Domain: Variant results
121 |         end
122 |     else Domain-specific
123 |         Router->>Domain: Direct domain search
124 |         Domain->>API: Single API call
125 |         API-->>Domain: Domain results
126 |     else Sequential Thinking
127 |         Router->>Domain: Process thought
128 |         Domain->>Domain: Update session state
129 |         Domain-->>Router: Thought response
130 |     end
131 | 
132 |     Domain-->>Router: Formatted results
133 |     Router-->>MCP: Aggregated results
134 |     MCP-->>AI: Standardized response
135 | ```
136 | 
137 | ## Search Tool Parameters
138 | 
139 | ```mermaid
140 | graph LR
141 |     subgraph "Search Tool Input"
142 |         PARAMS[Parameters]
143 |         QUERY[query: string]
144 |         DOMAIN[domain: article/trial/variant/thinking]
145 |         GENES[genes: list]
146 |         DISEASES[diseases: list]
147 |         CONDITIONS[conditions: list]
148 |         LAT[lat/long: coordinates]
149 |         THOUGHT[thought parameters]
150 |     end
151 | 
152 |     subgraph "Search Modes"
153 |         MODE1[Unified Query Mode<br/>Uses 'query' param]
154 |         MODE2[Domain-Specific Mode<br/>Uses domain + params]
155 |         MODE3[Thinking Mode<br/>Uses thought params]
156 |     end
157 | 
158 |     PARAMS --> MODE1
159 |     PARAMS --> MODE2
160 |     PARAMS --> MODE3
161 | ```
162 | 
163 | ## Domain-Specific Data Sources
164 | 
165 | ```mermaid
166 | graph TD
167 |     subgraph "Articles Domain"
168 |         A1[PubTator3/PubMed<br/>- Published articles<br/>- Annotations]
169 |         A2[bioRxiv/medRxiv<br/>- Preprints<br/>- Early research]
170 |         A3[Europe PMC<br/>- Open access<br/>- Full text]
171 |         A4[cBioPortal Integration<br/>- Auto-enrichment when genes specified<br/>- Mutation summaries & hotspots]
172 |     end
173 | 
174 |     subgraph "Trials Domain"
175 |         T1[ClinicalTrials.gov<br/>- Active trials<br/>- Trial details<br/>- Location search]
176 |     end
177 | 
178 |     subgraph "Variants Domain"
179 |         V1[MyVariant.info<br/>- Variant annotations<br/>- Clinical significance]
180 |         V2[TCGA<br/>- Cancer variants<br/>- Somatic mutations]
181 |         V3[1000 Genomes<br/>- Population frequency<br/>- Allele data]
182 |         V4[cBioPortal<br/>- Cancer mutations<br/>- Hotspots]
183 |     end
184 | 
185 |     A1 -.->|When genes present| A4
186 |     A2 -.->|When genes present| A4
187 |     A3 -.->|When genes present| A4
188 | ```
189 | 
190 | ## Unified Query Language
191 | 
192 | ```mermaid
193 | graph TD
194 |     QUERY[Unified Query<br/>"gene:BRAF AND disease:melanoma"]
195 | 
196 |     QUERY --> PARSE[Query Parser]
197 | 
198 |     PARSE --> F1[Field: gene<br/>Value: BRAF]
199 |     PARSE --> F2[Field: disease<br/>Value: melanoma]
200 | 
201 |     F1 --> D1[Articles Domain]
202 |     F1 --> D2[Variants Domain]
203 |     F2 --> D1
204 |     F2 --> D3[Trials Domain]
205 | 
206 |     D1 --> R1[PubMed Results]
207 |     D2 --> R2[Variant Results]
208 |     D3 --> R3[Trial Results]
209 | 
210 |     R1 --> AGG[Aggregated Results]
211 |     R2 --> AGG
212 |     R3 --> AGG
213 | ```
214 | 
215 | ## Example: Location-Based Trial Search
216 | 
217 | ```mermaid
218 | sequenceDiagram
219 |     participant User as User
220 |     participant AI as AI Client
221 |     participant MCP as BioMCP
222 |     participant GEO as Geocoding Service
223 |     participant CT as ClinicalTrials.gov
224 | 
225 |     User->>AI: Find active trials in Cleveland for NSCLC
226 |     AI->>AI: Recognize location needs geocoding
227 |     AI->>GEO: Geocode "Cleveland"
228 |     GEO-->>AI: lat: 41.4993, long: -81.6944
229 | 
230 |     AI->>MCP: search(domain="trial",<br/>diseases=["NSCLC"],<br/>lat=41.4993,<br/>long=-81.6944,<br/>distance=50)
231 | 
232 |     MCP->>CT: API call with geo filter
233 |     CT-->>MCP: Trials near Cleveland
234 |     MCP-->>AI: Formatted trial results
235 |     AI-->>User: Here are X active NSCLC trials in Cleveland area
236 | ```
237 | 
238 | ## Key Features
239 | 
240 | 1. **Parallel Execution**: Multiple domains are searched simultaneously for unified queries
241 | 2. **Smart Enrichment**: Article searches automatically include cBioPortal mutation summaries when genes are specified, providing clinical context alongside literature results
242 | 3. **Location Awareness**: Trial searches support geographic filtering with lat/long coordinates
243 | 4. **Sequential Thinking**: Built-in reasoning system for complex biomedical questions
244 | 5. **Standardized Output**: All results follow OpenAI MCP format for consistency
245 | 
246 | ## Response Format
247 | 
248 | All search results follow this standardized structure:
249 | 
250 | ```json
251 | {
252 |   "results": [
253 |     {
254 |       "id": "PMID12345678",
255 |       "title": "BRAF V600E mutation in melanoma",
256 |       "text": "This study investigates BRAF mutations...",
257 |       "url": "https://pubmed.ncbi.nlm.nih.gov/12345678"
258 |     }
259 |   ]
260 | }
261 | ```
262 | 
263 | Fetch results include additional domain-specific metadata in the response.
264 | 
```

--------------------------------------------------------------------------------
/src/biomcp/openfda/drug_labels_helpers.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Helper functions for OpenFDA drug labels to reduce complexity.
  3 | """
  4 | 
  5 | from typing import Any
  6 | 
  7 | from .input_validation import sanitize_input
  8 | from .utils import clean_text, extract_drug_names, truncate_text
  9 | 
 10 | 
 11 | def build_label_search_query(
 12 |     name: str | None,
 13 |     indication: str | None,
 14 |     boxed_warning: bool,
 15 |     section: str | None,
 16 | ) -> str:
 17 |     """Build the search query for drug labels."""
 18 |     search_parts = []
 19 | 
 20 |     if name:
 21 |         # Sanitize input to prevent injection
 22 |         name = sanitize_input(name, max_length=100)
 23 | 
 24 |     if name:
 25 |         name_query = (
 26 |             f'(openfda.brand_name:"{name}" OR '
 27 |             f'openfda.generic_name:"{name}" OR '
 28 |             f'openfda.substance_name:"{name}")'
 29 |         )
 30 |         search_parts.append(name_query)
 31 | 
 32 |     if indication:
 33 |         # Sanitize indication input
 34 |         indication = sanitize_input(indication, max_length=200)
 35 |         if indication:
 36 |             search_parts.append(f'indications_and_usage:"{indication}"')
 37 | 
 38 |     if boxed_warning:
 39 |         search_parts.append("_exists_:boxed_warning")
 40 | 
 41 |     if section:
 42 |         # Map common section names to FDA fields
 43 |         section_map = {
 44 |             "indications": "indications_and_usage",
 45 |             "dosage": "dosage_and_administration",
 46 |             "contraindications": "contraindications",
 47 |             "warnings": "warnings_and_precautions",
 48 |             "adverse": "adverse_reactions",
 49 |             "interactions": "drug_interactions",
 50 |             "pregnancy": "pregnancy",
 51 |             "pediatric": "pediatric_use",
 52 |             "geriatric": "geriatric_use",
 53 |             "overdose": "overdosage",
 54 |         }
 55 |         field_name = section_map.get(section.lower(), section)
 56 |         search_parts.append(f"_exists_:{field_name}")
 57 | 
 58 |     return " AND ".join(search_parts)
 59 | 
 60 | 
 61 | def format_label_summary(result: dict[str, Any], index: int) -> list[str]:
 62 |     """Format a single drug label summary."""
 63 |     output = []
 64 | 
 65 |     # Extract drug names
 66 |     drug_names = extract_drug_names(result)
 67 |     primary_name = drug_names[0] if drug_names else "Unknown Drug"
 68 | 
 69 |     output.append(f"#### {index}. {primary_name}")
 70 | 
 71 |     # Get OpenFDA data
 72 |     openfda = result.get("openfda", {})
 73 | 
 74 |     # Show all names if multiple
 75 |     if len(drug_names) > 1:
 76 |         output.append(f"**Also known as**: {', '.join(drug_names[1:])}")
 77 | 
 78 |     # Basic info
 79 |     output.extend(_format_label_basic_info(openfda))
 80 | 
 81 |     # Boxed warning
 82 |     if "boxed_warning" in result:
 83 |         warning_text = clean_text(" ".join(result["boxed_warning"]))
 84 |         output.append(
 85 |             f"\n⚠️ **BOXED WARNING**: {truncate_text(warning_text, 200)}"
 86 |         )
 87 | 
 88 |     # Key sections
 89 |     output.extend(_format_label_key_sections(result))
 90 | 
 91 |     # Set ID for retrieval
 92 |     if "set_id" in result:
 93 |         output.append(f"\n*Label ID: {result['set_id']}*")
 94 | 
 95 |     output.append("")
 96 |     return output
 97 | 
 98 | 
 99 | def _format_label_basic_info(openfda: dict) -> list[str]:
100 |     """Format basic label information from OpenFDA data."""
101 |     output = []
102 | 
103 |     # Application number
104 |     if app_numbers := openfda.get("application_number", []):
105 |         output.append(f"**FDA Application**: {app_numbers[0]}")
106 | 
107 |     # Manufacturer
108 |     if manufacturers := openfda.get("manufacturer_name", []):
109 |         output.append(f"**Manufacturer**: {manufacturers[0]}")
110 | 
111 |     # Route
112 |     if routes := openfda.get("route", []):
113 |         output.append(f"**Route**: {', '.join(routes)}")
114 | 
115 |     return output
116 | 
117 | 
118 | def _format_label_key_sections(result: dict) -> list[str]:
119 |     """Format key label sections."""
120 |     output = []
121 | 
122 |     # Indications
123 |     if "indications_and_usage" in result:
124 |         indications_text = clean_text(
125 |             " ".join(result["indications_and_usage"])
126 |         )
127 |         output.append(
128 |             f"\n**Indications**: {truncate_text(indications_text, 300)}"
129 |         )
130 | 
131 |     # Contraindications
132 |     if "contraindications" in result:
133 |         contra_text = clean_text(" ".join(result["contraindications"]))
134 |         output.append(
135 |             f"\n**Contraindications**: {truncate_text(contra_text, 200)}"
136 |         )
137 | 
138 |     return output
139 | 
140 | 
141 | def format_label_header(result: dict[str, Any], set_id: str) -> list[str]:
142 |     """Format the header for detailed drug label."""
143 |     output = []
144 | 
145 |     drug_names = extract_drug_names(result)
146 |     primary_name = drug_names[0] if drug_names else "Unknown Drug"
147 | 
148 |     output.append(f"## FDA Drug Label: {primary_name}\n")
149 | 
150 |     # Basic information
151 |     openfda = result.get("openfda", {})
152 | 
153 |     if len(drug_names) > 1:
154 |         output.append(f"**Other Names**: {', '.join(drug_names[1:])}")
155 | 
156 |     output.extend(_format_detailed_metadata(openfda))
157 |     output.append(f"**Label ID**: {set_id}\n")
158 | 
159 |     return output
160 | 
161 | 
162 | def _format_detailed_metadata(openfda: dict) -> list[str]:
163 |     """Format detailed metadata from OpenFDA."""
164 |     output = []
165 | 
166 |     # FDA application numbers
167 |     if app_numbers := openfda.get("application_number", []):
168 |         output.append(f"**FDA Application**: {', '.join(app_numbers)}")
169 | 
170 |     # Manufacturers
171 |     if manufacturers := openfda.get("manufacturer_name", []):
172 |         output.append(f"**Manufacturer**: {', '.join(manufacturers)}")
173 | 
174 |     # Routes of administration
175 |     if routes := openfda.get("route", []):
176 |         output.append(f"**Route of Administration**: {', '.join(routes)}")
177 | 
178 |     # Pharmacologic class
179 |     if pharm_classes := openfda.get("pharm_class_epc", []):
180 |         output.append(f"**Pharmacologic Class**: {', '.join(pharm_classes)}")
181 | 
182 |     return output
183 | 
184 | 
185 | def format_label_section(
186 |     result: dict[str, Any], section: str, section_titles: dict[str, str]
187 | ) -> list[str]:
188 |     """Format a single label section."""
189 |     output: list[str] = []
190 | 
191 |     if section not in result:
192 |         return output
193 | 
194 |     title = section_titles.get(section, section.upper().replace("_", " "))
195 |     output.append(f"### {title}\n")
196 | 
197 |     section_text = result[section]
198 |     if isinstance(section_text, list):
199 |         section_text = " ".join(section_text)
200 | 
201 |     cleaned_text = clean_text(section_text)
202 | 
203 |     # For very long sections, provide a truncated version
204 |     if len(cleaned_text) > 3000:
205 |         output.append(truncate_text(cleaned_text, 3000))
206 |         output.append("\n*[Section truncated for brevity]*")
207 |     else:
208 |         output.append(cleaned_text)
209 | 
210 |     output.append("")
211 |     return output
212 | 
213 | 
214 | def get_default_sections() -> list[str]:
215 |     """Get the default sections to display."""
216 |     return [
217 |         "indications_and_usage",
218 |         "dosage_and_administration",
219 |         "contraindications",
220 |         "warnings_and_precautions",
221 |         "adverse_reactions",
222 |         "drug_interactions",
223 |         "use_in_specific_populations",
224 |         "clinical_pharmacology",
225 |         "clinical_studies",
226 |     ]
227 | 
228 | 
229 | def get_section_titles() -> dict[str, str]:
230 |     """Get the mapping of section names to display titles."""
231 |     return {
232 |         "indications_and_usage": "INDICATIONS AND USAGE",
233 |         "dosage_and_administration": "DOSAGE AND ADMINISTRATION",
234 |         "contraindications": "CONTRAINDICATIONS",
235 |         "warnings_and_precautions": "WARNINGS AND PRECAUTIONS",
236 |         "adverse_reactions": "ADVERSE REACTIONS",
237 |         "drug_interactions": "DRUG INTERACTIONS",
238 |         "use_in_specific_populations": "USE IN SPECIFIC POPULATIONS",
239 |         "clinical_pharmacology": "CLINICAL PHARMACOLOGY",
240 |         "clinical_studies": "CLINICAL STUDIES",
241 |         "how_supplied": "HOW SUPPLIED",
242 |         "storage_and_handling": "STORAGE AND HANDLING",
243 |         "patient_counseling_information": "PATIENT COUNSELING INFORMATION",
244 |         "pregnancy": "PREGNANCY",
245 |         "nursing_mothers": "NURSING MOTHERS",
246 |         "pediatric_use": "PEDIATRIC USE",
247 |         "geriatric_use": "GERIATRIC USE",
248 |         "overdosage": "OVERDOSAGE",
249 |     }
250 | 
```

--------------------------------------------------------------------------------
/tests/tdd/test_drug_shortages.py:
--------------------------------------------------------------------------------

```python
  1 | """Tests for FDA drug shortages module."""
  2 | 
  3 | from datetime import datetime
  4 | from unittest.mock import AsyncMock, patch
  5 | 
  6 | import pytest
  7 | 
  8 | from biomcp.openfda.drug_shortages import (
  9 |     get_drug_shortage,
 10 |     search_drug_shortages,
 11 | )
 12 | 
 13 | 
 14 | class TestDrugShortages:
 15 |     """Test drug shortages functionality."""
 16 | 
 17 |     @pytest.mark.asyncio
 18 |     async def test_search_drug_shortages_no_data_available(self):
 19 |         """Test drug shortage search when FDA data is unavailable."""
 20 |         with patch(
 21 |             "biomcp.openfda.drug_shortages._get_cached_shortage_data",
 22 |             new_callable=AsyncMock,
 23 |         ) as mock_get_data:
 24 |             mock_get_data.return_value = None
 25 | 
 26 |             result = await search_drug_shortages(drug="cisplatin")
 27 | 
 28 |             assert "Drug Shortage Data Temporarily Unavailable" in result
 29 |             assert "FDA drug shortage database cannot be accessed" in result
 30 |             assert (
 31 |                 "https://www.accessdata.fda.gov/scripts/drugshortages/"
 32 |                 in result
 33 |             )
 34 |             assert (
 35 |                 "https://www.ashp.org/drug-shortages/current-shortages"
 36 |                 in result
 37 |             )
 38 | 
 39 |     @pytest.mark.asyncio
 40 |     async def test_get_drug_shortage_no_data_available(self):
 41 |         """Test getting specific drug shortage when FDA data is unavailable."""
 42 |         with patch(
 43 |             "biomcp.openfda.drug_shortages._get_cached_shortage_data",
 44 |             new_callable=AsyncMock,
 45 |         ) as mock_get_data:
 46 |             mock_get_data.return_value = None
 47 | 
 48 |             result = await get_drug_shortage("cisplatin")
 49 | 
 50 |             assert "Drug Shortage Data Temporarily Unavailable" in result
 51 |             assert "FDA drug shortage database cannot be accessed" in result
 52 |             assert "Alternative Options:" in result
 53 | 
 54 |     @pytest.mark.asyncio
 55 |     async def test_mock_data_not_used_in_production(self):
 56 |         """Test that mock data is never returned in production scenarios."""
 57 |         with patch(
 58 |             "biomcp.openfda.drug_shortages._get_cached_shortage_data",
 59 |             new_callable=AsyncMock,
 60 |         ) as mock_get_data:
 61 |             # Simulate no data available (cache miss and fetch failure)
 62 |             mock_get_data.return_value = None
 63 | 
 64 |             result = await search_drug_shortages(drug="test")
 65 | 
 66 |             assert "Drug Shortage Data Temporarily Unavailable" in result
 67 |             # Ensure mock data is not present
 68 |             assert "Cisplatin Injection" not in result
 69 |             assert "Methotrexate" not in result
 70 | 
 71 |     # Cache functionality test removed - was testing private implementation details
 72 |     # The public API is tested through search_drug_shortages and get_drug_shortage
 73 | 
 74 |     # Cache expiry test removed - was testing private implementation details
 75 |     # The caching behavior is an implementation detail not part of the public API
 76 | 
 77 |     @pytest.mark.asyncio
 78 |     async def test_search_with_filters(self):
 79 |         """Test drug shortage search with various filters."""
 80 |         mock_data = {
 81 |             "_fetched_at": datetime.now().isoformat(),
 82 |             "shortages": [
 83 |                 {
 84 |                     "generic_name": "Drug A",
 85 |                     "brand_names": ["Brand A"],
 86 |                     "status": "Current Shortage",
 87 |                     "therapeutic_category": "Oncology",
 88 |                 },
 89 |                 {
 90 |                     "generic_name": "Drug B",
 91 |                     "brand_names": ["Brand B"],
 92 |                     "status": "Resolved",
 93 |                     "therapeutic_category": "Cardiology",
 94 |                 },
 95 |                 {
 96 |                     "generic_name": "Drug C",
 97 |                     "brand_names": ["Brand C"],
 98 |                     "status": "Current Shortage",
 99 |                     "therapeutic_category": "Oncology",
100 |                 },
101 |             ],
102 |         }
103 | 
104 |         with patch(
105 |             "biomcp.openfda.drug_shortages._get_cached_shortage_data",
106 |             new_callable=AsyncMock,
107 |         ) as mock_get_data:
108 |             mock_get_data.return_value = mock_data
109 | 
110 |             # Test status filter
111 |             result = await search_drug_shortages(status="current")
112 |             assert "Drug A" in result
113 |             assert "Drug C" in result
114 |             assert "Drug B" not in result
115 | 
116 |             # Test therapeutic category filter
117 |             result = await search_drug_shortages(
118 |                 therapeutic_category="Oncology"
119 |             )
120 |             assert "Drug A" in result
121 |             assert "Drug C" in result
122 |             assert "Drug B" not in result
123 | 
124 |             # Test drug name filter
125 |             result = await search_drug_shortages(drug="Drug B")
126 |             assert "Drug B" in result
127 |             assert "Drug A" not in result
128 | 
129 |     @pytest.mark.asyncio
130 |     async def test_get_specific_drug_shortage(self):
131 |         """Test getting details for a specific drug shortage."""
132 |         mock_data = {
133 |             "_fetched_at": datetime.now().isoformat(),
134 |             "shortages": [
135 |                 {
136 |                     "generic_name": "Cisplatin Injection",
137 |                     "brand_names": ["Platinol"],
138 |                     "status": "Current Shortage",
139 |                     "shortage_start_date": "2023-02-10",
140 |                     "estimated_resolution": "Q2 2024",
141 |                     "reason": "Manufacturing delays",
142 |                     "therapeutic_category": "Oncology",
143 |                     "notes": "Limited supplies available",
144 |                 },
145 |             ],
146 |         }
147 | 
148 |         with patch(
149 |             "biomcp.openfda.drug_shortages._get_cached_shortage_data",
150 |             new_callable=AsyncMock,
151 |         ) as mock_get_data:
152 |             mock_get_data.return_value = mock_data
153 | 
154 |             result = await get_drug_shortage("cisplatin")
155 | 
156 |             assert "Cisplatin Injection" in result
157 |             assert "Current Shortage" in result
158 |             assert "Manufacturing delays" in result
159 |             assert "Oncology" in result
160 |             assert "Limited supplies available" in result
161 | 
162 |     @pytest.mark.asyncio
163 |     async def test_get_drug_shortage_not_found(self):
164 |         """Test getting drug shortage for non-existent drug."""
165 |         mock_data = {
166 |             "_fetched_at": datetime.now().isoformat(),
167 |             "shortages": [
168 |                 {
169 |                     "generic_name": "Drug A",
170 |                     "status": "Current Shortage",
171 |                 },
172 |             ],
173 |         }
174 | 
175 |         with patch(
176 |             "biomcp.openfda.drug_shortages._get_cached_shortage_data",
177 |             new_callable=AsyncMock,
178 |         ) as mock_get_data:
179 |             mock_get_data.return_value = mock_data
180 | 
181 |             result = await get_drug_shortage("nonexistent-drug")
182 | 
183 |             assert "No shortage information found" in result
184 |             assert "nonexistent-drug" in result
185 | 
186 |     @pytest.mark.asyncio
187 |     async def test_api_key_parameter_ignored(self):
188 |         """Test that API key parameter is accepted but not used (FDA limitation)."""
189 |         mock_data = {
190 |             "_fetched_at": datetime.now().isoformat(),
191 |             "shortages": [
192 |                 {
193 |                     "generic_name": "Test Drug",
194 |                     "status": "Current Shortage",
195 |                     "therapeutic_category": "Test Category",
196 |                 }
197 |             ],
198 |         }
199 | 
200 |         with patch(
201 |             "biomcp.openfda.drug_shortages._get_cached_shortage_data",
202 |             new_callable=AsyncMock,
203 |         ) as mock_get_data:
204 |             mock_get_data.return_value = mock_data
205 | 
206 |             # API key should be accepted but not affect functionality
207 |             result = await search_drug_shortages(
208 |                 drug="test",
209 |                 api_key="test-key",
210 |             )
211 | 
212 |             # When there's data, it should format results
213 |             assert "FDA Drug Shortage Information" in result
214 |             assert "Test Drug" in result
215 | 
216 |     # Mock data function has been removed - no longer needed
217 | 
```

--------------------------------------------------------------------------------
/tests/tdd/thinking/test_sequential.py:
--------------------------------------------------------------------------------

```python
  1 | """Tests for sequential thinking functionality."""
  2 | 
  3 | from datetime import datetime
  4 | 
  5 | import pytest
  6 | 
  7 | from biomcp.thinking import sequential
  8 | from biomcp.thinking.session import ThoughtEntry, _session_manager
  9 | 
 10 | 
 11 | @pytest.fixture(autouse=True)
 12 | def clear_thinking_state():
 13 |     """Clear thinking state before each test."""
 14 |     _session_manager.clear_all_sessions()
 15 |     yield
 16 |     _session_manager.clear_all_sessions()
 17 | 
 18 | 
 19 | class TestSequentialThinking:
 20 |     """Test the sequential thinking MCP tool."""
 21 | 
 22 |     @pytest.mark.anyio
 23 |     async def test_basic_sequential_thinking(self):
 24 |         """Test basic sequential thinking flow."""
 25 |         result = await sequential._sequential_thinking(
 26 |             thought="First step: analyze the problem",
 27 |             nextThoughtNeeded=True,
 28 |             thoughtNumber=1,
 29 |             totalThoughts=3,
 30 |         )
 31 | 
 32 |         assert "Added thought 1 to main sequence" in result
 33 |         assert "Progress: 1/3 thoughts" in result
 34 |         assert "Next thought needed" in result
 35 | 
 36 |         # Get current session
 37 |         session = _session_manager.get_session()
 38 |         assert session is not None
 39 |         assert len(session.thought_history) == 1
 40 | 
 41 |         # Verify thought structure
 42 |         thought = session.thought_history[0]
 43 |         assert thought.thought == "First step: analyze the problem"
 44 |         assert thought.thought_number == 1
 45 |         assert thought.total_thoughts == 3
 46 |         assert thought.next_thought_needed is True
 47 |         assert thought.is_revision is False
 48 | 
 49 |     @pytest.mark.anyio
 50 |     async def test_multiple_sequential_thoughts(self):
 51 |         """Test adding multiple thoughts in sequence."""
 52 |         # Add first thought
 53 |         await sequential._sequential_thinking(
 54 |             thought="First step",
 55 |             nextThoughtNeeded=True,
 56 |             thoughtNumber=1,
 57 |             totalThoughts=3,
 58 |         )
 59 | 
 60 |         # Add second thought
 61 |         await sequential._sequential_thinking(
 62 |             thought="Second step",
 63 |             nextThoughtNeeded=True,
 64 |             thoughtNumber=2,
 65 |             totalThoughts=3,
 66 |         )
 67 | 
 68 |         # Add final thought
 69 |         result = await sequential._sequential_thinking(
 70 |             thought="Final step",
 71 |             nextThoughtNeeded=False,
 72 |             thoughtNumber=3,
 73 |             totalThoughts=3,
 74 |         )
 75 | 
 76 |         assert "Added thought 3 to main sequence" in result
 77 |         assert "Thinking sequence complete" in result
 78 |         session = _session_manager.get_session()
 79 |         assert len(session.thought_history) == 3
 80 | 
 81 |     @pytest.mark.anyio
 82 |     async def test_thought_revision(self):
 83 |         """Test revising a previous thought."""
 84 |         # Add initial thought
 85 |         await sequential._sequential_thinking(
 86 |             thought="Initial analysis",
 87 |             nextThoughtNeeded=True,
 88 |             thoughtNumber=1,
 89 |             totalThoughts=2,
 90 |         )
 91 | 
 92 |         # Revise the thought
 93 |         result = await sequential._sequential_thinking(
 94 |             thought="Better analysis",
 95 |             nextThoughtNeeded=True,
 96 |             thoughtNumber=1,
 97 |             totalThoughts=2,
 98 |             isRevision=True,
 99 |             revisesThought=1,
100 |         )
101 | 
102 |         assert "Revised thought 1" in result
103 |         session = _session_manager.get_session()
104 |         assert len(session.thought_history) == 1
105 |         assert session.thought_history[0].thought == "Better analysis"
106 |         assert session.thought_history[0].is_revision is True
107 | 
108 |     @pytest.mark.anyio
109 |     async def test_branching_logic(self):
110 |         """Test creating thought branches."""
111 |         # Add main sequence thoughts
112 |         await sequential._sequential_thinking(
113 |             thought="Main thought 1",
114 |             nextThoughtNeeded=True,
115 |             thoughtNumber=1,
116 |             totalThoughts=3,
117 |         )
118 | 
119 |         await sequential._sequential_thinking(
120 |             thought="Main thought 2",
121 |             nextThoughtNeeded=True,
122 |             thoughtNumber=2,
123 |             totalThoughts=3,
124 |         )
125 | 
126 |         # Create a branch
127 |         result = await sequential._sequential_thinking(
128 |             thought="Alternative approach",
129 |             nextThoughtNeeded=True,
130 |             thoughtNumber=3,
131 |             totalThoughts=3,
132 |             branchFromThought=2,
133 |         )
134 | 
135 |         assert "Added thought 3 to branch 'branch_2'" in result
136 |         session = _session_manager.get_session()
137 |         assert len(session.thought_history) == 2
138 |         assert len(session.thought_branches) == 1
139 |         assert "branch_2" in session.thought_branches
140 |         assert len(session.thought_branches["branch_2"]) == 1
141 | 
142 |     @pytest.mark.anyio
143 |     async def test_validation_errors(self):
144 |         """Test input validation errors."""
145 |         # Test invalid thought number
146 |         result = await sequential._sequential_thinking(
147 |             thought="Test",
148 |             nextThoughtNeeded=False,
149 |             thoughtNumber=0,
150 |             totalThoughts=1,
151 |         )
152 |         assert "thoughtNumber must be >= 1" in result
153 | 
154 |         # Test invalid total thoughts
155 |         result = await sequential._sequential_thinking(
156 |             thought="Test",
157 |             nextThoughtNeeded=False,
158 |             thoughtNumber=1,
159 |             totalThoughts=0,
160 |         )
161 |         assert "totalThoughts must be >= 1" in result
162 | 
163 |         # Test revision without specifying which thought
164 |         result = await sequential._sequential_thinking(
165 |             thought="Test",
166 |             nextThoughtNeeded=False,
167 |             thoughtNumber=1,
168 |             totalThoughts=1,
169 |             isRevision=True,
170 |         )
171 |         assert (
172 |             "revisesThought must be specified when isRevision=True" in result
173 |         )
174 | 
175 |     @pytest.mark.anyio
176 |     async def test_needs_more_thoughts(self):
177 |         """Test the needsMoreThoughts parameter."""
178 |         result = await sequential._sequential_thinking(
179 |             thought="This problem is more complex than expected",
180 |             nextThoughtNeeded=True,
181 |             thoughtNumber=3,
182 |             totalThoughts=3,
183 |             needsMoreThoughts=True,
184 |         )
185 | 
186 |         assert "Added thought 3 to main sequence" in result
187 |         session = _session_manager.get_session()
188 |         assert len(session.thought_history) == 1
189 |         assert (
190 |             session.thought_history[0].metadata.get("needsMoreThoughts")
191 |             is True
192 |         )
193 | 
194 | 
195 | class TestUtilityFunctions:
196 |     """Test utility functions."""
197 | 
198 |     def test_get_current_timestamp(self):
199 |         """Test timestamp generation."""
200 |         timestamp = sequential.get_current_timestamp()
201 |         assert isinstance(timestamp, str)
202 |         # Should be able to parse as ISO format
203 |         parsed = datetime.fromisoformat(
204 |             timestamp.replace("Z", "+00:00").replace("T", " ").split(".")[0]
205 |         )
206 |         assert isinstance(parsed, datetime)
207 | 
208 |     def test_session_management(self):
209 |         """Test session management functionality."""
210 |         # Clear any existing sessions
211 |         _session_manager.clear_all_sessions()
212 | 
213 |         # Create a new session
214 |         session = _session_manager.create_session()
215 |         assert session is not None
216 |         assert session.session_id is not None
217 | 
218 |         # Add a thought entry
219 |         entry = ThoughtEntry(
220 |             thought="Test thought",
221 |             thought_number=1,
222 |             total_thoughts=1,
223 |             next_thought_needed=False,
224 |         )
225 |         session.add_thought(entry)
226 |         assert len(session.thought_history) == 1
227 |         assert session.thought_history[0].thought == "Test thought"
228 | 
229 |         # Test branch creation
230 |         branch_entry = ThoughtEntry(
231 |             thought="Branch thought",
232 |             thought_number=2,
233 |             total_thoughts=2,
234 |             next_thought_needed=False,
235 |             branch_id="test-branch",
236 |             branch_from_thought=1,
237 |         )
238 |         session.add_thought(branch_entry)
239 |         assert len(session.thought_branches) == 1
240 |         assert "test-branch" in session.thought_branches
241 |         assert len(session.thought_branches["test-branch"]) == 1
242 | 
```

--------------------------------------------------------------------------------
/tests/tdd/openfda/test_drug_labels.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Unit tests for OpenFDA drug labels integration.
  3 | """
  4 | 
  5 | from unittest.mock import patch
  6 | 
  7 | import pytest
  8 | 
  9 | from biomcp.openfda.drug_labels import get_drug_label, search_drug_labels
 10 | 
 11 | 
 12 | @pytest.mark.asyncio
 13 | async def test_search_drug_labels_by_name():
 14 |     """Test searching drug labels by name."""
 15 |     mock_response = {
 16 |         "meta": {"results": {"total": 5}},
 17 |         "results": [
 18 |             {
 19 |                 "set_id": "abc123",
 20 |                 "openfda": {
 21 |                     "brand_name": ["KEYTRUDA"],
 22 |                     "generic_name": ["PEMBROLIZUMAB"],
 23 |                     "application_number": ["BLA125514"],
 24 |                     "manufacturer_name": ["MERCK"],
 25 |                     "route": ["INTRAVENOUS"],
 26 |                 },
 27 |                 "indications_and_usage": [
 28 |                     "KEYTRUDA is indicated for the treatment of patients with unresectable or metastatic melanoma."
 29 |                 ],
 30 |                 "boxed_warning": [
 31 |                     "Immune-mediated adverse reactions can occur."
 32 |                 ],
 33 |             }
 34 |         ],
 35 |     }
 36 | 
 37 |     with patch(
 38 |         "biomcp.openfda.drug_labels.make_openfda_request"
 39 |     ) as mock_request:
 40 |         mock_request.return_value = (mock_response, None)
 41 | 
 42 |         result = await search_drug_labels(name="pembrolizumab", limit=10)
 43 | 
 44 |         # Verify request
 45 |         mock_request.assert_called_once()
 46 |         call_args = mock_request.call_args
 47 |         assert "pembrolizumab" in call_args[0][1]["search"].lower()
 48 | 
 49 |         # Check output
 50 |         assert "FDA Drug Labels" in result
 51 |         assert "KEYTRUDA" in result
 52 |         assert "PEMBROLIZUMAB" in result
 53 |         assert "melanoma" in result
 54 |         assert "BOXED WARNING" in result
 55 |         assert "Immune-mediated" in result
 56 |         assert "abc123" in result
 57 | 
 58 | 
 59 | @pytest.mark.asyncio
 60 | async def test_search_drug_labels_by_indication():
 61 |     """Test searching drug labels by indication."""
 62 |     mock_response = {
 63 |         "meta": {"results": {"total": 10}},
 64 |         "results": [
 65 |             {
 66 |                 "set_id": "xyz789",
 67 |                 "openfda": {
 68 |                     "brand_name": ["DRUG X"],
 69 |                     "generic_name": ["GENERIC X"],
 70 |                 },
 71 |                 "indications_and_usage": [
 72 |                     "Indicated for breast cancer treatment"
 73 |                 ],
 74 |             }
 75 |         ],
 76 |     }
 77 | 
 78 |     with patch(
 79 |         "biomcp.openfda.drug_labels.make_openfda_request"
 80 |     ) as mock_request:
 81 |         mock_request.return_value = (mock_response, None)
 82 | 
 83 |         result = await search_drug_labels(indication="breast cancer")
 84 | 
 85 |         # Verify request
 86 |         call_args = mock_request.call_args
 87 |         assert "breast cancer" in call_args[0][1]["search"].lower()
 88 | 
 89 |         # Check output
 90 |         assert "breast cancer" in result
 91 |         assert "10 labels" in result
 92 | 
 93 | 
 94 | @pytest.mark.asyncio
 95 | async def test_search_drug_labels_no_params():
 96 |     """Test that searching without parameters returns helpful message."""
 97 |     result = await search_drug_labels()
 98 | 
 99 |     assert "Please specify" in result
100 |     assert "drug name, indication, or label section" in result
101 |     assert "Examples:" in result
102 | 
103 | 
104 | @pytest.mark.asyncio
105 | async def test_search_drug_labels_boxed_warning_filter():
106 |     """Test filtering for drugs with boxed warnings."""
107 |     mock_response = {
108 |         "meta": {"results": {"total": 3}},
109 |         "results": [
110 |             {
111 |                 "set_id": "warn123",
112 |                 "openfda": {"brand_name": ["WARNING DRUG"]},
113 |                 "boxed_warning": ["Serious warning text"],
114 |             }
115 |         ],
116 |     }
117 | 
118 |     with patch(
119 |         "biomcp.openfda.drug_labels.make_openfda_request"
120 |     ) as mock_request:
121 |         mock_request.return_value = (mock_response, None)
122 | 
123 |         result = await search_drug_labels(boxed_warning=True)
124 | 
125 |         # Verify boxed warning filter in search
126 |         call_args = mock_request.call_args
127 |         assert "_exists_:boxed_warning" in call_args[0][1]["search"]
128 | 
129 |         # Check output
130 |         assert "WARNING DRUG" in result
131 |         assert "Serious warning" in result
132 | 
133 | 
134 | @pytest.mark.asyncio
135 | async def test_get_drug_label_detail():
136 |     """Test getting detailed drug label."""
137 |     mock_response = {
138 |         "results": [
139 |             {
140 |                 "set_id": "detail123",
141 |                 "openfda": {
142 |                     "brand_name": ["DETAILED DRUG"],
143 |                     "generic_name": ["GENERIC DETAILED"],
144 |                     "application_number": ["NDA123456"],
145 |                     "manufacturer_name": ["PHARMA CORP"],
146 |                     "route": ["ORAL"],
147 |                     "pharm_class_epc": ["KINASE INHIBITOR"],
148 |                 },
149 |                 "boxed_warning": ["Serious boxed warning"],
150 |                 "indications_and_usage": ["Indicated for cancer"],
151 |                 "dosage_and_administration": ["Take once daily"],
152 |                 "contraindications": ["Do not use if allergic"],
153 |                 "warnings_and_precautions": ["Monitor liver function"],
154 |                 "adverse_reactions": ["Common: nausea, fatigue"],
155 |                 "drug_interactions": ["Avoid with CYP3A4 inhibitors"],
156 |                 "clinical_pharmacology": ["Mechanism of action details"],
157 |                 "clinical_studies": ["Phase 3 trial results"],
158 |             }
159 |         ]
160 |     }
161 | 
162 |     with patch(
163 |         "biomcp.openfda.drug_labels.make_openfda_request"
164 |     ) as mock_request:
165 |         mock_request.return_value = (mock_response, None)
166 | 
167 |         result = await get_drug_label("detail123")
168 | 
169 |         # Verify request
170 |         mock_request.assert_called_once()
171 |         call_args = mock_request.call_args
172 |         assert "detail123" in call_args[0][1]["search"]
173 | 
174 |         # Check detailed output
175 |         assert "DETAILED DRUG" in result
176 |         assert "GENERIC DETAILED" in result
177 |         assert "NDA123456" in result
178 |         assert "PHARMA CORP" in result
179 |         assert "ORAL" in result
180 |         assert "KINASE INHIBITOR" in result
181 |         assert "BOXED WARNING" in result
182 |         assert "Serious boxed warning" in result
183 |         assert "INDICATIONS AND USAGE" in result
184 |         assert "Indicated for cancer" in result
185 |         assert "DOSAGE AND ADMINISTRATION" in result
186 |         assert "Take once daily" in result
187 |         assert "CONTRAINDICATIONS" in result
188 |         assert "WARNINGS AND PRECAUTIONS" in result
189 |         assert "ADVERSE REACTIONS" in result
190 |         assert "DRUG INTERACTIONS" in result
191 | 
192 | 
193 | @pytest.mark.asyncio
194 | async def test_get_drug_label_specific_sections():
195 |     """Test getting specific sections of drug label."""
196 |     mock_response = {
197 |         "results": [
198 |             {
199 |                 "set_id": "section123",
200 |                 "openfda": {"brand_name": ["SECTION DRUG"]},
201 |                 "indications_and_usage": ["Cancer indication"],
202 |                 "adverse_reactions": ["Side effects list"],
203 |                 "clinical_studies": ["Study data"],
204 |             }
205 |         ]
206 |     }
207 | 
208 |     with patch(
209 |         "biomcp.openfda.drug_labels.make_openfda_request"
210 |     ) as mock_request:
211 |         mock_request.return_value = (mock_response, None)
212 | 
213 |         sections = ["indications_and_usage", "adverse_reactions"]
214 |         result = await get_drug_label("section123", sections)
215 | 
216 |         # Check that requested sections are included
217 |         assert "INDICATIONS AND USAGE" in result
218 |         assert "Cancer indication" in result
219 |         assert "ADVERSE REACTIONS" in result
220 |         assert "Side effects list" in result
221 |         # Clinical studies should not be in output since not requested
222 |         assert "CLINICAL STUDIES" not in result
223 | 
224 | 
225 | @pytest.mark.asyncio
226 | async def test_get_drug_label_not_found():
227 |     """Test handling when drug label is not found."""
228 |     with patch(
229 |         "biomcp.openfda.drug_labels.make_openfda_request"
230 |     ) as mock_request:
231 |         mock_request.return_value = ({"results": []}, None)
232 | 
233 |         result = await get_drug_label("NOTFOUND456")
234 | 
235 |         assert "NOTFOUND456" in result
236 |         assert "not found" in result
237 | 
```

--------------------------------------------------------------------------------
/docs/getting-started/03-authentication-and-api-keys.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Authentication and API Keys
  2 | 
  3 | BioMCP integrates with multiple biomedical databases. While many features work without authentication, some advanced capabilities require API keys for enhanced functionality.
  4 | 
  5 | ## Overview of API Keys
  6 | 
  7 | | Service         | Required?  | Features Enabled                                  | Get Key                                                                |
  8 | | --------------- | ---------- | ------------------------------------------------- | ---------------------------------------------------------------------- |
  9 | | **NCI API**     | Optional   | Advanced clinical trial filters, biomarker search | [api.cancer.gov](https://api.cancer.gov)                               |
 10 | | **AlphaGenome** | Required\* | Variant effect predictions                        | [deepmind.google.com](https://deepmind.google.com/science/alphagenome) |
 11 | | **cBioPortal**  | Optional   | Enhanced cancer genomics queries                  | [cbioportal.org](https://www.cbioportal.org/webAPI)                    |
 12 | 
 13 | \*Required only when using AlphaGenome features
 14 | 
 15 | ## Setting Up API Keys
 16 | 
 17 | ### Method 1: Environment Variables (Recommended for Personal Use)
 18 | 
 19 | Set environment variables in your shell configuration:
 20 | 
 21 | ```bash
 22 | # Add to ~/.bashrc, ~/.zshrc, or equivalent
 23 | export NCI_API_KEY="your-nci-api-key"
 24 | export ALPHAGENOME_API_KEY="your-alphagenome-key"
 25 | export CBIO_TOKEN="your-cbioportal-token"
 26 | ```
 27 | 
 28 | ### Method 2: Configuration Files
 29 | 
 30 | #### For Claude Desktop
 31 | 
 32 | Add keys to your Claude Desktop configuration:
 33 | 
 34 | ```json
 35 | {
 36 |   "mcpServers": {
 37 |     "biomcp": {
 38 |       "command": "uv",
 39 |       "args": ["run", "--with", "biomcp-python", "biomcp", "run"],
 40 |       "env": {
 41 |         "NCI_API_KEY": "your-nci-api-key",
 42 |         "ALPHAGENOME_API_KEY": "your-alphagenome-key",
 43 |         "CBIO_TOKEN": "your-cbioportal-token"
 44 |       }
 45 |     }
 46 |   }
 47 | }
 48 | ```
 49 | 
 50 | #### For Docker Deployments
 51 | 
 52 | Include in your Docker run command:
 53 | 
 54 | ```bash
 55 | docker run -e NCI_API_KEY="your-key" \
 56 |            -e ALPHAGENOME_API_KEY="your-key" \
 57 |            -e CBIO_TOKEN="your-token" \
 58 |            biomcp:latest
 59 | ```
 60 | 
 61 | ### Method 3: Per-Request Keys (For Hosted Environments)
 62 | 
 63 | When using BioMCP through AI assistants or hosted services, provide keys in your request:
 64 | 
 65 | ```
 66 | "Predict effects of BRAF V600E mutation. My AlphaGenome API key is YOUR_KEY_HERE"
 67 | ```
 68 | 
 69 | The AI will recognize patterns like "My [service] API key is..." and use the key for that request only.
 70 | 
 71 | ## Individual Service Setup
 72 | 
 73 | ### NCI Clinical Trials API
 74 | 
 75 | The National Cancer Institute API provides advanced clinical trial search capabilities.
 76 | 
 77 | #### Getting Your Key
 78 | 
 79 | 1. Visit [api.cancer.gov](https://api.cancer.gov)
 80 | 2. Click "Get API Key"
 81 | 3. Complete registration
 82 | 4. Key is emailed immediately
 83 | 
 84 | #### Features Enabled
 85 | 
 86 | - Advanced biomarker-based trial search
 87 | - Organization and investigator lookups
 88 | - Intervention and disease vocabularies
 89 | - Higher rate limits (1000 requests/day vs 100)
 90 | 
 91 | #### Usage Example
 92 | 
 93 | ```bash
 94 | # With API key set
 95 | export NCI_API_KEY="your-key"
 96 | 
 97 | # Search trials with biomarker criteria
 98 | biomcp trial search --condition melanoma --source nci \
 99 |   --required-mutations "BRAF V600E" --allow-brain-mets true
100 | ```
101 | 
102 | ### AlphaGenome
103 | 
104 | Google DeepMind's AlphaGenome predicts variant effects on gene expression and chromatin accessibility.
105 | 
106 | #### Getting Your Key
107 | 
108 | 1. Visit [AlphaGenome Portal](https://deepmind.google.com/science/alphagenome)
109 | 2. Register for non-commercial use
110 | 3. Receive API key via email
111 | 4. Accept terms of service
112 | 
113 | #### Features Enabled
114 | 
115 | - Gene expression predictions
116 | - Chromatin accessibility analysis
117 | - Splicing effect predictions
118 | - Tissue-specific analyses
119 | 
120 | #### Usage Examples
121 | 
122 | **CLI with environment variable:**
123 | 
124 | ```bash
125 | export ALPHAGENOME_API_KEY="your-key"
126 | biomcp variant predict chr7 140753336 A T
127 | ```
128 | 
129 | **CLI with per-request key:**
130 | 
131 | ```bash
132 | biomcp variant predict chr7 140753336 A T --api-key YOUR_KEY
133 | ```
134 | 
135 | **Through AI assistant:**
136 | 
137 | ```
138 | "Predict regulatory effects of BRAF V600E (chr7:140753336 A>T).
139 | My AlphaGenome API key is YOUR_KEY_HERE"
140 | ```
141 | 
142 | ### cBioPortal
143 | 
144 | The cBioPortal token enables enhanced cancer genomics queries.
145 | 
146 | #### Getting Your Token
147 | 
148 | 1. Create account at [cbioportal.org](https://www.cbioportal.org)
149 | 2. Navigate to "Web API" section
150 | 3. Generate a personal access token
151 | 4. Copy the token (shown only once)
152 | 
153 | #### Features Enabled
154 | 
155 | - Higher API rate limits
156 | - Access to private studies (if authorized)
157 | - Batch query capabilities
158 | - Extended timeout limits
159 | 
160 | #### Usage
161 | 
162 | cBioPortal integration is automatic when searching for genes. The token enables:
163 | 
164 | ```bash
165 | # Enhanced gene search with cancer genomics
166 | export CBIO_TOKEN="your-token"
167 | biomcp article search --gene BRAF --disease melanoma
168 | ```
169 | 
170 | ## Security Best Practices
171 | 
172 | ### DO:
173 | 
174 | - Store keys in environment variables or secure config files
175 | - Use per-request keys in shared/hosted environments
176 | - Rotate keys periodically
177 | - Use separate keys for development/production
178 | 
179 | ### DON'T:
180 | 
181 | - Commit keys to version control
182 | - Share keys with others
183 | - Include keys in code or documentation
184 | - Store keys in plain text files
185 | 
186 | ### Git Security
187 | 
188 | Add to `.gitignore`:
189 | 
190 | ```
191 | .env
192 | .env.local
193 | *.key
194 | config/secrets/
195 | ```
196 | 
197 | Use git-secrets to prevent accidental commits:
198 | 
199 | ```bash
200 | # Install git-secrets
201 | brew install git-secrets  # macOS
202 | # or follow instructions at github.com/awslabs/git-secrets
203 | 
204 | # Set up in your repo
205 | git secrets --install
206 | git secrets --register-aws  # Detects common key patterns
207 | ```
208 | 
209 | ## Troubleshooting
210 | 
211 | ### "API Key Required" Errors
212 | 
213 | **For AlphaGenome:**
214 | 
215 | - This service always requires a key
216 | - Provide it via environment variable or per-request
217 | - Check key spelling and format
218 | 
219 | **For NCI:**
220 | 
221 | - Basic search works without key
222 | - Advanced features require authentication
223 | - Verify key is active at api.cancer.gov
224 | 
225 | ### "Invalid API Key" Errors
226 | 
227 | 1. Check for extra spaces or quotes
228 | 2. Ensure key hasn't expired
229 | 3. Verify you're using the correct service's key
230 | 4. Test key directly with the service's API
231 | 
232 | ### Rate Limit Errors
233 | 
234 | **Without API keys:**
235 | 
236 | - Public limits are restrictive (e.g., 100 requests/day)
237 | - Add delays between requests
238 | - Consider getting API keys
239 | 
240 | **With API keys:**
241 | 
242 | - Limits are much higher but still exist
243 | - Implement exponential backoff
244 | - Cache results when possible
245 | 
246 | ## Testing Your Setup
247 | 
248 | ### Check Environment Variables
249 | 
250 | ```bash
251 | # List all BioMCP-related environment variables
252 | env | grep -E "(NCI_API_KEY|ALPHAGENOME_API_KEY|CBIO_TOKEN)"
253 | ```
254 | 
255 | ### Test Each Service
256 | 
257 | ```bash
258 | # Test NCI API
259 | biomcp trial search --condition cancer --source nci --limit 1
260 | 
261 | # Test AlphaGenome (requires key)
262 | biomcp variant predict chr7 140753336 A T --limit 1
263 | 
264 | # Test cBioPortal integration
265 | biomcp article search --gene TP53 --limit 1
266 | ```
267 | 
268 | ## API Key Management Tools
269 | 
270 | For managing multiple API keys securely:
271 | 
272 | ### 1. direnv (Recommended)
273 | 
274 | ```bash
275 | # Install direnv
276 | brew install direnv  # macOS
277 | # Add to shell: eval "$(direnv hook zsh)"
278 | 
279 | # Create .envrc in project
280 | echo 'export NCI_API_KEY="your-key"' > .envrc
281 | direnv allow
282 | ```
283 | 
284 | ### 2. 1Password CLI
285 | 
286 | ```bash
287 | # Store in 1Password
288 | op item create --category=password \
289 |   --title="BioMCP API Keys" \
290 |   --vault="Development" \
291 |   NCI_API_KEY="your-key"
292 | 
293 | # Load in shell
294 | export NCI_API_KEY=$(op read "op://Development/BioMCP API Keys/NCI_API_KEY")
295 | ```
296 | 
297 | ### 3. AWS Secrets Manager
298 | 
299 | ```bash
300 | # Store secret
301 | aws secretsmanager create-secret \
302 |   --name biomcp/api-keys \
303 |   --secret-string '{"NCI_API_KEY":"your-key"}'
304 | 
305 | # Retrieve in script
306 | export NCI_API_KEY=$(aws secretsmanager get-secret-value \
307 |   --secret-id biomcp/api-keys \
308 |   --query SecretString \
309 |   --output text | jq -r .NCI_API_KEY)
310 | ```
311 | 
312 | ## Next Steps
313 | 
314 | Now that you have API keys configured:
315 | 
316 | 1. Test each service to ensure keys work
317 | 2. Explore [How-to Guides](../how-to-guides/01-find-articles-and-cbioportal-data.md) for advanced features
318 | 3. Set up [logging and monitoring](../how-to-guides/05-logging-and-monitoring-with-bigquery.md)
319 | 4. Review [security policies](../policies.md) for your organization
320 | 
```

--------------------------------------------------------------------------------
/docs/concepts/03-sequential-thinking-with-the-think-tool.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Sequential Thinking with the Think Tool
  2 | 
  3 | ## CRITICAL: The Think Tool is MANDATORY
  4 | 
  5 | **The 'think' tool must be your FIRST action when using BioMCP. This is not optional.**
  6 | 
  7 | For detailed technical documentation on the think tool parameters and usage, see the [MCP Tools Reference - Think Tool](../user-guides/02-mcp-tools-reference.md#3-think).
  8 | 
  9 | ## Why Sequential Thinking?
 10 | 
 11 | Biomedical research is inherently complex, requiring systematic analysis of interconnected data from multiple sources. The think tool enforces a structured approach that:
 12 | 
 13 | - **Prevents Information Overload**: Breaks complex queries into manageable steps
 14 | - **Ensures Comprehensive Coverage**: Systematic thinking catches details that might be missed
 15 | - **Documents Reasoning**: Creates an audit trail of research decisions
 16 | - **Improves Accuracy**: Thoughtful planning leads to better search strategies
 17 | 
 18 | ## Mandatory Usage Requirements
 19 | 
 20 | 🚨 **REQUIRED USAGE:**
 21 | 
 22 | - You MUST call 'think' BEFORE any search or fetch operations
 23 | - EVERY biomedical research query requires thinking first
 24 | - ALL multi-step analyses must begin with the think tool
 25 | - ANY task using BioMCP tools requires prior planning with think
 26 | 
 27 | ⚠️ **WARNING - Skipping the think tool will result in:**
 28 | 
 29 | - Incomplete analysis
 30 | - Poor search strategies
 31 | - Missing critical connections
 32 | - Suboptimal results
 33 | - Frustrated users
 34 | 
 35 | ## How to Use the Think Tool
 36 | 
 37 | The think tool accepts these parameters:
 38 | 
 39 | ```python
 40 | think(
 41 |     thought="Your reasoning about the current step",
 42 |     thoughtNumber=1,  # Sequential number starting from 1
 43 |     totalThoughts=5,  # Optional: estimated total thoughts needed
 44 |     nextThoughtNeeded=True  # Set to False only when analysis is complete
 45 | )
 46 | ```
 47 | 
 48 | ## Sequential Thinking Patterns
 49 | 
 50 | ### Pattern 1: Initial Query Decomposition
 51 | 
 52 | Always start by breaking down the user's query:
 53 | 
 54 | ```python
 55 | # User asks: "What are the treatment options for BRAF V600E melanoma?"
 56 | 
 57 | think(
 58 |     thought="Breaking down query: Need to find 1) BRAF V600E mutation significance in melanoma, 2) approved treatments for BRAF-mutant melanoma, 3) clinical trials for new therapies, 4) resistance mechanisms and combination strategies",
 59 |     thoughtNumber=1,
 60 |     nextThoughtNeeded=True
 61 | )
 62 | ```
 63 | 
 64 | ### Pattern 2: Search Strategy Planning
 65 | 
 66 | Plan your data collection approach:
 67 | 
 68 | ```python
 69 | think(
 70 |     thought="Search strategy: First use gene_getter for BRAF context, then article_searcher for BRAF V600E melanoma treatments focusing on FDA-approved drugs, followed by trial_searcher for ongoing studies with BRAF inhibitors",
 71 |     thoughtNumber=2,
 72 |     nextThoughtNeeded=True
 73 | )
 74 | ```
 75 | 
 76 | ### Pattern 3: Progressive Refinement
 77 | 
 78 | Document findings and adjust strategy:
 79 | 
 80 | ```python
 81 | think(
 82 |     thought="Found 3 FDA-approved BRAF inhibitors (vemurafenib, dabrafenib, encorafenib). Need to search for combination therapies with MEK inhibitors based on resistance patterns identified in literature",
 83 |     thoughtNumber=3,
 84 |     nextThoughtNeeded=True
 85 | )
 86 | ```
 87 | 
 88 | ### Pattern 4: Synthesis Planning
 89 | 
 90 | Before creating final output:
 91 | 
 92 | ```python
 93 | think(
 94 |     thought="Ready to synthesize: Will organize findings into 1) First-line treatments (BRAF+MEK combos), 2) Second-line options (immunotherapy), 3) Emerging therapies from trials, 4) Resistance mechanisms to consider",
 95 |     thoughtNumber=4,
 96 |     nextThoughtNeeded=False  # Analysis complete
 97 | )
 98 | ```
 99 | 
100 | ## Common Think Tool Workflows
101 | 
102 | ### Literature Review Workflow
103 | 
104 | ```python
105 | # Step 1: Problem definition
106 | think(thought="User wants comprehensive review of CDK4/6 inhibitors in breast cancer...", thoughtNumber=1)
107 | 
108 | # Step 2: Search parameters
109 | think(thought="Will search for palbociclib, ribociclib, abemaciclib in HR+/HER2- breast cancer...", thoughtNumber=2)
110 | 
111 | # Step 3: Quality filtering
112 | think(thought="Found 47 articles, filtering for Phase III trials and meta-analyses...", thoughtNumber=3)
113 | 
114 | # Step 4: Evidence synthesis
115 | think(thought="Identified consistent PFS benefit across trials, now analyzing OS data...", thoughtNumber=4)
116 | ```
117 | 
118 | ### Clinical Trial Analysis Workflow
119 | 
120 | ```python
121 | # Step 1: Criteria identification
122 | think(thought="Patient has EGFR L858R lung cancer, progressed on osimertinib...", thoughtNumber=1)
123 | 
124 | # Step 2: Trial search strategy
125 | think(thought="Searching for trials accepting EGFR-mutant NSCLC after TKI resistance...", thoughtNumber=2)
126 | 
127 | # Step 3: Eligibility assessment
128 | think(thought="Found 12 trials, checking for brain metastases eligibility...", thoughtNumber=3)
129 | 
130 | # Step 4: Prioritization
131 | think(thought="Ranking trials by proximity, novel mechanisms, and enrollment status...", thoughtNumber=4)
132 | ```
133 | 
134 | ### Variant Interpretation Workflow
135 | 
136 | ```python
137 | # Step 1: Variant identification
138 | think(thought="Analyzing TP53 R248Q mutation found in patient's tumor...", thoughtNumber=1)
139 | 
140 | # Step 2: Database queries
141 | think(thought="Will check MyVariant for population frequency, cBioPortal for cancer prevalence...", thoughtNumber=2)
142 | 
143 | # Step 3: Functional assessment
144 | think(thought="Variant is pathogenic, affects DNA binding domain, common in multiple cancers...", thoughtNumber=3)
145 | 
146 | # Step 4: Clinical implications
147 | think(thought="Synthesizing prognostic impact and potential therapeutic vulnerabilities...", thoughtNumber=4)
148 | ```
149 | 
150 | ## Think Tool Best Practices
151 | 
152 | ### DO:
153 | 
154 | - Start EVERY BioMCP session with think
155 | - Use sequential numbering (1, 2, 3...)
156 | - Document key findings in each thought
157 | - Adjust strategy based on intermediate results
158 | - Use think to track progress through complex analyses
159 | 
160 | ### DON'T:
161 | 
162 | - Skip think and jump to searches
163 | - Use think only at the beginning
164 | - Set nextThoughtNeeded=false prematurely
165 | - Use generic thoughts without specific content
166 | - Forget to document decision rationale
167 | 
168 | ## Integration with Other Tools
169 | 
170 | The think tool should wrap around other tool usage:
171 | 
172 | ```python
173 | # CORRECT PATTERN
174 | think(thought="Planning BRAF melanoma research...", thoughtNumber=1)
175 | gene_info = gene_getter("BRAF")
176 | 
177 | think(thought="BRAF is a serine/threonine kinase, V600E creates constitutive activation. Searching for targeted therapies...", thoughtNumber=2)
178 | articles = article_searcher(genes=["BRAF"], diseases=["melanoma"], keywords=["vemurafenib", "dabrafenib"])
179 | 
180 | think(thought="Found key trials showing BRAF+MEK combination superiority. Checking for active trials...", thoughtNumber=3)
181 | trials = trial_searcher(conditions=["melanoma"], interventions=["BRAF inhibitor"])
182 | 
183 | # INCORRECT PATTERN - NO THINKING
184 | gene_info = gene_getter("BRAF")  # ❌ Started without thinking
185 | articles = article_searcher(...)  # ❌ No strategy planning
186 | ```
187 | 
188 | ## Reminder System
189 | 
190 | BioMCP includes automatic reminders if you forget to use think:
191 | 
192 | - Search results will include a reminder message
193 | - The reminder appears as a system message
194 | - It prompts you to use think for better results
195 | - This ensures consistent methodology
196 | 
197 | ## Advanced Sequential Thinking
198 | 
199 | ### Branching Logic
200 | 
201 | Use think to handle conditional paths:
202 | 
203 | ```python
204 | think(
205 |     thought="No direct trials found for this rare mutation. Pivoting to search for basket trials and mutation-agnostic approaches...",
206 |     thoughtNumber=5,
207 |     nextThoughtNeeded=True
208 | )
209 | ```
210 | 
211 | ### Error Recovery
212 | 
213 | Document and adjust when searches fail:
214 | 
215 | ```python
216 | think(
217 |     thought="MyVariant query failed for this structural variant. Will use article search to find functional studies instead...",
218 |     thoughtNumber=6,
219 |     nextThoughtNeeded=True
220 | )
221 | ```
222 | 
223 | ### Complex Integration
224 | 
225 | Coordinate multiple data sources:
226 | 
227 | ```python
228 | think(
229 |     thought="Integrating findings: cBioPortal shows 15% frequency in lung adenocarcinoma, articles describe resistance mechanisms, trials testing combination strategies...",
230 |     thoughtNumber=7,
231 |     nextThoughtNeeded=True
232 | )
233 | ```
234 | 
235 | ## Conclusion
236 | 
237 | The think tool is not just a requirement—it's your research companion that ensures systematic, thorough, and reproducible biomedical research. By following sequential thinking patterns, you'll deliver comprehensive insights that address all aspects of complex biomedical queries.
238 | 
239 | Remember: **Always think first, then search. Document your reasoning. Only mark thinking complete when your analysis is truly finished.**
240 | 
```