# Directory Structure
```
├── .gitignore
├── .python-version
├── assets
│ ├── use-backlinks-mcp-on-cursor.png
│ └── use-keyword-mcp-on-cursor.png
├── LICENSE
├── main.py
├── pyproject.toml
├── README_CN.md
├── README.md
├── src
│ └── seo_mcp
│ ├── __init__.py
│ ├── backlinks.py
│ ├── keywords.py
│ ├── logger.py
│ ├── server.py
│ └── traffic.py
└── uv.lock
```
# Files
--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------
```
1 | 3.10
```
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
```
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | wheels/
23 | *.egg-info/
24 | .installed.cfg
25 | *.egg
26 | MANIFEST
27 |
28 | # PyInstaller
29 | *.manifest
30 | *.spec
31 |
32 | # Installer logs
33 | pip-log.txt
34 | pip-delete-this-directory.txt
35 |
36 | # Unit test / coverage reports
37 | htmlcov/
38 | .tox/
39 | .nox/
40 | .coverage
41 | .coverage.*
42 | .cache
43 | nosetests.xml
44 | coverage.xml
45 | *.cover
46 | .hypothesis/
47 | .pytest_cache/
48 |
49 | # Environments
50 | .env
51 | .venv
52 | env/
53 | venv/
54 | ENV/
55 | env.bak/
56 | venv.bak/
57 |
58 | # Jupyter Notebook
59 | .ipynb_checkpoints
60 |
61 | # VS Code
62 | .vscode/
63 |
64 | # PyCharm
65 | .idea/
66 |
67 | # Signature cache
68 | signature_cache.json
69 |
70 | temp/
71 |
72 | logs/
```
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
```markdown
1 | # SEO MCP
2 |
3 | A MCP (Model Control Protocol) SEO tool service based on Ahrefs data. Includes features such as backlink analysis, keyword research, traffic estimation, and more.
4 |
5 | [中文](./README_CN.md)
6 |
7 | ## Overview
8 |
9 | This service provides an API to retrieve SEO data from Ahrefs. It handles the entire process, including solving the CAPTCHA, authentication, and data retrieval. The results are cached to improve performance and reduce API costs.
10 |
11 | > This MCP service is for educational purposes only. Please do not misuse it. This project is inspired by `@哥飞社群`.
12 |
13 | ## Features
14 |
15 | - 🔍 Backlink Analysis
16 |
17 | - Get detailed backlink data for any domain
18 | - View domain rating, anchor text, and link attributes
19 | - Filter educational and government domains
20 |
21 | - 🎯 Keyword Research
22 |
23 | - Generate keyword ideas from a seed keyword
24 | - Get keyword difficulty score
25 | - View search volume and trends
26 |
27 | - 📊 Traffic Analysis
28 |
29 | - Estimate website traffic
30 | - View traffic history and trends
31 | - Analyze popular pages and country distribution
32 | - Track keyword rankings
33 |
34 | - 🚀 Performance Optimization
35 |
36 | - Use CapSolver to automatically solve CAPTCHA
37 | - Response caching
38 |
39 | ## Installation
40 |
41 | ### Prerequisites
42 |
43 | - Python 3.10 or higher
44 | - CapSolver account and API key ([register here](https://dashboard.capsolver.com/passport/register?inviteCode=1dTH7WQSfHD0))
45 |
46 | ### Install from PyPI
47 |
48 | ```bash
49 | pip install seo-mcp
50 | ```
51 |
52 | Or use `uv`:
53 |
54 | ```bash
55 | uv pip install seo-mcp
56 | ```
57 |
58 | ### Manual Installation
59 |
60 | 1. Clone the repository:
61 |
62 | ```bash
63 | git clone https://github.com/cnych/seo-mcp.git
64 | cd seo-mcp
65 | ```
66 |
67 | 2. Install dependencies:
68 |
69 | ```bash
70 | pip install -e .
71 | # Or
72 | uv pip install -e .
73 | ```
74 |
75 | 3. Set the CapSolver API key:
76 |
77 | ```bash
78 | export CAPSOLVER_API_KEY="your-capsolver-api-key"
79 | ```
80 |
81 | ## Usage
82 |
83 | ### Run the service
84 |
85 | You can run the service in the following ways:
86 |
87 | #### Use in Cursor IDE
88 |
89 | In the Cursor settings, switch to the MCP tab, click the `+Add new global MCP server` button, and then input:
90 |
91 | ```json
92 | {
93 | "mcpServers": {
94 | "SEO MCP": {
95 | "command": "uvx",
96 | "args": ["--python", "3.10", "seo-mcp"],
97 | "env": {
98 | "CAPSOLVER_API_KEY": "CAP-xxxxxx"
99 | }
100 | }
101 | }
102 | }
103 | ```
104 |
105 | You can also create a `.cursor/mcp.json` file in the project root directory, with the same content.
106 |
107 | ### API Reference
108 |
109 | The service provides the following MCP tools:
110 |
111 | #### `get_backlinks_list(domain: str)`
112 |
113 | Get the backlinks of a domain.
114 |
115 | **Parameters:**
116 |
117 | - `domain` (string): The domain to analyze (e.g. "example.com")
118 |
119 | **Returns:**
120 |
121 | ```json
122 | {
123 | "overview": {
124 | "domainRating": 76,
125 | "backlinks": 1500,
126 | "refDomains": 300
127 | },
128 | "backlinks": [
129 | {
130 | "anchor": "Example link",
131 | "domainRating": 76,
132 | "title": "Page title",
133 | "urlFrom": "https://referringsite.com/page",
134 | "urlTo": "https://example.com/page",
135 | "edu": false,
136 | "gov": false
137 | }
138 | ]
139 | }
140 | ```
141 |
142 | #### `keyword_generator(keyword: str, country: str = "us", search_engine: str = "Google")`
143 |
144 | Generate keyword ideas.
145 |
146 | **Parameters:**
147 |
148 | - `keyword` (string): The seed keyword
149 | - `country` (string): Country code (default: "us")
150 | - `search_engine` (string): Search engine (default: "Google")
151 |
152 | **Returns:**
153 |
154 | ```json
155 | [
156 | {
157 | "keyword": "Example keyword",
158 | "volume": 1000,
159 | "difficulty": 45,
160 | "cpc": 2.5
161 | }
162 | ]
163 | ```
164 |
165 | #### `get_traffic(domain_or_url: str, country: str = "None", mode: str = "subdomains")`
166 |
167 | Get the traffic estimation.
168 |
169 | **Parameters:**
170 |
171 | - `domain_or_url` (string): The domain or URL to analyze
172 | - `country` (string): Country filter (default: "None")
173 | - `mode` (string): Analysis mode ("subdomains" or "exact")
174 |
175 | **Returns:**
176 |
177 | ```json
178 | {
179 | "traffic_history": [...],
180 | "traffic": {
181 | "trafficMonthlyAvg": 50000,
182 | "costMontlyAvg": 25000
183 | },
184 | "top_pages": [...],
185 | "top_countries": [...],
186 | "top_keywords": [...]
187 | }
188 | ```
189 |
190 | #### `keyword_difficulty(keyword: str, country: str = "us")`
191 |
192 | Get the keyword difficulty score.
193 |
194 | **Parameters:**
195 |
196 | - `keyword` (string): The keyword to analyze
197 | - `country` (string): Country code (default: "us")
198 |
199 | **Returns:**
200 |
201 | ```json
202 | {
203 | "difficulty": 45,
204 | "serp": [...],
205 | "related": [...]
206 | }
207 | ```
208 |
209 | ## Development
210 |
211 | For development:
212 |
213 | ```bash
214 | git clone https://github.com/cnych/seo-mcp.git
215 | cd seo-mcp
216 | uv sync
217 | ```
218 |
219 | ## How it works
220 |
221 | 1. The user sends a request through MCP
222 | 2. The service uses CapSolver to solve the Cloudflare Turnstile CAPTCHA
223 | 3. The service gets the authentication token from Ahrefs
224 | 4. The service retrieves the requested SEO data
225 | 5. The service processes and returns the formatted results
226 |
227 | ## Troubleshooting
228 |
229 | - **CapSolver API key error**:Check the `CAPSOLVER_API_KEY` environment variable
230 | - **Rate limiting**:Reduce request frequency
231 | - **No results**:The domain may not be indexed by Ahrefs
232 | - **Other issues**:See [GitHub repository](https://github.com/cnych/seo-mcp)
233 |
234 | ## License
235 |
236 | MIT License - See LICENSE file
237 |
```
--------------------------------------------------------------------------------
/src/seo_mcp/__init__.py:
--------------------------------------------------------------------------------
```python
1 | """
2 | SEO MCP - A FastMCP service for retrieving SEO information for any domain using Ahrefs' data.
3 | """
4 |
5 | __version__ = "0.2.4"
6 |
```
--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------
```python
1 | from seo_mcp.server import main as server_main
2 |
3 | def main():
4 | """Entry point for the backlinks-mcp package"""
5 | server_main()
6 |
7 | if __name__ == "__main__":
8 | main()
```
--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
```toml
1 | [build-system]
2 | requires = ["pdm-backend>=2.4.0"]
3 | build-backend = "pdm.backend"
4 |
5 | [project]
6 | name = "seo-mcp"
7 | version = "0.2.4"
8 | description = "A free SEO tool MCP (Model Control Protocol) service based on Ahrefs data. Includes features such as backlinks, keyword ideas, and more."
9 | readme = "README.md"
10 | authors = [
11 | {name = "cnych", email = "[email protected]"}
12 | ]
13 | license = {text = "MIT"}
14 | classifiers = [
15 | "Programming Language :: Python :: 3",
16 | "License :: OSI Approved :: MIT License",
17 | "Operating System :: OS Independent",
18 | ]
19 | requires-python = ">=3.10"
20 | dependencies = [
21 | "fastmcp>=2.0.0",
22 | "requests>=2.32.3",
23 | "pydantic>=2.5.0",
24 | "pydantic-core>=2.14.0"
25 | ]
26 |
27 | [project.scripts]
28 | seo-mcp = "seo_mcp.server:main"
29 |
30 | [project.urls]
31 | "Homepage" = "https://github.com/cnych/seo-mcp"
32 | "Bug Tracker" = "https://github.com/cnych/seo-mcp/issues"
33 |
34 | [tool.setuptools]
35 | packages = ["seo_mcp"]
36 | package-dir = {"" = "src"}
37 |
```
--------------------------------------------------------------------------------
/src/seo_mcp/logger.py:
--------------------------------------------------------------------------------
```python
1 | """
2 | A general logging module providing a unified logging function
3 | """
4 | import logging
5 | import os
6 | from datetime import datetime
7 |
8 |
9 | DEBUG = os.environ.get("DEBUG", "False")
10 |
11 |
12 | def setup_logger(name: str, log_dir: str = "logs", level: int = logging.INFO) -> logging.Logger:
13 | """
14 | Setup a logger for the given name
15 |
16 | Args:
17 | name: The name of the logger
18 | log_dir: The directory to save the log files
19 | level: The level of the logger
20 |
21 | Returns:
22 | logging.Logger: The configured logger
23 | """
24 | if not DEBUG:
25 | return logging.getLogger(name)
26 |
27 | # Create the log directory
28 | os.makedirs(log_dir, exist_ok=True)
29 |
30 | # Create the log file name, format: module_name_YYYYMMDD.log
31 | log_file = os.path.join(log_dir, f"{name}_{datetime.now().strftime('%Y%m%d')}.log")
32 |
33 | # Create the logger
34 | logger = logging.getLogger(name)
35 | logger.setLevel(level)
36 |
37 | # If the logger already has handlers, don't add a new handler
38 | if not logger.handlers:
39 | # Create the file handler
40 | file_handler = logging.FileHandler(log_file, encoding='utf-8')
41 | file_handler.setLevel(level)
42 |
43 | # Create the console handler
44 | console_handler = logging.StreamHandler()
45 | console_handler.setLevel(level)
46 |
47 | # Create the formatter
48 | formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
49 | file_handler.setFormatter(formatter)
50 | console_handler.setFormatter(formatter)
51 |
52 | # Add the handlers to the logger
53 | logger.addHandler(file_handler)
54 | logger.addHandler(console_handler)
55 |
56 | return logger
```
--------------------------------------------------------------------------------
/src/seo_mcp/traffic.py:
--------------------------------------------------------------------------------
```python
1 | """
2 | Check the estimated search traffic for any website. Try Ahrefs' free traffic checker.
3 | """
4 |
5 | from typing import Optional, Dict, Any, Literal, List
6 | import requests
7 | import json
8 |
9 |
10 | def check_traffic(token: str, domain_or_url: str, mode: Literal["subdomains", "exact"] = "subdomains", country: str = "None") -> Optional[Dict[str, Any]]:
11 | """
12 | Check the estimated search traffic for any website.
13 |
14 | Args:
15 | domain_or_url (str): The domain or URL to query
16 | token (str): Verification token
17 | mode (str): Query mode, default is "subdomains"
18 | country (str): Country, default is "None"
19 |
20 | Returns:
21 | Optional[Dict[str, Any]]: Dictionary containing traffic data, returns None if request fails
22 | """
23 | if not token:
24 | return None
25 |
26 | url = "https://ahrefs.com/v4/stGetFreeTrafficOverview"
27 |
28 | # 将参数转换为JSON字符串,然后作为单个input参数传递
29 | params = {
30 | "input": json.dumps({
31 | "captcha": token,
32 | "country": country,
33 | "protocol": "None",
34 | "mode": mode,
35 | "url": domain_or_url
36 | })
37 | }
38 |
39 | headers = {
40 | "accept": "*/*",
41 | "content-type": "application/json",
42 | "referer": f"https://ahrefs.com/traffic-checker/?input={domain_or_url}&mode={mode}"
43 | }
44 |
45 | try:
46 | response = requests.get(url, params=params, headers=headers)
47 | if response.status_code != 200:
48 | return None
49 |
50 | data: Optional[List[Any]] = response.json()
51 |
52 | # 检查响应数据格式
53 | if not isinstance(data, list) or len(data) < 2 or data[0] != "Ok":
54 | return None
55 |
56 | # 提取有效数据
57 | traffic_data = data[1]
58 |
59 | # 格式化返回结果
60 | result = {
61 | "traffic_history": traffic_data.get("traffic_history", []),
62 | "traffic": {
63 | "trafficMonthlyAvg": traffic_data.get("traffic", {}).get("trafficMonthlyAvg", 0),
64 | "costMontlyAvg": traffic_data.get("traffic", {}).get("costMontlyAvg", 0)
65 | },
66 | "top_pages": traffic_data.get("top_pages", []),
67 | "top_countries": traffic_data.get("top_countries", []),
68 | "top_keywords": traffic_data.get("top_keywords", [])
69 | }
70 |
71 | return result
72 | except Exception as e:
73 | return None
```
--------------------------------------------------------------------------------
/src/seo_mcp/server.py:
--------------------------------------------------------------------------------
```python
1 | """
2 | SEO MCP Server: A free SEO tool MCP (Model Control Protocol) service based on Ahrefs data. Includes features such as backlinks, keyword ideas, and more.
3 | """
4 | import requests
5 | import time
6 | import os
7 | import urllib.parse
8 | from typing import Dict, List, Optional, Any, Literal
9 |
10 | from fastmcp import FastMCP
11 |
12 | from seo_mcp.backlinks import get_backlinks, load_signature_from_cache, get_signature_and_overview
13 | from seo_mcp.keywords import get_keyword_ideas, get_keyword_difficulty
14 | from seo_mcp.traffic import check_traffic
15 |
16 |
17 | mcp = FastMCP("SEO MCP")
18 |
19 | # CapSolver website: https://dashboard.capsolver.com/passport/register?inviteCode=1dTH7WQSfHD0
20 | # Get API Key from environment variable - must be set for production use
21 | api_key = os.environ.get("CAPSOLVER_API_KEY")
22 |
23 |
24 | def get_capsolver_token(site_url: str) -> Optional[str]:
25 | """
26 | Use CapSolver to solve the captcha and get a token
27 |
28 | Args:
29 | site_url: Site URL to query
30 |
31 | Returns:
32 | Verification token or None if failed
33 | """
34 | if not api_key:
35 | return None
36 |
37 | payload = {
38 | "clientKey": api_key,
39 | "task": {
40 | "type": 'AntiTurnstileTaskProxyLess',
41 | "websiteKey": "0x4AAAAAAAAzi9ITzSN9xKMi", # site key of your target site: ahrefs.com,
42 | "websiteURL": site_url,
43 | "metadata": {
44 | "action": "" # optional
45 | }
46 | }
47 | }
48 | res = requests.post("https://api.capsolver.com/createTask", json=payload)
49 | resp = res.json()
50 | task_id = resp.get("taskId")
51 | if not task_id:
52 | return None
53 |
54 | while True:
55 | time.sleep(1) # delay
56 | payload = {"clientKey": api_key, "taskId": task_id}
57 | res = requests.post("https://api.capsolver.com/getTaskResult", json=payload)
58 | resp = res.json()
59 | status = resp.get("status")
60 | if status == "ready":
61 | token = resp.get("solution", {}).get('token')
62 | return token
63 | if status == "failed" or resp.get("errorId"):
64 | return None
65 |
66 |
67 | @mcp.tool()
68 | def get_backlinks_list(domain: str) -> Optional[Dict[str, Any]]:
69 | """
70 | Get backlinks list for the specified domain
71 | Args:
72 | domain (str): The domain to query
73 | Returns:
74 | List of backlinks for the domain, containing title, URL, domain rating, etc.
75 | """
76 | # Try to get signature from cache
77 | signature, valid_until, overview_data = load_signature_from_cache(domain)
78 |
79 | # If no valid signature in cache, get a new one
80 | if not signature or not valid_until:
81 | # Step 1: Get token
82 | site_url = f"https://ahrefs.com/backlink-checker/?input={domain}&mode=subdomains"
83 | token = get_capsolver_token(site_url)
84 | if not token:
85 | raise Exception(f"Failed to get verification token for domain: {domain}")
86 |
87 | # Step 2: Get signature and validUntil
88 | signature, valid_until, overview_data = get_signature_and_overview(token, domain)
89 | if not signature or not valid_until:
90 | raise Exception(f"Failed to get signature for domain: {domain}")
91 |
92 | # Step 3: Get backlinks list
93 | backlinks = get_backlinks(signature, valid_until, domain)
94 | return {
95 | "overview": overview_data,
96 | "backlinks": backlinks
97 | }
98 |
99 |
100 | @mcp.tool()
101 | def keyword_generator(keyword: str, country: str = "us", search_engine: str = "Google") -> Optional[List[str]]:
102 | """
103 | Get keyword ideas for the specified keyword
104 | """
105 | site_url = f"https://ahrefs.com/keyword-generator/?country={country}&input={urllib.parse.quote(keyword)}"
106 | token = get_capsolver_token(site_url)
107 | if not token:
108 | raise Exception(f"Failed to get verification token for keyword: {keyword}")
109 | return get_keyword_ideas(token, keyword, country, search_engine)
110 |
111 |
112 | @mcp.tool()
113 | def get_traffic(domain_or_url: str, country: str = "None", mode: Literal["subdomains", "exact"] = "subdomains") -> Optional[Dict[str, Any]]:
114 | """
115 | Check the estimated search traffic for any website.
116 |
117 | Args:
118 | domain_or_url (str): The domain or URL to query
119 | country (str): The country to query, default is "None"
120 | mode (["subdomains", "exact"]): The mode to use for the query
121 | Returns:
122 | Traffic data for the specified domain or URL
123 | """
124 | site_url = f"https://ahrefs.com/traffic-checker/?input={domain_or_url}&mode={mode}"
125 | token = get_capsolver_token(site_url)
126 | if not token:
127 | raise Exception(f"Failed to get verification token for domain: {domain_or_url}")
128 | return check_traffic(token, domain_or_url, mode, country)
129 |
130 |
131 | @mcp.tool()
132 | def keyword_difficulty(keyword: str, country: str = "us") -> Optional[Dict[str, Any]]:
133 | """
134 | Get keyword difficulty for the specified keyword
135 | """
136 | site_url = f"https://ahrefs.com/keyword-difficulty/?country={country}&input={urllib.parse.quote(keyword)}"
137 | token = get_capsolver_token(site_url)
138 | if not token:
139 | raise Exception(f"Failed to get verification token for keyword: {keyword}")
140 | return get_keyword_difficulty(token, keyword, country)
141 |
142 |
143 | def main():
144 | """Run the MCP server"""
145 | mcp.run()
146 |
147 |
148 | if __name__ == "__main__":
149 | main()
150 |
```
--------------------------------------------------------------------------------
/src/seo_mcp/keywords.py:
--------------------------------------------------------------------------------
```python
1 | from typing import List, Optional, Any, Dict
2 |
3 | import requests
4 |
5 |
6 | def format_keyword_ideas(keyword_data: Optional[List[Any]]) -> List[str]:
7 | if not keyword_data or len(keyword_data) < 2:
8 | return ["\n❌ No valid keyword ideas retrieved"]
9 |
10 | data = keyword_data[1]
11 |
12 | result = []
13 |
14 | # 处理常规关键词推荐
15 | if "allIdeas" in data and "results" in data["allIdeas"]:
16 | all_ideas = data["allIdeas"]["results"]
17 | # total = data["allIdeas"].get("total", 0)
18 | for idea in all_ideas:
19 | simplified_idea = {
20 | "keyword": idea.get('keyword', 'No keyword'),
21 | "country": idea.get('country', '-'),
22 | "difficulty": idea.get('difficultyLabel', 'Unknown'),
23 | "volume": idea.get('volumeLabel', 'Unknown'),
24 | "updatedAt": idea.get('updatedAt', '-')
25 | }
26 | result.append({
27 | "label": "keyword ideas",
28 | "value": simplified_idea
29 | })
30 |
31 | # 处理问题类关键词推荐
32 | if "questionIdeas" in data and "results" in data["questionIdeas"]:
33 | question_ideas = data["questionIdeas"]["results"]
34 | # total = data["questionIdeas"].get("total", 0)
35 | for idea in question_ideas:
36 | simplified_idea = {
37 | "keyword": idea.get('keyword', 'No keyword'),
38 | "country": idea.get('country', '-'),
39 | "difficulty": idea.get('difficultyLabel', 'Unknown'),
40 | "volume": idea.get('volumeLabel', 'Unknown'),
41 | "updatedAt": idea.get('updatedAt', '-')
42 | }
43 | result.append({
44 | "label": "question ideas",
45 | "value": simplified_idea
46 | })
47 |
48 | if not result:
49 | return ["\n❌ No valid keyword ideas retrieved"]
50 |
51 | return result
52 |
53 |
54 | def get_keyword_ideas(token: str, keyword: str, country: str = "us", search_engine: str = "Google") -> Optional[List[str]]:
55 | if not token:
56 | return None
57 |
58 | url = "https://ahrefs.com/v4/stGetFreeKeywordIdeas"
59 | payload = {
60 | "withQuestionIdeas": True,
61 | "captcha": token,
62 | "searchEngine": search_engine,
63 | "country": country,
64 | "keyword": ["Some", keyword]
65 | }
66 |
67 | headers = {
68 | "Content-Type": "application/json"
69 | }
70 |
71 | response = requests.post(url, json=payload, headers=headers)
72 | if response.status_code != 200:
73 | return None
74 |
75 | data = response.json()
76 |
77 | return format_keyword_ideas(data)
78 |
79 |
80 | def get_keyword_difficulty(token: str, keyword: str, country: str = "us") -> Optional[Dict[str, Any]]:
81 | """
82 | Get keyword difficulty information
83 |
84 | Args:
85 | token (str): Verification token
86 | keyword (str): Keyword to query
87 | country (str): Country/region code, default is "us"
88 |
89 | Returns:
90 | Optional[Dict[str, Any]]: Dictionary containing keyword difficulty information, returns None if request fails
91 | """
92 | if not token:
93 | return None
94 |
95 | url = "https://ahrefs.com/v4/stGetFreeSerpOverviewForKeywordDifficultyChecker"
96 |
97 | payload = {
98 | "captcha": token,
99 | "country": country,
100 | "keyword": keyword
101 | }
102 |
103 | headers = {
104 | "accept": "*/*",
105 | "content-type": "application/json; charset=utf-8",
106 | "referer": f"https://ahrefs.com/keyword-difficulty/?country={country}&input={keyword}"
107 | }
108 |
109 | try:
110 | response = requests.post(url, json=payload, headers=headers)
111 | if response.status_code != 200:
112 | return None
113 |
114 | data: Optional[List[Any]] = response.json()
115 | # 检查响应数据格式
116 | if not isinstance(data, list) or len(data) < 2 or data[0] != "Ok":
117 | return None
118 |
119 | # 提取有效数据
120 | kd_data = data[1]
121 |
122 | # 格式化返回结果
123 | result = {
124 | "difficulty": kd_data.get("difficulty", 0), # Keyword difficulty
125 | "shortage": kd_data.get("shortage", 0), # Keyword shortage
126 | "lastUpdate": kd_data.get("lastUpdate", ""), # Last update time
127 | "serp": {
128 | "results": []
129 | }
130 | }
131 |
132 | # 处理SERP结果
133 | if "serp" in kd_data and "results" in kd_data["serp"]:
134 | serp_results = []
135 | for item in kd_data["serp"]["results"]:
136 | # 只处理有机搜索结果
137 | if item.get("content") and item["content"][0] == "organic":
138 | organic_data = item["content"][1]
139 | if "link" in organic_data and organic_data["link"][0] == "Some":
140 | link_data = organic_data["link"][1]
141 | result_item = {
142 | "title": link_data.get("title", ""),
143 | "url": link_data.get("url", [None, {}])[1].get("url", ""),
144 | "position": item.get("pos", 0)
145 | }
146 |
147 | # 添加指标数据(如果有)
148 | if "metrics" in link_data and link_data["metrics"]:
149 | metrics = link_data["metrics"]
150 | result_item.update({
151 | "domainRating": metrics.get("domainRating", 0),
152 | "urlRating": metrics.get("urlRating", 0),
153 | "traffic": metrics.get("traffic", 0),
154 | "keywords": metrics.get("keywords", 0),
155 | "topKeyword": metrics.get("topKeyword", ""),
156 | "topVolume": metrics.get("topVolume", 0)
157 | })
158 |
159 | serp_results.append(result_item)
160 |
161 | result["serp"]["results"] = serp_results
162 |
163 | return result
164 | except Exception:
165 | return None
166 |
```
--------------------------------------------------------------------------------
/src/seo_mcp/backlinks.py:
--------------------------------------------------------------------------------
```python
1 | from typing import Any, List, Optional, Dict, Tuple, cast
2 | import os
3 | import json
4 | import time
5 | from datetime import datetime
6 | import requests
7 |
8 | # Cache file path for storing signatures
9 | SIGNATURE_CACHE_FILE = "signature_cache.json"
10 |
11 |
12 | def iso_to_timestamp(iso_date_string: str) -> float:
13 | """
14 | Convert ISO 8601 format datetime string to timestamp
15 | Example: "2025-04-12T14:59:18Z" -> 1744916358.0
16 | """
17 | # Handle UTC time represented by "Z"
18 | if iso_date_string.endswith('Z'):
19 | iso_date_string = iso_date_string[:-1] + '+00:00'
20 | dt = datetime.fromisoformat(iso_date_string)
21 | return dt.timestamp()
22 |
23 |
24 |
25 | def save_signature_to_cache(signature: str, valid_until: str, overview_data: Dict[str, Any], domain: str) -> bool:
26 | """
27 | Save signature information to local cache file
28 |
29 | Args:
30 | signature: Obtained signature
31 | valid_until: Signature expiration time
32 | domain: Domain name
33 |
34 | Returns:
35 | True if saved successfully, False otherwise
36 | """
37 | # Read existing cache
38 | cache_data: Dict[str, Dict[str, Any]] = {}
39 | if os.path.exists(SIGNATURE_CACHE_FILE):
40 | try:
41 | with open(SIGNATURE_CACHE_FILE, 'r') as f:
42 | cache_data = json.load(f)
43 | except:
44 | pass
45 |
46 | # Update cache for current domain
47 | cache_data[domain] = {
48 | "signature": signature,
49 | "valid_until": valid_until,
50 | "overview_data": overview_data,
51 | "timestamp": datetime.now().timestamp()
52 | }
53 |
54 | try:
55 | with open(SIGNATURE_CACHE_FILE, 'w') as f:
56 | json.dump(cache_data, f)
57 | return True
58 | except Exception as e:
59 | return False
60 |
61 |
62 | def load_signature_from_cache(domain: str) -> Tuple[Optional[str], Optional[str], Optional[Dict[str, Any]]]:
63 | """
64 | Load signature information for a specific domain from local cache file
65 | Returns the signature and valid_until if cache is valid, otherwise None
66 |
67 | Args:
68 | domain: Domain to query
69 |
70 | Returns:
71 | (signature, valid_until) tuple, or (None, None) if no valid cache
72 | """
73 | if not os.path.exists(SIGNATURE_CACHE_FILE):
74 | return None, None, None
75 |
76 | try:
77 | with open(SIGNATURE_CACHE_FILE, 'r') as f:
78 | cache_data = json.load(f)
79 |
80 | # Check if cache exists for current domain
81 | if domain not in cache_data:
82 | return None, None, None
83 |
84 | domain_cache = cache_data[domain]
85 |
86 | # Check if signature is expired
87 | valid_until = domain_cache.get("valid_until")
88 |
89 | if valid_until:
90 | # Convert ISO date string to timestamp for comparison
91 | valid_until_timestamp = iso_to_timestamp(valid_until)
92 | current_time = time.time()
93 |
94 | if current_time < valid_until_timestamp:
95 | return domain_cache.get("signature"), valid_until, domain_cache.get("overview_data")
96 | else:
97 | return None, None, None
98 | else:
99 | return None, None, None
100 | except Exception:
101 | return None, None, None
102 |
103 |
104 |
105 | def get_signature_and_overview(token: str, domain: str) -> Tuple[Optional[str], Optional[str], Optional[Dict[str, Any]]]:
106 | """
107 | Get signature and validUntil parameters using the token
108 |
109 | Args:
110 | token: Verification token
111 | domain: Domain to query
112 |
113 | Returns:
114 | (signature, valid_until, overview_data) tuple, or (None, None, None) if failed
115 | """
116 | url = "https://ahrefs.com/v4/stGetFreeBacklinksOverview"
117 | payload = {
118 | "captcha": token,
119 | "mode": "subdomains",
120 | "url": domain
121 | }
122 |
123 | headers = {
124 | "Content-Type": "application/json"
125 | }
126 |
127 | response = requests.post(url, json=payload, headers=headers)
128 | if response.status_code != 200:
129 | return None, None, None
130 |
131 | data = response.json()
132 |
133 | try:
134 | # Assuming data format is always ['Ok', {signature object}]
135 | if isinstance(data, list) and len(cast(List[Any], data)) > 1:
136 | second_element: Dict[str, Any] = cast(Dict[str, Any], data[1])
137 | signature: str = cast(str, second_element['signedInput']['signature'])
138 | valid_until: str = cast(str, second_element['signedInput']['input']['validUntil'])
139 | overview_data: Dict[str, Any] = cast(Dict[str, Any], second_element['data'])
140 |
141 | # Save the new signature to cache
142 | save_signature_to_cache(signature, valid_until, overview_data, domain)
143 |
144 | return signature, valid_until, overview_data
145 | else:
146 | return None, None, None
147 | except Exception:
148 | return None, None, None
149 |
150 |
151 | def format_backlinks(backlinks_data: List[Any], domain: str) -> List[Any]:
152 | """
153 | Format backlinks data
154 | """
155 | if backlinks_data and len(backlinks_data) > 1 and "topBacklinks" in backlinks_data[1]:
156 | backlinks = backlinks_data[1]["topBacklinks"]["backlinks"]
157 | # Only keep necessary fields
158 | simplified_backlinks = []
159 | for backlink in backlinks:
160 | simplified_backlink = {
161 | "anchor": backlink.get("anchor", ""),
162 | "domainRating": backlink.get("domainRating", 0),
163 | "title": backlink.get("title", ""),
164 | "urlFrom": backlink.get("urlFrom", ""),
165 | "urlTo": backlink.get("urlTo", ""),
166 | "edu": backlink.get("edu", False),
167 | "gov": backlink.get("gov", False),
168 | }
169 | simplified_backlinks.append(simplified_backlink)
170 | return simplified_backlinks
171 | else:
172 | return []
173 |
174 |
175 | def get_backlinks(signature: str, valid_until: str, domain: str) -> Optional[List[Any]]:
176 | if not signature or not valid_until:
177 | return None
178 |
179 | url = "https://ahrefs.com/v4/stGetFreeBacklinksList"
180 | payload = {
181 | "reportType": "TopBacklinks",
182 | "signedInput": {
183 | "signature": signature,
184 | "input": {
185 | "validUntil": valid_until,
186 | "mode": "subdomains",
187 | "url": f"{domain}/"
188 | }
189 | }
190 | }
191 |
192 | headers = {
193 | "Content-Type": "application/json"
194 | }
195 |
196 | response = requests.post(url, json=payload, headers=headers)
197 | if response.status_code != 200:
198 | return None
199 |
200 | data = response.json()
201 |
202 | return format_backlinks(data, domain)
203 |
204 |
205 |
206 | def get_backlinks_overview(signature: str, valid_until: str, domain: str) -> Optional[Dict[str, Any]]:
207 | """
208 | Retrieve backlinks overview data for a domain using Ahrefs API.
209 |
210 | Args:
211 | signature: The authentication signature
212 | valid_until: Signature expiration timestamp
213 | domain: The domain to get overview data for
214 |
215 | Returns:
216 | Dictionary containing backlinks overview data or None if request fails
217 | """
218 | if not signature or not valid_until:
219 | print("ERROR: No signature or valid_until, cannot proceed")
220 | return None
221 |
222 | url = "https://ahrefs.com/v4/stGetFreeBacklinksOverview"
223 | payload = {
224 | "captcha": signature,
225 | "mode": "subdomains",
226 | "url": domain
227 | }
228 |
229 | headers = {
230 | "Content-Type": "application/json",
231 | "accept": "*/*",
232 | "sec-fetch-site": "same-origin"
233 | }
234 |
235 | try:
236 | response = requests.post(url, json=payload, headers=headers)
237 | if response.status_code != 200:
238 | print(f"ERROR: Failed to get backlinks overview, status code: {response.status_code}, response: {response.text}")
239 | return None
240 |
241 | data = response.json()
242 | return data
243 | except Exception:
244 | return None
245 |
```