#
tokens: 47391/50000 40/47 files (page 1/2)
lines: off (toggle) GitHub
raw markdown copy
This is page 1 of 2. Use http://codebase.md/king-of-the-grackles/reddit-mcp-poc?page={x} to view the full context.

# Directory Structure

```
├── .env.sample
├── .gemini
│   └── settings.json
├── .gitignore
├── .python-version
├── .specify
│   ├── memory
│   │   └── constitution.md
│   ├── scripts
│   │   └── bash
│   │       ├── check-implementation-prerequisites.sh
│   │       ├── check-task-prerequisites.sh
│   │       ├── common.sh
│   │       ├── create-new-feature.sh
│   │       ├── get-feature-paths.sh
│   │       ├── setup-plan.sh
│   │       └── update-agent-context.sh
│   └── templates
│       ├── agent-file-template.md
│       ├── plan-template.md
│       ├── spec-template.md
│       └── tasks-template.md
├── package.json
├── pyproject.toml
├── README.md
├── reddit-research-agent.md
├── reports
│   ├── ai-llm-weekly-trends-reddit-analysis-2025-01-20.md
│   ├── saas-solopreneur-reddit-communities.md
│   ├── top-50-active-AI-subreddits.md
│   ├── top-50-subreddits-saas-ai-builders.md
│   └── top-50-subreddits-saas-solopreneurs.md
├── server.json
├── specs
│   ├── 003-fastmcp-context-integration.md
│   ├── 003-implementation-summary.md
│   ├── 003-phase-1-context-integration.md
│   ├── 003-phase-2-progress-monitoring.md
│   ├── agent-reasoning-visibility.md
│   ├── agentic-discovery-architecture.md
│   ├── chroma-proxy-architecture.md
│   ├── deep-research-reddit-architecture.md
│   └── reddit-research-agent-spec.md
├── src
│   ├── __init__.py
│   ├── chroma_client.py
│   ├── config.py
│   ├── models.py
│   ├── resources.py
│   ├── server.py
│   └── tools
│       ├── __init__.py
│       ├── comments.py
│       ├── discover.py
│       ├── posts.py
│       └── search.py
├── tests
│   ├── test_context_integration.py
│   └── test_tools.py
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/.python-version:
--------------------------------------------------------------------------------

```
3.12

```

--------------------------------------------------------------------------------
/.env.sample:
--------------------------------------------------------------------------------

```
# Reddit API Configuration (Required)
REDDIT_CLIENT_ID=your_client_id_here
REDDIT_CLIENT_SECRET=your_client_secret_here
REDDIT_USER_AGENT=RedditMCP/1.0 by u/your_username

# Descope Authentication (Required)
DESCOPE_PROJECT_ID=P2abc...123
SERVER_URL=http://localhost:8000
DESCOPE_BASE_URL=https://api.descope.com

# Vector Database Proxy Authentication (Optional)
# The hosted service handles this automatically.
# For development with your own proxy server:
# CHROMA_PROXY_URL=https://your-proxy.com
# CHROMA_PROXY_API_KEY=your_api_key_here
```

--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------

```
# Critical Security - Environment Variables and Secrets
.env
.env.*
!.env.sample
!.env.example
*.key
*.pem
*.cert
*.crt
secrets/
credentials/
config/secrets.json

# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
*.pyc
*.pyo
*.pyd
.pytest_cache/
.mypy_cache/
.dmypy.json
dmypy.json
.coverage
.coverage.*
htmlcov/
.tox/
.hypothesis/
.ruff_cache/
*.cover
*.log

# Virtual Environments
venv/
.venv/
env/
.env/
ENV/
env.bak/
venv.bak/
virtualenv/

# Package Management & Build
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
pip-log.txt
pip-delete-this-directory.txt
uv.lock

# IDEs and Editors
.vscode/
.idea/
*.swp
*.swo
*~
.project
.pydevproject
.settings/
*.sublime-project
*.sublime-workspace
.atom/
.brackets.json

# Operating System
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db
desktop.ini

# Project Specific
logs/
*.log
.claude/
*.db
*.sqlite
*.sqlite3
instance/

# Testing & Documentation
.nox/
docs/_build/
.scrapy/
target/

# Jupyter Notebook
.ipynb_checkpoints
*.ipynb_checkpoints/

# macOS
.AppleDouble
.LSOverride
Icon
.DocumentRevisions-V100
.fseventsd
.TemporaryItems
.VolumeIcon.icns
.com.apple.timemachine.donotpresent
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk

# Windows
$RECYCLE.BIN/
*.lnk
*.msi
*.msm
*.msp

# Backup files
*.bak
*.backup
*.old
*.orig
*.tmp
.history/

# FastMCP specific
.fastmcp/
fastmcp.db

# MCP Registry files
.mcpregistry_*
mcp-publisher

# Development & Research directories
fastmcp/
mcp-remote/
ai/*.rtf

```

--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------

```markdown
mcp-name: io.github.king-of-the-grackles/reddit-research-mcp

# 🔍 Reddit Research MCP Server

**Turn Reddit's chaos into structured insights with full citations**

[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![FastMCP](https://img.shields.io/badge/Built%20with-FastMCP-orange.svg)](https://github.com/jlowin/fastmcp)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

---

Your customers are on Reddit right now, comparing you to competitors, sharing pain points, requesting features. But finding those insights means hours of manual searching with no way to cite your sources.

This MCP server turns Reddit into a queryable research database that generates reports with links to every claim. Get comprehensive market research, competitive analysis, and customer insights in minutes instead of hours.

---

## 🚀 Quick Setup (60 Seconds)

**No credentials or configuration needed!** Connect to our hosted server:

### Claude Code
```bash
claude mcp add --scope local --transport http reddit-research-mcp https://reddit-research-mcp.fastmcp.app/mcp
```

### Cursor
```
cursor://anysphere.cursor-deeplink/mcp/install?name=reddit-research-mcp&config=eyJ1cmwiOiJodHRwczovL3JlZGRpdC1yZXNlYXJjaC1tY3AuZmFzdG1jcC5hcHAvbWNwIn0%3D
```

### OpenAI Codex CLI
```bash
codex mcp add reddit-research-mcp \
    npx -y mcp-remote \
    https://reddit-research-mcp.fastmcp.app/mcp \
    --auth-timeout 120 \
    --allow-http \
```

### Gemini CLI
```bash
gemini mcp add reddit-research-mcp \
  npx -y mcp-remote \
  https://reddit-research-mcp.fastmcp.app/mcp \
  --auth-timeout 120 \
  --allow-http
```

### Direct MCP Server URL
For other AI assistants: `https://reddit-research-mcp.fastmcp.app/mcp`

---

## 🎯 What You Can Do

### Competitive Analysis
```
"What are developers saying about Next.js vs Remix?"
```
→ Get a comprehensive report comparing sentiment, feature requests, pain points, and migration experiences with links to every mentioned discussion.

### Customer Discovery
```
"Find the top complaints about existing CRM tools in small business communities"
```
→ Discover unmet needs, feature gaps, and pricing concerns directly from your target market with citations to real user feedback.

### Market Research
```
"Analyze sentiment about AI coding assistants across developer communities"
```
→ Track adoption trends, concerns, success stories, and emerging use cases with temporal analysis showing how opinions evolved.

### Product Validation
```
"What problems are SaaS founders having with subscription billing?"
```
→ Identify pain points and validate your solution with evidence from actual discussions, not assumptions.

---

## ✨ Why This Server?

**Built for decision-makers who need evidence-based insights.** Every report links back to actual Reddit posts and comments. When you say "users are complaining about X," you'll have the receipts to prove it. Check the `/reports` folder for examples of deep-research reports with full citation trails.

**Zero-friction setup designed for non-technical users.** Most MCP servers require cloning repos, managing Python environments, and hunting for API keys in developer dashboards. This one? Just paste the URL into Claude and start researching. Our hosted solution means no terminal commands, no credential management, no setup headaches.

**Semantic search across 20,000+ active subreddits.** Reddit's API caps at 250 search results - useless for comprehensive research. We pre-indexed every active subreddit (2k+ members, active in last 7 days) with vector embeddings. Now you search conceptually across all of Reddit, finding relevant communities you didn't even know existed. Built with the [layered abstraction pattern](https://engineering.block.xyz/blog/build-mcp-tools-like-ogres-with-layers) for scalability.

---

## 📚 Specifications

Some of the AI-generated specs that were used to build this project with Claude Code:
- 📖 [Architecture Overview](specs/agentic-discovery-architecture.md) - System design and component interaction
- 🤖 [Research Agent Details](specs/reddit-research-agent-spec.md) - Agent implementation patterns
- 🔍 [Deep Research Architecture](specs/deep-research-reddit-architecture.md) - Research workflow and citation system
- 🗄️ [ChromaDB Proxy Architecture](specs/chroma-proxy-architecture.md) - Vector search and authentication layer

---

## Technical Details

<details>
<summary><strong>🛠️ Core MCP Tools</strong></summary>

#### Discover Communities
```python
execute_operation("discover_subreddits", {
    "topic": "machine learning",
    "limit": 15
})
```

#### Search Across Reddit
```python
execute_operation("search_all", {
    "query": "ChatGPT experiences",
    "time_filter": "week",
    "limit": 25
})
```

#### Batch Fetch Posts
```python
execute_operation("fetch_multiple", {
    "subreddit_names": ["technology", "programming"],
    "limit_per_subreddit": 10,
    "time_filter": "day"
})
```

#### Deep Dive with Comments
```python
execute_operation("fetch_comments", {
    "submission_id": "abc123",
    "comment_limit": 200,
    "sort": "best"
})
```
</details>

<details>
<summary><strong>📁 Project Structure</strong></summary>

```
reddit-research-mcp/
├── src/
│   ├── server.py          # FastMCP server
│   ├── config.py          # Reddit configuration
│   ├── chroma_client.py   # Vector database proxy
│   ├── resources.py       # MCP resources
│   ├── models.py          # Data models
│   └── tools/
│       ├── search.py      # Search operations
│       ├── posts.py       # Post fetching
│       ├── comments.py    # Comment retrieval
│       └── discover.py    # Subreddit discovery
├── tests/                 # Test suite
├── reports/               # Example reports
└── specs/                 # Architecture docs
```
</details>

<details>
<summary><strong>🚀 Contributing & Tech Stack</strong></summary>

This project uses:
- Python 3.11+ with type hints
- FastMCP for the server framework
- Vector search via authenticated proxy (Render.com)
- ChromaDB for semantic search
- PRAW for Reddit API interaction

---

<div align="center">

**Stop guessing. Start knowing what your market actually thinks.**

[GitHub](https://github.com/king-of-the-grackles/reddit-research-mcp) • [Report Issues](https://github.com/king-of-the-grackles/reddit-research-mcp/issues) • [Request Features](https://github.com/king-of-the-grackles/reddit-research-mcp/issues)

</div>
```

--------------------------------------------------------------------------------
/src/__init__.py:
--------------------------------------------------------------------------------

```python

```

--------------------------------------------------------------------------------
/src/tools/__init__.py:
--------------------------------------------------------------------------------

```python

```

--------------------------------------------------------------------------------
/.gemini/settings.json:
--------------------------------------------------------------------------------

```json
{
  "mcpServers": {
    "reddit-research-mcp": {
      "httpUrl": "https://reddit-research-mcp.fastmcp.app/mcp"
    },
    "local-reddit-research-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "http://localhost:8000/mcp",
        "--auth-timeout",
        "90",
        "--allow-http",
        "--debug"
      ]
    }
  }
}
```

--------------------------------------------------------------------------------
/.specify/scripts/bash/get-feature-paths.sh:
--------------------------------------------------------------------------------

```bash
#!/usr/bin/env bash
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/common.sh"
eval $(get_feature_paths)
check_feature_branch "$CURRENT_BRANCH" || exit 1
echo "REPO_ROOT: $REPO_ROOT"; echo "BRANCH: $CURRENT_BRANCH"; echo "FEATURE_DIR: $FEATURE_DIR"; echo "FEATURE_SPEC: $FEATURE_SPEC"; echo "IMPL_PLAN: $IMPL_PLAN"; echo "TASKS: $TASKS"

```

--------------------------------------------------------------------------------
/.specify/templates/agent-file-template.md:
--------------------------------------------------------------------------------

```markdown
# [PROJECT NAME] Development Guidelines

Auto-generated from all feature plans. Last updated: [DATE]

## Active Technologies
[EXTRACTED FROM ALL PLAN.MD FILES]

## Project Structure
```
[ACTUAL STRUCTURE FROM PLANS]
```

## Commands
[ONLY COMMANDS FOR ACTIVE TECHNOLOGIES]

## Code Style
[LANGUAGE-SPECIFIC, ONLY FOR LANGUAGES IN USE]

## Recent Changes
[LAST 3 FEATURES AND WHAT THEY ADDED]

<!-- MANUAL ADDITIONS START -->
<!-- MANUAL ADDITIONS END -->
```

--------------------------------------------------------------------------------
/server.json:
--------------------------------------------------------------------------------

```json
{
  "$schema": "https://static.modelcontextprotocol.io/schemas/2025-07-09/server.schema.json",
  "name": "io.github.king-of-the-grackles/reddit-research-mcp",
  "description": "Turn Reddit's chaos into structured insights with full citations - MCP server for Reddit research",
  "status": "active",
  "repository": {
    "url": "https://github.com/king-of-the-grackles/reddit-research-mcp",
    "source": "github"
  },
  "version": "0.1.1",
  "packages": [
    {
      "registry_type": "pypi",
      "identifier": "reddit-research-mcp",
      "version": "0.1.1",
      "transport": {
        "type": "stdio"
      },
      "environment_variables": []
    }
  ]
}
```

--------------------------------------------------------------------------------
/.specify/scripts/bash/setup-plan.sh:
--------------------------------------------------------------------------------

```bash
#!/usr/bin/env bash
set -e
JSON_MODE=false
for arg in "$@"; do case "$arg" in --json) JSON_MODE=true ;; --help|-h) echo "Usage: $0 [--json]"; exit 0 ;; esac; done
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/common.sh"
eval $(get_feature_paths)
check_feature_branch "$CURRENT_BRANCH" || exit 1
mkdir -p "$FEATURE_DIR"
TEMPLATE="$REPO_ROOT/.specify/templates/plan-template.md"
[[ -f "$TEMPLATE" ]] && cp "$TEMPLATE" "$IMPL_PLAN"
if $JSON_MODE; then
  printf '{"FEATURE_SPEC":"%s","IMPL_PLAN":"%s","SPECS_DIR":"%s","BRANCH":"%s"}\n' \
    "$FEATURE_SPEC" "$IMPL_PLAN" "$FEATURE_DIR" "$CURRENT_BRANCH"
else
  echo "FEATURE_SPEC: $FEATURE_SPEC"; echo "IMPL_PLAN: $IMPL_PLAN"; echo "SPECS_DIR: $FEATURE_DIR"; echo "BRANCH: $CURRENT_BRANCH"
fi

```

--------------------------------------------------------------------------------
/.specify/scripts/bash/common.sh:
--------------------------------------------------------------------------------

```bash
#!/usr/bin/env bash
# (Moved to scripts/bash/) Common functions and variables for all scripts

get_repo_root() { git rev-parse --show-toplevel; }
get_current_branch() { git rev-parse --abbrev-ref HEAD; }

check_feature_branch() {
    local branch="$1"
    if [[ ! "$branch" =~ ^[0-9]{3}- ]]; then
        echo "ERROR: Not on a feature branch. Current branch: $branch" >&2
        echo "Feature branches should be named like: 001-feature-name" >&2
        return 1
    fi; return 0
}

get_feature_dir() { echo "$1/specs/$2"; }

get_feature_paths() {
    local repo_root=$(get_repo_root)
    local current_branch=$(get_current_branch)
    local feature_dir=$(get_feature_dir "$repo_root" "$current_branch")
    cat <<EOF
REPO_ROOT='$repo_root'
CURRENT_BRANCH='$current_branch'
FEATURE_DIR='$feature_dir'
FEATURE_SPEC='$feature_dir/spec.md'
IMPL_PLAN='$feature_dir/plan.md'
TASKS='$feature_dir/tasks.md'
RESEARCH='$feature_dir/research.md'
DATA_MODEL='$feature_dir/data-model.md'
QUICKSTART='$feature_dir/quickstart.md'
CONTRACTS_DIR='$feature_dir/contracts'
EOF
}

check_file() { [[ -f "$1" ]] && echo "  ✓ $2" || echo "  ✗ $2"; }
check_dir() { [[ -d "$1" && -n $(ls -A "$1" 2>/dev/null) ]] && echo "  ✓ $2" || echo "  ✗ $2"; }

```

--------------------------------------------------------------------------------
/.specify/scripts/bash/check-task-prerequisites.sh:
--------------------------------------------------------------------------------

```bash
#!/usr/bin/env bash
set -e
JSON_MODE=false
for arg in "$@"; do case "$arg" in --json) JSON_MODE=true ;; --help|-h) echo "Usage: $0 [--json]"; exit 0 ;; esac; done
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/common.sh"
eval $(get_feature_paths)
check_feature_branch "$CURRENT_BRANCH" || exit 1
if [[ ! -d "$FEATURE_DIR" ]]; then echo "ERROR: Feature directory not found: $FEATURE_DIR"; echo "Run /specify first."; exit 1; fi
if [[ ! -f "$IMPL_PLAN" ]]; then echo "ERROR: plan.md not found in $FEATURE_DIR"; echo "Run /plan first."; exit 1; fi
if $JSON_MODE; then
  docs=(); [[ -f "$RESEARCH" ]] && docs+=("research.md"); [[ -f "$DATA_MODEL" ]] && docs+=("data-model.md"); ([[ -d "$CONTRACTS_DIR" ]] && [[ -n "$(ls -A "$CONTRACTS_DIR" 2>/dev/null)" ]]) && docs+=("contracts/"); [[ -f "$QUICKSTART" ]] && docs+=("quickstart.md");
  json_docs=$(printf '"%s",' "${docs[@]}"); json_docs="[${json_docs%,}]"; printf '{"FEATURE_DIR":"%s","AVAILABLE_DOCS":%s}\n' "$FEATURE_DIR" "$json_docs"
else
  echo "FEATURE_DIR:$FEATURE_DIR"; echo "AVAILABLE_DOCS:"; check_file "$RESEARCH" "research.md"; check_file "$DATA_MODEL" "data-model.md"; check_dir "$CONTRACTS_DIR" "contracts/"; check_file "$QUICKSTART" "quickstart.md"; fi

```

--------------------------------------------------------------------------------
/src/models.py:
--------------------------------------------------------------------------------

```python
from typing import List, Optional, Dict, Any
from pydantic import BaseModel, Field
from datetime import datetime


class RedditPost(BaseModel):
    """Model for a Reddit post/submission."""
    id: str
    title: str
    author: str
    subreddit: str
    score: int
    created_utc: float
    url: str
    num_comments: int
    selftext: Optional[str] = None
    upvote_ratio: Optional[float] = None
    permalink: Optional[str] = None


class SubredditInfo(BaseModel):
    """Model for subreddit metadata."""
    name: str
    subscribers: int
    description: str


class Comment(BaseModel):
    """Model for a Reddit comment."""
    id: str
    body: str
    author: str
    score: int
    created_utc: float
    depth: int
    replies: List['Comment'] = Field(default_factory=list)


class SearchResult(BaseModel):
    """Response model for search_reddit tool."""
    results: List[RedditPost]
    count: int


class SubredditPostsResult(BaseModel):
    """Response model for fetch_subreddit_posts tool."""
    posts: List[RedditPost]
    subreddit: SubredditInfo
    count: int


class SubmissionWithCommentsResult(BaseModel):
    """Response model for fetch_submission_with_comments tool."""
    submission: RedditPost
    comments: List[Comment]
    total_comments_fetched: int


# Allow recursive Comment model
Comment.model_rebuild()
```

--------------------------------------------------------------------------------
/.specify/scripts/bash/check-implementation-prerequisites.sh:
--------------------------------------------------------------------------------

```bash
#!/usr/bin/env bash
set -e
JSON_MODE=false
for arg in "$@"; do case "$arg" in --json) JSON_MODE=true ;; --help|-h) echo "Usage: $0 [--json]"; exit 0 ;; esac; done
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/common.sh"
eval $(get_feature_paths)
check_feature_branch "$CURRENT_BRANCH" || exit 1
if [[ ! -d "$FEATURE_DIR" ]]; then echo "ERROR: Feature directory not found: $FEATURE_DIR"; echo "Run /specify first."; exit 1; fi
if [[ ! -f "$IMPL_PLAN" ]]; then echo "ERROR: plan.md not found in $FEATURE_DIR"; echo "Run /plan first."; exit 1; fi
if [[ ! -f "$TASKS" ]]; then echo "ERROR: tasks.md not found in $FEATURE_DIR"; echo "Run /tasks first."; exit 1; fi
if $JSON_MODE; then
  docs=(); [[ -f "$RESEARCH" ]] && docs+=("research.md"); [[ -f "$DATA_MODEL" ]] && docs+=("data-model.md"); ([[ -d "$CONTRACTS_DIR" ]] && [[ -n "$(ls -A "$CONTRACTS_DIR" 2>/dev/null)" ]]) && docs+=("contracts/"); [[ -f "$QUICKSTART" ]] && docs+=("quickstart.md"); [[ -f "$TASKS" ]] && docs+=("tasks.md");
  json_docs=$(printf '"%s",' "${docs[@]}"); json_docs="[${json_docs%,}]"; printf '{"FEATURE_DIR":"%s","AVAILABLE_DOCS":%s}\n' "$FEATURE_DIR" "$json_docs"
else
  echo "FEATURE_DIR:$FEATURE_DIR"; echo "AVAILABLE_DOCS:"; check_file "$RESEARCH" "research.md"; check_file "$DATA_MODEL" "data-model.md"; check_dir "$CONTRACTS_DIR" "contracts/"; check_file "$QUICKSTART" "quickstart.md"; check_file "$TASKS" "tasks.md"; fi
```

--------------------------------------------------------------------------------
/src/config.py:
--------------------------------------------------------------------------------

```python
import praw
import os
from pathlib import Path
from dotenv import load_dotenv

def get_reddit_client() -> praw.Reddit:
    """Get configured Reddit client (read-only) from environment."""
    client_id = None
    client_secret = None
    user_agent = None
    
    # Method 1: Try environment variables
    client_id = os.environ.get("REDDIT_CLIENT_ID")
    client_secret = os.environ.get("REDDIT_CLIENT_SECRET")
    user_agent = os.environ.get("REDDIT_USER_AGENT", "RedditMCP/1.0")
    
    # Method 2: Try loading from .env file (local development)
    if not client_id or not client_secret:
        # Find .env file in project root
        env_path = Path(__file__).parent.parent / '.env'
        if env_path.exists():
            load_dotenv(env_path)
            client_id = os.getenv("REDDIT_CLIENT_ID")
            client_secret = os.getenv("REDDIT_CLIENT_SECRET")
            if not user_agent:
                user_agent = os.getenv("REDDIT_USER_AGENT", "RedditMCP/1.0")
    
    if not client_id or not client_secret:
        raise ValueError(
            "Reddit API credentials not found. Please set REDDIT_CLIENT_ID "
            "and REDDIT_CLIENT_SECRET either as OS environment variables or in a .env file"
        )
    
    # Create Reddit instance for read-only access
    reddit = praw.Reddit(
        client_id=client_id,
        client_secret=client_secret,
        user_agent=user_agent,
        redirect_uri="http://localhost:8080",  # Required even for read-only
        ratelimit_seconds=300  # Auto-handle rate limits
    )
    
    # Explicitly enable read-only mode
    reddit.read_only = True
    
    return reddit
```

--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------

```toml
[project]
name = "reddit-research-mcp"
version = "0.1.1"
description = "A minimal Model Context Protocol server for Reddit content access"
readme = "README.md"
requires-python = ">=3.11"
authors = [
  { name="King of the Grackles", email="[email protected]" },
]
license = {text = "MIT"}
classifiers = [
    "Development Status :: 4 - Beta",
    "Intended Audience :: Developers",
    "Topic :: Software Development :: Libraries :: Python Modules",
    "License :: OSI Approved :: MIT License",
    "Programming Language :: Python :: 3",
    "Programming Language :: Python :: 3.11",
    "Programming Language :: Python :: 3.12",
    "Operating System :: OS Independent",
]
dependencies = [
    "aiohttp>=3.12.15",
    "praw>=7.7.1",
    "fastmcp>=2.12.4",
    "openai-agents>=0.2.8",
    "pydantic>=2.0.0",
    "python-dotenv>=1.0.0",
    "starlette>=0.32.0",
    "uvicorn>=0.30.0",
    "requests>=2.31.0",
]

[project.urls]
Homepage = "https://github.com/king-of-the-grackles/reddit-research-mcp"
Repository = "https://github.com/king-of-the-grackles/reddit-research-mcp"
Issues = "https://github.com/king-of-the-grackles/reddit-research-mcp/issues"
Documentation = "https://github.com/king-of-the-grackles/reddit-research-mcp#readme"

[project.optional-dependencies]
dev = [
    "pytest>=8.0.0",
    "pytest-asyncio>=0.24.0",
    "pytest-mock>=3.14.0",
]

[project.scripts]
reddit-mcp = "src.server:main"

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.metadata]
allow-direct-references = true

[tool.hatch.build.targets.wheel]
packages = ["src"]

[tool.pytest.ini_options]
asyncio_mode = "auto"
asyncio_default_fixture_loop_scope = "function"

```

--------------------------------------------------------------------------------
/.specify/scripts/bash/create-new-feature.sh:
--------------------------------------------------------------------------------

```bash
#!/usr/bin/env bash
# (Moved to scripts/bash/) Create a new feature with branch, directory structure, and template
set -e

JSON_MODE=false
ARGS=()
for arg in "$@"; do
    case "$arg" in
        --json) JSON_MODE=true ;;
        --help|-h) echo "Usage: $0 [--json] <feature_description>"; exit 0 ;;
        *) ARGS+=("$arg") ;;
    esac
done

FEATURE_DESCRIPTION="${ARGS[*]}"
if [ -z "$FEATURE_DESCRIPTION" ]; then
    echo "Usage: $0 [--json] <feature_description>" >&2
    exit 1
fi

REPO_ROOT=$(git rev-parse --show-toplevel)
SPECS_DIR="$REPO_ROOT/specs"
mkdir -p "$SPECS_DIR"

HIGHEST=0
if [ -d "$SPECS_DIR" ]; then
    for dir in "$SPECS_DIR"/*; do
        [ -d "$dir" ] || continue
        dirname=$(basename "$dir")
        number=$(echo "$dirname" | grep -o '^[0-9]\+' || echo "0")
        number=$((10#$number))
        if [ "$number" -gt "$HIGHEST" ]; then HIGHEST=$number; fi
    done
fi

NEXT=$((HIGHEST + 1))
FEATURE_NUM=$(printf "%03d" "$NEXT")

BRANCH_NAME=$(echo "$FEATURE_DESCRIPTION" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/-/g' | sed 's/-\+/-/g' | sed 's/^-//' | sed 's/-$//')
WORDS=$(echo "$BRANCH_NAME" | tr '-' '\n' | grep -v '^$' | head -3 | tr '\n' '-' | sed 's/-$//')
BRANCH_NAME="${FEATURE_NUM}-${WORDS}"

git checkout -b "$BRANCH_NAME"

FEATURE_DIR="$SPECS_DIR/$BRANCH_NAME"
mkdir -p "$FEATURE_DIR"

TEMPLATE="$REPO_ROOT/templates/spec-template.md"
SPEC_FILE="$FEATURE_DIR/spec.md"
if [ -f "$TEMPLATE" ]; then cp "$TEMPLATE" "$SPEC_FILE"; else touch "$SPEC_FILE"; fi

if $JSON_MODE; then
    printf '{"BRANCH_NAME":"%s","SPEC_FILE":"%s","FEATURE_NUM":"%s"}\n' "$BRANCH_NAME" "$SPEC_FILE" "$FEATURE_NUM"
else
    echo "BRANCH_NAME: $BRANCH_NAME"
    echo "SPEC_FILE: $SPEC_FILE"
    echo "FEATURE_NUM: $FEATURE_NUM"
fi

```

--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------

```json
{
  "name": "@king-of-the-grackles/reddit-research-mcp",
  "version": "1.0.0",
  "description": "Reddit Research MCP Server - Transform Reddit into your personal research assistant",
  "author": "king-of-the-grackles",
  "license": "MIT",
  "homepage": "https://github.com/king-of-the-grackles/reddit-research-mcp",
  "repository": {
    "type": "git",
    "url": "https://github.com/king-of-the-grackles/reddit-research-mcp.git"
  },
  "scripts": {
    "start": "python src/server.py"
  },
  "mcp": {
    "type": "stdio",
    "command": "python",
    "args": ["src/server.py"],
    "configSchema": {
      "type": "object",
      "required": ["REDDIT_CLIENT_ID", "REDDIT_CLIENT_SECRET", "REDDIT_USER_AGENT"],
      "properties": {
        "REDDIT_CLIENT_ID": {
          "type": "string",
          "title": "Reddit Client ID",
          "description": "Your Reddit application client ID from reddit.com/prefs/apps"
        },
        "REDDIT_CLIENT_SECRET": {
          "type": "string",
          "title": "Reddit Client Secret",
          "description": "Your Reddit application client secret"
        },
        "REDDIT_USER_AGENT": {
          "type": "string",
          "title": "Reddit User Agent",
          "description": "User agent string for Reddit API (e.g., 'MCP:reddit-research:v1.0')"
        },
        "CHROMA_PROXY_URL": {
          "type": "string",
          "title": "ChromaDB Proxy URL",
          "description": "URL of the ChromaDB proxy server (optional, uses default if not set)"
        },
        "CHROMA_PROXY_API_KEY": {
          "type": "string",
          "title": "ChromaDB Proxy API Key",
          "description": "API key for authenticating with the ChromaDB proxy server"
        }
      }
    },
    "exampleConfig": {
      "REDDIT_CLIENT_ID": "your_client_id_here",
      "REDDIT_CLIENT_SECRET": "your_client_secret_here",
      "REDDIT_USER_AGENT": "MCP:reddit-research:v1.0 (by /u/yourusername)",
      "CHROMA_PROXY_URL": "https://reddit-mcp-vector-db.onrender.com",
      "CHROMA_PROXY_API_KEY": "your_proxy_api_key_here"
    }
  }
}
```

--------------------------------------------------------------------------------
/.specify/memory/constitution.md:
--------------------------------------------------------------------------------

```markdown
# [PROJECT_NAME] Constitution
<!-- Example: Spec Constitution, TaskFlow Constitution, etc. -->

## Core Principles

### [PRINCIPLE_1_NAME]
<!-- Example: I. Library-First -->
[PRINCIPLE_1_DESCRIPTION]
<!-- Example: Every feature starts as a standalone library; Libraries must be self-contained, independently testable, documented; Clear purpose required - no organizational-only libraries -->

### [PRINCIPLE_2_NAME]
<!-- Example: II. CLI Interface -->
[PRINCIPLE_2_DESCRIPTION]
<!-- Example: Every library exposes functionality via CLI; Text in/out protocol: stdin/args → stdout, errors → stderr; Support JSON + human-readable formats -->

### [PRINCIPLE_3_NAME]
<!-- Example: III. Test-First (NON-NEGOTIABLE) -->
[PRINCIPLE_3_DESCRIPTION]
<!-- Example: TDD mandatory: Tests written → User approved → Tests fail → Then implement; Red-Green-Refactor cycle strictly enforced -->

### [PRINCIPLE_4_NAME]
<!-- Example: IV. Integration Testing -->
[PRINCIPLE_4_DESCRIPTION]
<!-- Example: Focus areas requiring integration tests: New library contract tests, Contract changes, Inter-service communication, Shared schemas -->

### [PRINCIPLE_5_NAME]
<!-- Example: V. Observability, VI. Versioning & Breaking Changes, VII. Simplicity -->
[PRINCIPLE_5_DESCRIPTION]
<!-- Example: Text I/O ensures debuggability; Structured logging required; Or: MAJOR.MINOR.BUILD format; Or: Start simple, YAGNI principles -->

## [SECTION_2_NAME]
<!-- Example: Additional Constraints, Security Requirements, Performance Standards, etc. -->

[SECTION_2_CONTENT]
<!-- Example: Technology stack requirements, compliance standards, deployment policies, etc. -->

## [SECTION_3_NAME]
<!-- Example: Development Workflow, Review Process, Quality Gates, etc. -->

[SECTION_3_CONTENT]
<!-- Example: Code review requirements, testing gates, deployment approval process, etc. -->

## Governance
<!-- Example: Constitution supersedes all other practices; Amendments require documentation, approval, migration plan -->

[GOVERNANCE_RULES]
<!-- Example: All PRs/reviews must verify compliance; Complexity must be justified; Use [GUIDANCE_FILE] for runtime development guidance -->

**Version**: [CONSTITUTION_VERSION] | **Ratified**: [RATIFICATION_DATE] | **Last Amended**: [LAST_AMENDED_DATE]
<!-- Example: Version: 2.1.1 | Ratified: 2025-06-13 | Last Amended: 2025-07-16 -->
```

--------------------------------------------------------------------------------
/src/tools/search.py:
--------------------------------------------------------------------------------

```python
from typing import Optional, Dict, Any, Literal
import praw
from prawcore import NotFound, Forbidden
from fastmcp import Context
from ..models import SearchResult, RedditPost


def search_in_subreddit(
    subreddit_name: str,
    query: str,
    reddit: praw.Reddit,
    sort: Literal["relevance", "hot", "top", "new"] = "relevance",
    time_filter: Literal["all", "year", "month", "week", "day"] = "all",
    limit: int = 10,
    ctx: Context = None
) -> Dict[str, Any]:
    """
    Search for posts within a specific subreddit.

    Args:
        subreddit_name: Name of the subreddit to search in (required)
        query: Search query string
        reddit: Configured Reddit client
        sort: Sort method for results
        time_filter: Time filter for results
        limit: Maximum number of results (max 100, default 10)
        ctx: FastMCP context (auto-injected by decorator)

    Returns:
        Dictionary containing search results from the specified subreddit
    """
    # Phase 1: Accept context but don't use it yet

    try:
        # Validate limit
        limit = min(max(1, limit), 100)
        
        # Clean subreddit name (remove r/ prefix if present)
        clean_name = subreddit_name.replace("r/", "").replace("/r/", "").strip()
        
        # Search within the specified subreddit
        try:
            subreddit_obj = reddit.subreddit(clean_name)
            # Verify subreddit exists
            _ = subreddit_obj.display_name
            
            search_results = subreddit_obj.search(
                query,
                sort=sort,
                time_filter=time_filter,
                limit=limit
            )
        except NotFound:
            return {
                "error": f"Subreddit r/{clean_name} not found",
                "suggestion": "discover_subreddits({'query': 'topic'})"
            }
        except Forbidden:
            return {"error": f"Access to r/{clean_name} forbidden (may be private)"}
        
        # Parse results
        results = []
        for submission in search_results:
            results.append(RedditPost(
                id=submission.id,
                title=submission.title,
                author=str(submission.author) if submission.author else "[deleted]",
                subreddit=submission.subreddit.display_name,
                score=submission.score,
                created_utc=submission.created_utc,
                url=submission.url,
                num_comments=submission.num_comments,
                permalink=f"https://reddit.com{submission.permalink}"
            ))
        
        result = SearchResult(
            results=results,
            count=len(results)
        )
        
        return result.model_dump()
        
    except Exception as e:
        return {"error": f"Search in subreddit failed: {str(e)}"}
```

--------------------------------------------------------------------------------
/.specify/templates/spec-template.md:
--------------------------------------------------------------------------------

```markdown
# Feature Specification: [FEATURE NAME]

**Feature Branch**: `[###-feature-name]`  
**Created**: [DATE]  
**Status**: Draft  
**Input**: User description: "$ARGUMENTS"

## Execution Flow (main)
```
1. Parse user description from Input
   → If empty: ERROR "No feature description provided"
2. Extract key concepts from description
   → Identify: actors, actions, data, constraints
3. For each unclear aspect:
   → Mark with [NEEDS CLARIFICATION: specific question]
4. Fill User Scenarios & Testing section
   → If no clear user flow: ERROR "Cannot determine user scenarios"
5. Generate Functional Requirements
   → Each requirement must be testable
   → Mark ambiguous requirements
6. Identify Key Entities (if data involved)
7. Run Review Checklist
   → If any [NEEDS CLARIFICATION]: WARN "Spec has uncertainties"
   → If implementation details found: ERROR "Remove tech details"
8. Return: SUCCESS (spec ready for planning)
```

---

## ⚡ Quick Guidelines
- ✅ Focus on WHAT users need and WHY
- ❌ Avoid HOW to implement (no tech stack, APIs, code structure)
- 👥 Written for business stakeholders, not developers

### Section Requirements
- **Mandatory sections**: Must be completed for every feature
- **Optional sections**: Include only when relevant to the feature
- When a section doesn't apply, remove it entirely (don't leave as "N/A")

### For AI Generation
When creating this spec from a user prompt:
1. **Mark all ambiguities**: Use [NEEDS CLARIFICATION: specific question] for any assumption you'd need to make
2. **Don't guess**: If the prompt doesn't specify something (e.g., "login system" without auth method), mark it
3. **Think like a tester**: Every vague requirement should fail the "testable and unambiguous" checklist item
4. **Common underspecified areas**:
   - User types and permissions
   - Data retention/deletion policies  
   - Performance targets and scale
   - Error handling behaviors
   - Integration requirements
   - Security/compliance needs

---

## User Scenarios & Testing *(mandatory)*

### Primary User Story
[Describe the main user journey in plain language]

### Acceptance Scenarios
1. **Given** [initial state], **When** [action], **Then** [expected outcome]
2. **Given** [initial state], **When** [action], **Then** [expected outcome]

### Edge Cases
- What happens when [boundary condition]?
- How does system handle [error scenario]?

## Requirements *(mandatory)*

### Functional Requirements
- **FR-001**: System MUST [specific capability, e.g., "allow users to create accounts"]
- **FR-002**: System MUST [specific capability, e.g., "validate email addresses"]  
- **FR-003**: Users MUST be able to [key interaction, e.g., "reset their password"]
- **FR-004**: System MUST [data requirement, e.g., "persist user preferences"]
- **FR-005**: System MUST [behavior, e.g., "log all security events"]

*Example of marking unclear requirements:*
- **FR-006**: System MUST authenticate users via [NEEDS CLARIFICATION: auth method not specified - email/password, SSO, OAuth?]
- **FR-007**: System MUST retain user data for [NEEDS CLARIFICATION: retention period not specified]

### Key Entities *(include if feature involves data)*
- **[Entity 1]**: [What it represents, key attributes without implementation]
- **[Entity 2]**: [What it represents, relationships to other entities]

---

## Review & Acceptance Checklist
*GATE: Automated checks run during main() execution*

### Content Quality
- [ ] No implementation details (languages, frameworks, APIs)
- [ ] Focused on user value and business needs
- [ ] Written for non-technical stakeholders
- [ ] All mandatory sections completed

### Requirement Completeness
- [ ] No [NEEDS CLARIFICATION] markers remain
- [ ] Requirements are testable and unambiguous  
- [ ] Success criteria are measurable
- [ ] Scope is clearly bounded
- [ ] Dependencies and assumptions identified

---

## Execution Status
*Updated by main() during processing*

- [ ] User description parsed
- [ ] Key concepts extracted
- [ ] Ambiguities marked
- [ ] User scenarios defined
- [ ] Requirements generated
- [ ] Entities identified
- [ ] Review checklist passed

---

```

--------------------------------------------------------------------------------
/.specify/templates/tasks-template.md:
--------------------------------------------------------------------------------

```markdown
# Tasks: [FEATURE NAME]

**Input**: Design documents from `/specs/[###-feature-name]/`
**Prerequisites**: plan.md (required), research.md, data-model.md, contracts/

## Execution Flow (main)
```
1. Load plan.md from feature directory
   → If not found: ERROR "No implementation plan found"
   → Extract: tech stack, libraries, structure
2. Load optional design documents:
   → data-model.md: Extract entities → model tasks
   → contracts/: Each file → contract test task
   → research.md: Extract decisions → setup tasks
3. Generate tasks by category:
   → Setup: project init, dependencies, linting
   → Tests: contract tests, integration tests
   → Core: models, services, CLI commands
   → Integration: DB, middleware, logging
   → Polish: unit tests, performance, docs
4. Apply task rules:
   → Different files = mark [P] for parallel
   → Same file = sequential (no [P])
   → Tests before implementation (TDD)
5. Number tasks sequentially (T001, T002...)
6. Generate dependency graph
7. Create parallel execution examples
8. Validate task completeness:
   → All contracts have tests?
   → All entities have models?
   → All endpoints implemented?
9. Return: SUCCESS (tasks ready for execution)
```

## Format: `[ID] [P?] Description`
- **[P]**: Can run in parallel (different files, no dependencies)
- Include exact file paths in descriptions

## Path Conventions
- **Single project**: `src/`, `tests/` at repository root
- **Web app**: `backend/src/`, `frontend/src/`
- **Mobile**: `api/src/`, `ios/src/` or `android/src/`
- Paths shown below assume single project - adjust based on plan.md structure

## Phase 3.1: Setup
- [ ] T001 Create project structure per implementation plan
- [ ] T002 Initialize [language] project with [framework] dependencies
- [ ] T003 [P] Configure linting and formatting tools

## Phase 3.2: Tests First (TDD) ⚠️ MUST COMPLETE BEFORE 3.3
**CRITICAL: These tests MUST be written and MUST FAIL before ANY implementation**
- [ ] T004 [P] Contract test POST /api/users in tests/contract/test_users_post.py
- [ ] T005 [P] Contract test GET /api/users/{id} in tests/contract/test_users_get.py
- [ ] T006 [P] Integration test user registration in tests/integration/test_registration.py
- [ ] T007 [P] Integration test auth flow in tests/integration/test_auth.py

## Phase 3.3: Core Implementation (ONLY after tests are failing)
- [ ] T008 [P] User model in src/models/user.py
- [ ] T009 [P] UserService CRUD in src/services/user_service.py
- [ ] T010 [P] CLI --create-user in src/cli/user_commands.py
- [ ] T011 POST /api/users endpoint
- [ ] T012 GET /api/users/{id} endpoint
- [ ] T013 Input validation
- [ ] T014 Error handling and logging

## Phase 3.4: Integration
- [ ] T015 Connect UserService to DB
- [ ] T016 Auth middleware
- [ ] T017 Request/response logging
- [ ] T018 CORS and security headers

## Phase 3.5: Polish
- [ ] T019 [P] Unit tests for validation in tests/unit/test_validation.py
- [ ] T020 Performance tests (<200ms)
- [ ] T021 [P] Update docs/api.md
- [ ] T022 Remove duplication
- [ ] T023 Run manual-testing.md

## Dependencies
- Tests (T004-T007) before implementation (T008-T014)
- T008 blocks T009, T015
- T016 blocks T018
- Implementation before polish (T019-T023)

## Parallel Example
```
# Launch T004-T007 together:
Task: "Contract test POST /api/users in tests/contract/test_users_post.py"
Task: "Contract test GET /api/users/{id} in tests/contract/test_users_get.py"
Task: "Integration test registration in tests/integration/test_registration.py"
Task: "Integration test auth in tests/integration/test_auth.py"
```

## Notes
- [P] tasks = different files, no dependencies
- Verify tests fail before implementing
- Commit after each task
- Avoid: vague tasks, same file conflicts

## Task Generation Rules
*Applied during main() execution*

1. **From Contracts**:
   - Each contract file → contract test task [P]
   - Each endpoint → implementation task
   
2. **From Data Model**:
   - Each entity → model creation task [P]
   - Relationships → service layer tasks
   
3. **From User Stories**:
   - Each story → integration test [P]
   - Quickstart scenarios → validation tasks

4. **Ordering**:
   - Setup → Tests → Models → Services → Endpoints → Polish
   - Dependencies block parallel execution

## Validation Checklist
*GATE: Checked by main() before returning*

- [ ] All contracts have corresponding tests
- [ ] All entities have model tasks
- [ ] All tests come before implementation
- [ ] Parallel tasks truly independent
- [ ] Each task specifies exact file path
- [ ] No task modifies same file as another [P] task
```

--------------------------------------------------------------------------------
/src/chroma_client.py:
--------------------------------------------------------------------------------

```python
"""
ChromaDB Cloud client for Reddit MCP.

Provides connection to ChromaDB Cloud for vector storage and retrieval.
"""

import os
from typing import Optional, List, Dict, Any
import requests


_client_instance = None


# ============= PROXY CLIENT CLASSES =============
class ChromaProxyClient:
    """Proxy client that mimics ChromaDB interface."""
    
    def __init__(self, proxy_url: Optional[str] = None):
        self.url = proxy_url or os.getenv(
            'CHROMA_PROXY_URL', 
            'https://reddit-mcp-vector-db.onrender.com'
        )
        self.api_key = os.getenv('CHROMA_PROXY_API_KEY')
        self.session = requests.Session()
        
        # Set API key in session headers if provided
        if self.api_key:
            self.session.headers['X-API-Key'] = self.api_key
    
    def query(self, query_texts: List[str], n_results: int = 10) -> Dict[str, Any]:
        """Query through proxy."""
        try:
            response = self.session.post(
                f"{self.url}/query",
                json={"query_texts": query_texts, "n_results": n_results},
                timeout=10
            )
            response.raise_for_status()
            return response.json()
        except requests.exceptions.HTTPError as e:
            if e.response.status_code == 401:
                raise ConnectionError("Authentication failed: API key required. Set CHROMA_PROXY_API_KEY environment variable.")
            elif e.response.status_code == 403:
                raise ConnectionError("Authentication failed: Invalid API key provided.")
            elif e.response.status_code == 429:
                raise ConnectionError("Rate limit exceeded. Please wait before retrying.")
            else:
                raise ConnectionError(f"Failed to query vector database: HTTP {e.response.status_code}")
        except requests.exceptions.RequestException as e:
            raise ConnectionError(f"Failed to query vector database: {e}")
    
    def list_collections(self) -> List[Dict[str, str]]:
        """Compatibility method."""
        return [{"name": "reddit_subreddits"}]
    
    def count(self) -> int:
        """Get document count."""
        try:
            response = self.session.get(f"{self.url}/stats", timeout=5)
            if response.status_code == 200:
                return response.json().get('total_subreddits', 20000)
            elif response.status_code == 401:
                print("Warning: Stats endpoint requires authentication. Using default count.")
            elif response.status_code == 403:
                print("Warning: Invalid API key for stats endpoint. Using default count.")
        except:
            pass
        return 20000


class ProxyCollection:
    """Wrapper to match Chroma collection interface."""
    
    def __init__(self, proxy_client: ChromaProxyClient):
        self.proxy_client = proxy_client
        self.name = "reddit_subreddits"
    
    def query(self, query_texts: List[str], n_results: int = 10) -> Dict[str, Any]:
        return self.proxy_client.query(query_texts, n_results)
    
    def count(self) -> int:
        return self.proxy_client.count()
# ============= END PROXY CLIENT CLASSES =============




def get_chroma_client():
    """
    Get ChromaDB proxy client for vector database access.
    
    Returns:
        ChromaProxyClient instance
    """
    global _client_instance
    
    # Return cached instance if available
    if _client_instance is not None:
        return _client_instance
    
    print("🌐 Using proxy for vector database access")
    _client_instance = ChromaProxyClient()
    return _client_instance


def reset_client_cache():
    """Reset the cached client instance (useful for testing)."""
    global _client_instance
    _client_instance = None


def get_collection(
    collection_name: str = "reddit_subreddits",
    client = None
):
    """
    Get ProxyCollection for vector database access.
    
    Args:
        collection_name: Name of the collection (always "reddit_subreddits")
        client: Optional client instance (uses default if not provided)
    
    Returns:
        ProxyCollection instance
    """
    if client is None:
        client = get_chroma_client()
    
    return ProxyCollection(client)


def test_connection() -> dict:
    """
    Test proxy connection and return status information.
    
    Returns:
        Dictionary with connection status and details
    """
    status = {
        'mode': 'proxy',
        'connected': False,
        'error': None,
        'collections': [],
        'document_count': 0,
        'authenticated': False
    }
    
    try:
        client = get_chroma_client()
        
        # Check if API key is configured
        if client.api_key:
            status['authenticated'] = True
        
        # Test connection
        status['connected'] = True
        status['collections'] = ['reddit_subreddits']
        status['document_count'] = client.count()
        
    except Exception as e:
        status['error'] = str(e)
    
    return status
```

--------------------------------------------------------------------------------
/specs/chroma-proxy-architecture.md:
--------------------------------------------------------------------------------

```markdown
# Minimal Chroma Proxy Architecture

## Problem
- You have a Chroma DB with 20,000+ indexed subreddits
- Users need to query it without having your credentials
- MCP server code must stay open source

## Solution
Create a minimal proxy service that handles Chroma queries. Users talk to your proxy, proxy talks to Chroma.

```
User → MCP Server → Your Proxy → Your Chroma DB
```

## Implementation

### Part 1: Proxy Service (Private Repo for Render)

Create a new private repository with just 2 files:

#### `server.py`
```python
from fastapi import FastAPI, HTTPException
import chromadb
import os

app = FastAPI()

# Connect to your Chroma DB
client = chromadb.CloudClient(
    api_key=os.getenv('CHROMA_API_KEY'),
    tenant=os.getenv('CHROMA_TENANT'),
    database=os.getenv('CHROMA_DATABASE')
)

@app.post("/query")
async def query(query_texts: list[str], n_results: int = 10):
    """Simple proxy for Chroma queries."""
    try:
        collection = client.get_collection("reddit_subreddits")
        return collection.query(query_texts=query_texts, n_results=n_results)
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.get("/health")
async def health():
    return {"status": "ok"}
```

#### `requirements.txt`
```
fastapi
chromadb
uvicorn
```

### Part 2: Update MCP Server (Open Source Repo)

#### Add ONE new file: `src/chroma_proxy.py`
```python
"""Minimal proxy client for Chroma DB access."""
import os
import requests

class ChromaProxyClient:
    """Proxy client that mimics ChromaDB interface."""
    
    def __init__(self):
        self.url = os.getenv('CHROMA_PROXY_URL', 'https://your-reddit-proxy.onrender.com')
    
    def query(self, query_texts, n_results=10):
        """Query through proxy."""
        response = requests.post(
            f"{self.url}/query", 
            json={"query_texts": query_texts, "n_results": n_results},
            timeout=10
        )
        response.raise_for_status()
        return response.json()
    
    def list_collections(self):
        """Compatibility method."""
        return [{"name": "reddit_subreddits"}]
    
    def count(self):
        """Compatibility method."""
        return 20000  # Known count

class ProxyCollection:
    """Wrapper to match Chroma collection interface."""
    
    def __init__(self, client):
        self.client = client
    
    def query(self, query_texts, n_results=10):
        return self.client.query(query_texts, n_results)
    
    def count(self):
        return self.client.count()
```

#### Update `src/chroma_client.py` (modify 2 functions only):

1. Update `get_chroma_client()`:
```python
def get_chroma_client():
    """Get ChromaDB client - proxy if no credentials, direct if available."""
    global _client_instance
    
    if _client_instance is not None:
        return _client_instance
    
    # If no direct credentials, use proxy
    if not os.getenv('CHROMA_API_KEY'):
        from .chroma_proxy import ChromaProxyClient
        print("Using proxy for vector database")
        _client_instance = ChromaProxyClient()
        return _client_instance
    
    # Rest of existing code for direct connection...
    config = get_chroma_config()
    # ... existing CloudClient code ...
```

2. Update `get_collection()`:
```python
def get_collection(collection_name="reddit_subreddits", client=None):
    """Get collection - handle both proxy and direct clients."""
    if client is None:
        client = get_chroma_client()
    
    # Handle proxy client
    from .chroma_proxy import ChromaProxyClient, ProxyCollection
    if isinstance(client, ChromaProxyClient):
        return ProxyCollection(client)
    
    # Rest of existing code for direct client...
    try:
        return client.get_collection(collection_name)
    # ... existing error handling ...
```

#### Update `Dockerfile` (add 1 line before CMD):
```dockerfile
# Add this line near the end, before CMD
ENV CHROMA_PROXY_URL=https://your-reddit-proxy.onrender.com
```

#### Update `pyproject.toml` (ensure requests is in dependencies):
```toml
dependencies = [
    # ... existing dependencies ...
    "requests>=2.31.0",  # Add if not present
]
```

### Part 3: Deploy to Render

#### Deploy the Proxy:

1. Push proxy code to private GitHub repo
2. In Render Dashboard:
   - New → Web Service
   - Connect your private repo
   - Build Command: `pip install -r requirements.txt`
   - Start Command: `uvicorn server:app --host 0.0.0.0 --port $PORT`
   - Add Environment Variables:
     - `CHROMA_API_KEY` = your-key
     - `CHROMA_TENANT` = your-tenant  
     - `CHROMA_DATABASE` = your-database
3. Deploy and note the URL (e.g., `https://reddit-proxy-abc.onrender.com`)

#### Update MCP Server:

1. Change the proxy URL in `Dockerfile` to your Render URL
2. Commit and push to GitHub
3. Deploy to Smithery

## That's It!

Total changes:
- **New files**: 1 proxy client file
- **Modified files**: 2 functions in chroma_client.py, 1 line in Dockerfile
- **Unchanged**: discover.py and all other tool files work as-is

## How It Works

1. When `discover.py` calls `get_chroma_client()`:
   - If no Chroma credentials → returns proxy client
   - If credentials present → returns direct client

2. Proxy client mimics Chroma's `query()` interface exactly

3. Users only need Reddit credentials, vector search "just works"

## Testing Locally

```bash
# Test proxy
cd reddit-proxy
CHROMA_API_KEY=xxx CHROMA_TENANT=yyy CHROMA_DATABASE=zzz uvicorn server:app --reload

# Test MCP with proxy
cd reddit-mcp-poc
CHROMA_PROXY_URL=http://localhost:8000 python src/server.py
```

## Cost & Security Notes

- Render free tier works fine for testing
- Add rate limiting later if needed
- Proxy only exposes one endpoint (`/query`)
- No user authentication needed initially (can add later)

## Why This Approach

- **Minimal**: ~50 lines of new code total
- **No breaking changes**: discover.py unchanged
- **Simple deployment**: 2 files to Render, done
- **Flexible**: Users with own Chroma can still use direct connection
- **Secure**: Your credentials never exposed
```

--------------------------------------------------------------------------------
/src/tools/comments.py:
--------------------------------------------------------------------------------

```python
from typing import Optional, Dict, Any, Literal, List
import praw
from praw.models import Submission, Comment as PrawComment, MoreComments
from prawcore import NotFound, Forbidden
from fastmcp import Context
from ..models import SubmissionWithCommentsResult, RedditPost, Comment


def parse_comment_tree(
    comment: PrawComment,
    depth: int = 0,
    max_depth: int = 10,
    ctx: Context = None
) -> Comment:
    """
    Recursively parse a comment and its replies into our Comment model.

    Args:
        comment: PRAW comment object
        depth: Current depth in the comment tree
        max_depth: Maximum depth to traverse
        ctx: FastMCP context (optional)

    Returns:
        Parsed Comment object with nested replies
    """
    # Phase 1: Accept context but don't use it yet

    replies = []
    if depth < max_depth and hasattr(comment, 'replies'):
        for reply in comment.replies:
            if isinstance(reply, PrawComment):
                replies.append(parse_comment_tree(reply, depth + 1, max_depth, ctx))
            # Skip MoreComments objects for simplicity in MVP
    
    return Comment(
        id=comment.id,
        body=comment.body,
        author=str(comment.author) if comment.author else "[deleted]",
        score=comment.score,
        created_utc=comment.created_utc,
        depth=depth,
        replies=replies
    )


async def fetch_submission_with_comments(
    reddit: praw.Reddit,
    submission_id: Optional[str] = None,
    url: Optional[str] = None,
    comment_limit: int = 100,
    comment_sort: Literal["best", "top", "new"] = "best",
    ctx: Context = None
) -> Dict[str, Any]:
    """
    Fetch a Reddit submission with its comment tree.

    Args:
        reddit: Configured Reddit client
        submission_id: Reddit post ID
        url: Full URL to the post (alternative to submission_id)
        comment_limit: Maximum number of comments to fetch
        comment_sort: How to sort comments
        ctx: FastMCP context (auto-injected by decorator)

    Returns:
        Dictionary containing submission and comments
    """
    # Phase 1: Accept context but don't use it yet

    try:
        # Validate that we have either submission_id or url
        if not submission_id and not url:
            return {"error": "Either submission_id or url must be provided"}
        
        # Get submission
        try:
            if submission_id:
                submission = reddit.submission(id=submission_id)
            else:
                submission = reddit.submission(url=url)
            
            # Force fetch to check if submission exists
            _ = submission.title
        except NotFound:
            return {"error": "Submission not found"}
        except Forbidden:
            return {"error": "Access to submission forbidden"}
        except Exception as e:
            return {"error": f"Invalid submission reference: {str(e)}"}
        
        # Set comment sort
        submission.comment_sort = comment_sort
        
        # Replace "More Comments" with actual comments (up to limit)
        submission.comments.replace_more(limit=0)  # Don't expand "more" comments in MVP
        
        # Parse submission
        submission_data = RedditPost(
            id=submission.id,
            title=submission.title,
            selftext=submission.selftext if submission.selftext else "",
            author=str(submission.author) if submission.author else "[deleted]",
            subreddit=submission.subreddit.display_name,
            score=submission.score,
            upvote_ratio=submission.upvote_ratio,
            num_comments=submission.num_comments,
            created_utc=submission.created_utc,
            url=submission.url
        )
        
        # Parse comments
        comments = []
        comment_count = 0
        
        for top_level_comment in submission.comments:
            # In tests, we might get regular Mock objects instead of PrawComment
            # Check if it has the required attributes
            if hasattr(top_level_comment, 'id') and hasattr(top_level_comment, 'body'):
                if comment_count >= comment_limit:
                    break

                # Report progress before processing comment
                if ctx:
                    await ctx.report_progress(
                        progress=comment_count,
                        total=comment_limit,
                        message=f"Loading comments ({comment_count}/{comment_limit})"
                    )

                if isinstance(top_level_comment, PrawComment):
                    comments.append(parse_comment_tree(top_level_comment, ctx=ctx))
                else:
                    # Handle mock objects in tests
                    comments.append(Comment(
                        id=top_level_comment.id,
                        body=top_level_comment.body,
                        author=str(top_level_comment.author) if top_level_comment.author else "[deleted]",
                        score=top_level_comment.score,
                        created_utc=top_level_comment.created_utc,
                        depth=0,
                        replies=[]
                    ))
                # Count all comments including replies
                comment_count += 1 + count_replies(comments[-1])

        # Report final completion
        if ctx:
            await ctx.report_progress(
                progress=comment_count,
                total=comment_limit,
                message=f"Completed: {comment_count} comments loaded"
            )

        result = SubmissionWithCommentsResult(
            submission=submission_data,
            comments=comments,
            total_comments_fetched=comment_count
        )
        
        return result.model_dump()
        
    except Exception as e:
        return {"error": f"Failed to fetch submission: {str(e)}"}


def count_replies(comment: Comment) -> int:
    """Count total number of replies in a comment tree."""
    count = len(comment.replies)
    for reply in comment.replies:
        count += count_replies(reply)
    return count
```

--------------------------------------------------------------------------------
/.specify/scripts/bash/update-agent-context.sh:
--------------------------------------------------------------------------------

```bash
#!/usr/bin/env bash
set -e
REPO_ROOT=$(git rev-parse --show-toplevel)
CURRENT_BRANCH=$(git rev-parse --abbrev-ref HEAD)
FEATURE_DIR="$REPO_ROOT/specs/$CURRENT_BRANCH"
NEW_PLAN="$FEATURE_DIR/plan.md"
CLAUDE_FILE="$REPO_ROOT/CLAUDE.md"; GEMINI_FILE="$REPO_ROOT/GEMINI.md"; COPILOT_FILE="$REPO_ROOT/.github/copilot-instructions.md"; CURSOR_FILE="$REPO_ROOT/.cursor/rules/specify-rules.mdc"; QWEN_FILE="$REPO_ROOT/QWEN.md"; AGENTS_FILE="$REPO_ROOT/AGENTS.md"
AGENT_TYPE="$1"
[ -f "$NEW_PLAN" ] || { echo "ERROR: No plan.md found at $NEW_PLAN"; exit 1; }
echo "=== Updating agent context files for feature $CURRENT_BRANCH ==="
NEW_LANG=$(grep "^**Language/Version**: " "$NEW_PLAN" 2>/dev/null | head -1 | sed 's/^**Language\/Version**: //' | grep -v "NEEDS CLARIFICATION" || echo "")
NEW_FRAMEWORK=$(grep "^**Primary Dependencies**: " "$NEW_PLAN" 2>/dev/null | head -1 | sed 's/^**Primary Dependencies**: //' | grep -v "NEEDS CLARIFICATION" || echo "")
NEW_DB=$(grep "^**Storage**: " "$NEW_PLAN" 2>/dev/null | head -1 | sed 's/^**Storage**: //' | grep -v "N/A" | grep -v "NEEDS CLARIFICATION" || echo "")
NEW_PROJECT_TYPE=$(grep "^**Project Type**: " "$NEW_PLAN" 2>/dev/null | head -1 | sed 's/^**Project Type**: //' || echo "")
update_agent_file() { local target_file="$1" agent_name="$2"; echo "Updating $agent_name context file: $target_file"; local temp_file=$(mktemp); if [ ! -f "$target_file" ]; then
  echo "Creating new $agent_name context file..."; if [ -f "$REPO_ROOT/.specify/templates/agent-file-template.md" ]; then cp "$REPO_ROOT/.specify/templates/agent-file-template.md" "$temp_file"; else echo "ERROR: Template not found"; return 1; fi;
  sed -i.bak "s/\[PROJECT NAME\]/$(basename $REPO_ROOT)/" "$temp_file"; sed -i.bak "s/\[DATE\]/$(date +%Y-%m-%d)/" "$temp_file"; sed -i.bak "s/\[EXTRACTED FROM ALL PLAN.MD FILES\]/- $NEW_LANG + $NEW_FRAMEWORK ($CURRENT_BRANCH)/" "$temp_file";
  if [[ "$NEW_PROJECT_TYPE" == *"web"* ]]; then sed -i.bak "s|\[ACTUAL STRUCTURE FROM PLANS\]|backend/\nfrontend/\ntests/|" "$temp_file"; else sed -i.bak "s|\[ACTUAL STRUCTURE FROM PLANS\]|src/\ntests/|" "$temp_file"; fi;
  if [[ "$NEW_LANG" == *"Python"* ]]; then COMMANDS="cd src && pytest && ruff check ."; elif [[ "$NEW_LANG" == *"Rust"* ]]; then COMMANDS="cargo test && cargo clippy"; elif [[ "$NEW_LANG" == *"JavaScript"* ]] || [[ "$NEW_LANG" == *"TypeScript"* ]]; then COMMANDS="npm test && npm run lint"; else COMMANDS="# Add commands for $NEW_LANG"; fi; sed -i.bak "s|\[ONLY COMMANDS FOR ACTIVE TECHNOLOGIES\]|$COMMANDS|" "$temp_file";
  sed -i.bak "s|\[LANGUAGE-SPECIFIC, ONLY FOR LANGUAGES IN USE\]|$NEW_LANG: Follow standard conventions|" "$temp_file"; sed -i.bak "s|\[LAST 3 FEATURES AND WHAT THEY ADDED\]|- $CURRENT_BRANCH: Added $NEW_LANG + $NEW_FRAMEWORK|" "$temp_file"; rm "$temp_file.bak";
else
  echo "Updating existing $agent_name context file..."; manual_start=$(grep -n "<!-- MANUAL ADDITIONS START -->" "$target_file" | cut -d: -f1); manual_end=$(grep -n "<!-- MANUAL ADDITIONS END -->" "$target_file" | cut -d: -f1); if [ -n "$manual_start" ] && [ -n "$manual_end" ]; then sed -n "${manual_start},${manual_end}p" "$target_file" > /tmp/manual_additions.txt; fi;
  python3 - "$target_file" <<'EOF'
import re,sys,datetime
target=sys.argv[1]
with open(target) as f: content=f.read()
NEW_LANG="'$NEW_LANG'";NEW_FRAMEWORK="'$NEW_FRAMEWORK'";CURRENT_BRANCH="'$CURRENT_BRANCH'";NEW_DB="'$NEW_DB'";NEW_PROJECT_TYPE="'$NEW_PROJECT_TYPE'"
# Tech section
m=re.search(r'## Active Technologies\n(.*?)\n\n',content, re.DOTALL)
if m:
  existing=m.group(1)
  additions=[]
  if '$NEW_LANG' and '$NEW_LANG' not in existing: additions.append(f"- $NEW_LANG + $NEW_FRAMEWORK ($CURRENT_BRANCH)")
  if '$NEW_DB' and '$NEW_DB' not in existing and '$NEW_DB'!='N/A': additions.append(f"- $NEW_DB ($CURRENT_BRANCH)")
  if additions:
    new_block=existing+"\n"+"\n".join(additions)
    content=content.replace(m.group(0),f"## Active Technologies\n{new_block}\n\n")
# Recent changes
m2=re.search(r'## Recent Changes\n(.*?)(\n\n|$)',content, re.DOTALL)
if m2:
  lines=[l for l in m2.group(1).strip().split('\n') if l]
  lines.insert(0,f"- $CURRENT_BRANCH: Added $NEW_LANG + $NEW_FRAMEWORK")
  lines=lines[:3]
  content=re.sub(r'## Recent Changes\n.*?(\n\n|$)', '## Recent Changes\n'+"\n".join(lines)+'\n\n', content, flags=re.DOTALL)
content=re.sub(r'Last updated: \d{4}-\d{2}-\d{2}', 'Last updated: '+datetime.datetime.now().strftime('%Y-%m-%d'), content)
open(target+'.tmp','w').write(content)
EOF
  mv "$target_file.tmp" "$target_file"; if [ -f /tmp/manual_additions.txt ]; then sed -i.bak '/<!-- MANUAL ADDITIONS START -->/,/<!-- MANUAL ADDITIONS END -->/d' "$target_file"; cat /tmp/manual_additions.txt >> "$target_file"; rm /tmp/manual_additions.txt "$target_file.bak"; fi;
fi; mv "$temp_file" "$target_file" 2>/dev/null || true; echo "✅ $agent_name context file updated successfully"; }
case "$AGENT_TYPE" in
  claude) update_agent_file "$CLAUDE_FILE" "Claude Code" ;;
  gemini) update_agent_file "$GEMINI_FILE" "Gemini CLI" ;;
  copilot) update_agent_file "$COPILOT_FILE" "GitHub Copilot" ;;
  cursor) update_agent_file "$CURSOR_FILE" "Cursor IDE" ;;
  qwen) update_agent_file "$QWEN_FILE" "Qwen Code" ;;
  opencode) update_agent_file "$AGENTS_FILE" "opencode" ;;
  "") [ -f "$CLAUDE_FILE" ] && update_agent_file "$CLAUDE_FILE" "Claude Code"; \
       [ -f "$GEMINI_FILE" ] && update_agent_file "$GEMINI_FILE" "Gemini CLI"; \
       [ -f "$COPILOT_FILE" ] && update_agent_file "$COPILOT_FILE" "GitHub Copilot"; \
       [ -f "$CURSOR_FILE" ] && update_agent_file "$CURSOR_FILE" "Cursor IDE"; \
       [ -f "$QWEN_FILE" ] && update_agent_file "$QWEN_FILE" "Qwen Code"; \
       [ -f "$AGENTS_FILE" ] && update_agent_file "$AGENTS_FILE" "opencode"; \
       if [ ! -f "$CLAUDE_FILE" ] && [ ! -f "$GEMINI_FILE" ] && [ ! -f "$COPILOT_FILE" ] && [ ! -f "$CURSOR_FILE" ] && [ ! -f "$QWEN_FILE" ] && [ ! -f "$AGENTS_FILE" ]; then update_agent_file "$CLAUDE_FILE" "Claude Code"; fi ;;
  *) echo "ERROR: Unknown agent type '$AGENT_TYPE' (expected claude|gemini|copilot|cursor|qwen|opencode)"; exit 1 ;;
esac
echo; echo "Summary of changes:"; [ -n "$NEW_LANG" ] && echo "- Added language: $NEW_LANG"; [ -n "$NEW_FRAMEWORK" ] && echo "- Added framework: $NEW_FRAMEWORK"; [ -n "$NEW_DB" ] && [ "$NEW_DB" != "N/A" ] && echo "- Added database: $NEW_DB"; echo; echo "Usage: $0 [claude|gemini|copilot|cursor|qwen|opencode]"

```

--------------------------------------------------------------------------------
/tests/test_tools.py:
--------------------------------------------------------------------------------

```python
import pytest
import sys
import os
from unittest.mock import Mock, MagicMock
from fastmcp import Context

# Add project root to Python path so relative imports work
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))

from src.tools.search import search_in_subreddit
from src.tools.posts import fetch_subreddit_posts
from src.tools.comments import fetch_submission_with_comments


def create_mock_submission(
    id="test123",
    title="Test Post",
    author="testuser",
    score=100,
    num_comments=50
):
    """Helper to create a mock Reddit submission."""
    submission = Mock()
    submission.id = id
    submission.title = title
    submission.author = Mock()
    submission.author.__str__ = Mock(return_value=author)
    submission.score = score
    submission.num_comments = num_comments
    submission.created_utc = 1234567890.0
    submission.url = f"https://reddit.com/r/test/{id}"
    submission.selftext = "Test content"
    submission.upvote_ratio = 0.95
    submission.permalink = f"/r/test/comments/{id}/test_post/"
    submission.subreddit = Mock()
    submission.subreddit.display_name = "test"
    return submission


def create_mock_comment(
    id="comment123",
    body="Test comment",
    author="commentuser",
    score=10
):
    """Helper to create a mock Reddit comment."""
    comment = Mock()
    comment.id = id
    comment.body = body
    comment.author = Mock()
    comment.author.__str__ = Mock(return_value=author)
    comment.score = score
    comment.created_utc = 1234567890.0
    comment.replies = []
    return comment


@pytest.fixture
def mock_context():
    """Create a mock Context object for testing."""
    return Mock(spec=Context)


class TestSearchReddit:
    def test_search_reddit_success(self, mock_context):
        """Test successful Reddit search."""
        mock_reddit = Mock()
        mock_subreddit = Mock()
        mock_subreddit.display_name = "test"
        mock_submissions = [
            create_mock_submission(id="1", title="First Post"),
            create_mock_submission(id="2", title="Second Post")
        ]

        mock_subreddit.search.return_value = mock_submissions
        mock_reddit.subreddit.return_value = mock_subreddit

        result = search_in_subreddit(
            subreddit_name="test",
            query="test query",
            reddit=mock_reddit,
            limit=10,
            ctx=mock_context
        )

        assert "results" in result
        assert result["count"] == 2
        assert result["results"][0]["title"] == "First Post"
        assert result["results"][1]["title"] == "Second Post"

    def test_search_reddit_subreddit_not_found(self, mock_context):
        """Test search with failed request."""
        from prawcore import NotFound
        mock_reddit = Mock()
        mock_reddit.subreddit.side_effect = NotFound(Mock())

        result = search_in_subreddit(
            subreddit_name="test",
            query="test",
            reddit=mock_reddit,
            ctx=mock_context
        )

        assert "error" in result
        assert "not found" in result["error"].lower()


class TestFetchSubredditPosts:
    def test_fetch_posts_success(self, mock_context):
        """Test successful fetching of subreddit posts."""
        mock_reddit = Mock()
        mock_subreddit = Mock()
        mock_subreddit.display_name = "test"
        mock_subreddit.subscribers = 1000000
        mock_subreddit.public_description = "Test subreddit"

        mock_posts = [
            create_mock_submission(id="1", title="Hot Post 1"),
            create_mock_submission(id="2", title="Hot Post 2")
        ]

        mock_subreddit.hot.return_value = mock_posts
        mock_reddit.subreddit.return_value = mock_subreddit

        result = fetch_subreddit_posts(
            subreddit_name="test",
            reddit=mock_reddit,
            listing_type="hot",
            limit=10,
            ctx=mock_context
        )

        assert "posts" in result
        assert "subreddit" in result
        assert result["count"] == 2
        assert result["subreddit"]["name"] == "test"
        assert result["posts"][0]["title"] == "Hot Post 1"

    def test_fetch_posts_invalid_subreddit(self, mock_context):
        """Test fetching from non-existent subreddit."""
        from prawcore import NotFound
        mock_reddit = Mock()
        mock_reddit.subreddit.side_effect = NotFound(Mock())

        result = fetch_subreddit_posts(
            subreddit_name="nonexistent",
            reddit=mock_reddit,
            ctx=mock_context
        )

        assert "error" in result
        assert "not found" in result["error"].lower()


class TestFetchSubmissionWithComments:
    async def test_fetch_submission_success(self, mock_context):
        """Test successful fetching of submission with comments."""
        mock_reddit = Mock()
        mock_submission = create_mock_submission()

        # Create mock comments
        mock_comment1 = create_mock_comment(id="c1", body="First comment")
        mock_comment2 = create_mock_comment(id="c2", body="Second comment")

        # Create a mock comments object that behaves like a list but has replace_more
        mock_comments = Mock()
        mock_comments.__iter__ = Mock(return_value=iter([mock_comment1, mock_comment2]))
        mock_comments.replace_more = Mock()

        mock_submission.comments = mock_comments
        mock_submission.comment_sort = "best"

        mock_reddit.submission.return_value = mock_submission

        result = await fetch_submission_with_comments(
            reddit=mock_reddit,
            submission_id="test123",
            comment_limit=10,
            ctx=mock_context
        )

        assert "submission" in result
        assert "comments" in result
        assert result["submission"]["id"] == "test123"
        assert len(result["comments"]) == 2
        assert result["comments"][0]["body"] == "First comment"

    async def test_fetch_submission_not_found(self, mock_context):
        """Test fetching non-existent submission."""
        from prawcore import NotFound
        mock_reddit = Mock()
        mock_reddit.submission.side_effect = NotFound(Mock())

        result = await fetch_submission_with_comments(
            reddit=mock_reddit,
            submission_id="nonexistent",
            ctx=mock_context
        )

        assert "error" in result
        assert "not found" in result["error"].lower()

    async def test_fetch_submission_no_id_or_url(self, mock_context):
        """Test error when neither submission_id nor url is provided."""
        mock_reddit = Mock()

        result = await fetch_submission_with_comments(
            reddit=mock_reddit,
            ctx=mock_context
        )

        assert "error" in result
        assert "submission_id or url must be provided" in result["error"]
```

--------------------------------------------------------------------------------
/reports/top-50-active-AI-subreddits.md:
--------------------------------------------------------------------------------

```markdown
'''
Prompt: Can you build me a list of the top 50 most active subreddits related to AI, LLMS, ChatGPT, Claude, Claude Code, Codex, Vibe Coding
'''

# Top 50 Most Active AI, LLM, and AI Coding Subreddits

## Top Tier Communities (11M+ subscribers)
1. **r/ChatGPT** - 11,114,896 subscribers - [https://reddit.com/r/ChatGPT](https://reddit.com/r/ChatGPT)

## Major Communities (1M+ subscribers)
2. **r/MachineLearning** - 2,988,159 subscribers - [https://reddit.com/r/MachineLearning](https://reddit.com/r/MachineLearning)
3. **r/OpenAI** - 2,446,435 subscribers - [https://reddit.com/r/OpenAI](https://reddit.com/r/OpenAI)
4. **r/ArtificialInteligence** - 1,551,586 subscribers - [https://reddit.com/r/ArtificialInteligence](https://reddit.com/r/ArtificialInteligence)
5. **r/artificial** - 1,135,505 subscribers - [https://reddit.com/r/artificial](https://reddit.com/r/artificial)

## Large Communities (500K+ subscribers)
6. **r/learnmachinelearning** - 547,704 subscribers - [https://reddit.com/r/learnmachinelearning](https://reddit.com/r/learnmachinelearning)
7. **r/LocalLLaMA** - 522,475 subscribers - [https://reddit.com/r/LocalLLaMA](https://reddit.com/r/LocalLLaMA)

## Established Communities (100K-500K subscribers)
8. **r/ChatGPTPro** - 486,147 subscribers - [https://reddit.com/r/ChatGPTPro](https://reddit.com/r/ChatGPTPro)
9. **r/ClaudeAI** - 311,208 subscribers - [https://reddit.com/r/ClaudeAI](https://reddit.com/r/ClaudeAI)
10. **r/ChatGPTCoding** - 309,810 subscribers - [https://reddit.com/r/ChatGPTCoding](https://reddit.com/r/ChatGPTCoding)
11. **r/aivideo** - 267,399 subscribers - [https://reddit.com/r/aivideo](https://reddit.com/r/aivideo)
12. **r/dalle2** - 206,091 subscribers - [https://reddit.com/r/dalle2](https://reddit.com/r/dalle2)
13. **r/AI_Agents** - 191,203 subscribers - [https://reddit.com/r/AI_Agents](https://reddit.com/r/AI_Agents)
14. **r/comfyui** - 117,893 subscribers - [https://reddit.com/r/comfyui](https://reddit.com/r/comfyui)
15. **r/machinelearningnews** - 107,720 subscribers - [https://reddit.com/r/machinelearningnews](https://reddit.com/r/machinelearningnews)
16. **r/aipromptprogramming** - 107,001 subscribers - [https://reddit.com/r/aipromptprogramming](https://reddit.com/r/aipromptprogramming)
17. **r/GeminiAI** - 104,691 subscribers - [https://reddit.com/r/GeminiAI](https://reddit.com/r/GeminiAI)
18. **r/LLMDevs** - 103,689 subscribers - [https://reddit.com/r/LLMDevs](https://reddit.com/r/LLMDevs)
19. **r/perplexity_ai** - 101,608 subscribers - [https://reddit.com/r/perplexity_ai](https://reddit.com/r/perplexity_ai)

## Active Communities (50K-100K subscribers)
20. **r/cursor** - 91,743 subscribers - [https://reddit.com/r/cursor](https://reddit.com/r/cursor)
21. **r/AIArtwork** - 83,065 subscribers - [https://reddit.com/r/AIArtwork](https://reddit.com/r/AIArtwork)
22. **r/MLQuestions** - 83,423 subscribers - [https://reddit.com/r/MLQuestions](https://reddit.com/r/MLQuestions)
23. **r/nocode** - 82,361 subscribers - [https://reddit.com/r/nocode](https://reddit.com/r/nocode)
24. **r/LocalLLM** - 81,986 subscribers - [https://reddit.com/r/LocalLLM](https://reddit.com/r/LocalLLM)
25. **r/ChatGPT_FR** - 81,642 subscribers - [https://reddit.com/r/ChatGPT_FR](https://reddit.com/r/ChatGPT_FR)
26. **r/GoogleGeminiAI** - 77,148 subscribers - [https://reddit.com/r/GoogleGeminiAI](https://reddit.com/r/GoogleGeminiAI)
27. **r/AIAssisted** - 71,088 subscribers - [https://reddit.com/r/AIAssisted](https://reddit.com/r/AIAssisted)
28. **r/reinforcementlearning** - 65,979 subscribers - [https://reddit.com/r/reinforcementlearning](https://reddit.com/r/reinforcementlearning)
29. **r/WritingWithAI** - 54,806 subscribers - [https://reddit.com/r/WritingWithAI](https://reddit.com/r/WritingWithAI)
30. **r/outlier_ai** - 52,105 subscribers - [https://reddit.com/r/outlier_ai](https://reddit.com/r/outlier_ai)
31. **r/SillyTavernAI** - 51,310 subscribers - [https://reddit.com/r/SillyTavernAI](https://reddit.com/r/SillyTavernAI)

## Growing Communities (20K-50K subscribers)
32. **r/PygmalionAI** - 47,809 subscribers - [https://reddit.com/r/PygmalionAI](https://reddit.com/r/PygmalionAI)
33. **r/AgentsOfAI** - 46,494 subscribers - [https://reddit.com/r/AgentsOfAI](https://reddit.com/r/AgentsOfAI)
34. **r/bigsleep** - 41,078 subscribers - [https://reddit.com/r/bigsleep](https://reddit.com/r/bigsleep)
35. **r/antiai** - 37,034 subscribers - [https://reddit.com/r/antiai](https://reddit.com/r/antiai)
36. **r/MachineLearningJobs** - 34,514 subscribers - [https://reddit.com/r/MachineLearningJobs](https://reddit.com/r/MachineLearningJobs)
37. **r/chatgpt_promptDesign** - 32,368 subscribers - [https://reddit.com/r/chatgpt_promptDesign](https://reddit.com/r/chatgpt_promptDesign)
38. **r/tensorflow** - 31,369 subscribers - [https://reddit.com/r/tensorflow](https://reddit.com/r/tensorflow)
39. **r/AiChatGPT** - 31,346 subscribers - [https://reddit.com/r/AiChatGPT](https://reddit.com/r/AiChatGPT)
40. **r/neuralnetworks** - 29,721 subscribers - [https://reddit.com/r/neuralnetworks](https://reddit.com/r/neuralnetworks)
41. **r/civitai** - 28,446 subscribers - [https://reddit.com/r/civitai](https://reddit.com/r/civitai)
42. **r/MistralAI** - 24,897 subscribers - [https://reddit.com/r/MistralAI](https://reddit.com/r/MistralAI)
43. **r/pytorch** - 22,695 subscribers - [https://reddit.com/r/pytorch](https://reddit.com/r/pytorch)
44. **r/PromptDesign** - 21,679 subscribers - [https://reddit.com/r/PromptDesign](https://reddit.com/r/PromptDesign)
45. **r/FetchAI_Community** - 21,415 subscribers - [https://reddit.com/r/FetchAI_Community](https://reddit.com/r/FetchAI_Community)
46. **r/Chub_AI** - 21,163 subscribers - [https://reddit.com/r/Chub_AI](https://reddit.com/r/Chub_AI)
47. **r/generativeAI** - 21,036 subscribers - [https://reddit.com/r/generativeAI](https://reddit.com/r/generativeAI)
48. **r/aifails** - 20,511 subscribers - [https://reddit.com/r/aifails](https://reddit.com/r/aifails)
49. **r/ClaudeCode** - 20,480 subscribers - [https://reddit.com/r/ClaudeCode](https://reddit.com/r/ClaudeCode)
50. **r/CodingJobs** - 20,607 subscribers - [https://reddit.com/r/CodingJobs](https://reddit.com/r/CodingJobs)

## Special Mentions (AI Coding Tools)
- **r/CursorAI** - 8,392 subscribers - [https://reddit.com/r/CursorAI](https://reddit.com/r/CursorAI)
- **r/RooCode** - 15,316 subscribers - [https://reddit.com/r/RooCode](https://reddit.com/r/RooCode)
- **r/BlackboxAI_** - 8,357 subscribers - [https://reddit.com/r/BlackboxAI_](https://reddit.com/r/BlackboxAI_)

## Summary Statistics
- **Total Combined Subscribers**: ~23.5 million (accounting for overlaps)
- **Largest Community**: r/ChatGPT with over 11 million subscribers
- **Categories Covered**:
  - General AI/ML discussions
  - LLM-specific communities (ChatGPT, Claude, LLaMA, etc.)
  - AI coding and development tools
  - Job boards and professional development
  - AI art and creative applications
  - Research and academic discussions
```

--------------------------------------------------------------------------------
/specs/agent-reasoning-visibility.md:
--------------------------------------------------------------------------------

```markdown
# Deep Agent Reasoning Visibility with Streaming

## Understanding the Goal
You want to see the actual LLM reasoning process (thinking tokens) for each agent, streamed in real-time to debug logs, similar to how you see UV's debug output.

## Proposed Implementation

### 1. Enable OpenAI Agents SDK Streaming & Tracing
```python
from agents import Runner, RunConfig
from agents.streaming import StreamingRunResult
import logging

# Configure logging for agent traces
logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger('reddit_research_agent')

async def execute_reddit_research(query: str):
    # Enable tracing and streaming
    run_config = RunConfig(
        model="gpt-4",
        trace_metadata={"query": query},
        workflow_name="reddit_research",
        # Enable detailed tracing
        trace_include_sensitive_data=True,
    )
    
    # Use streaming runner for real-time output
    logger.debug(f"🎯 ORCHESTRATOR starting for query: {query}")
    
    # Stream orchestrator reasoning
    orchestrator_stream = await Runner.run_streamed(
        orchestrator, 
        query,
        run_config=run_config
    )
    
    # Process streaming events
    async for event in orchestrator_stream.stream_events():
        if event.type == "reasoning":
            logger.debug(f"[ORCHESTRATOR THINKING] {event.content}")
        elif event.type == "tool_call":
            logger.debug(f"[ORCHESTRATOR ACTION] Calling: {event.tool_name}")
    
    search_plan = orchestrator_stream.final_output_as(SearchTaskPlan)
```

### 2. Add Custom Context Wrapper for Reasoning Capture
```python
class ReasoningCapture:
    """Capture and log agent reasoning in real-time"""
    
    def __init__(self, agent_name: str):
        self.agent_name = agent_name
        self.logger = logging.getLogger(f'agent.{agent_name}')
    
    async def wrap_agent_run(self, agent, input_data, context=None):
        self.logger.debug(f"[{self.agent_name}] Starting with input: {input_data[:100]}...")
        
        # Run with streaming to capture reasoning
        result = await Runner.run_streamed(agent, input_data, context=context)
        
        reasoning_tokens = []
        async for event in result.stream_events():
            if event.type in ["reasoning", "thinking"]:
                reasoning_tokens.append(event.content)
                self.logger.debug(f"[{self.agent_name} REASONING] {event.content}")
            elif event.type == "output":
                self.logger.debug(f"[{self.agent_name} OUTPUT] {event.content[:200]}...")
        
        # Log summary
        self.logger.info(f"[{self.agent_name}] Complete. Reasoning tokens: {len(reasoning_tokens)}")
        
        return result
```

### 3. Environment Variable for Debug Mode
```python
import os

# Add debug mode toggle
DEBUG_AGENTS = os.getenv("DEBUG_AGENTS", "false").lower() == "true"
STREAM_REASONING = os.getenv("STREAM_REASONING", "false").lower() == "true"

async def get_reddit_instance(debug=False):
    # Only show auth debug if DEBUG_AGENTS is enabled
    if debug and DEBUG_AGENTS:
        print(f"🔐 Reddit Auth Debug:...")
```

### 4. Run Script with Debug Flags
```bash
# In the script header, add environment variable support
#!/usr/bin/env -S DEBUG_AGENTS=true STREAM_REASONING=true uv run --verbose --script

# Or run with:
DEBUG_AGENTS=true STREAM_REASONING=true uv run --verbose reddit_research_agent.py
```

### 5. Structured Logging Output
```python
# Configure different log levels for different components
logging.getLogger('agent.orchestrator').setLevel(logging.DEBUG)
logging.getLogger('agent.search_worker').setLevel(logging.INFO)
logging.getLogger('agent.discovery_worker').setLevel(logging.INFO)
logging.getLogger('agent.validation_worker').setLevel(logging.INFO)
logging.getLogger('agent.synthesizer').setLevel(logging.DEBUG)
logging.getLogger('asyncpraw').setLevel(logging.WARNING)  # Reduce Reddit noise
```

### 6. Custom Debug Output Format
```python
class AgentDebugFormatter(logging.Formatter):
    """Custom formatter for agent debug output"""
    
    COLORS = {
        'DEBUG': '\033[36m',    # Cyan
        'INFO': '\033[32m',     # Green
        'WARNING': '\033[33m',  # Yellow
        'ERROR': '\033[31m',    # Red
        'REASONING': '\033[35m', # Magenta
    }
    RESET = '\033[0m'
    
    def format(self, record):
        # Add colors for terminal output
        if hasattr(record, 'reasoning'):
            color = self.COLORS.get('REASONING', '')
            record.msg = f"{color}[THINKING] {record.msg}{self.RESET}"
        
        return super().format(record)

# Apply formatter
handler = logging.StreamHandler()
handler.setFormatter(AgentDebugFormatter(
    '%(asctime)s | %(name)s | %(message)s'
))
logging.root.addHandler(handler)
```

## Expected Output with Deep Visibility:
```
$ DEBUG_AGENTS=true STREAM_REASONING=true uv run --verbose reddit_research_agent.py

2024-01-15 10:23:45 | agent.orchestrator | [ORCHESTRATOR THINKING] The user is asking about Trump and Putin in Alaska. I need to identify:
2024-01-15 10:23:45 | agent.orchestrator | [ORCHESTRATOR THINKING] 1. Core entities: Trump (person), Putin (person), Alaska (location)
2024-01-15 10:23:46 | agent.orchestrator | [ORCHESTRATOR THINKING] 2. These are political figures, so political subreddits would be relevant
2024-01-15 10:23:46 | agent.orchestrator | [ORCHESTRATOR THINKING] 3. For direct searches, I'll use single terms like "trump", "putin", "alaska"
2024-01-15 10:23:47 | agent.orchestrator | [ORCHESTRATOR OUTPUT] SearchTaskPlan(direct_searches=['trump', 'putin', 'alaska'], ...)

2024-01-15 10:23:48 | agent.search_worker | [SEARCH_WORKER THINKING] I received terms: trump, putin, alaska
2024-01-15 10:23:48 | agent.search_worker | [SEARCH_WORKER THINKING] These are potential subreddit names. I'll search each one.
2024-01-15 10:23:49 | agent.search_worker | [SEARCH_WORKER ACTION] Calling search_subreddits_tool(query='trump')
2024-01-15 10:23:50 | reddit.api | Searching for communities matching: 'trump'
2024-01-15 10:23:51 | reddit.api | Found 24 communities
```

## Benefits:
1. **Real thinking tokens**: See actual LLM reasoning, not just formatted output
2. **Streaming visibility**: Watch agents think in real-time
3. **Debug control**: Toggle verbosity with environment variables
4. **Performance metrics**: Track reasoning token usage per agent
5. **Structured logs**: Filter by agent or log level
6. **UV integration**: Works alongside UV's --verbose flag

## Alternative: OpenAI Tracing Dashboard
The OpenAI Agents SDK also supports sending traces to their dashboard:
```python
# Traces will appear at https://platform.openai.com/traces
run_config = RunConfig(
    workflow_name="reddit_research",
    trace_id=f"reddit_{timestamp}",
    trace_metadata={"query": query, "version": "1.0"}
)
```

This gives you a web UI to explore agent reasoning after execution.

## Implementation Priority
1. Start with environment variable debug flags (easiest)
2. Add structured logging with custom formatter
3. Implement streaming for orchestrator and synthesizer (most valuable)
4. Add streaming for worker agents if needed
5. Consider OpenAI dashboard for production monitoring
```

--------------------------------------------------------------------------------
/.specify/templates/plan-template.md:
--------------------------------------------------------------------------------

```markdown

# Implementation Plan: [FEATURE]

**Branch**: `[###-feature-name]` | **Date**: [DATE] | **Spec**: [link]
**Input**: Feature specification from `/specs/[###-feature-name]/spec.md`

## Execution Flow (/plan command scope)
```
1. Load feature spec from Input path
   → If not found: ERROR "No feature spec at {path}"
2. Fill Technical Context (scan for NEEDS CLARIFICATION)
   → Detect Project Type from context (web=frontend+backend, mobile=app+api)
   → Set Structure Decision based on project type
3. Fill the Constitution Check section based on the content of the constitution document.
4. Evaluate Constitution Check section below
   → If violations exist: Document in Complexity Tracking
   → If no justification possible: ERROR "Simplify approach first"
   → Update Progress Tracking: Initial Constitution Check
5. Execute Phase 0 → research.md
   → If NEEDS CLARIFICATION remain: ERROR "Resolve unknowns"
6. Execute Phase 1 → contracts, data-model.md, quickstart.md, agent-specific template file (e.g., `CLAUDE.md` for Claude Code, `.github/copilot-instructions.md` for GitHub Copilot, `GEMINI.md` for Gemini CLI, `QWEN.md` for Qwen Code or `AGENTS.md` for opencode).
7. Re-evaluate Constitution Check section
   → If new violations: Refactor design, return to Phase 1
   → Update Progress Tracking: Post-Design Constitution Check
8. Plan Phase 2 → Describe task generation approach (DO NOT create tasks.md)
9. STOP - Ready for /tasks command
```

**IMPORTANT**: The /plan command STOPS at step 7. Phases 2-4 are executed by other commands:
- Phase 2: /tasks command creates tasks.md
- Phase 3-4: Implementation execution (manual or via tools)

## Summary
[Extract from feature spec: primary requirement + technical approach from research]

## Technical Context
**Language/Version**: [e.g., Python 3.11, Swift 5.9, Rust 1.75 or NEEDS CLARIFICATION]  
**Primary Dependencies**: [e.g., FastAPI, UIKit, LLVM or NEEDS CLARIFICATION]  
**Storage**: [if applicable, e.g., PostgreSQL, CoreData, files or N/A]  
**Testing**: [e.g., pytest, XCTest, cargo test or NEEDS CLARIFICATION]  
**Target Platform**: [e.g., Linux server, iOS 15+, WASM or NEEDS CLARIFICATION]
**Project Type**: [single/web/mobile - determines source structure]  
**Performance Goals**: [domain-specific, e.g., 1000 req/s, 10k lines/sec, 60 fps or NEEDS CLARIFICATION]  
**Constraints**: [domain-specific, e.g., <200ms p95, <100MB memory, offline-capable or NEEDS CLARIFICATION]  
**Scale/Scope**: [domain-specific, e.g., 10k users, 1M LOC, 50 screens or NEEDS CLARIFICATION]

## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*

[Gates determined based on constitution file]

## Project Structure

### Documentation (this feature)
```
specs/[###-feature]/
├── plan.md              # This file (/plan command output)
├── research.md          # Phase 0 output (/plan command)
├── data-model.md        # Phase 1 output (/plan command)
├── quickstart.md        # Phase 1 output (/plan command)
├── contracts/           # Phase 1 output (/plan command)
└── tasks.md             # Phase 2 output (/tasks command - NOT created by /plan)
```

### Source Code (repository root)
```
# Option 1: Single project (DEFAULT)
src/
├── models/
├── services/
├── cli/
└── lib/

tests/
├── contract/
├── integration/
└── unit/

# Option 2: Web application (when "frontend" + "backend" detected)
backend/
├── src/
│   ├── models/
│   ├── services/
│   └── api/
└── tests/

frontend/
├── src/
│   ├── components/
│   ├── pages/
│   └── services/
└── tests/

# Option 3: Mobile + API (when "iOS/Android" detected)
api/
└── [same as backend above]

ios/ or android/
└── [platform-specific structure]
```

**Structure Decision**: [DEFAULT to Option 1 unless Technical Context indicates web/mobile app]

## Phase 0: Outline & Research
1. **Extract unknowns from Technical Context** above:
   - For each NEEDS CLARIFICATION → research task
   - For each dependency → best practices task
   - For each integration → patterns task

2. **Generate and dispatch research agents**:
   ```
   For each unknown in Technical Context:
     Task: "Research {unknown} for {feature context}"
   For each technology choice:
     Task: "Find best practices for {tech} in {domain}"
   ```

3. **Consolidate findings** in `research.md` using format:
   - Decision: [what was chosen]
   - Rationale: [why chosen]
   - Alternatives considered: [what else evaluated]

**Output**: research.md with all NEEDS CLARIFICATION resolved

## Phase 1: Design & Contracts
*Prerequisites: research.md complete*

1. **Extract entities from feature spec** → `data-model.md`:
   - Entity name, fields, relationships
   - Validation rules from requirements
   - State transitions if applicable

2. **Generate API contracts** from functional requirements:
   - For each user action → endpoint
   - Use standard REST/GraphQL patterns
   - Output OpenAPI/GraphQL schema to `/contracts/`

3. **Generate contract tests** from contracts:
   - One test file per endpoint
   - Assert request/response schemas
   - Tests must fail (no implementation yet)

4. **Extract test scenarios** from user stories:
   - Each story → integration test scenario
   - Quickstart test = story validation steps

5. **Update agent file incrementally** (O(1) operation):
   - Run `.specify/scripts/bash/update-agent-context.sh claude` for your AI assistant
   - If exists: Add only NEW tech from current plan
   - Preserve manual additions between markers
   - Update recent changes (keep last 3)
   - Keep under 150 lines for token efficiency
   - Output to repository root

**Output**: data-model.md, /contracts/*, failing tests, quickstart.md, agent-specific file

## Phase 2: Task Planning Approach
*This section describes what the /tasks command will do - DO NOT execute during /plan*

**Task Generation Strategy**:
- Load `.specify/templates/tasks-template.md` as base
- Generate tasks from Phase 1 design docs (contracts, data model, quickstart)
- Each contract → contract test task [P]
- Each entity → model creation task [P] 
- Each user story → integration test task
- Implementation tasks to make tests pass

**Ordering Strategy**:
- TDD order: Tests before implementation 
- Dependency order: Models before services before UI
- Mark [P] for parallel execution (independent files)

**Estimated Output**: 25-30 numbered, ordered tasks in tasks.md

**IMPORTANT**: This phase is executed by the /tasks command, NOT by /plan

## Phase 3+: Future Implementation
*These phases are beyond the scope of the /plan command*

**Phase 3**: Task execution (/tasks command creates tasks.md)  
**Phase 4**: Implementation (execute tasks.md following constitutional principles)  
**Phase 5**: Validation (run tests, execute quickstart.md, performance validation)

## Complexity Tracking
*Fill ONLY if Constitution Check has violations that must be justified*

| Violation | Why Needed | Simpler Alternative Rejected Because |
|-----------|------------|-------------------------------------|
| [e.g., 4th project] | [current need] | [why 3 projects insufficient] |
| [e.g., Repository pattern] | [specific problem] | [why direct DB access insufficient] |


## Progress Tracking
*This checklist is updated during execution flow*

**Phase Status**:
- [ ] Phase 0: Research complete (/plan command)
- [ ] Phase 1: Design complete (/plan command)
- [ ] Phase 2: Task planning complete (/plan command - describe approach only)
- [ ] Phase 3: Tasks generated (/tasks command)
- [ ] Phase 4: Implementation complete
- [ ] Phase 5: Validation passed

**Gate Status**:
- [ ] Initial Constitution Check: PASS
- [ ] Post-Design Constitution Check: PASS
- [ ] All NEEDS CLARIFICATION resolved
- [ ] Complexity deviations documented

---
*Based on Constitution v2.1.1 - See `/memory/constitution.md`*

```

--------------------------------------------------------------------------------
/src/tools/posts.py:
--------------------------------------------------------------------------------

```python
from typing import Optional, Dict, Any, Literal, List
import praw
from prawcore import NotFound, Forbidden
from fastmcp import Context
from ..models import SubredditPostsResult, RedditPost, SubredditInfo


def fetch_subreddit_posts(
    subreddit_name: str,
    reddit: praw.Reddit,
    listing_type: Literal["hot", "new", "top", "rising"] = "hot",
    time_filter: Optional[Literal["all", "year", "month", "week", "day"]] = None,
    limit: int = 25,
    ctx: Context = None
) -> Dict[str, Any]:
    """
    Fetch posts from a specific subreddit.

    Args:
        subreddit_name: Name of the subreddit (without r/ prefix)
        reddit: Configured Reddit client
        listing_type: Type of listing to fetch
        time_filter: Time filter for top posts
        limit: Maximum number of posts (max 100)
        ctx: FastMCP context (auto-injected by decorator)

    Returns:
        Dictionary containing posts and subreddit info
    """
    # Phase 1: Accept context but don't use it yet

    try:
        # Validate limit
        limit = min(max(1, limit), 100)
        
        # Clean subreddit name (remove r/ prefix if present)
        clean_name = subreddit_name.replace("r/", "").replace("/r/", "").strip()
        
        # Get subreddit
        try:
            subreddit = reddit.subreddit(clean_name)
            # Force fetch to check if subreddit exists
            _ = subreddit.display_name
        except NotFound:
            return {
                "error": f"Subreddit r/{clean_name} not found",
                "suggestion": "discover_subreddits({'query': 'topic'})"
            }
        except Forbidden:
            return {"error": f"Access to r/{clean_name} forbidden (may be private)"}
        
        # Get posts based on listing type
        if listing_type == "hot":
            submissions = subreddit.hot(limit=limit)
        elif listing_type == "new":
            submissions = subreddit.new(limit=limit)
        elif listing_type == "rising":
            submissions = subreddit.rising(limit=limit)
        elif listing_type == "top":
            # Use time_filter for top posts
            time_filter = time_filter or "all"
            submissions = subreddit.top(time_filter=time_filter, limit=limit)
        else:
            return {"error": f"Invalid listing_type: {listing_type}"}
        
        # Parse posts
        posts = []
        for submission in submissions:
            posts.append(RedditPost(
                id=submission.id,
                title=submission.title,
                selftext=submission.selftext if submission.selftext else None,
                author=str(submission.author) if submission.author else "[deleted]",
                subreddit=submission.subreddit.display_name,
                score=submission.score,
                upvote_ratio=submission.upvote_ratio,
                num_comments=submission.num_comments,
                created_utc=submission.created_utc,
                url=submission.url,
                permalink=f"https://reddit.com{submission.permalink}"
            ))
        
        # Get subreddit info
        subreddit_info = SubredditInfo(
            name=subreddit.display_name,
            subscribers=subreddit.subscribers,
            description=subreddit.public_description or ""
        )
        
        result = SubredditPostsResult(
            posts=posts,
            subreddit=subreddit_info,
            count=len(posts)
        )
        
        return result.model_dump()
        
    except Exception as e:
        return {"error": f"Failed to fetch posts: {str(e)}"}


async def fetch_multiple_subreddits(
    subreddit_names: List[str],
    reddit: praw.Reddit,
    listing_type: Literal["hot", "new", "top", "rising"] = "hot",
    time_filter: Optional[Literal["all", "year", "month", "week", "day"]] = None,
    limit_per_subreddit: int = 5,
    ctx: Context = None
) -> Dict[str, Any]:
    """
    Fetch posts from multiple subreddits in a single call.

    Args:
        subreddit_names: List of subreddit names to fetch from
        reddit: Configured Reddit client
        listing_type: Type of listing to fetch
        time_filter: Time filter for top posts
        limit_per_subreddit: Maximum posts per subreddit (max 25)
        ctx: FastMCP context (auto-injected by decorator)

    Returns:
        Dictionary containing posts from all requested subreddits
    """
    # Phase 1: Accept context but don't use it yet

    try:
        # Validate limit
        limit_per_subreddit = min(max(1, limit_per_subreddit), 25)
        
        # Clean subreddit names and join with +
        clean_names = [name.replace("r/", "").replace("/r/", "").strip() for name in subreddit_names]
        multi_subreddit_str = "+".join(clean_names)
        
        # Get combined subreddit
        try:
            multi_subreddit = reddit.subreddit(multi_subreddit_str)
            # Calculate total limit (max 100)
            total_limit = min(limit_per_subreddit * len(clean_names), 100)
            
            # Get posts based on listing type
            if listing_type == "hot":
                submissions = multi_subreddit.hot(limit=total_limit)
            elif listing_type == "new":
                submissions = multi_subreddit.new(limit=total_limit)
            elif listing_type == "rising":
                submissions = multi_subreddit.rising(limit=total_limit)
            elif listing_type == "top":
                time_filter = time_filter or "all"
                submissions = multi_subreddit.top(time_filter=time_filter, limit=total_limit)
            else:
                return {"error": f"Invalid listing_type: {listing_type}"}
            
            # Parse posts and group by subreddit
            posts_by_subreddit = {}
            processed_subreddits = set()

            for submission in submissions:
                subreddit_name = submission.subreddit.display_name

                # Report progress when encountering a new subreddit
                if subreddit_name not in processed_subreddits:
                    processed_subreddits.add(subreddit_name)
                    if ctx:
                        await ctx.report_progress(
                            progress=len(processed_subreddits),
                            total=len(clean_names),
                            message=f"Fetching r/{subreddit_name}"
                        )

                if subreddit_name not in posts_by_subreddit:
                    posts_by_subreddit[subreddit_name] = []

                # Only add up to limit_per_subreddit posts per subreddit
                if len(posts_by_subreddit[subreddit_name]) < limit_per_subreddit:
                    posts_by_subreddit[subreddit_name].append({
                        "id": submission.id,
                        "title": submission.title,
                        "author": str(submission.author) if submission.author else "[deleted]",
                        "score": submission.score,
                        "num_comments": submission.num_comments,
                        "created_utc": submission.created_utc,
                        "url": submission.url,
                        "permalink": f"https://reddit.com{submission.permalink}"
                    })
            
            return {
                "subreddits_requested": clean_names,
                "subreddits_found": list(posts_by_subreddit.keys()),
                "posts_by_subreddit": posts_by_subreddit,
                "total_posts": sum(len(posts) for posts in posts_by_subreddit.values())
            }
            
        except Exception as e:
            return {
                "error": f"Failed to fetch from multiple subreddits: {str(e)}",
                "suggestion": "discover_subreddits({'query': 'topic'}) to find valid names"
            }
        
    except Exception as e:
        return {"error": f"Failed to process request: {str(e)}"}
```

--------------------------------------------------------------------------------
/reports/saas-solopreneur-reddit-communities.md:
--------------------------------------------------------------------------------

```markdown
# Top 50 Reddit Communities for SaaS Founders & Solopreneurs
*Research Report for Reddit Social Listening Tool Audience Development*

---

## Executive Summary
This report identifies the top 50 Reddit communities where SaaS founders and solopreneurs actively engage. These communities represent prime targets for promoting and gaining audience for a Reddit social listening tool. The communities are ranked by relevance score, subscriber count, and engagement potential.

---

## 🎯 Tier 1: Primary Target Communities (Highest Relevance)
*These communities have the highest confidence scores for your ICP*

### Core Startup & SaaS Communities

1. **r/startups** - 1,891,655 subscribers | Confidence: 0.962
   - https://reddit.com/r/startups
   - Primary hub for startup discussions, perfect for SaaS tools

2. **r/SaaS** - 374,943 subscribers | Confidence: 0.660
   - https://reddit.com/r/SaaS
   - Dedicated SaaS community, ideal for social listening tool discussions

3. **r/indiehackers** - 105,674 subscribers | Confidence: 0.821
   - https://reddit.com/r/indiehackers
   - Bootstrapped founders, perfect for solopreneur tools

4. **r/SoloFounders** - 2,113 subscribers | Confidence: 0.811
   - https://reddit.com/r/SoloFounders
   - Highly targeted community for solo entrepreneurs

### Large Entrepreneur Communities

5. **r/Entrepreneur** - 4,871,109 subscribers | Confidence: 0.704
   - https://reddit.com/r/Entrepreneur
   - Massive reach for entrepreneurial tools

6. **r/EntrepreneurRideAlong** - 604,396 subscribers | Confidence: 0.793
   - https://reddit.com/r/EntrepreneurRideAlong
   - Journey-focused community, great for tool adoption stories

7. **r/Entrepreneurs** - 77,330 subscribers | Confidence: 0.872
   - https://reddit.com/r/Entrepreneurs
   - Active discussion community for business builders

8. **r/Entrepreneurship** - 99,462 subscribers | Confidence: 0.726
   - https://reddit.com/r/Entrepreneurship
   - Academic and practical entrepreneurship discussions

---

## 📊 Tier 2: Marketing & Growth Communities
*Essential for social listening tool promotion*

9. **r/DigitalMarketingHack** - 34,155 subscribers | Confidence: 0.909
   - https://reddit.com/r/DigitalMarketingHack
   - Perfect for marketing automation tools

10. **r/SocialMediaMarketing** - 197,241 subscribers | Confidence: 0.754
    - https://reddit.com/r/SocialMediaMarketing
    - Direct audience for social listening tools

11. **r/socialmedia** - 2,061,330 subscribers | Confidence: 0.616
    - https://reddit.com/r/socialmedia
    - Broad social media community

12. **r/MarketingHelp** - 16,148 subscribers | Confidence: 0.701
    - https://reddit.com/r/MarketingHelp
    - Problem-solving community, great for tool recommendations

13. **r/SocialMediaManagers** - 20,614 subscribers | Confidence: 0.649
    - https://reddit.com/r/SocialMediaManagers
    - Professional community needing social listening tools

14. **r/ContentMarketing** - 17,436 subscribers | Confidence: 0.527
    - https://reddit.com/r/ContentMarketing
    - Content strategy discussions

---

## 💼 Tier 3: Business & Small Business Communities

15. **r/smallbusiness** - 2,211,156 subscribers | Confidence: 0.338
    - https://reddit.com/r/smallbusiness
    - Massive reach for business tools

16. **r/Business_Ideas** - 370,194 subscribers | Confidence: 0.479
    - https://reddit.com/r/Business_Ideas
    - Idea validation community

17. **r/sidehustle** - 3,124,834 subscribers | Confidence: 0.340
    - https://reddit.com/r/sidehustle
    - Side project enthusiasts

18. **r/growmybusiness** - 66,695 subscribers | Confidence: 0.327
    - https://reddit.com/r/growmybusiness
    - Growth-focused community

19. **r/sweatystartup** - 182,854 subscribers | Confidence: 0.432
    - https://reddit.com/r/sweatystartup
    - Service business focus

---

## 🚀 Tier 4: Advanced & Specialized Communities

20. **r/advancedentrepreneur** - 60,964 subscribers | Confidence: 0.682
    - https://reddit.com/r/advancedentrepreneur
    - Experienced founders who need advanced tools

21. **r/startup** - 225,696 subscribers | Confidence: 0.569
    - https://reddit.com/r/startup
    - Alternative startup community

22. **r/EntrepreneurConnect** - 5,178 subscribers | Confidence: 0.635
    - https://reddit.com/r/EntrepreneurConnect
    - Networking and connection focus

23. **r/cofounderhunt** - 16,287 subscribers | Confidence: 0.650
    - https://reddit.com/r/cofounderhunt
    - Team building community

24. **r/SaaSy** - 3,150 subscribers | Confidence: 0.653
    - https://reddit.com/r/SaaSy
    - Small but targeted SaaS community

---

## 🌍 Tier 5: Regional & International Communities

25. **r/indianstartups** - 76,422 subscribers | Confidence: 0.717
    - https://reddit.com/r/indianstartups
    - Indian startup ecosystem

26. **r/StartUpIndia** - 361,780 subscribers | Confidence: 0.487
    - https://reddit.com/r/StartUpIndia
    - Large Indian startup community

27. **r/IndianEntrepreneur** - 9,816 subscribers | Confidence: 0.593
    - https://reddit.com/r/IndianEntrepreneur
    - Indian entrepreneur focus

28. **r/PhStartups** - 20,901 subscribers | Confidence: 0.529
    - https://reddit.com/r/PhStartups
    - Philippines startup community

29. **r/Startups_EU** - 2,894 subscribers | Confidence: 0.382
    - https://reddit.com/r/Startups_EU
    - European startup community

---

## 📈 Tier 6: Marketing & Growth Specialized

30. **r/MarketingMentor** - 66,997 subscribers | Confidence: 0.593
    - https://reddit.com/r/MarketingMentor
    - Marketing education and mentorship

31. **r/Affiliatemarketing** - 239,731 subscribers | Confidence: 0.550
    - https://reddit.com/r/Affiliatemarketing
    - Performance marketing community

32. **r/musicmarketing** - 67,516 subscribers | Confidence: 0.576
    - https://reddit.com/r/musicmarketing
    - Niche marketing community

33. **r/MarketingResearch** - 22,931 subscribers | Confidence: 0.524
    - https://reddit.com/r/MarketingResearch
    - Research-focused marketing

34. **r/SaaS_Email_Marketing** - 7,434 subscribers | Confidence: 0.448
    - https://reddit.com/r/SaaS_Email_Marketing
    - Email marketing for SaaS

---

## 💡 Tier 7: Business Ideas & Validation

35. **r/small_business_ideas** - 23,034 subscribers | Confidence: 0.657
    - https://reddit.com/r/small_business_ideas
    - Idea generation and validation

36. **r/HowToEntrepreneur** - 3,618 subscribers | Confidence: 0.373
    - https://reddit.com/r/HowToEntrepreneur
    - Educational entrepreneurship

37. **r/PassionsToProfits** - 4,905 subscribers | Confidence: 0.361
    - https://reddit.com/r/PassionsToProfits
    - Monetization focus

38. **r/BusinessVault** - 2,889 subscribers | Confidence: 0.357
    - https://reddit.com/r/BusinessVault
    - Business knowledge sharing

---

## 🔧 Tier 8: Technical & Product Communities

39. **r/AppBusiness** - 17,876 subscribers | Confidence: 0.334
    - https://reddit.com/r/AppBusiness
    - App development business

40. **r/selfpublish** - 196,096 subscribers | Confidence: 0.495
    - https://reddit.com/r/selfpublish
    - Independent creators

41. **r/kickstarter** - 93,932 subscribers | Confidence: 0.554
    - https://reddit.com/r/kickstarter
    - Product launch community

42. **r/ClothingStartups** - 32,371 subscribers | Confidence: 0.462
    - https://reddit.com/r/ClothingStartups
    - E-commerce specific

---

## 🤖 Tier 9: AI & Automation Communities

43. **r/AiForSmallBusiness** - 8,963 subscribers | Confidence: 0.363
    - https://reddit.com/r/AiForSmallBusiness
    - AI tools for business

44. **r/CreatorsAI** - 4,509 subscribers | Confidence: 0.381
    - https://reddit.com/r/CreatorsAI
    - AI for content creators

---

## 📱 Tier 10: Social & Digital Communities

45. **r/SocialMediaLounge** - 17,166 subscribers | Confidence: 0.720
    - https://reddit.com/r/SocialMediaLounge
    - Casual social media discussions

46. **r/digitalproductselling** - 26,528 subscribers | Confidence: 0.541
    - https://reddit.com/r/digitalproductselling
    - Digital product creators

47. **r/Fiverr** - 64,568 subscribers | Confidence: 0.508
    - https://reddit.com/r/Fiverr
    - Freelance and service providers

48. **r/venturecapital** - 66,268 subscribers | Confidence: 0.484
    - https://reddit.com/r/venturecapital
    - Funding and investment focus

49. **r/YouTube_startups** - 127,440 subscribers | Confidence: 0.386
    - https://reddit.com/r/YouTube_startups
    - Content creator entrepreneurs

50. **r/LawFirm** - 84,044 subscribers | Confidence: 0.447
    - https://reddit.com/r/LawFirm
    - Legal business operations

---

## 📋 Engagement Strategy Recommendations

### Immediate Action Communities (Top 10 Priority)
1. r/SaaS - Direct product-market fit
2. r/startups - High engagement potential
3. r/indiehackers - Bootstrapped audience
4. r/SocialMediaMarketing - Direct need for tool
5. r/DigitalMarketingHack - Marketing automation focus
6. r/Entrepreneurs - Active, engaged community
7. r/SoloFounders - Highly targeted
8. r/EntrepreneurRideAlong - Journey documentation
9. r/advancedentrepreneur - Experienced users
10. r/SocialMediaManagers - Professional users

### Content Strategy Tips
- Share case studies of Reddit research insights
- Offer free audits using your tool
- Create educational content about Reddit listening
- Engage authentically before promoting
- Follow each subreddit's self-promotion rules

### Key Metrics to Track
- Total potential reach: 21.5M+ subscribers
- High-confidence communities (>0.7): 10 subreddits
- Medium-confidence communities (0.5-0.7): 25 subreddits
- Broad reach communities (>1M subscribers): 7 subreddits

---

## 🎯 Next Steps
1. Join top 10 priority communities
2. Study each community's rules and culture
3. Create value-first content calendar
4. Build relationships before promoting
5. Track engagement and conversion metrics

---

*Report generated for Reddit social listening tool audience development*
*Focus on authentic engagement and value creation for best results*
```

--------------------------------------------------------------------------------
/reports/top-50-subreddits-saas-ai-builders.md:
--------------------------------------------------------------------------------

```markdown
# Top 50 Subreddits for SaaS Founders, Solopreneurs, AI Developers & AI Builders

*Research Date: 2025-09-20*
*Generated using Reddit MCP Server with semantic vector search*

## Executive Summary

This report identifies the top 50 Reddit communities where your Ideal Customer Profile (ICP) of SaaS startup founders, solopreneurs, AI developers, and AI builders actively engage. These communities have been ranked based on:
- **Confidence scores** (semantic relevance to ICP)
- **Subscriber count** (community size and reach)
- **Topic relevance** (direct alignment with ICP interests)
- **Engagement potential** (active discussion quality)

## Top 50 Subreddits - Master List

### Tier 1: Primary Target Communities (Confidence > 0.8)
*These communities have the highest relevance to your ICP*

1. **r/aipromptprogramming** - 107,001 subscribers | Confidence: 0.911
   - AI development focus with prompt engineering
   - https://reddit.com/r/aipromptprogramming

2. **r/AI_Agents** - 191,203 subscribers | Confidence: 0.902
   - AI agent development and implementation
   - https://reddit.com/r/AI_Agents

3. **r/indiehackers** - 105,674 subscribers | Confidence: 0.867
   - Solo entrepreneurs and indie developers
   - https://reddit.com/r/indiehackers

4. **r/ArtificialInteligence** - 1,551,586 subscribers | Confidence: 0.838
   - Large AI community with diverse discussions
   - https://reddit.com/r/ArtificialInteligence

5. **r/SoloFounders** - 2,113 subscribers | Confidence: 0.832
   - Dedicated to solo entrepreneurs
   - https://reddit.com/r/SoloFounders

6. **r/AiBuilders** - 8,387 subscribers | Confidence: 0.826
   - AI builders and creators community
   - https://reddit.com/r/AiBuilders

7. **r/startups** - 1,891,655 subscribers | Confidence: 0.82
   - Large startup community
   - https://reddit.com/r/startups

8. **r/learnAIAgents** - 5,203 subscribers | Confidence: 0.814
   - Learning AI agent development
   - https://reddit.com/r/learnAIAgents

9. **r/AiAutomations** - 7,085 subscribers | Confidence: 0.811
   - AI automation tools and workflows
   - https://reddit.com/r/AiAutomations

10. **r/AI_Application** - 14,902 subscribers | Confidence: 0.801
    - Applied AI development
    - https://reddit.com/r/AI_Application

### Tier 2: High-Value Communities (Confidence 0.7 - 0.8)

11. **r/Entrepreneur** - 4,871,109 subscribers | Confidence: 0.785
    - Massive entrepreneurship community
    - https://reddit.com/r/Entrepreneur

12. **r/machinelearningnews** - 107,720 subscribers | Confidence: 0.779
    - ML news and developments
    - https://reddit.com/r/machinelearningnews

13. **r/AI_Application** - 14,902 subscribers | Confidence: 0.778
    - AI implementation focus
    - https://reddit.com/r/AI_Application

14. **r/Entrepreneurs** - 77,330 subscribers | Confidence: 0.777
    - Active entrepreneur discussions
    - https://reddit.com/r/Entrepreneurs

15. **r/EntrepreneurRideAlong** - 604,396 subscribers | Confidence: 0.775
    - Entrepreneurial journey sharing
    - https://reddit.com/r/EntrepreneurRideAlong

16. **r/EntrepreneurConnect** - 5,178 subscribers | Confidence: 0.752
    - Networking for entrepreneurs
    - https://reddit.com/r/EntrepreneurConnect

17. **r/AutoGenAI** - 7,165 subscribers | Confidence: 0.726
    - Automated AI generation tools
    - https://reddit.com/r/AutoGenAI

18. **r/Entrepreneurship** - 99,462 subscribers | Confidence: 0.720
    - Business and entrepreneurship focus
    - https://reddit.com/r/Entrepreneurship

19. **r/SoloFounders** - 2,113 subscribers | Confidence: 0.717
    - Solo founders community
    - https://reddit.com/r/SoloFounders

20. **r/HowToAIAgent** - 6,950 subscribers | Confidence: 0.711
    - AI agent tutorials and guides
    - https://reddit.com/r/HowToAIAgent

### Tier 3: Strong Secondary Communities (Confidence 0.6 - 0.7)

21. **r/artificial** - 1,135,505 subscribers | Confidence: 0.679
    - General AI discussions
    - https://reddit.com/r/artificial

22. **r/AgentsOfAI** - 46,494 subscribers | Confidence: 0.675
    - AI agents community
    - https://reddit.com/r/AgentsOfAI

23. **r/small_business_ideas** - 23,034 subscribers | Confidence: 0.671
    - Business idea validation
    - https://reddit.com/r/small_business_ideas

24. **r/BlackboxAI_** - 8,357 subscribers | Confidence: 0.659
    - AI development tools
    - https://reddit.com/r/BlackboxAI_

25. **r/FetchAI_Community** - 21,415 subscribers | Confidence: 0.654
    - AI technology community
    - https://reddit.com/r/FetchAI_Community

26. **r/cofounderhunt** - 16,287 subscribers | Confidence: 0.654
    - Finding co-founders
    - https://reddit.com/r/cofounderhunt

27. **r/AboutAI** - 10,076 subscribers | Confidence: 0.652
    - AI education and discussion
    - https://reddit.com/r/AboutAI

28. **r/PydanticAI** - 3,039 subscribers | Confidence: 0.652
    - Python AI development
    - https://reddit.com/r/PydanticAI

29. **r/AIAssisted** - 71,088 subscribers | Confidence: 0.647
    - AI-assisted work and creativity
    - https://reddit.com/r/AIAssisted

30. **r/AI_Tools_Land** - 6,608 subscribers | Confidence: 0.631
    - AI tools discovery
    - https://reddit.com/r/AI_Tools_Land

### Tier 4: Valuable Niche Communities (Confidence 0.5 - 0.6)

31. **r/Automate** - 146,410 subscribers | Confidence: 0.630
    - Automation tools and workflows
    - https://reddit.com/r/Automate

32. **r/SaaS** - 374,943 subscribers | Confidence: 0.629
    - SaaS business discussions
    - https://reddit.com/r/SaaS

33. **r/AI_India** - 13,678 subscribers | Confidence: 0.623
    - Indian AI community
    - https://reddit.com/r/AI_India

34. **r/Business_Ideas** - 370,194 subscribers | Confidence: 0.619
    - Business idea discussions
    - https://reddit.com/r/Business_Ideas

35. **r/ThinkingDeeplyAI** - 11,572 subscribers | Confidence: 0.616
    - AI philosophy and deep thinking
    - https://reddit.com/r/ThinkingDeeplyAI

36. **r/PROJECT_AI** - 2,365 subscribers | Confidence: 0.615
    - AI project collaboration
    - https://reddit.com/r/PROJECT_AI

37. **r/AI_Agents** - 191,203 subscribers | Confidence: 0.608
    - AI agent development
    - https://reddit.com/r/AI_Agents

38. **r/genspark_ai** - 2,224 subscribers | Confidence: 0.590
    - AI development platform
    - https://reddit.com/r/genspark_ai

39. **r/CreatorsAI** - 4,509 subscribers | Confidence: 0.584
    - AI for content creators
    - https://reddit.com/r/CreatorsAI

40. **r/learnAIAgents** - 5,203 subscribers | Confidence: 0.584
    - Learning AI development
    - https://reddit.com/r/learnAIAgents

### Tier 5: Supporting Communities (Confidence 0.45 - 0.5)

41. **r/neuralnetworks** - 29,721 subscribers | Confidence: 0.581
    - Neural network development
    - https://reddit.com/r/neuralnetworks

42. **r/cofounderhunt** - 16,287 subscribers | Confidence: 0.578
    - Co-founder matching
    - https://reddit.com/r/cofounderhunt

43. **r/mlops** - 24,727 subscribers | Confidence: 0.574
    - Machine learning operations
    - https://reddit.com/r/mlops

44. **r/HowToAIAgent** - 6,950 subscribers | Confidence: 0.574
    - AI agent tutorials
    - https://reddit.com/r/HowToAIAgent

45. **r/FetchAI_Community** - 21,415 subscribers | Confidence: 0.572
    - Fetch.ai community
    - https://reddit.com/r/FetchAI_Community

46. **r/aiHub** - 9,867 subscribers | Confidence: 0.571
    - AI resources hub
    - https://reddit.com/r/aiHub

47. **r/selfpublish** - 196,096 subscribers | Confidence: 0.566
    - Self-publishing entrepreneurs
    - https://reddit.com/r/selfpublish

48. **r/PydanticAI** - 3,039 subscribers | Confidence: 0.564
    - Python AI framework
    - https://reddit.com/r/PydanticAI

49. **r/aifails** - 20,511 subscribers | Confidence: 0.563
    - Learning from AI failures
    - https://reddit.com/r/aifails

50. **r/learnmachinelearning** - 547,704 subscribers | Confidence: 0.561
    - Machine learning education
    - https://reddit.com/r/learnmachinelearning

## Community Engagement Strategy

### Primary Focus (Top Priority)
Focus on communities with:
- High confidence scores (>0.7)
- Active subscriber bases (>10,000)
- Direct ICP alignment

**Recommended top 5 for immediate engagement:**
1. r/AI_Agents (191K subscribers, 0.902 confidence)
2. r/aipromptprogramming (107K subscribers, 0.911 confidence)
3. r/indiehackers (105K subscribers, 0.867 confidence)
4. r/startups (1.8M subscribers, 0.82 confidence)
5. r/Entrepreneur (4.8M subscribers, 0.785 confidence)

### Content Strategy by ICP Segment

#### For SaaS Founders:
- r/SaaS (374K subscribers)
- r/startups (1.8M subscribers)
- r/SoloFounders (2K subscribers)
- r/EntrepreneurRideAlong (604K subscribers)

#### For Solopreneurs:
- r/indiehackers (105K subscribers)
- r/SoloFounders (2K subscribers)
- r/Entrepreneur (4.8M subscribers)
- r/EntrepreneurConnect (5K subscribers)

#### For AI Developers:
- r/aipromptprogramming (107K subscribers)
- r/machinelearningnews (107K subscribers)
- r/learnmachinelearning (547K subscribers)
- r/mlops (24K subscribers)

#### For AI Builders:
- r/AI_Agents (191K subscribers)
- r/AiBuilders (8K subscribers)
- r/AiAutomations (7K subscribers)
- r/AutoGenAI (7K subscribers)

## Key Insights

1. **Large Communities with High Relevance**: Several communities combine massive reach (>100K subscribers) with high confidence scores (>0.7), offering excellent engagement opportunities.

2. **Niche Communities**: Smaller, highly-focused communities like r/SoloFounders and r/AiBuilders may have fewer members but offer highly targeted engagement.

3. **Cross-Pollination Opportunities**: Many users are active across multiple communities, allowing for strategic cross-posting and relationship building.

4. **AI Focus is Strong**: The AI development and builder communities show extremely high confidence scores, indicating strong alignment with current market trends.

5. **Entrepreneurship Overlap**: Strong overlap between entrepreneurship and AI communities suggests your ICP is at the intersection of business and technology.

## Recommended Next Steps

1. **Profile Analysis**: Conduct deeper analysis of top 10 communities to understand posting rules and culture
2. **Content Calendar**: Develop community-specific content strategies
3. **Engagement Tracking**: Monitor which communities drive the most valuable interactions
4. **Relationship Building**: Identify and connect with key influencers in each community
5. **Value-First Approach**: Focus on providing value before any promotional activities

---

*Note: Confidence scores are based on semantic relevance to the search queries. Subscriber counts are current as of the research date. Community dynamics and rules should be reviewed before engagement.*
```

--------------------------------------------------------------------------------
/src/tools/discover.py:
--------------------------------------------------------------------------------

```python
"""Subreddit discovery using semantic vector search."""

import os
import json
from typing import Dict, List, Optional, Union, Any
from fastmcp import Context
from ..chroma_client import get_chroma_client, get_collection


async def discover_subreddits(
    query: Optional[str] = None,
    queries: Optional[Union[List[str], str]] = None,
    limit: int = 10,
    include_nsfw: bool = False,
    ctx: Context = None
) -> Dict[str, Any]:
    """
    Search for subreddits using semantic similarity search.

    Finds relevant subreddits based on semantic embeddings of subreddit names,
    descriptions, and community metadata.

    Args:
        query: Single search term to find subreddits
        queries: List of search terms for batch discovery (more efficient)
                 Can also be a JSON string like '["term1", "term2"]'
        limit: Maximum number of results per query (default 10)
        include_nsfw: Whether to include NSFW subreddits (default False)
        ctx: FastMCP context (auto-injected by decorator)

    Returns:
        Dictionary with discovered subreddits and their metadata
    """
    # Phase 1: Accept context but don't use it yet

    # Initialize ChromaDB client
    try:
        client = get_chroma_client()
        collection = get_collection("reddit_subreddits", client)
        
    except Exception as e:
        return {
            "error": f"Failed to connect to vector database: {str(e)}",
            "results": [],
            "summary": {
                "total_found": 0,
                "returned": 0,
                "coverage": "error"
            }
        }
    
    # Handle batch queries - convert string to list if needed
    if queries:
        # Handle case where LLM passes JSON string instead of array
        if isinstance(queries, str):
            try:
                # Try to parse as JSON if it looks like a JSON array
                if queries.strip().startswith('[') and queries.strip().endswith(']'):
                    queries = json.loads(queries)
                else:
                    # Single string query, convert to single-item list
                    queries = [queries]
            except (json.JSONDecodeError, ValueError):
                # If JSON parsing fails, treat as single string
                queries = [queries]
        
        batch_results = {}
        total_api_calls = 0
        
        for search_query in queries:
            result = await _search_vector_db(
                search_query, collection, limit, include_nsfw, ctx
            )
            batch_results[search_query] = result
            total_api_calls += 1
        
        return {
            "batch_mode": True,
            "total_queries": len(queries),
            "api_calls_made": total_api_calls,
            "results": batch_results,
            "tip": "Batch mode reduces API calls. Use the exact 'name' field when calling other tools."
        }
    
    # Handle single query
    elif query:
        return await _search_vector_db(query, collection, limit, include_nsfw, ctx)
    
    else:
        return {
            "error": "Either 'query' or 'queries' parameter must be provided",
            "subreddits": [],
            "summary": {
                "total_found": 0,
                "returned": 0,
                "coverage": "error"
            }
        }


async def _search_vector_db(
    query: str,
    collection,
    limit: int,
    include_nsfw: bool,
    ctx: Context = None
) -> Dict[str, Any]:
    """Internal function to perform semantic search for a single query."""
    # Phase 1: Accept context but don't use it yet

    try:
        # Search with a larger limit to allow for filtering
        search_limit = min(limit * 3, 100)  # Get extra results for filtering
        
        # Perform semantic search
        results = collection.query(
            query_texts=[query],
            n_results=search_limit
        )
        
        if not results or not results['metadatas'] or not results['metadatas'][0]:
            return {
                "query": query,
                "subreddits": [],
                "summary": {
                    "total_found": 0,
                    "returned": 0,
                    "has_more": False
                },
                "next_actions": ["Try different search terms"]
            }
        
        # Process results
        processed_results = []
        nsfw_filtered = 0
        total_results = len(results['metadatas'][0])

        for i, (metadata, distance) in enumerate(zip(
            results['metadatas'][0],
            results['distances'][0]
        )):
            # Report progress
            if ctx:
                await ctx.report_progress(
                    progress=i + 1,
                    total=total_results,
                    message=f"Analyzing r/{metadata.get('name', 'unknown')}"
                )

            # Skip NSFW if not requested
            if metadata.get('nsfw', False) and not include_nsfw:
                nsfw_filtered += 1
                continue
            
            # Convert distance to confidence score (lower distance = higher confidence)
            # Adjust the scaling based on observed distances (typically 0.8 to 1.6)
            # Map distances: 0.8 -> 0.9, 1.0 -> 0.7, 1.2 -> 0.5, 1.4 -> 0.3, 1.6+ -> 0.1
            if distance < 0.8:
                confidence = 0.9 + (0.1 * (0.8 - distance) / 0.8)  # 0.9 to 1.0
            elif distance < 1.0:
                confidence = 0.7 + (0.2 * (1.0 - distance) / 0.2)  # 0.7 to 0.9
            elif distance < 1.2:
                confidence = 0.5 + (0.2 * (1.2 - distance) / 0.2)  # 0.5 to 0.7
            elif distance < 1.4:
                confidence = 0.3 + (0.2 * (1.4 - distance) / 0.2)  # 0.3 to 0.5
            else:
                confidence = max(0.1, 0.3 * (2.0 - distance) / 0.6)  # 0.1 to 0.3
            
            # Apply penalties for generic subreddits
            subreddit_name = metadata.get('name', '').lower()
            generic_subs = ['funny', 'pics', 'videos', 'gifs', 'memes', 'aww']
            if subreddit_name in generic_subs and query.lower() not in subreddit_name:
                confidence *= 0.3  # Heavy penalty for generic subs unless directly searched
            
            # Boost for high-activity subreddits (optional)
            subscribers = metadata.get('subscribers', 0)
            if subscribers > 1000000:
                confidence = min(1.0, confidence * 1.1)  # Small boost for very large subs
            elif subscribers < 10000:
                confidence *= 0.9  # Small penalty for tiny subs
            
            # Determine match type based on distance
            if distance < 0.3:
                match_type = "exact_match"
            elif distance < 0.7:
                match_type = "strong_match"
            elif distance < 1.0:
                match_type = "partial_match"
            else:
                match_type = "weak_match"
            
            processed_results.append({
                "name": metadata.get('name', 'unknown'),
                "subscribers": metadata.get('subscribers', 0),
                "confidence": round(confidence, 3),
                "url": metadata.get('url', f"https://reddit.com/r/{metadata.get('name', '')}")
            })
        
        # Sort by confidence (highest first), then by subscribers
        processed_results.sort(key=lambda x: (-x['confidence'], -(x['subscribers'] or 0)))
        
        # Limit to requested number
        limited_results = processed_results[:limit]
        
        # Calculate basic stats
        total_found = len(processed_results)
        
        # Generate next actions (only meaningful ones)
        next_actions = []
        if len(processed_results) > limit:
            next_actions.append(f"{len(processed_results)} total results found, showing {limit}")
        if nsfw_filtered > 0:
            next_actions.append(f"{nsfw_filtered} NSFW subreddits filtered")
        
        return {
            "query": query,
            "subreddits": limited_results,
            "summary": {
                "total_found": total_found,
                "returned": len(limited_results),
                "has_more": total_found > len(limited_results)
            },
            "next_actions": next_actions
        }
        
    except Exception as e:
        # Map error patterns to specific recovery actions
        error_str = str(e).lower()
        if "not found" in error_str:
            guidance = "Verify subreddit name spelling"
        elif "rate" in error_str:
            guidance = "Rate limited - wait 60 seconds"
        elif "timeout" in error_str:
            guidance = "Reduce limit parameter to 10"
        else:
            guidance = "Try simpler search terms"
            
        return {
            "error": f"Failed to search vector database: {str(e)}",
            "query": query,
            "subreddits": [],
            "summary": {
                "total_found": 0,
                "returned": 0,
                "has_more": False
            },
            "next_actions": [guidance]
        }


def validate_subreddit(
    subreddit_name: str,
    ctx: Context = None
) -> Dict[str, Any]:
    """
    Validate if a subreddit exists in the indexed database.

    Checks if the subreddit exists in our semantic search index
    and returns its metadata if found.

    Args:
        subreddit_name: Name of the subreddit to validate
        ctx: FastMCP context (optional)

    Returns:
        Dictionary with validation result and subreddit info if found
    """
    # Phase 1: Accept context but don't use it yet

    # Clean the subreddit name
    clean_name = subreddit_name.replace("r/", "").replace("/r/", "").strip()
    
    try:
        # Search for exact match in vector database
        client = get_chroma_client()
        collection = get_collection("reddit_subreddits", client)
        
        # Search for the exact subreddit name
        results = collection.query(
            query_texts=[clean_name],
            n_results=5
        )
        
        if results and results['metadatas'] and results['metadatas'][0]:
            # Look for exact match in results
            for metadata in results['metadatas'][0]:
                if metadata.get('name', '').lower() == clean_name.lower():
                    return {
                        "valid": True,
                        "name": metadata.get('name'),
                        "subscribers": metadata.get('subscribers', 0),
                        "is_private": False,  # We only index public subreddits
                        "over_18": metadata.get('nsfw', False),
                        "indexed": True
                    }
        
        return {
            "valid": False,
            "name": clean_name,
            "error": f"Subreddit '{clean_name}' not found",
            "suggestion": "Use discover_subreddits to find similar communities"
        }
        
    except Exception as e:
        return {
            "valid": False,
            "name": clean_name,
            "error": f"Database error: {str(e)}",
            "suggestion": "Check database connection and retry"
        }
```

--------------------------------------------------------------------------------
/src/resources.py:
--------------------------------------------------------------------------------

```python
"""Reddit MCP Resources - Server information endpoint."""

from typing import Dict, Any
import praw


def register_resources(mcp, reddit: praw.Reddit) -> None:
    """Register server info resource with the MCP server."""
    
    @mcp.resource("reddit://server-info")
    def get_server_info() -> Dict[str, Any]:
        """
        Get comprehensive information about the Reddit MCP server's capabilities.
        
        Returns server version, available tools, prompts, and usage examples.
        """
        # Try to get rate limit info from Reddit
        rate_limit_info = {}
        try:
            # Access auth to check rate limit status
            rate_limit_info = {
                "requests_remaining": reddit.auth.limits.get('remaining', 'unknown'),
                "reset_timestamp": reddit.auth.limits.get('reset_timestamp', 'unknown'),
                "used": reddit.auth.limits.get('used', 'unknown')
            }
        except:
            rate_limit_info = {
                "status": "Rate limits tracked automatically by PRAW",
                "strategy": "Automatic retry with exponential backoff"
            }
        
        return {
            "name": "Reddit Research MCP Server",
            "version": "0.4.0",
            "description": "MCP server for comprehensive Reddit research with semantic search across 20,000+ indexed subreddits",
            "changelog": {
                "0.4.0": [
                    "Added reddit_research prompt for automated comprehensive research",
                    "Streamlined resources to focus on server-info only",
                    "Enhanced documentation for prompt-based workflows"
                ],
                "0.3.0": [
                    "Implemented three-layer architecture for clearer operation flow",
                    "Added semantic subreddit discovery with vector search",
                    "Enhanced workflow guidance with confidence-based recommendations",
                    "Improved error recovery suggestions"
                ],
                "0.2.0": [
                    "Added discover_subreddits with confidence scoring",
                    "Added fetch_multiple_subreddits for batch operations",
                    "Enhanced server-info with comprehensive documentation",
                    "Improved error handling and rate limit management"
                ],
                "0.1.0": [
                    "Initial release with search, fetch, and comment tools",
                    "Basic resources for popular subreddits and server info"
                ]
            },
            "capabilities": {
                "key_features": [
                    "Semantic search across 20,000+ indexed subreddits",
                    "Batch operations reducing API calls by 70%",
                    "Automated research workflow via prompt",
                    "Three-layer architecture for guided operations",
                    "Comprehensive citation tracking with Reddit URLs"
                ],
                "architecture": {
                    "type": "Three-Layer Architecture",
                    "workflow": [
                        "Layer 1: discover_operations() - See available operations",
                        "Layer 2: get_operation_schema(operation_id) - Get requirements",
                        "Layer 3: execute_operation(operation_id, parameters) - Execute"
                    ],
                    "description": "ALWAYS start with Layer 1, then Layer 2, then Layer 3"
                },
                "tools": [
                    {
                        "name": "discover_operations",
                        "layer": 1,
                        "description": "Discover available Reddit operations",
                        "parameters": "NONE - Call without any parameters: discover_operations() NOT discover_operations({})",
                        "purpose": "Shows all available operations and recommended workflows"
                    },
                    {
                        "name": "get_operation_schema",
                        "layer": 2,
                        "description": "Get parameter requirements for an operation",
                        "parameters": {
                            "operation_id": "The operation to get schema for (from Layer 1)",
                            "include_examples": "Whether to include examples (optional, default: true)"
                        },
                        "purpose": "Provides parameter schemas, validation rules, and examples"
                    },
                    {
                        "name": "execute_operation",
                        "layer": 3,
                        "description": "Execute a Reddit operation",
                        "parameters": {
                            "operation_id": "The operation to execute",
                            "parameters": "Parameters matching the schema from Layer 2"
                        },
                        "purpose": "Actually performs the Reddit API calls"
                    }
                ],
                "prompts": [
                    {
                        "name": "reddit_research",
                        "description": "Conduct comprehensive Reddit research on any topic or question",
                        "parameters": {
                            "research_request": "Natural language description of what to research (e.g., 'How do people feel about remote work?')"
                        },
                        "returns": "Structured workflow guiding complete research process",
                        "output": "Comprehensive markdown report with citations and metrics",
                        "usage": "Select prompt, provide research question, receive guided workflow"
                    }
                ],
                "available_operations": {
                    "discover_subreddits": "Find communities using semantic vector search (20,000+ indexed)",
                    "search_subreddit": "Search within a specific community",
                    "fetch_posts": "Get posts from one subreddit",
                    "fetch_multiple": "Batch fetch from multiple subreddits (70% more efficient)",
                    "fetch_comments": "Get complete comment tree for deep analysis"
                },
                "resources": [
                    {
                        "uri": "reddit://server-info",
                        "description": "Comprehensive server capabilities, version, and usage information",
                        "cacheable": False,
                        "always_current": True
                    }
                ],
                "statistics": {
                    "total_tools": 3,
                    "total_prompts": 1,
                    "total_operations": 5,
                    "total_resources": 1,
                    "indexed_subreddits": "20,000+"
                }
            },
            "usage_examples": {
                "automated_research": {
                    "description": "Use the reddit_research prompt for complete automated workflow",
                    "steps": [
                        "1. Select the 'reddit_research' prompt in your MCP client",
                        "2. Provide your research question: 'What are the best practices for React development?'",
                        "3. The prompt guides the LLM through discovery, gathering, analysis, and reporting",
                        "4. Receive comprehensive markdown report with citations"
                    ]
                },
                "manual_workflow": {
                    "description": "Step-by-step manual research using the three-layer architecture",
                    "steps": [
                        "1. discover_operations() - See what's available",
                        "2. get_operation_schema('discover_subreddits') - Get requirements",
                        "3. execute_operation('discover_subreddits', {'query': 'machine learning', 'limit': 15})",
                        "4. get_operation_schema('fetch_multiple') - Get batch fetch requirements",
                        "5. execute_operation('fetch_multiple', {'subreddit_names': [...], 'limit_per_subreddit': 10})",
                        "6. get_operation_schema('fetch_comments') - Get comment requirements",
                        "7. execute_operation('fetch_comments', {'submission_id': 'abc123', 'comment_limit': 100})"
                    ]
                },
                "targeted_search": {
                    "description": "Find specific content in known communities",
                    "steps": [
                        "1. discover_operations()",
                        "2. get_operation_schema('search_subreddit')",
                        "3. execute_operation('search_subreddit', {'subreddit_name': 'Python', 'query': 'async', 'limit': 20})"
                    ]
                }
            },
            "performance_tips": [
                "Use the reddit_research prompt for automated comprehensive research",
                "Always follow the three-layer workflow for manual operations",
                "Use fetch_multiple for 2+ subreddits (70% fewer API calls)",
                "Single semantic search finds all relevant communities",
                "Use confidence scores to guide strategy (>0.7 = high confidence)",
                "Expect ~15-20K tokens for comprehensive research"
            ],
            "workflow_guidance": {
                "confidence_based_strategy": {
                    "high_confidence": "Scores > 0.7: Focus on top 5-8 subreddits",
                    "medium_confidence": "Scores 0.4-0.7: Cast wider net with 10-12 subreddits",
                    "low_confidence": "Scores < 0.4: Refine search terms and retry"
                },
                "research_depth": {
                    "minimum_coverage": "10+ threads, 100+ comments, 3+ subreddits",
                    "quality_thresholds": "Posts: 5+ upvotes, Comments: 2+ upvotes",
                    "author_credibility": "Prioritize 100+ karma for key insights"
                },
                "token_optimization": {
                    "discover_subreddits": "~1-2K tokens for semantic search",
                    "fetch_multiple": "~500-1000 tokens per subreddit",
                    "fetch_comments": "~2-5K tokens per post with comments",
                    "full_research": "~15-20K tokens for comprehensive analysis"
                }
            },
            "rate_limiting": {
                "handler": "PRAW automatic rate limit handling",
                "strategy": "Exponential backoff with retry",
                "current_status": rate_limit_info
            },
            "authentication": {
                "type": "Application-only OAuth",
                "scope": "Read-only access",
                "capabilities": "Search, browse, and read public content"
            },
            "support": {
                "repository": "https://github.com/king-of-the-grackles/reddit-research-mcp",
                "issues": "https://github.com/king-of-the-grackles/reddit-research-mcp/issues",
                "documentation": "See README.md and specs/ directory for architecture details"
            }
        }
```

--------------------------------------------------------------------------------
/reports/top-50-subreddits-saas-solopreneurs.md:
--------------------------------------------------------------------------------

```markdown
# Top 50 Subreddits for SaaS Startup Founders & Solopreneurs

*Research Date: 2025-09-20*
*Generated using Reddit MCP Server with semantic vector search*

## Executive Summary

This focused report identifies the top 50 Reddit communities specifically for **SaaS startup founders** and **solopreneurs**. These communities were selected based on:
- Direct relevance to SaaS business models
- Solo entrepreneurship focus
- Bootstrapped/self-funded business approaches
- Active engagement levels
- Community quality and support culture

## Top 50 Subreddits - Ranked by Relevance

### 🎯 Tier 1: Must-Join Communities (Confidence > 0.8)
*These are your highest-priority communities with direct ICP alignment*

1. **r/SaaS** - 374,943 subscribers | Confidence: 0.892
   - The primary SaaS community on Reddit
   - Topics: pricing, growth, tech stack, customer acquisition
   - https://reddit.com/r/SaaS

2. **r/indiehackers** - 105,674 subscribers | Confidence: 0.867
   - Solo founders and bootstrappers building profitable businesses
   - Strong focus on MRR milestones and transparency
   - https://reddit.com/r/indiehackers

3. **r/SoloFounders** - 2,113 subscribers | Confidence: 0.832
   - Dedicated community for solo entrepreneurs
   - Intimate setting for peer support and advice
   - https://reddit.com/r/SoloFounders

### 🚀 Tier 2: Core Communities (Confidence 0.7 - 0.8)

4. **r/startups** - 1,891,655 subscribers | Confidence: 0.729
   - Massive startup ecosystem community
   - Mix of bootstrapped and funded startups
   - https://reddit.com/r/startups

5. **r/SaaSy** - 3,150 subscribers | Confidence: 0.722
   - Focused SaaS discussions and case studies
   - https://reddit.com/r/SaaSy

6. **r/EntrepreneurRideAlong** - 604,396 subscribers | Confidence: 0.712
   - Document your entrepreneurial journey
   - Great for building in public
   - https://reddit.com/r/EntrepreneurRideAlong

7. **r/venturecapital** - 66,268 subscribers | Confidence: 0.721
   - Useful even for bootstrappers to understand funding landscape
   - https://reddit.com/r/venturecapital

8. **r/Entrepreneurs** - 77,330 subscribers | Confidence: 0.777
   - Active entrepreneur community with quality discussions
   - https://reddit.com/r/Entrepreneurs

### 💼 Tier 3: High-Value Communities (Confidence 0.6 - 0.7)

9. **r/Entrepreneur** - 4,871,109 subscribers | Confidence: 0.664
   - Largest entrepreneurship community
   - https://reddit.com/r/Entrepreneur

10. **r/EntrepreneurConnect** - 5,178 subscribers | Confidence: 0.691
    - Networking and collaboration focus
    - https://reddit.com/r/EntrepreneurConnect

11. **r/kickstarter** - 93,932 subscribers | Confidence: 0.658
    - Product launches and crowdfunding strategies
    - https://reddit.com/r/kickstarter

12. **r/small_business_ideas** - 23,034 subscribers | Confidence: 0.631
    - Idea validation and feedback
    - https://reddit.com/r/small_business_ideas

13. **r/Entrepreneurship** - 99,462 subscribers | Confidence: 0.619
    - Business strategy and growth discussions
    - https://reddit.com/r/Entrepreneurship

### 📊 Tier 4: Specialized Communities (Confidence 0.5 - 0.6)

14. **r/Business_Ideas** - 370,194 subscribers | Confidence: 0.521
    - Brainstorming and validating business concepts
    - https://reddit.com/r/Business_Ideas

15. **r/startup** - 225,696 subscribers | Confidence: 0.529
    - Startup ecosystem and resources
    - https://reddit.com/r/startup

16. **r/NoCodeSaaS** - 23,297 subscribers | Confidence: 0.329*
    - Building SaaS without coding
    - Perfect for non-technical founders
    - https://reddit.com/r/NoCodeSaaS

17. **r/Affiliatemarketing** - 239,731 subscribers | Confidence: 0.537
    - Revenue strategies for SaaS
    - https://reddit.com/r/Affiliatemarketing

18. **r/OnlineIncomeHustle** - 34,382 subscribers | Confidence: 0.517
    - Online business strategies
    - https://reddit.com/r/OnlineIncomeHustle

19. **r/SmallBusinessOwners** - 4,081 subscribers | Confidence: 0.501
    - Peer support for business owners
    - https://reddit.com/r/SmallBusinessOwners

20. **r/selfpublish** - 196,096 subscribers | Confidence: 0.483
    - Content creation and info products
    - https://reddit.com/r/selfpublish

### 🌍 Tier 5: Regional & Niche Communities

21. **r/indianstartups** - 76,422 subscribers | Confidence: 0.505
    - Indian startup ecosystem
    - https://reddit.com/r/indianstartups

22. **r/StartUpIndia** - 361,780 subscribers | Confidence: 0.432
    - Large Indian entrepreneur community
    - https://reddit.com/r/StartUpIndia

23. **r/IndianEntrepreneur** - 9,816 subscribers | Confidence: 0.446
    - Indian entrepreneur discussions
    - https://reddit.com/r/IndianEntrepreneur

24. **r/PhStartups** - 20,901 subscribers | Confidence: 0.359
    - Philippines startup community
    - https://reddit.com/r/PhStartups

25. **r/Startups_EU** - 2,894 subscribers | Confidence: 0.314
    - European startup ecosystem
    - https://reddit.com/r/Startups_EU

### 🛠️ Tier 6: Supporting Communities

26. **r/advancedentrepreneur** - 60,964 subscribers | Confidence: 0.464
    - For experienced entrepreneurs
    - https://reddit.com/r/advancedentrepreneur

27. **r/cofounderhunt** - 16,287 subscribers | Confidence: 0.456
    - Finding co-founders and team members
    - https://reddit.com/r/cofounderhunt

28. **r/sweatystartup** - 182,854 subscribers | Confidence: 0.432
    - Service businesses and local startups
    - https://reddit.com/r/sweatystartup

29. **r/ycombinator** - 139,403 subscribers | Confidence: 0.433
    - YC ecosystem and accelerator insights
    - https://reddit.com/r/ycombinator

30. **r/sidehustle** - 3,124,834 subscribers | Confidence: 0.486
    - Side projects that can become SaaS
    - https://reddit.com/r/sidehustle

### 💰 Tier 7: Business & Revenue Focus

31. **r/passive_income** - 851,987 subscribers | Confidence: 0.422
    - Building recurring revenue streams
    - https://reddit.com/r/passive_income

32. **r/SaaS_Email_Marketing** - 7,434 subscribers | Confidence: 0.465
    - Email marketing for SaaS
    - https://reddit.com/r/SaaS_Email_Marketing

33. **r/SocialMediaMarketing** - 197,241 subscribers | Confidence: 0.419
    - Marketing strategies for SaaS
    - https://reddit.com/r/SocialMediaMarketing

34. **r/equity_crowdfunding** - 3,112 subscribers | Confidence: 0.473
    - Alternative funding options
    - https://reddit.com/r/equity_crowdfunding

35. **r/AiForSmallBusiness** - 8,963 subscribers | Confidence: 0.378
    - AI tools for solopreneurs
    - https://reddit.com/r/AiForSmallBusiness

### 🎨 Tier 8: Creative & Indie Communities

36. **r/IndieGaming** - 412,025 subscribers | Confidence: 0.453
    - Indie game dev (similar mindset to SaaS)
    - https://reddit.com/r/IndieGaming

37. **r/IndieDev** - 295,248 subscribers | Confidence: 0.383
    - Independent development community
    - https://reddit.com/r/IndieDev

38. **r/PassionsToProfits** - 4,905 subscribers | Confidence: 0.468
    - Monetizing expertise
    - https://reddit.com/r/PassionsToProfits

39. **r/LawFirm** - 84,044 subscribers | Confidence: 0.437
    - Legal aspects of running a business
    - https://reddit.com/r/LawFirm

40. **r/Fiverr** - 64,568 subscribers | Confidence: 0.489
    - Freelancing and service offerings
    - https://reddit.com/r/Fiverr

### 🌐 Tier 9: Broader Business Communities

41. **r/smallbusiness** - 2,211,156 subscribers | Confidence: 0.345
    - General small business discussions
    - https://reddit.com/r/smallbusiness

42. **r/business** - 2,498,385 subscribers | Confidence: 0.457
    - Broad business topics
    - https://reddit.com/r/business

43. **r/smallbusinessUS** - 4,886 subscribers | Confidence: 0.464
    - US-focused small business
    - https://reddit.com/r/smallbusinessUS

44. **r/WholesaleRealestate** - 28,356 subscribers | Confidence: 0.447
    - Business model discussions
    - https://reddit.com/r/WholesaleRealestate

45. **r/selbststaendig** - 38,000 subscribers | Confidence: 0.364
    - German solopreneur community
    - https://reddit.com/r/selbststaendig

### 🔧 Tier 10: Tools & Resources

46. **r/YouTube_startups** - 127,440 subscribers | Confidence: 0.369
    - Content marketing for startups
    - https://reddit.com/r/YouTube_startups

47. **r/OnlineMarketing** - 3,744 subscribers | Confidence: 0.396
    - Digital marketing strategies
    - https://reddit.com/r/OnlineMarketing

48. **r/Businessideas** - 22,137 subscribers | Confidence: 0.389
    - Idea generation and validation
    - https://reddit.com/r/Businessideas

49. **r/BusinessVault** - 2,889 subscribers | Confidence: 0.348
    - Business resources and tools
    - https://reddit.com/r/BusinessVault

50. **r/simpleliving** - 1,447,715 subscribers | Confidence: 0.415
    - Lifestyle design for solopreneurs
    - https://reddit.com/r/simpleliving

## 🎯 Engagement Strategy for SaaS Founders & Solopreneurs

### Quick Start Guide
1. **Join Top 5 First:**
   - r/SaaS (primary community)
   - r/indiehackers (building in public)
   - r/SoloFounders (peer support)
   - r/startups (broad exposure)
   - r/EntrepreneurRideAlong (journey sharing)

2. **Weekly Engagement Plan:**
   - **Monday**: Share wins/milestones in r/EntrepreneurRideAlong
   - **Tuesday**: Ask for feedback in r/SaaS
   - **Wednesday**: Help others in r/indiehackers
   - **Thursday**: Network in r/SoloFounders
   - **Friday**: Share learnings in r/startups

3. **Content Types That Work:**
   - Case studies with real numbers (MRR, growth rates)
   - "How I built..." technical posts
   - Pricing strategy discussions
   - Tool stack reveals
   - Failure stories and lessons learned

### Community-Specific Tips

**For r/SaaS:**
- Share MRR milestones
- Discuss pricing strategies
- Ask about tech stack decisions
- Share customer acquisition costs

**For r/indiehackers:**
- Be transparent about revenue
- Document your journey
- Share both wins and failures
- Engage with other builders

**For r/SoloFounders:**
- Focus on work-life balance
- Share productivity tips
- Discuss delegation strategies
- Mental health and burnout prevention

## 📊 Key Metrics to Track

1. **Engagement Quality**: Comments > Upvotes
2. **Connection Building**: DMs from relevant founders
3. **Traffic Generation**: Clicks to your product
4. **Brand Recognition**: Mentions in other threads
5. **Value Created**: Problems solved for others

## ⚠️ Common Mistakes to Avoid

1. **Over-promotion**: Follow 9:1 rule (9 value posts : 1 promotional)
2. **Generic content**: Tailor posts to each community's culture
3. **Ignoring rules**: Each subreddit has specific posting guidelines
4. **Not engaging**: Don't just post and leave
5. **Being inauthentic**: Genuine interactions build trust

## 🚀 Next Steps

1. **Week 1**: Join top 10 communities, observe culture
2. **Week 2**: Start engaging with comments
3. **Week 3**: Make first posts in top 3 communities
4. **Week 4**: Analyze what resonates, adjust strategy
5. **Month 2+**: Scale successful approaches

---

*Note: This report focuses specifically on communities relevant to SaaS founders and solopreneurs. Confidence scores reflect semantic relevance to these specific ICPs. Community dynamics change, so regular monitoring is recommended.*

*Strategy Tip: Focus on depth over breadth - better to be highly active in 5-10 communities than sporadically active in 50.*
```

--------------------------------------------------------------------------------
/specs/003-implementation-summary.md:
--------------------------------------------------------------------------------

```markdown
# FastMCP Context API Implementation Summary

**Status:** ✅ Complete
**Date:** 2025-10-02
**Phases Completed:** Phase 1 (Context Integration) + Phase 2 (Progress Monitoring)

## Overview

This document summarizes the completed implementation of FastMCP's Context API integration into the Reddit MCP server. The implementation was completed in two phases and enables real-time progress reporting for long-running Reddit operations.

## Phase 1: Context Integration (Complete ✅)

### Goal
Integrate FastMCP's `Context` parameter into all tool and operation functions to enable future context-aware features.

### Implementation Details

**Scope:** All MCP tool functions and Reddit operation functions now accept `Context` as a parameter.

#### Functions Updated
- ✅ `discover_subreddits()` - Subreddit discovery via vector search
- ✅ `search_in_subreddit()` - Search within specific subreddit
- ✅ `fetch_subreddit_posts()` - Fetch posts from single subreddit
- ✅ `fetch_multiple_subreddits()` - Batch fetch from multiple subreddits
- ✅ `fetch_submission_with_comments()` - Fetch post with comment tree
- ✅ `validate_subreddit()` - Validate subreddit exists in index
- ✅ `_search_vector_db()` - Internal vector search helper
- ✅ `parse_comment_tree()` - Internal comment parsing helper

#### MCP Layer Functions
- ✅ `discover_operations()` - Layer 1: Discovery
- ✅ `get_operation_schema()` - Layer 2: Schema
- ✅ `execute_operation()` - Layer 3: Execution

### Test Coverage
- **8 integration tests** verifying context parameter acceptance
- All tests verify functions accept `Context` without errors
- Context parameter can be positioned anywhere in function signature

### Files Modified (Phase 1)
1. `src/tools/discover.py` - Added `ctx: Context = None` to all functions
2. `src/tools/search.py` - Added context parameter
3. `src/tools/posts.py` - Added context parameter
4. `src/tools/comments.py` - Added context parameter and forwarding
5. `src/server.py` - Updated MCP tools to accept and forward context
6. `tests/test_context_integration.py` - Created comprehensive test suite

---

## Phase 2: Progress Monitoring (Complete ✅)

### Goal
Add real-time progress reporting to long-running Reddit operations using `ctx.report_progress()`.

### Implementation Details

**Scope:** Three primary long-running operations now emit progress events.

#### Operation 1: `discover_subreddits` - Vector Search Progress

**File:** `src/tools/discover.py`

**Progress Events:**
- Reports progress for each subreddit analyzed during vector search
- **Message Format:** `"Analyzing r/{subreddit_name}"`
- **Frequency:** 10-100 events depending on `limit` parameter
- **Progress Values:** `progress=i+1, total=total_results`

**Implementation:**
```python
async def _search_vector_db(...):
    total_results = len(results['metadatas'][0])
    for i, (metadata, distance) in enumerate(...):
        if ctx:
            await ctx.report_progress(
                progress=i + 1,
                total=total_results,
                message=f"Analyzing r/{metadata.get('name', 'unknown')}"
            )
```

#### Operation 2: `fetch_multiple_subreddits` - Batch Fetch Progress

**File:** `src/tools/posts.py`

**Progress Events:**
- Reports progress when encountering each new subreddit
- **Message Format:** `"Fetching r/{subreddit_name}"`
- **Frequency:** 1-10 events (one per unique subreddit)
- **Progress Values:** `progress=len(processed), total=len(subreddit_names)`

**Implementation:**
```python
async def fetch_multiple_subreddits(...):
    processed_subreddits = set()
    for submission in submissions:
        subreddit_name = submission.subreddit.display_name
        if subreddit_name not in processed_subreddits:
            processed_subreddits.add(subreddit_name)
            if ctx:
                await ctx.report_progress(
                    progress=len(processed_subreddits),
                    total=len(clean_names),
                    message=f"Fetching r/{subreddit_name}"
                )
```

#### Operation 3: `fetch_submission_with_comments` - Comment Tree Progress

**File:** `src/tools/comments.py`

**Progress Events:**
- Reports progress during comment loading
- Final completion message when done
- **Message Format:**
  - During: `"Loading comments ({count}/{limit})"`
  - Complete: `"Completed: {count} comments loaded"`
- **Frequency:** 5-100+ events depending on `comment_limit`
- **Progress Values:** `progress=comment_count, total=comment_limit`

**Implementation:**
```python
async def fetch_submission_with_comments(...):
    for top_level_comment in submission.comments:
        if ctx:
            await ctx.report_progress(
                progress=comment_count,
                total=comment_limit,
                message=f"Loading comments ({comment_count}/{comment_limit})"
            )
        # ... process comment

    # Final completion
    if ctx:
        await ctx.report_progress(
            progress=comment_count,
            total=comment_limit,
            message=f"Completed: {comment_count} comments loaded"
        )
```

### Async/Await Changes

All three operations are now **async functions**:
- ✅ `discover_subreddits()` → `async def discover_subreddits()`
- ✅ `fetch_multiple_subreddits()` → `async def fetch_multiple_subreddits()`
- ✅ `fetch_submission_with_comments()` → `async def fetch_submission_with_comments()`
- ✅ `execute_operation()` → `async def execute_operation()` (conditionally awaits async operations)

### Test Coverage

**New Test Classes (Phase 2):**
1. `TestDiscoverSubredditsProgress` - Verifies progress during vector search
2. `TestFetchMultipleProgress` - Verifies progress per subreddit
3. `TestFetchCommentsProgress` - Verifies progress during comment loading

**Test Assertions:**
- ✅ Progress called minimum expected times (based on data)
- ✅ Progress includes `progress` and `total` parameters
- ✅ AsyncMock properly configured for async progress calls

**Total Test Results:** 18 tests, all passing ✅

### Files Modified (Phase 2)
1. `src/tools/discover.py` - Made async, added progress reporting
2. `src/tools/posts.py` - Made async, added progress reporting
3. `src/tools/comments.py` - Made async, added progress reporting
4. `src/tools/search.py` - No changes (operation too fast for progress)
5. `src/server.py` - Made `execute_operation()` async with conditional await
6. `tests/test_context_integration.py` - Added 3 progress test classes
7. `tests/test_tools.py` - Updated 3 tests to handle async functions
8. `pyproject.toml` - Added pytest asyncio configuration

---

## Current MCP Server Capabilities

### Context API Support

**All operations support:**
- ✅ Context parameter injection via FastMCP
- ✅ Progress reporting during long operations
- ✅ Future-ready for logging, sampling, and other context features

### Progress Reporting Patterns

**For Frontend/Client Implementation:**

1. **Vector Search (discover_subreddits)**
   - Progress updates: Every result analyzed
   - Typical range: 10-100 progress events
   - Pattern: Sequential 1→2→3→...→total
   - Message: Subreddit name being analyzed

2. **Multi-Subreddit Fetch (fetch_multiple)**
   - Progress updates: Each new subreddit encountered
   - Typical range: 1-10 progress events
   - Pattern: Incremental as new subreddits found
   - Message: Subreddit name being fetched

3. **Comment Tree Loading (fetch_comments)**
   - Progress updates: Each comment + final completion
   - Typical range: 5-100+ progress events
   - Pattern: Sequential with completion message
   - Message: Comment count progress

### FastMCP Progress API Specification

**Progress Call Signature:**
```python
await ctx.report_progress(
    progress: float,      # Current progress value
    total: float,         # Total expected (enables percentage)
    message: str         # Optional descriptive message
)
```

**Client Requirements:**
- Clients must send `progressToken` in initial request to receive updates
- If no token provided, progress calls have no effect (won't error)
- Progress events sent as MCP notifications during operation execution

---

## Integration Notes for Frontend Agent

### Expected Behavior

1. **Progress Events are Optional**
   - Operations work without progress tracking
   - Progress enhances UX but isn't required for functionality

2. **Async Operation Handling**
   - All three operations are async and must be awaited
   - `execute_operation()` properly handles both sync and async operations

3. **Message Patterns**
   - Messages are descriptive and user-friendly
   - Include specific subreddit names and counts
   - Can be displayed directly to users

### Testing Progress Locally

**To test progress reporting:**
1. Use MCP Inspector or Claude Desktop (supports progress tokens)
2. Call operations with realistic data sizes:
   - `discover_subreddits`: limit=20+ for visible progress
   - `fetch_multiple`: 3+ subreddits for multiple events
   - `fetch_comments`: comment_limit=50+ for visible progress

### Known Limitations

1. **Single-operation Progress Only**
   - No multi-stage progress across multiple operations
   - Each operation reports independently

2. **No Progress for Fast Operations**
   - `search_in_subreddit`: Too fast, no progress
   - `fetch_subreddit_posts`: Single subreddit, too fast

3. **Progress Granularity**
   - Vector search: Per-result (can be 100+ events)
   - Multi-fetch: Per-subreddit (typically 3-10 events)
   - Comments: Per-comment (can be 100+ events)

---

## Future Enhancements (Not Yet Implemented)

**Phase 3: Structured Logging** (Planned)
- Add `ctx.info()`, `ctx.debug()`, `ctx.warning()` calls
- Log operation start/end, errors, performance metrics

**Phase 4: Enhanced Error Handling** (Planned)
- Better error context via `ctx.error()`
- Structured error responses with recovery suggestions

**Phase 5: LLM Sampling** (Planned)
- Use `ctx.sample()` for AI-enhanced subreddit suggestions
- Intelligent query refinement based on results

---

## API Surface Summary

### Async Operations (Require await)
```python
# These are now async
await discover_subreddits(query="...", ctx=ctx)
await fetch_multiple_subreddits(subreddit_names=[...], reddit=client, ctx=ctx)
await fetch_submission_with_comments(reddit=client, submission_id="...", ctx=ctx)
await execute_operation(operation_id="...", parameters={...}, ctx=ctx)
```

### Sync Operations (No await)
```python
# These remain synchronous
search_in_subreddit(subreddit_name="...", query="...", reddit=client, ctx=ctx)
fetch_subreddit_posts(subreddit_name="...", reddit=client, ctx=ctx)
```

### Progress Event Format

**Client receives progress notifications:**
```json
{
  "progress": 15,
  "total": 50,
  "message": "Analyzing r/Python"
}
```

**Percentage calculation:**
```javascript
const percentage = (progress / total) * 100; // 30% in example
```

---

## Validation & Testing

### Test Suite Results
- ✅ **18 total tests** (all passing)
- ✅ **11 context integration tests** (8 existing + 3 new progress)
- ✅ **7 tool tests** (updated for async)
- ✅ No breaking changes to existing API
- ✅ No performance degradation

### Manual Testing Checklist
- ✅ Vector search reports progress for each result
- ✅ Multi-subreddit fetch reports per subreddit
- ✅ Comment loading reports progress + completion
- ✅ Progress messages are descriptive
- ✅ Operations work without context (graceful degradation)

---

## References

- [FastMCP Context API Docs](../ai-docs/fastmcp/docs/servers/context.mdx)
- [FastMCP Progress Reporting Docs](../ai-docs/fastmcp/docs/servers/progress.mdx)
- [Phase 1 Spec](./003-phase-1-context-integration.md)
- [Phase 2 Spec](./003-phase-2-progress-monitoring.md)
- [Master Integration Spec](./003-fastmcp-context-integration.md)

```

--------------------------------------------------------------------------------
/specs/003-fastmcp-context-integration.md:
--------------------------------------------------------------------------------

```markdown
# FastMCP Context Integration - Progress & Logging

**Status:** Draft
**Created:** 2025-10-02
**Owner:** Engineering Team

## Executive Summary

This specification outlines the integration of FastMCP's Context API to add progress monitoring, structured logging, and enhanced error context to the Reddit MCP server. These improvements will provide real-time visibility into server operations for debugging and user feedback.

## Background

The Reddit MCP server currently lacks visibility into long-running operations. Users cannot see progress during multi-step tasks like discovering subreddits or fetching posts from multiple communities. Server-side logging and error context are not surfaced to clients, making debugging difficult.

FastMCP's Context API provides built-in support for:
- **Progress reporting**: `ctx.report_progress(current, total, message)`
- **Structured logging**: `ctx.info()`, `ctx.warning()`, `ctx.error()`
- **Error context**: Rich error information with operation details

## Goals

1. **Progress Monitoring**: Report real-time progress during multi-step operations
2. **Structured Logging**: Surface server logs to clients at appropriate severity levels
3. **Enhanced Errors**: Provide detailed error context including operation name, type, and recovery suggestions
4. **Developer Experience**: Maintain clean, testable code with minimal complexity

## Non-Goals

- Frontend client implementation (separate project)
- UI component development (separate project)
- Metrics collection and export features
- Resource access tracking
- Sampling request monitoring

## Technical Design

### Phase 1: Context Integration (Days 1-2)

**Objective**: Enable all tool functions to receive FastMCP Context

#### Implementation Steps

1. **Update Tool Signatures**
   - Add required `Context` parameter to all functions in `src/tools/`
   - Pattern: `def tool_name(param: str, ctx: Context) -> dict:`
   - FastMCP automatically injects context when tools are called with `@mcp.tool` decorator

2. **Update execute_operation()**
   - Ensure context flows through to tool functions
   - No changes needed - FastMCP handles injection automatically

#### Files to Modify
- `src/tools/discover.py`
- `src/tools/posts.py`
- `src/tools/comments.py`
- `src/tools/search.py`
- `src/server.py`

#### Code Example

**Before:**
```python
def discover_subreddits(query: str, limit: int = 10) -> dict:
    results = search_vector_db(query, limit)
    return {"subreddits": results}
```

**After:**
```python
def discover_subreddits(
    query: str,
    limit: int = 10,
    ctx: Context
) -> dict:
    results = search_vector_db(query, limit)
    return {"subreddits": results}
```

### Phase 2: Progress Monitoring (Days 3-4)

**Objective**: Report progress during long-running operations

#### Progress Events

**discover_subreddits** - Vector search progress:
```python
for i, result in enumerate(search_results):
    ctx.report_progress(
        progress=i + 1,
        total=limit,
        message=f"Analyzing r/{result.name}"
    )
```

**fetch_multiple_subreddits** - Batch fetch progress:
```python
for i, subreddit in enumerate(subreddit_names):
    ctx.report_progress(
        progress=i + 1,
        total=len(subreddit_names),
        message=f"Fetching r/{subreddit}"
    )
    # Fetch posts...
```

**fetch_submission_with_comments** - Comment loading progress:
```python
ctx.report_progress(
    progress=len(comments),
    total=comment_limit,
    message=f"Loading comments ({len(comments)}/{comment_limit})"
)
```

#### Files to Modify
- `src/tools/discover.py` - Add progress during vector search iteration
- `src/tools/posts.py` - Add progress per subreddit in batch operations
- `src/tools/comments.py` - Add progress during comment tree traversal

### Phase 3: Structured Logging (Days 5-6)

**Objective**: Surface server-side information to clients via logs

#### Logging Events by Operation

**Discovery Operations** (`src/tools/discover.py`):
```python
ctx.info(f"Starting discovery for topic: {query}")
ctx.info(f"Found {len(results)} communities (avg confidence: {avg_conf:.2f})")

if avg_conf < 0.5:
    ctx.warning(f"Low confidence results (<0.5) for query: {query}")
```

**Fetch Operations** (`src/tools/posts.py`):
```python
ctx.info(f"Fetching {limit} posts from r/{subreddit_name}")
ctx.info(f"Successfully fetched {len(posts)} posts from r/{subreddit_name}")

# Rate limit warnings
if remaining_requests < 10:
    ctx.warning(f"Rate limit approaching: {remaining_requests}/60 requests remaining")

# Error logging
ctx.error(f"Failed to fetch r/{subreddit_name}: {str(e)}", extra={
    "subreddit": subreddit_name,
    "error_type": type(e).__name__
})
```

**Search Operations** (`src/tools/search.py`):
```python
ctx.info(f"Searching r/{subreddit_name} for: {query}")
ctx.debug(f"Search parameters: sort={sort}, time_filter={time_filter}")
```

**Comment Operations** (`src/tools/comments.py`):
```python
ctx.info(f"Fetching comments for submission: {submission_id}")
ctx.info(f"Loaded {len(comments)} comments (sort: {comment_sort})")
```

#### Log Levels

- **DEBUG**: Internal operation details, parameter values
- **INFO**: Operation start/completion, success metrics
- **WARNING**: Rate limits, low confidence scores, degraded functionality
- **ERROR**: Operation failures, API errors, exceptions

#### Files to Modify
- `src/tools/discover.py` - Confidence scores, discovery metrics
- `src/tools/posts.py` - Fetch success/failure, rate limit warnings
- `src/tools/comments.py` - Comment analysis metrics
- `src/tools/search.py` - Search operation logging

### Phase 4: Enhanced Error Handling (Days 7-8)

**Objective**: Provide detailed error context for debugging and recovery

#### Error Context Pattern

**Current Implementation:**
```python
except Exception as e:
    return {
        "success": False,
        "error": str(e),
        "recovery": suggest_recovery(operation_id, e)
    }
```

**Enhanced Implementation:**
```python
except Exception as e:
    error_type = type(e).__name__

    # Log error with context
    ctx.error(
        f"Operation failed: {operation_id}",
        extra={
            "operation": operation_id,
            "error_type": error_type,
            "parameters": parameters,
            "timestamp": datetime.now().isoformat()
        }
    )

    return {
        "success": False,
        "error": str(e),
        "error_type": error_type,
        "operation": operation_id,
        "parameters": parameters,
        "recovery": suggest_recovery(operation_id, e),
        "timestamp": datetime.now().isoformat()
    }
```

#### Error Categories & Recovery Suggestions

| Error Type | Recovery Suggestion |
|------------|-------------------|
| 404 / Not Found | "Verify subreddit name or use discover_subreddits" |
| 429 / Rate Limited | "Reduce limit parameter or wait 30s before retrying" |
| 403 / Private | "Subreddit is private - try other communities" |
| Validation Error | "Check parameters match schema from get_operation_schema" |
| Network Error | "Check internet connection and retry" |

#### Files to Modify
- `src/server.py` - Enhanced `execute_operation()` error handling
- `src/tools/*.py` - Operation-specific error logging

### Phase 5: Testing & Validation (Days 9-10)

**Objective**: Ensure all instrumentation works correctly

#### Test Coverage

**Context Integration Tests** (`tests/test_context_integration.py`):
```python
async def test_context_injected():
    """Verify context is properly injected into tools"""

async def test_progress_events_emitted():
    """Verify progress events during multi-step operations"""

async def test_log_messages_captured():
    """Verify logs at appropriate severity levels"""

async def test_error_context_included():
    """Verify error responses include operation details"""
```

**Updated Tool Tests** (`tests/test_tools.py`):
- Verify tools receive and use context properly
- Check progress reporting frequency (≥5 events per operation)
- Validate log message content and levels
- Ensure error context is complete

#### Files to Create/Modify
- Create: `tests/test_context_integration.py`
- Modify: `tests/test_tools.py`

## Implementation Details

### Context Parameter Pattern

FastMCP automatically injects Context when tools are decorated with `@mcp.tool`:

```python
@mcp.tool
def my_tool(param: str, ctx: Context) -> dict:
    # Context is automatically injected
    ctx.info("Tool started")
    ctx.report_progress(1, 10, "Processing")
    return {"result": "data"}
```

For functions called internally (not decorated), Context must be passed explicitly:

```python
def internal_function(param: str, ctx: Context) -> dict:
    ctx.info("Internal operation")
    return {"result": "data"}
```

### Progress Reporting Best Practices

1. **Report at regular intervals**: Every iteration in loops
2. **Provide descriptive messages**: "Fetching r/Python" not "Step 1"
3. **Include total when known**: `ctx.report_progress(5, 10, msg)`
4. **Use meaningful units**: Report actual progress (items processed) not arbitrary percentages

### Logging Best Practices

1. **Use appropriate levels**: INFO for normal ops, WARNING for issues, ERROR for failures
2. **Include context in extra**: `ctx.error(msg, extra={"operation": "name"})`
3. **Structured messages**: Consistent format for parsing
4. **Avoid spam**: Log meaningful events, not every line

### Error Handling Best Practices

1. **Specific exception types**: Catch specific errors when possible
2. **Include operation context**: Always log which operation failed
3. **Actionable recovery**: Provide specific steps to resolve
4. **Preserve stack traces**: Log full error details in extra

## Success Criteria

### Functional Requirements
- ✅ All tool functions accept required Context parameter
- ✅ Progress events emitted during multi-step operations (≥5 per operation)
- ✅ Server logs at appropriate severity levels (DEBUG/INFO/WARNING/ERROR)
- ✅ Error responses include operation name, type, and recovery suggestions
- ✅ MCP client compatibility maintained (Claude, ChatGPT, etc.)

### Technical Requirements
- ✅ All existing tests pass with new instrumentation
- ✅ New integration tests verify context functionality
- ✅ No performance degradation (progress/logging overhead <5%)
- ✅ Type hints maintained throughout

### Quality Requirements
- ✅ Code follows FastMCP patterns from documentation
- ✅ Logging messages are clear and actionable
- ✅ Error recovery suggestions are specific and helpful
- ✅ Progress messages provide meaningful status updates

## File Summary

### Files to Create
- `tests/test_context_integration.py` - New integration tests

### Files to Modify
- `src/tools/discover.py` - Context, progress, logging
- `src/tools/posts.py` - Context, progress, logging
- `src/tools/comments.py` - Context, progress, logging
- `src/tools/search.py` - Context, logging
- `src/server.py` - Enhanced error handling in execute_operation
- `tests/test_tools.py` - Updated tests for context integration

### Files Not Modified
- `src/config.py` - No changes needed
- `src/models.py` - No changes needed
- `src/resources.py` - No changes needed (future enhancement)
- `src/chroma_client.py` - No changes needed

## Dependencies

### Required
- FastMCP ≥2.0.0 (already installed)
- Python ≥3.10 (already using)
- Context API support (available in FastMCP)

### Optional
- No additional dependencies required

## Risks & Mitigations

| Risk | Impact | Mitigation |
|------|--------|------------|
| Performance overhead from logging | Low | Log only meaningful events, avoid verbose debug logs in production |
| Too many progress events | Low | Limit to 5-10 events per operation |
| Breaking MCP client compatibility | Low | Context changes are server-side only; MCP protocol unchanged |
| Testing complexity | Low | Use FastMCP's in-memory transport for tests |

## Backward Compatibility

**MCP Client Compatibility**: Changes are server-side implementation only. The MCP protocol interface remains unchanged, ensuring compatibility with all MCP clients including Claude, ChatGPT, and others. Context injection is handled by FastMCP's decorator system and is transparent to clients.

## Future Enhancements

Following this implementation, future phases could include:

1. **Resource Access Tracking** - Monitor `ctx.read_resource()` calls
2. **Sampling Monitoring** - Track `ctx.sample()` operations
3. **Metrics Collection** - Aggregate operation timing and success rates
4. **Client Integration** - Frontend components to display progress/logs

These are out of scope for this specification.

## References

- [FastMCP Context API Documentation](../ai-docs/fastmcp/docs/python-sdk/fastmcp-server-context.mdx)
- [FastMCP Progress Monitoring](../ai-docs/fastmcp/docs/clients/progress.mdx)
- [FastMCP Logging](../ai-docs/fastmcp/docs/clients/logging.mdx)
- Current Implementation: `src/server.py`
- Original UX Improvements Spec: `../frontend-reddit-research-mcp/specs/002-ux-improvements-fastmcp-patterns/spec.md`

```

--------------------------------------------------------------------------------
/reddit-research-agent.md:
--------------------------------------------------------------------------------

```markdown
---
name: reddit-research-agent
description: Use this agent when you need to conduct research using Reddit MCP server tools and produce a comprehensive, well-cited research report in Obsidian-optimized markdown format. This agent specializes in gathering Reddit data (posts, comments, subreddit information), analyzing patterns and insights, and presenting findings with proper inline citations that link back to source materials.
tools: Glob, Grep, LS, Read, WebFetch, TodoWrite, WebSearch, BashOutput, KillBash, ListMcpResourcesTool, ReadMcpResourceTool, Edit, MultiEdit, Write, NotebookEdit, Bash, mcp__reddit-mcp-poc__discover_operations, mcp__reddit-mcp-poc__get_operation_schema, mcp__reddit-mcp-poc__execute_operation
model: opus
color: purple
---

You are an insightful Reddit research analyst who transforms community discussions into compelling narratives. You excel at discovering diverse perspectives, synthesizing complex viewpoints, and building analytical stories that explain not just what Reddit thinks, but why different communities think differently.

## Core Mission

Create insightful research narratives that weave together diverse Reddit perspectives into coherent analytical stories, focusing on understanding the "why" behind community viewpoints rather than simply cataloging who said what.

## Technical Architecture (Reddit MCP Server)

Follow the three-layer workflow for Reddit operations:
1. **Discovery**: `discover_operations()` - NO parameters
2. **Schema**: `get_operation_schema(operation_id)` 
3. **Execution**: `execute_operation(operation_id, parameters)`

Key operations:
- `discover_subreddits`: Find diverse, relevant communities
- `fetch_multiple`: Efficiently gather from multiple subreddits
- `fetch_comments`: Deep dive into valuable discussions

## Research Approach

### 1. Diverse Perspective Discovery
**Goal**: Find 5-7 communities with genuinely different viewpoints

- Use semantic search to discover conceptually related but diverse subreddits
- Prioritize variety over volume:
  - Professional vs hobbyist communities
  - Technical vs general audiences  
  - Supportive vs critical spaces
  - Different geographic/demographic focuses
- Look for unexpected or adjacent communities that discuss the topic differently

### 2. Strategic Data Gathering
**Goal**: Quality insights over quantity of posts

```python
execute_operation("fetch_multiple", {
    "subreddit_names": [diverse_subreddits],
    "listing_type": "top",
    "time_filter": "year", 
    "limit_per_subreddit": 10-15
})
```

For high-value discussions:
```python
execute_operation("fetch_comments", {
    "submission_id": post_id,
    "comment_limit": 50,
    "comment_sort": "best"
})
```

### 3. Analytical Synthesis
**Goal**: Build narratives that explain patterns and tensions

- Identify themes that cut across communities
- Understand WHY different groups hold different views
- Find surprising connections between viewpoints
- Recognize emotional undercurrents and practical concerns
- Connect individual experiences to broader patterns

## Evidence & Citation Approach

**Philosophy**: Mix broad community patterns with individual voices to create rich, evidence-based narratives.

### Three Types of Citations (USE ALL THREE):

#### 1. **Community-Level Citations** (broad patterns)
```markdown
The r/sales community consistently emphasizes [theme], with discussions 
about [topic] dominating recent threads ([link1], [link2], [link3]).
```

#### 2. **Individual Voice Citations** (specific quotes)
```markdown
As one frustrated user (15 years in sales) explained: "Direct quote that 
captures the emotion and specificity" ([r/sales](link)).
```

#### 3. **Cross-Community Pattern Citations**
```markdown
This sentiment spans from r/technical ([link]) where developers 
[perspective], to r/business ([link]) where owners [different angle], 
revealing [your analysis of the pattern].
```

### Citation Density Requirements:
- **Every major claim**: 2-3 supporting citations minimum
- **Each theme section**: 3-4 broad community citations + 4-5 individual quotes
- **Pattern observations**: Evidence from at least 3 different subreddits
- **NO unsupported generalizations**: Everything cited or framed as a question

### Example of Mixed Citation Narrative:
```markdown
Small businesses are reverting to Excel not from technological ignorance, 
but from painful experience. Across r/smallbusiness, implementation horror 
stories dominate CRM discussions ([link1], [link2]), with costs frequently 
exceeding $70,000 for "basic functionality." One owner captured the 
community's frustration: "I paid $500/month to make my job harder" 
([r/smallbusiness](link)). This exodus isn't limited to non-technical users—
even r/programming members share Excel templates as CRM alternatives ([link]), 
suggesting the problem transcends technical capability.
```

## Report Structure

```markdown
# [Topic]: Understanding Reddit's Perspective

## Summary
[2-3 paragraphs providing your analytical overview of what you discovered. This should tell a coherent story about how Reddit communities view this topic, major tensions, and key insights. Write this AFTER completing your analysis.]

## The Conversation Landscape

[Analytical paragraph explaining the diversity of communities discussing this topic and why different groups care about it differently. For example: "The discussion spans from technical implementation in r/programming to business impact in r/smallbusiness, with surprisingly passionate debate in r/[unexpected_community]..."]

Key communities analyzed:
- **r/[subreddit]**: [1-line description of this community's unique perspective]
- **r/[subreddit]**: [What makes their viewpoint different]
- **r/[subreddit]**: [Their specific angle or concern]

## Major Themes

**IMPORTANT**: No "Top 10" lists. No bullet-point compilations. Every theme must be a narrative synthesis with extensive evidence from multiple communities showing different perspectives on the same pattern.

### Theme 1: [Descriptive Title That Captures the Insight]

[Opening analytical paragraph explaining what this pattern is and why it matters. Include 2-3 broad community citations showing this is a widespread phenomenon, not isolated incidents.]

[Second paragraph diving into the human impact with 3-4 specific individual quotes that illustrate different facets of this theme. Show the emotional and practical reality through actual Reddit voices.]

[Third paragraph connecting different community perspectives, explaining WHY different groups see this differently. Use cross-community citations to show how the same issue manifests differently across subreddits.]

Example structure:
```markdown
The CRM complexity crisis isn't about features—it's about fundamental misalignment 
between vendor assumptions and small business reality. This theme dominates 
r/smallbusiness discussions ([link1], [link2]), appears in weekly rant threads 
on r/sales ([link3]), and even surfaces in r/ExperiencedDevs when developers 
vent about building CRM integrations ([link4]).

The frustration is visceral and specific. A sales manager with 15 years 
experience wrote: "I calculated it—I spend 38% of my time on CRM data entry 
for metrics no one looks at" ([r/sales](link)). Another user, a small business 
owner, was more blunt: "Salesforce is where sales go to die" ([r/smallbusiness](link)), 
a comment that received 450 upvotes and sparked a thread of similar experiences. 
Even technical users aren't immune—a developer noted: "I built our entire CRM 
replacement in Google Sheets in a weekend. It does everything we need and nothing 
we don't" ([r/programming](link)).

The divide between communities reveals deeper truths. While r/sales focuses on 
time waste ([link1], [link2])—they have dedicated hours but resent non-selling 
activities—r/smallbusiness emphasizes resource impossibility ([link3], [link4])—
they simply don't have anyone to dedicate to CRM management. Meanwhile, 
r/Entrepreneur questions the entire premise: "CRM is a solution looking for 
a problem" was the top comment in a recent discussion ([link5]), suggesting 
some view the entire category as manufactured need.
```

### Theme 2: [Another Major Pattern or Tension]

[Similar structure - lead with YOUR analysis, support with evidence]

### Theme 3: [Emerging Trend or Fundamental Divide]

[Similar structure - focus on synthesis and interpretation]

## Divergent Perspectives

[Paragraph analyzing why certain communities see this topic so differently. What are the underlying factors - professional background, use cases, values, experiences - that drive these different viewpoints?]

Example contrasts:
- **Technical vs Business**: [Your analysis of this divide]
- **Veterans vs Newcomers**: [What experience changes]
- **Geographic/Cultural**: [If relevant]

## What This Means

[2-3 paragraphs of YOUR analysis about implications. What should someone building in this space know? What opportunities exist? What mistakes should be avoided? This should flow naturally from your research but be YOUR interpretive voice.]

Key takeaways:
1. [Actionable insight based on the research]
2. [Another practical implication]
3. [Strategic consideration]

## Research Notes

*Communities analyzed*: [List of subreddits examined]
*Methodology*: Semantic discovery to find diverse perspectives, followed by thematic analysis of top discussions and comments
*Limitations*: [Brief note on any biases or gaps]
```

## Writing Guidelines

### Voice & Tone
- **Analytical**: You're an insightful analyst, not a citation machine
- **Confident**: Make clear assertions based on evidence
- **Nuanced**: Acknowledge complexity without hedging excessively
- **Accessible**: Write for intelligent readers who aren't Reddit experts

### What Makes Good Analysis
- Explains WHY patterns exist, not just WHAT they are
- Connects disparate viewpoints into coherent narrative
- Identifies non-obvious insights
- Provides context for understanding different perspectives
- Tells a story that helps readers understand the landscape

### What to AVOID
- ❌ "Top 10" or "Top X" lists of any kind
- ❌ Bullet-point lists of complaints or features
- ❌ Unsupported generalizations ("Users hate X" without citations)
- ❌ Platform-by-platform breakdowns without narrative synthesis
- ❌ Generic business writing that could exist without Reddit data
- ❌ Claims without exploring WHY they exist

### What to INCLUDE
- ✅ Mixed citations: broad community patterns + individual voices
- ✅ Cross-community analysis showing different perspectives
- ✅ "Why" explanations for every pattern identified
- ✅ Narrative flow that builds understanding progressively
- ✅ Specific quotes that capture emotion and nuance
- ✅ Evidence from at least 3 different communities per theme

## File Handling

When saving reports:
1. Always save to `./reports/` directory (create if it doesn't exist)
2. Check if file exists with Read tool first
3. Use Write for new files, Edit/MultiEdit for existing
4. Default filename: `./reports/[topic]-reddit-analysis-[YYYY-MM-DD].md`

Example:
```bash
# Ensure reports directory exists
mkdir -p ./reports

# Save with descriptive filename
./reports/micro-saas-ideas-reddit-analysis-2024-01-15.md
```

## Quality Checklist

Before finalizing:
- [ ] Found genuinely diverse perspectives (5-7 different communities)
- [ ] Built coherent narrative that explains the landscape
- [ ] Analysis leads, evidence supports (not vice versa)
- [ ] Explained WHY different groups think differently  
- [ ] Connected patterns across communities
- [ ] Provided actionable insights based on findings
- [ ] Maintained analytical voice throughout
- [ ] **Each theme has 8-12 citations minimum (mixed types)**
- [ ] **No "Top X" lists anywhere in the report**
- [ ] **Every claim supported by 2-3 citations**
- [ ] **Community-level patterns shown with multiple links**
- [ ] **Individual voices included for human perspective**
- [ ] **Cross-community patterns demonstrated**
- [ ] **Zero unsupported generalizations**

## Core Competencies

### 1. Perspective Discovery
- Use semantic search to find conceptually related but culturally different communities
- Identify adjacent spaces that discuss the topic from unique angles
- Recognize when different terms are used for the same concept

### 2. Narrative Building  
- Connect individual comments to broader patterns
- Explain tensions between different viewpoints
- Identify emotional and practical drivers behind opinions
- Build stories that make complex landscapes understandable

### 3. Analytical Commentary
- Add interpretive value beyond summarization
- Explain implications and opportunities
- Connect Reddit insights to real-world applications
- Provide strategic guidance based on community wisdom

## Remember

You're not a court reporter documenting everything said. You're an investigative analyst who:
- Finds diverse perspectives across Reddit's ecosystem
- Understands WHY different communities think differently
- Builds compelling narratives that explain complex landscapes
- Provides actionable insights through analytical synthesis

Your reports should feel like reading excellent research journalism - informative, insightful, and built on solid evidence, but driven by narrative and analysis rather than exhaustive citation.
```
Page 1/2FirstPrevNextLast