king-of-the-grackles/reddit-mcp-poc # codebase.md

This is page 2 of 3. Use http://codebase.md/king-of-the-grackles/reddit-mcp-poc?lines=true&page={x} to view the full context.

# Directory Structure

```
├── .env.sample
├── .gemini
│   └── settings.json
├── .gitignore
├── .python-version
├── .specify
│   ├── memory
│   │   └── constitution.md
│   ├── scripts
│   │   └── bash
│   │       ├── check-implementation-prerequisites.sh
│   │       ├── check-task-prerequisites.sh
│   │       ├── common.sh
│   │       ├── create-new-feature.sh
│   │       ├── get-feature-paths.sh
│   │       ├── setup-plan.sh
│   │       └── update-agent-context.sh
│   └── templates
│       ├── agent-file-template.md
│       ├── plan-template.md
│       ├── spec-template.md
│       └── tasks-template.md
├── package.json
├── pyproject.toml
├── README.md
├── reddit-research-agent.md
├── reports
│   ├── ai-llm-weekly-trends-reddit-analysis-2025-01-20.md
│   ├── saas-solopreneur-reddit-communities.md
│   ├── top-50-active-AI-subreddits.md
│   ├── top-50-subreddits-saas-ai-builders.md
│   └── top-50-subreddits-saas-solopreneurs.md
├── server.json
├── specs
│   ├── 003-fastmcp-context-integration.md
│   ├── 003-implementation-summary.md
│   ├── 003-phase-1-context-integration.md
│   ├── 003-phase-2-progress-monitoring.md
│   ├── agent-reasoning-visibility.md
│   ├── agentic-discovery-architecture.md
│   ├── chroma-proxy-architecture.md
│   ├── deep-research-reddit-architecture.md
│   └── reddit-research-agent-spec.md
├── src
│   ├── __init__.py
│   ├── chroma_client.py
│   ├── config.py
│   ├── models.py
│   ├── resources.py
│   ├── server.py
│   └── tools
│       ├── __init__.py
│       ├── comments.py
│       ├── discover.py
│       ├── posts.py
│       └── search.py
├── tests
│   ├── test_context_integration.py
│   └── test_tools.py
└── uv.lock
```

# Files

--------------------------------------------------------------------------------
/reports/top-50-subreddits-saas-solopreneurs.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Top 50 Subreddits for SaaS Startup Founders & Solopreneurs
  2 | 
  3 | *Research Date: 2025-09-20*
  4 | *Generated using Reddit MCP Server with semantic vector search*
  5 | 
  6 | ## Executive Summary
  7 | 
  8 | This focused report identifies the top 50 Reddit communities specifically for **SaaS startup founders** and **solopreneurs**. These communities were selected based on:
  9 | - Direct relevance to SaaS business models
 10 | - Solo entrepreneurship focus
 11 | - Bootstrapped/self-funded business approaches
 12 | - Active engagement levels
 13 | - Community quality and support culture
 14 | 
 15 | ## Top 50 Subreddits - Ranked by Relevance
 16 | 
 17 | ### 🎯 Tier 1: Must-Join Communities (Confidence > 0.8)
 18 | *These are your highest-priority communities with direct ICP alignment*
 19 | 
 20 | 1. **r/SaaS** - 374,943 subscribers | Confidence: 0.892
 21 |    - The primary SaaS community on Reddit
 22 |    - Topics: pricing, growth, tech stack, customer acquisition
 23 |    - https://reddit.com/r/SaaS
 24 | 
 25 | 2. **r/indiehackers** - 105,674 subscribers | Confidence: 0.867
 26 |    - Solo founders and bootstrappers building profitable businesses
 27 |    - Strong focus on MRR milestones and transparency
 28 |    - https://reddit.com/r/indiehackers
 29 | 
 30 | 3. **r/SoloFounders** - 2,113 subscribers | Confidence: 0.832
 31 |    - Dedicated community for solo entrepreneurs
 32 |    - Intimate setting for peer support and advice
 33 |    - https://reddit.com/r/SoloFounders
 34 | 
 35 | ### 🚀 Tier 2: Core Communities (Confidence 0.7 - 0.8)
 36 | 
 37 | 4. **r/startups** - 1,891,655 subscribers | Confidence: 0.729
 38 |    - Massive startup ecosystem community
 39 |    - Mix of bootstrapped and funded startups
 40 |    - https://reddit.com/r/startups
 41 | 
 42 | 5. **r/SaaSy** - 3,150 subscribers | Confidence: 0.722
 43 |    - Focused SaaS discussions and case studies
 44 |    - https://reddit.com/r/SaaSy
 45 | 
 46 | 6. **r/EntrepreneurRideAlong** - 604,396 subscribers | Confidence: 0.712
 47 |    - Document your entrepreneurial journey
 48 |    - Great for building in public
 49 |    - https://reddit.com/r/EntrepreneurRideAlong
 50 | 
 51 | 7. **r/venturecapital** - 66,268 subscribers | Confidence: 0.721
 52 |    - Useful even for bootstrappers to understand funding landscape
 53 |    - https://reddit.com/r/venturecapital
 54 | 
 55 | 8. **r/Entrepreneurs** - 77,330 subscribers | Confidence: 0.777
 56 |    - Active entrepreneur community with quality discussions
 57 |    - https://reddit.com/r/Entrepreneurs
 58 | 
 59 | ### 💼 Tier 3: High-Value Communities (Confidence 0.6 - 0.7)
 60 | 
 61 | 9. **r/Entrepreneur** - 4,871,109 subscribers | Confidence: 0.664
 62 |    - Largest entrepreneurship community
 63 |    - https://reddit.com/r/Entrepreneur
 64 | 
 65 | 10. **r/EntrepreneurConnect** - 5,178 subscribers | Confidence: 0.691
 66 |     - Networking and collaboration focus
 67 |     - https://reddit.com/r/EntrepreneurConnect
 68 | 
 69 | 11. **r/kickstarter** - 93,932 subscribers | Confidence: 0.658
 70 |     - Product launches and crowdfunding strategies
 71 |     - https://reddit.com/r/kickstarter
 72 | 
 73 | 12. **r/small_business_ideas** - 23,034 subscribers | Confidence: 0.631
 74 |     - Idea validation and feedback
 75 |     - https://reddit.com/r/small_business_ideas
 76 | 
 77 | 13. **r/Entrepreneurship** - 99,462 subscribers | Confidence: 0.619
 78 |     - Business strategy and growth discussions
 79 |     - https://reddit.com/r/Entrepreneurship
 80 | 
 81 | ### 📊 Tier 4: Specialized Communities (Confidence 0.5 - 0.6)
 82 | 
 83 | 14. **r/Business_Ideas** - 370,194 subscribers | Confidence: 0.521
 84 |     - Brainstorming and validating business concepts
 85 |     - https://reddit.com/r/Business_Ideas
 86 | 
 87 | 15. **r/startup** - 225,696 subscribers | Confidence: 0.529
 88 |     - Startup ecosystem and resources
 89 |     - https://reddit.com/r/startup
 90 | 
 91 | 16. **r/NoCodeSaaS** - 23,297 subscribers | Confidence: 0.329*
 92 |     - Building SaaS without coding
 93 |     - Perfect for non-technical founders
 94 |     - https://reddit.com/r/NoCodeSaaS
 95 | 
 96 | 17. **r/Affiliatemarketing** - 239,731 subscribers | Confidence: 0.537
 97 |     - Revenue strategies for SaaS
 98 |     - https://reddit.com/r/Affiliatemarketing
 99 | 
100 | 18. **r/OnlineIncomeHustle** - 34,382 subscribers | Confidence: 0.517
101 |     - Online business strategies
102 |     - https://reddit.com/r/OnlineIncomeHustle
103 | 
104 | 19. **r/SmallBusinessOwners** - 4,081 subscribers | Confidence: 0.501
105 |     - Peer support for business owners
106 |     - https://reddit.com/r/SmallBusinessOwners
107 | 
108 | 20. **r/selfpublish** - 196,096 subscribers | Confidence: 0.483
109 |     - Content creation and info products
110 |     - https://reddit.com/r/selfpublish
111 | 
112 | ### 🌍 Tier 5: Regional & Niche Communities
113 | 
114 | 21. **r/indianstartups** - 76,422 subscribers | Confidence: 0.505
115 |     - Indian startup ecosystem
116 |     - https://reddit.com/r/indianstartups
117 | 
118 | 22. **r/StartUpIndia** - 361,780 subscribers | Confidence: 0.432
119 |     - Large Indian entrepreneur community
120 |     - https://reddit.com/r/StartUpIndia
121 | 
122 | 23. **r/IndianEntrepreneur** - 9,816 subscribers | Confidence: 0.446
123 |     - Indian entrepreneur discussions
124 |     - https://reddit.com/r/IndianEntrepreneur
125 | 
126 | 24. **r/PhStartups** - 20,901 subscribers | Confidence: 0.359
127 |     - Philippines startup community
128 |     - https://reddit.com/r/PhStartups
129 | 
130 | 25. **r/Startups_EU** - 2,894 subscribers | Confidence: 0.314
131 |     - European startup ecosystem
132 |     - https://reddit.com/r/Startups_EU
133 | 
134 | ### 🛠️ Tier 6: Supporting Communities
135 | 
136 | 26. **r/advancedentrepreneur** - 60,964 subscribers | Confidence: 0.464
137 |     - For experienced entrepreneurs
138 |     - https://reddit.com/r/advancedentrepreneur
139 | 
140 | 27. **r/cofounderhunt** - 16,287 subscribers | Confidence: 0.456
141 |     - Finding co-founders and team members
142 |     - https://reddit.com/r/cofounderhunt
143 | 
144 | 28. **r/sweatystartup** - 182,854 subscribers | Confidence: 0.432
145 |     - Service businesses and local startups
146 |     - https://reddit.com/r/sweatystartup
147 | 
148 | 29. **r/ycombinator** - 139,403 subscribers | Confidence: 0.433
149 |     - YC ecosystem and accelerator insights
150 |     - https://reddit.com/r/ycombinator
151 | 
152 | 30. **r/sidehustle** - 3,124,834 subscribers | Confidence: 0.486
153 |     - Side projects that can become SaaS
154 |     - https://reddit.com/r/sidehustle
155 | 
156 | ### 💰 Tier 7: Business & Revenue Focus
157 | 
158 | 31. **r/passive_income** - 851,987 subscribers | Confidence: 0.422
159 |     - Building recurring revenue streams
160 |     - https://reddit.com/r/passive_income
161 | 
162 | 32. **r/SaaS_Email_Marketing** - 7,434 subscribers | Confidence: 0.465
163 |     - Email marketing for SaaS
164 |     - https://reddit.com/r/SaaS_Email_Marketing
165 | 
166 | 33. **r/SocialMediaMarketing** - 197,241 subscribers | Confidence: 0.419
167 |     - Marketing strategies for SaaS
168 |     - https://reddit.com/r/SocialMediaMarketing
169 | 
170 | 34. **r/equity_crowdfunding** - 3,112 subscribers | Confidence: 0.473
171 |     - Alternative funding options
172 |     - https://reddit.com/r/equity_crowdfunding
173 | 
174 | 35. **r/AiForSmallBusiness** - 8,963 subscribers | Confidence: 0.378
175 |     - AI tools for solopreneurs
176 |     - https://reddit.com/r/AiForSmallBusiness
177 | 
178 | ### 🎨 Tier 8: Creative & Indie Communities
179 | 
180 | 36. **r/IndieGaming** - 412,025 subscribers | Confidence: 0.453
181 |     - Indie game dev (similar mindset to SaaS)
182 |     - https://reddit.com/r/IndieGaming
183 | 
184 | 37. **r/IndieDev** - 295,248 subscribers | Confidence: 0.383
185 |     - Independent development community
186 |     - https://reddit.com/r/IndieDev
187 | 
188 | 38. **r/PassionsToProfits** - 4,905 subscribers | Confidence: 0.468
189 |     - Monetizing expertise
190 |     - https://reddit.com/r/PassionsToProfits
191 | 
192 | 39. **r/LawFirm** - 84,044 subscribers | Confidence: 0.437
193 |     - Legal aspects of running a business
194 |     - https://reddit.com/r/LawFirm
195 | 
196 | 40. **r/Fiverr** - 64,568 subscribers | Confidence: 0.489
197 |     - Freelancing and service offerings
198 |     - https://reddit.com/r/Fiverr
199 | 
200 | ### 🌐 Tier 9: Broader Business Communities
201 | 
202 | 41. **r/smallbusiness** - 2,211,156 subscribers | Confidence: 0.345
203 |     - General small business discussions
204 |     - https://reddit.com/r/smallbusiness
205 | 
206 | 42. **r/business** - 2,498,385 subscribers | Confidence: 0.457
207 |     - Broad business topics
208 |     - https://reddit.com/r/business
209 | 
210 | 43. **r/smallbusinessUS** - 4,886 subscribers | Confidence: 0.464
211 |     - US-focused small business
212 |     - https://reddit.com/r/smallbusinessUS
213 | 
214 | 44. **r/WholesaleRealestate** - 28,356 subscribers | Confidence: 0.447
215 |     - Business model discussions
216 |     - https://reddit.com/r/WholesaleRealestate
217 | 
218 | 45. **r/selbststaendig** - 38,000 subscribers | Confidence: 0.364
219 |     - German solopreneur community
220 |     - https://reddit.com/r/selbststaendig
221 | 
222 | ### 🔧 Tier 10: Tools & Resources
223 | 
224 | 46. **r/YouTube_startups** - 127,440 subscribers | Confidence: 0.369
225 |     - Content marketing for startups
226 |     - https://reddit.com/r/YouTube_startups
227 | 
228 | 47. **r/OnlineMarketing** - 3,744 subscribers | Confidence: 0.396
229 |     - Digital marketing strategies
230 |     - https://reddit.com/r/OnlineMarketing
231 | 
232 | 48. **r/Businessideas** - 22,137 subscribers | Confidence: 0.389
233 |     - Idea generation and validation
234 |     - https://reddit.com/r/Businessideas
235 | 
236 | 49. **r/BusinessVault** - 2,889 subscribers | Confidence: 0.348
237 |     - Business resources and tools
238 |     - https://reddit.com/r/BusinessVault
239 | 
240 | 50. **r/simpleliving** - 1,447,715 subscribers | Confidence: 0.415
241 |     - Lifestyle design for solopreneurs
242 |     - https://reddit.com/r/simpleliving
243 | 
244 | ## 🎯 Engagement Strategy for SaaS Founders & Solopreneurs
245 | 
246 | ### Quick Start Guide
247 | 1. **Join Top 5 First:**
248 |    - r/SaaS (primary community)
249 |    - r/indiehackers (building in public)
250 |    - r/SoloFounders (peer support)
251 |    - r/startups (broad exposure)
252 |    - r/EntrepreneurRideAlong (journey sharing)
253 | 
254 | 2. **Weekly Engagement Plan:**
255 |    - **Monday**: Share wins/milestones in r/EntrepreneurRideAlong
256 |    - **Tuesday**: Ask for feedback in r/SaaS
257 |    - **Wednesday**: Help others in r/indiehackers
258 |    - **Thursday**: Network in r/SoloFounders
259 |    - **Friday**: Share learnings in r/startups
260 | 
261 | 3. **Content Types That Work:**
262 |    - Case studies with real numbers (MRR, growth rates)
263 |    - "How I built..." technical posts
264 |    - Pricing strategy discussions
265 |    - Tool stack reveals
266 |    - Failure stories and lessons learned
267 | 
268 | ### Community-Specific Tips
269 | 
270 | **For r/SaaS:**
271 | - Share MRR milestones
272 | - Discuss pricing strategies
273 | - Ask about tech stack decisions
274 | - Share customer acquisition costs
275 | 
276 | **For r/indiehackers:**
277 | - Be transparent about revenue
278 | - Document your journey
279 | - Share both wins and failures
280 | - Engage with other builders
281 | 
282 | **For r/SoloFounders:**
283 | - Focus on work-life balance
284 | - Share productivity tips
285 | - Discuss delegation strategies
286 | - Mental health and burnout prevention
287 | 
288 | ## 📊 Key Metrics to Track
289 | 
290 | 1. **Engagement Quality**: Comments > Upvotes
291 | 2. **Connection Building**: DMs from relevant founders
292 | 3. **Traffic Generation**: Clicks to your product
293 | 4. **Brand Recognition**: Mentions in other threads
294 | 5. **Value Created**: Problems solved for others
295 | 
296 | ## ⚠️ Common Mistakes to Avoid
297 | 
298 | 1. **Over-promotion**: Follow 9:1 rule (9 value posts : 1 promotional)
299 | 2. **Generic content**: Tailor posts to each community's culture
300 | 3. **Ignoring rules**: Each subreddit has specific posting guidelines
301 | 4. **Not engaging**: Don't just post and leave
302 | 5. **Being inauthentic**: Genuine interactions build trust
303 | 
304 | ## 🚀 Next Steps
305 | 
306 | 1. **Week 1**: Join top 10 communities, observe culture
307 | 2. **Week 2**: Start engaging with comments
308 | 3. **Week 3**: Make first posts in top 3 communities
309 | 4. **Week 4**: Analyze what resonates, adjust strategy
310 | 5. **Month 2+**: Scale successful approaches
311 | 
312 | ---
313 | 
314 | *Note: This report focuses specifically on communities relevant to SaaS founders and solopreneurs. Confidence scores reflect semantic relevance to these specific ICPs. Community dynamics change, so regular monitoring is recommended.*
315 | 
316 | *Strategy Tip: Focus on depth over breadth - better to be highly active in 5-10 communities than sporadically active in 50.*
```

--------------------------------------------------------------------------------
/specs/003-implementation-summary.md:
--------------------------------------------------------------------------------

```markdown
  1 | # FastMCP Context API Implementation Summary
  2 | 
  3 | **Status:** ✅ Complete
  4 | **Date:** 2025-10-02
  5 | **Phases Completed:** Phase 1 (Context Integration) + Phase 2 (Progress Monitoring)
  6 | 
  7 | ## Overview
  8 | 
  9 | This document summarizes the completed implementation of FastMCP's Context API integration into the Reddit MCP server. The implementation was completed in two phases and enables real-time progress reporting for long-running Reddit operations.
 10 | 
 11 | ## Phase 1: Context Integration (Complete ✅)
 12 | 
 13 | ### Goal
 14 | Integrate FastMCP's `Context` parameter into all tool and operation functions to enable future context-aware features.
 15 | 
 16 | ### Implementation Details
 17 | 
 18 | **Scope:** All MCP tool functions and Reddit operation functions now accept `Context` as a parameter.
 19 | 
 20 | #### Functions Updated
 21 | - ✅ `discover_subreddits()` - Subreddit discovery via vector search
 22 | - ✅ `search_in_subreddit()` - Search within specific subreddit
 23 | - ✅ `fetch_subreddit_posts()` - Fetch posts from single subreddit
 24 | - ✅ `fetch_multiple_subreddits()` - Batch fetch from multiple subreddits
 25 | - ✅ `fetch_submission_with_comments()` - Fetch post with comment tree
 26 | - ✅ `validate_subreddit()` - Validate subreddit exists in index
 27 | - ✅ `_search_vector_db()` - Internal vector search helper
 28 | - ✅ `parse_comment_tree()` - Internal comment parsing helper
 29 | 
 30 | #### MCP Layer Functions
 31 | - ✅ `discover_operations()` - Layer 1: Discovery
 32 | - ✅ `get_operation_schema()` - Layer 2: Schema
 33 | - ✅ `execute_operation()` - Layer 3: Execution
 34 | 
 35 | ### Test Coverage
 36 | - **8 integration tests** verifying context parameter acceptance
 37 | - All tests verify functions accept `Context` without errors
 38 | - Context parameter can be positioned anywhere in function signature
 39 | 
 40 | ### Files Modified (Phase 1)
 41 | 1. `src/tools/discover.py` - Added `ctx: Context = None` to all functions
 42 | 2. `src/tools/search.py` - Added context parameter
 43 | 3. `src/tools/posts.py` - Added context parameter
 44 | 4. `src/tools/comments.py` - Added context parameter and forwarding
 45 | 5. `src/server.py` - Updated MCP tools to accept and forward context
 46 | 6. `tests/test_context_integration.py` - Created comprehensive test suite
 47 | 
 48 | ---
 49 | 
 50 | ## Phase 2: Progress Monitoring (Complete ✅)
 51 | 
 52 | ### Goal
 53 | Add real-time progress reporting to long-running Reddit operations using `ctx.report_progress()`.
 54 | 
 55 | ### Implementation Details
 56 | 
 57 | **Scope:** Three primary long-running operations now emit progress events.
 58 | 
 59 | #### Operation 1: `discover_subreddits` - Vector Search Progress
 60 | 
 61 | **File:** `src/tools/discover.py`
 62 | 
 63 | **Progress Events:**
 64 | - Reports progress for each subreddit analyzed during vector search
 65 | - **Message Format:** `"Analyzing r/{subreddit_name}"`
 66 | - **Frequency:** 10-100 events depending on `limit` parameter
 67 | - **Progress Values:** `progress=i+1, total=total_results`
 68 | 
 69 | **Implementation:**
 70 | ```python
 71 | async def _search_vector_db(...):
 72 |     total_results = len(results['metadatas'][0])
 73 |     for i, (metadata, distance) in enumerate(...):
 74 |         if ctx:
 75 |             await ctx.report_progress(
 76 |                 progress=i + 1,
 77 |                 total=total_results,
 78 |                 message=f"Analyzing r/{metadata.get('name', 'unknown')}"
 79 |             )
 80 | ```
 81 | 
 82 | #### Operation 2: `fetch_multiple_subreddits` - Batch Fetch Progress
 83 | 
 84 | **File:** `src/tools/posts.py`
 85 | 
 86 | **Progress Events:**
 87 | - Reports progress when encountering each new subreddit
 88 | - **Message Format:** `"Fetching r/{subreddit_name}"`
 89 | - **Frequency:** 1-10 events (one per unique subreddit)
 90 | - **Progress Values:** `progress=len(processed), total=len(subreddit_names)`
 91 | 
 92 | **Implementation:**
 93 | ```python
 94 | async def fetch_multiple_subreddits(...):
 95 |     processed_subreddits = set()
 96 |     for submission in submissions:
 97 |         subreddit_name = submission.subreddit.display_name
 98 |         if subreddit_name not in processed_subreddits:
 99 |             processed_subreddits.add(subreddit_name)
100 |             if ctx:
101 |                 await ctx.report_progress(
102 |                     progress=len(processed_subreddits),
103 |                     total=len(clean_names),
104 |                     message=f"Fetching r/{subreddit_name}"
105 |                 )
106 | ```
107 | 
108 | #### Operation 3: `fetch_submission_with_comments` - Comment Tree Progress
109 | 
110 | **File:** `src/tools/comments.py`
111 | 
112 | **Progress Events:**
113 | - Reports progress during comment loading
114 | - Final completion message when done
115 | - **Message Format:**
116 |   - During: `"Loading comments ({count}/{limit})"`
117 |   - Complete: `"Completed: {count} comments loaded"`
118 | - **Frequency:** 5-100+ events depending on `comment_limit`
119 | - **Progress Values:** `progress=comment_count, total=comment_limit`
120 | 
121 | **Implementation:**
122 | ```python
123 | async def fetch_submission_with_comments(...):
124 |     for top_level_comment in submission.comments:
125 |         if ctx:
126 |             await ctx.report_progress(
127 |                 progress=comment_count,
128 |                 total=comment_limit,
129 |                 message=f"Loading comments ({comment_count}/{comment_limit})"
130 |             )
131 |         # ... process comment
132 | 
133 |     # Final completion
134 |     if ctx:
135 |         await ctx.report_progress(
136 |             progress=comment_count,
137 |             total=comment_limit,
138 |             message=f"Completed: {comment_count} comments loaded"
139 |         )
140 | ```
141 | 
142 | ### Async/Await Changes
143 | 
144 | All three operations are now **async functions**:
145 | - ✅ `discover_subreddits()` → `async def discover_subreddits()`
146 | - ✅ `fetch_multiple_subreddits()` → `async def fetch_multiple_subreddits()`
147 | - ✅ `fetch_submission_with_comments()` → `async def fetch_submission_with_comments()`
148 | - ✅ `execute_operation()` → `async def execute_operation()` (conditionally awaits async operations)
149 | 
150 | ### Test Coverage
151 | 
152 | **New Test Classes (Phase 2):**
153 | 1. `TestDiscoverSubredditsProgress` - Verifies progress during vector search
154 | 2. `TestFetchMultipleProgress` - Verifies progress per subreddit
155 | 3. `TestFetchCommentsProgress` - Verifies progress during comment loading
156 | 
157 | **Test Assertions:**
158 | - ✅ Progress called minimum expected times (based on data)
159 | - ✅ Progress includes `progress` and `total` parameters
160 | - ✅ AsyncMock properly configured for async progress calls
161 | 
162 | **Total Test Results:** 18 tests, all passing ✅
163 | 
164 | ### Files Modified (Phase 2)
165 | 1. `src/tools/discover.py` - Made async, added progress reporting
166 | 2. `src/tools/posts.py` - Made async, added progress reporting
167 | 3. `src/tools/comments.py` - Made async, added progress reporting
168 | 4. `src/tools/search.py` - No changes (operation too fast for progress)
169 | 5. `src/server.py` - Made `execute_operation()` async with conditional await
170 | 6. `tests/test_context_integration.py` - Added 3 progress test classes
171 | 7. `tests/test_tools.py` - Updated 3 tests to handle async functions
172 | 8. `pyproject.toml` - Added pytest asyncio configuration
173 | 
174 | ---
175 | 
176 | ## Current MCP Server Capabilities
177 | 
178 | ### Context API Support
179 | 
180 | **All operations support:**
181 | - ✅ Context parameter injection via FastMCP
182 | - ✅ Progress reporting during long operations
183 | - ✅ Future-ready for logging, sampling, and other context features
184 | 
185 | ### Progress Reporting Patterns
186 | 
187 | **For Frontend/Client Implementation:**
188 | 
189 | 1. **Vector Search (discover_subreddits)**
190 |    - Progress updates: Every result analyzed
191 |    - Typical range: 10-100 progress events
192 |    - Pattern: Sequential 1→2→3→...→total
193 |    - Message: Subreddit name being analyzed
194 | 
195 | 2. **Multi-Subreddit Fetch (fetch_multiple)**
196 |    - Progress updates: Each new subreddit encountered
197 |    - Typical range: 1-10 progress events
198 |    - Pattern: Incremental as new subreddits found
199 |    - Message: Subreddit name being fetched
200 | 
201 | 3. **Comment Tree Loading (fetch_comments)**
202 |    - Progress updates: Each comment + final completion
203 |    - Typical range: 5-100+ progress events
204 |    - Pattern: Sequential with completion message
205 |    - Message: Comment count progress
206 | 
207 | ### FastMCP Progress API Specification
208 | 
209 | **Progress Call Signature:**
210 | ```python
211 | await ctx.report_progress(
212 |     progress: float,      # Current progress value
213 |     total: float,         # Total expected (enables percentage)
214 |     message: str         # Optional descriptive message
215 | )
216 | ```
217 | 
218 | **Client Requirements:**
219 | - Clients must send `progressToken` in initial request to receive updates
220 | - If no token provided, progress calls have no effect (won't error)
221 | - Progress events sent as MCP notifications during operation execution
222 | 
223 | ---
224 | 
225 | ## Integration Notes for Frontend Agent
226 | 
227 | ### Expected Behavior
228 | 
229 | 1. **Progress Events are Optional**
230 |    - Operations work without progress tracking
231 |    - Progress enhances UX but isn't required for functionality
232 | 
233 | 2. **Async Operation Handling**
234 |    - All three operations are async and must be awaited
235 |    - `execute_operation()` properly handles both sync and async operations
236 | 
237 | 3. **Message Patterns**
238 |    - Messages are descriptive and user-friendly
239 |    - Include specific subreddit names and counts
240 |    - Can be displayed directly to users
241 | 
242 | ### Testing Progress Locally
243 | 
244 | **To test progress reporting:**
245 | 1. Use MCP Inspector or Claude Desktop (supports progress tokens)
246 | 2. Call operations with realistic data sizes:
247 |    - `discover_subreddits`: limit=20+ for visible progress
248 |    - `fetch_multiple`: 3+ subreddits for multiple events
249 |    - `fetch_comments`: comment_limit=50+ for visible progress
250 | 
251 | ### Known Limitations
252 | 
253 | 1. **Single-operation Progress Only**
254 |    - No multi-stage progress across multiple operations
255 |    - Each operation reports independently
256 | 
257 | 2. **No Progress for Fast Operations**
258 |    - `search_in_subreddit`: Too fast, no progress
259 |    - `fetch_subreddit_posts`: Single subreddit, too fast
260 | 
261 | 3. **Progress Granularity**
262 |    - Vector search: Per-result (can be 100+ events)
263 |    - Multi-fetch: Per-subreddit (typically 3-10 events)
264 |    - Comments: Per-comment (can be 100+ events)
265 | 
266 | ---
267 | 
268 | ## Future Enhancements (Not Yet Implemented)
269 | 
270 | **Phase 3: Structured Logging** (Planned)
271 | - Add `ctx.info()`, `ctx.debug()`, `ctx.warning()` calls
272 | - Log operation start/end, errors, performance metrics
273 | 
274 | **Phase 4: Enhanced Error Handling** (Planned)
275 | - Better error context via `ctx.error()`
276 | - Structured error responses with recovery suggestions
277 | 
278 | **Phase 5: LLM Sampling** (Planned)
279 | - Use `ctx.sample()` for AI-enhanced subreddit suggestions
280 | - Intelligent query refinement based on results
281 | 
282 | ---
283 | 
284 | ## API Surface Summary
285 | 
286 | ### Async Operations (Require await)
287 | ```python
288 | # These are now async
289 | await discover_subreddits(query="...", ctx=ctx)
290 | await fetch_multiple_subreddits(subreddit_names=[...], reddit=client, ctx=ctx)
291 | await fetch_submission_with_comments(reddit=client, submission_id="...", ctx=ctx)
292 | await execute_operation(operation_id="...", parameters={...}, ctx=ctx)
293 | ```
294 | 
295 | ### Sync Operations (No await)
296 | ```python
297 | # These remain synchronous
298 | search_in_subreddit(subreddit_name="...", query="...", reddit=client, ctx=ctx)
299 | fetch_subreddit_posts(subreddit_name="...", reddit=client, ctx=ctx)
300 | ```
301 | 
302 | ### Progress Event Format
303 | 
304 | **Client receives progress notifications:**
305 | ```json
306 | {
307 |   "progress": 15,
308 |   "total": 50,
309 |   "message": "Analyzing r/Python"
310 | }
311 | ```
312 | 
313 | **Percentage calculation:**
314 | ```javascript
315 | const percentage = (progress / total) * 100; // 30% in example
316 | ```
317 | 
318 | ---
319 | 
320 | ## Validation & Testing
321 | 
322 | ### Test Suite Results
323 | - ✅ **18 total tests** (all passing)
324 | - ✅ **11 context integration tests** (8 existing + 3 new progress)
325 | - ✅ **7 tool tests** (updated for async)
326 | - ✅ No breaking changes to existing API
327 | - ✅ No performance degradation
328 | 
329 | ### Manual Testing Checklist
330 | - ✅ Vector search reports progress for each result
331 | - ✅ Multi-subreddit fetch reports per subreddit
332 | - ✅ Comment loading reports progress + completion
333 | - ✅ Progress messages are descriptive
334 | - ✅ Operations work without context (graceful degradation)
335 | 
336 | ---
337 | 
338 | ## References
339 | 
340 | - [FastMCP Context API Docs](../ai-docs/fastmcp/docs/servers/context.mdx)
341 | - [FastMCP Progress Reporting Docs](../ai-docs/fastmcp/docs/servers/progress.mdx)
342 | - [Phase 1 Spec](./003-phase-1-context-integration.md)
343 | - [Phase 2 Spec](./003-phase-2-progress-monitoring.md)
344 | - [Master Integration Spec](./003-fastmcp-context-integration.md)
345 | 
```

--------------------------------------------------------------------------------
/specs/003-fastmcp-context-integration.md:
--------------------------------------------------------------------------------

```markdown
  1 | # FastMCP Context Integration - Progress & Logging
  2 | 
  3 | **Status:** Draft
  4 | **Created:** 2025-10-02
  5 | **Owner:** Engineering Team
  6 | 
  7 | ## Executive Summary
  8 | 
  9 | This specification outlines the integration of FastMCP's Context API to add progress monitoring, structured logging, and enhanced error context to the Reddit MCP server. These improvements will provide real-time visibility into server operations for debugging and user feedback.
 10 | 
 11 | ## Background
 12 | 
 13 | The Reddit MCP server currently lacks visibility into long-running operations. Users cannot see progress during multi-step tasks like discovering subreddits or fetching posts from multiple communities. Server-side logging and error context are not surfaced to clients, making debugging difficult.
 14 | 
 15 | FastMCP's Context API provides built-in support for:
 16 | - **Progress reporting**: `ctx.report_progress(current, total, message)`
 17 | - **Structured logging**: `ctx.info()`, `ctx.warning()`, `ctx.error()`
 18 | - **Error context**: Rich error information with operation details
 19 | 
 20 | ## Goals
 21 | 
 22 | 1. **Progress Monitoring**: Report real-time progress during multi-step operations
 23 | 2. **Structured Logging**: Surface server logs to clients at appropriate severity levels
 24 | 3. **Enhanced Errors**: Provide detailed error context including operation name, type, and recovery suggestions
 25 | 4. **Developer Experience**: Maintain clean, testable code with minimal complexity
 26 | 
 27 | ## Non-Goals
 28 | 
 29 | - Frontend client implementation (separate project)
 30 | - UI component development (separate project)
 31 | - Metrics collection and export features
 32 | - Resource access tracking
 33 | - Sampling request monitoring
 34 | 
 35 | ## Technical Design
 36 | 
 37 | ### Phase 1: Context Integration (Days 1-2)
 38 | 
 39 | **Objective**: Enable all tool functions to receive FastMCP Context
 40 | 
 41 | #### Implementation Steps
 42 | 
 43 | 1. **Update Tool Signatures**
 44 |    - Add required `Context` parameter to all functions in `src/tools/`
 45 |    - Pattern: `def tool_name(param: str, ctx: Context) -> dict:`
 46 |    - FastMCP automatically injects context when tools are called with `@mcp.tool` decorator
 47 | 
 48 | 2. **Update execute_operation()**
 49 |    - Ensure context flows through to tool functions
 50 |    - No changes needed - FastMCP handles injection automatically
 51 | 
 52 | #### Files to Modify
 53 | - `src/tools/discover.py`
 54 | - `src/tools/posts.py`
 55 | - `src/tools/comments.py`
 56 | - `src/tools/search.py`
 57 | - `src/server.py`
 58 | 
 59 | #### Code Example
 60 | 
 61 | **Before:**
 62 | ```python
 63 | def discover_subreddits(query: str, limit: int = 10) -> dict:
 64 |     results = search_vector_db(query, limit)
 65 |     return {"subreddits": results}
 66 | ```
 67 | 
 68 | **After:**
 69 | ```python
 70 | def discover_subreddits(
 71 |     query: str,
 72 |     limit: int = 10,
 73 |     ctx: Context
 74 | ) -> dict:
 75 |     results = search_vector_db(query, limit)
 76 |     return {"subreddits": results}
 77 | ```
 78 | 
 79 | ### Phase 2: Progress Monitoring (Days 3-4)
 80 | 
 81 | **Objective**: Report progress during long-running operations
 82 | 
 83 | #### Progress Events
 84 | 
 85 | **discover_subreddits** - Vector search progress:
 86 | ```python
 87 | for i, result in enumerate(search_results):
 88 |     ctx.report_progress(
 89 |         progress=i + 1,
 90 |         total=limit,
 91 |         message=f"Analyzing r/{result.name}"
 92 |     )
 93 | ```
 94 | 
 95 | **fetch_multiple_subreddits** - Batch fetch progress:
 96 | ```python
 97 | for i, subreddit in enumerate(subreddit_names):
 98 |     ctx.report_progress(
 99 |         progress=i + 1,
100 |         total=len(subreddit_names),
101 |         message=f"Fetching r/{subreddit}"
102 |     )
103 |     # Fetch posts...
104 | ```
105 | 
106 | **fetch_submission_with_comments** - Comment loading progress:
107 | ```python
108 | ctx.report_progress(
109 |     progress=len(comments),
110 |     total=comment_limit,
111 |     message=f"Loading comments ({len(comments)}/{comment_limit})"
112 | )
113 | ```
114 | 
115 | #### Files to Modify
116 | - `src/tools/discover.py` - Add progress during vector search iteration
117 | - `src/tools/posts.py` - Add progress per subreddit in batch operations
118 | - `src/tools/comments.py` - Add progress during comment tree traversal
119 | 
120 | ### Phase 3: Structured Logging (Days 5-6)
121 | 
122 | **Objective**: Surface server-side information to clients via logs
123 | 
124 | #### Logging Events by Operation
125 | 
126 | **Discovery Operations** (`src/tools/discover.py`):
127 | ```python
128 | ctx.info(f"Starting discovery for topic: {query}")
129 | ctx.info(f"Found {len(results)} communities (avg confidence: {avg_conf:.2f})")
130 | 
131 | if avg_conf < 0.5:
132 |     ctx.warning(f"Low confidence results (<0.5) for query: {query}")
133 | ```
134 | 
135 | **Fetch Operations** (`src/tools/posts.py`):
136 | ```python
137 | ctx.info(f"Fetching {limit} posts from r/{subreddit_name}")
138 | ctx.info(f"Successfully fetched {len(posts)} posts from r/{subreddit_name}")
139 | 
140 | # Rate limit warnings
141 | if remaining_requests < 10:
142 |     ctx.warning(f"Rate limit approaching: {remaining_requests}/60 requests remaining")
143 | 
144 | # Error logging
145 | ctx.error(f"Failed to fetch r/{subreddit_name}: {str(e)}", extra={
146 |     "subreddit": subreddit_name,
147 |     "error_type": type(e).__name__
148 | })
149 | ```
150 | 
151 | **Search Operations** (`src/tools/search.py`):
152 | ```python
153 | ctx.info(f"Searching r/{subreddit_name} for: {query}")
154 | ctx.debug(f"Search parameters: sort={sort}, time_filter={time_filter}")
155 | ```
156 | 
157 | **Comment Operations** (`src/tools/comments.py`):
158 | ```python
159 | ctx.info(f"Fetching comments for submission: {submission_id}")
160 | ctx.info(f"Loaded {len(comments)} comments (sort: {comment_sort})")
161 | ```
162 | 
163 | #### Log Levels
164 | 
165 | - **DEBUG**: Internal operation details, parameter values
166 | - **INFO**: Operation start/completion, success metrics
167 | - **WARNING**: Rate limits, low confidence scores, degraded functionality
168 | - **ERROR**: Operation failures, API errors, exceptions
169 | 
170 | #### Files to Modify
171 | - `src/tools/discover.py` - Confidence scores, discovery metrics
172 | - `src/tools/posts.py` - Fetch success/failure, rate limit warnings
173 | - `src/tools/comments.py` - Comment analysis metrics
174 | - `src/tools/search.py` - Search operation logging
175 | 
176 | ### Phase 4: Enhanced Error Handling (Days 7-8)
177 | 
178 | **Objective**: Provide detailed error context for debugging and recovery
179 | 
180 | #### Error Context Pattern
181 | 
182 | **Current Implementation:**
183 | ```python
184 | except Exception as e:
185 |     return {
186 |         "success": False,
187 |         "error": str(e),
188 |         "recovery": suggest_recovery(operation_id, e)
189 |     }
190 | ```
191 | 
192 | **Enhanced Implementation:**
193 | ```python
194 | except Exception as e:
195 |     error_type = type(e).__name__
196 | 
197 |     # Log error with context
198 |     ctx.error(
199 |         f"Operation failed: {operation_id}",
200 |         extra={
201 |             "operation": operation_id,
202 |             "error_type": error_type,
203 |             "parameters": parameters,
204 |             "timestamp": datetime.now().isoformat()
205 |         }
206 |     )
207 | 
208 |     return {
209 |         "success": False,
210 |         "error": str(e),
211 |         "error_type": error_type,
212 |         "operation": operation_id,
213 |         "parameters": parameters,
214 |         "recovery": suggest_recovery(operation_id, e),
215 |         "timestamp": datetime.now().isoformat()
216 |     }
217 | ```
218 | 
219 | #### Error Categories & Recovery Suggestions
220 | 
221 | | Error Type | Recovery Suggestion |
222 | |------------|-------------------|
223 | | 404 / Not Found | "Verify subreddit name or use discover_subreddits" |
224 | | 429 / Rate Limited | "Reduce limit parameter or wait 30s before retrying" |
225 | | 403 / Private | "Subreddit is private - try other communities" |
226 | | Validation Error | "Check parameters match schema from get_operation_schema" |
227 | | Network Error | "Check internet connection and retry" |
228 | 
229 | #### Files to Modify
230 | - `src/server.py` - Enhanced `execute_operation()` error handling
231 | - `src/tools/*.py` - Operation-specific error logging
232 | 
233 | ### Phase 5: Testing & Validation (Days 9-10)
234 | 
235 | **Objective**: Ensure all instrumentation works correctly
236 | 
237 | #### Test Coverage
238 | 
239 | **Context Integration Tests** (`tests/test_context_integration.py`):
240 | ```python
241 | async def test_context_injected():
242 |     """Verify context is properly injected into tools"""
243 | 
244 | async def test_progress_events_emitted():
245 |     """Verify progress events during multi-step operations"""
246 | 
247 | async def test_log_messages_captured():
248 |     """Verify logs at appropriate severity levels"""
249 | 
250 | async def test_error_context_included():
251 |     """Verify error responses include operation details"""
252 | ```
253 | 
254 | **Updated Tool Tests** (`tests/test_tools.py`):
255 | - Verify tools receive and use context properly
256 | - Check progress reporting frequency (≥5 events per operation)
257 | - Validate log message content and levels
258 | - Ensure error context is complete
259 | 
260 | #### Files to Create/Modify
261 | - Create: `tests/test_context_integration.py`
262 | - Modify: `tests/test_tools.py`
263 | 
264 | ## Implementation Details
265 | 
266 | ### Context Parameter Pattern
267 | 
268 | FastMCP automatically injects Context when tools are decorated with `@mcp.tool`:
269 | 
270 | ```python
271 | @mcp.tool
272 | def my_tool(param: str, ctx: Context) -> dict:
273 |     # Context is automatically injected
274 |     ctx.info("Tool started")
275 |     ctx.report_progress(1, 10, "Processing")
276 |     return {"result": "data"}
277 | ```
278 | 
279 | For functions called internally (not decorated), Context must be passed explicitly:
280 | 
281 | ```python
282 | def internal_function(param: str, ctx: Context) -> dict:
283 |     ctx.info("Internal operation")
284 |     return {"result": "data"}
285 | ```
286 | 
287 | ### Progress Reporting Best Practices
288 | 
289 | 1. **Report at regular intervals**: Every iteration in loops
290 | 2. **Provide descriptive messages**: "Fetching r/Python" not "Step 1"
291 | 3. **Include total when known**: `ctx.report_progress(5, 10, msg)`
292 | 4. **Use meaningful units**: Report actual progress (items processed) not arbitrary percentages
293 | 
294 | ### Logging Best Practices
295 | 
296 | 1. **Use appropriate levels**: INFO for normal ops, WARNING for issues, ERROR for failures
297 | 2. **Include context in extra**: `ctx.error(msg, extra={"operation": "name"})`
298 | 3. **Structured messages**: Consistent format for parsing
299 | 4. **Avoid spam**: Log meaningful events, not every line
300 | 
301 | ### Error Handling Best Practices
302 | 
303 | 1. **Specific exception types**: Catch specific errors when possible
304 | 2. **Include operation context**: Always log which operation failed
305 | 3. **Actionable recovery**: Provide specific steps to resolve
306 | 4. **Preserve stack traces**: Log full error details in extra
307 | 
308 | ## Success Criteria
309 | 
310 | ### Functional Requirements
311 | - ✅ All tool functions accept required Context parameter
312 | - ✅ Progress events emitted during multi-step operations (≥5 per operation)
313 | - ✅ Server logs at appropriate severity levels (DEBUG/INFO/WARNING/ERROR)
314 | - ✅ Error responses include operation name, type, and recovery suggestions
315 | - ✅ MCP client compatibility maintained (Claude, ChatGPT, etc.)
316 | 
317 | ### Technical Requirements
318 | - ✅ All existing tests pass with new instrumentation
319 | - ✅ New integration tests verify context functionality
320 | - ✅ No performance degradation (progress/logging overhead <5%)
321 | - ✅ Type hints maintained throughout
322 | 
323 | ### Quality Requirements
324 | - ✅ Code follows FastMCP patterns from documentation
325 | - ✅ Logging messages are clear and actionable
326 | - ✅ Error recovery suggestions are specific and helpful
327 | - ✅ Progress messages provide meaningful status updates
328 | 
329 | ## File Summary
330 | 
331 | ### Files to Create
332 | - `tests/test_context_integration.py` - New integration tests
333 | 
334 | ### Files to Modify
335 | - `src/tools/discover.py` - Context, progress, logging
336 | - `src/tools/posts.py` - Context, progress, logging
337 | - `src/tools/comments.py` - Context, progress, logging
338 | - `src/tools/search.py` - Context, logging
339 | - `src/server.py` - Enhanced error handling in execute_operation
340 | - `tests/test_tools.py` - Updated tests for context integration
341 | 
342 | ### Files Not Modified
343 | - `src/config.py` - No changes needed
344 | - `src/models.py` - No changes needed
345 | - `src/resources.py` - No changes needed (future enhancement)
346 | - `src/chroma_client.py` - No changes needed
347 | 
348 | ## Dependencies
349 | 
350 | ### Required
351 | - FastMCP ≥2.0.0 (already installed)
352 | - Python ≥3.10 (already using)
353 | - Context API support (available in FastMCP)
354 | 
355 | ### Optional
356 | - No additional dependencies required
357 | 
358 | ## Risks & Mitigations
359 | 
360 | | Risk | Impact | Mitigation |
361 | |------|--------|------------|
362 | | Performance overhead from logging | Low | Log only meaningful events, avoid verbose debug logs in production |
363 | | Too many progress events | Low | Limit to 5-10 events per operation |
364 | | Breaking MCP client compatibility | Low | Context changes are server-side only; MCP protocol unchanged |
365 | | Testing complexity | Low | Use FastMCP's in-memory transport for tests |
366 | 
367 | ## Backward Compatibility
368 | 
369 | **MCP Client Compatibility**: Changes are server-side implementation only. The MCP protocol interface remains unchanged, ensuring compatibility with all MCP clients including Claude, ChatGPT, and others. Context injection is handled by FastMCP's decorator system and is transparent to clients.
370 | 
371 | ## Future Enhancements
372 | 
373 | Following this implementation, future phases could include:
374 | 
375 | 1. **Resource Access Tracking** - Monitor `ctx.read_resource()` calls
376 | 2. **Sampling Monitoring** - Track `ctx.sample()` operations
377 | 3. **Metrics Collection** - Aggregate operation timing and success rates
378 | 4. **Client Integration** - Frontend components to display progress/logs
379 | 
380 | These are out of scope for this specification.
381 | 
382 | ## References
383 | 
384 | - [FastMCP Context API Documentation](../ai-docs/fastmcp/docs/python-sdk/fastmcp-server-context.mdx)
385 | - [FastMCP Progress Monitoring](../ai-docs/fastmcp/docs/clients/progress.mdx)
386 | - [FastMCP Logging](../ai-docs/fastmcp/docs/clients/logging.mdx)
387 | - Current Implementation: `src/server.py`
388 | - Original UX Improvements Spec: `../frontend-reddit-research-mcp/specs/002-ux-improvements-fastmcp-patterns/spec.md`
389 | 
```

--------------------------------------------------------------------------------
/reddit-research-agent.md:
--------------------------------------------------------------------------------

```markdown
  1 | ---
  2 | name: reddit-research-agent
  3 | description: Use this agent when you need to conduct research using Reddit MCP server tools and produce a comprehensive, well-cited research report in Obsidian-optimized markdown format. This agent specializes in gathering Reddit data (posts, comments, subreddit information), analyzing patterns and insights, and presenting findings with proper inline citations that link back to source materials.
  4 | tools: Glob, Grep, LS, Read, WebFetch, TodoWrite, WebSearch, BashOutput, KillBash, ListMcpResourcesTool, ReadMcpResourceTool, Edit, MultiEdit, Write, NotebookEdit, Bash, mcp__reddit-mcp-poc__discover_operations, mcp__reddit-mcp-poc__get_operation_schema, mcp__reddit-mcp-poc__execute_operation
  5 | model: opus
  6 | color: purple
  7 | ---
  8 | 
  9 | You are an insightful Reddit research analyst who transforms community discussions into compelling narratives. You excel at discovering diverse perspectives, synthesizing complex viewpoints, and building analytical stories that explain not just what Reddit thinks, but why different communities think differently.
 10 | 
 11 | ## Core Mission
 12 | 
 13 | Create insightful research narratives that weave together diverse Reddit perspectives into coherent analytical stories, focusing on understanding the "why" behind community viewpoints rather than simply cataloging who said what.
 14 | 
 15 | ## Technical Architecture (Reddit MCP Server)
 16 | 
 17 | Follow the three-layer workflow for Reddit operations:
 18 | 1. **Discovery**: `discover_operations()` - NO parameters
 19 | 2. **Schema**: `get_operation_schema(operation_id)` 
 20 | 3. **Execution**: `execute_operation(operation_id, parameters)`
 21 | 
 22 | Key operations:
 23 | - `discover_subreddits`: Find diverse, relevant communities
 24 | - `fetch_multiple`: Efficiently gather from multiple subreddits
 25 | - `fetch_comments`: Deep dive into valuable discussions
 26 | 
 27 | ## Research Approach
 28 | 
 29 | ### 1. Diverse Perspective Discovery
 30 | **Goal**: Find 5-7 communities with genuinely different viewpoints
 31 | 
 32 | - Use semantic search to discover conceptually related but diverse subreddits
 33 | - Prioritize variety over volume:
 34 |   - Professional vs hobbyist communities
 35 |   - Technical vs general audiences  
 36 |   - Supportive vs critical spaces
 37 |   - Different geographic/demographic focuses
 38 | - Look for unexpected or adjacent communities that discuss the topic differently
 39 | 
 40 | ### 2. Strategic Data Gathering
 41 | **Goal**: Quality insights over quantity of posts
 42 | 
 43 | ```python
 44 | execute_operation("fetch_multiple", {
 45 |     "subreddit_names": [diverse_subreddits],
 46 |     "listing_type": "top",
 47 |     "time_filter": "year", 
 48 |     "limit_per_subreddit": 10-15
 49 | })
 50 | ```
 51 | 
 52 | For high-value discussions:
 53 | ```python
 54 | execute_operation("fetch_comments", {
 55 |     "submission_id": post_id,
 56 |     "comment_limit": 50,
 57 |     "comment_sort": "best"
 58 | })
 59 | ```
 60 | 
 61 | ### 3. Analytical Synthesis
 62 | **Goal**: Build narratives that explain patterns and tensions
 63 | 
 64 | - Identify themes that cut across communities
 65 | - Understand WHY different groups hold different views
 66 | - Find surprising connections between viewpoints
 67 | - Recognize emotional undercurrents and practical concerns
 68 | - Connect individual experiences to broader patterns
 69 | 
 70 | ## Evidence & Citation Approach
 71 | 
 72 | **Philosophy**: Mix broad community patterns with individual voices to create rich, evidence-based narratives.
 73 | 
 74 | ### Three Types of Citations (USE ALL THREE):
 75 | 
 76 | #### 1. **Community-Level Citations** (broad patterns)
 77 | ```markdown
 78 | The r/sales community consistently emphasizes [theme], with discussions 
 79 | about [topic] dominating recent threads ([link1], [link2], [link3]).
 80 | ```
 81 | 
 82 | #### 2. **Individual Voice Citations** (specific quotes)
 83 | ```markdown
 84 | As one frustrated user (15 years in sales) explained: "Direct quote that 
 85 | captures the emotion and specificity" ([r/sales](link)).
 86 | ```
 87 | 
 88 | #### 3. **Cross-Community Pattern Citations**
 89 | ```markdown
 90 | This sentiment spans from r/technical ([link]) where developers 
 91 | [perspective], to r/business ([link]) where owners [different angle], 
 92 | revealing [your analysis of the pattern].
 93 | ```
 94 | 
 95 | ### Citation Density Requirements:
 96 | - **Every major claim**: 2-3 supporting citations minimum
 97 | - **Each theme section**: 3-4 broad community citations + 4-5 individual quotes
 98 | - **Pattern observations**: Evidence from at least 3 different subreddits
 99 | - **NO unsupported generalizations**: Everything cited or framed as a question
100 | 
101 | ### Example of Mixed Citation Narrative:
102 | ```markdown
103 | Small businesses are reverting to Excel not from technological ignorance, 
104 | but from painful experience. Across r/smallbusiness, implementation horror 
105 | stories dominate CRM discussions ([link1], [link2]), with costs frequently 
106 | exceeding $70,000 for "basic functionality." One owner captured the 
107 | community's frustration: "I paid $500/month to make my job harder" 
108 | ([r/smallbusiness](link)). This exodus isn't limited to non-technical users—
109 | even r/programming members share Excel templates as CRM alternatives ([link]), 
110 | suggesting the problem transcends technical capability.
111 | ```
112 | 
113 | ## Report Structure
114 | 
115 | ```markdown
116 | # [Topic]: Understanding Reddit's Perspective
117 | 
118 | ## Summary
119 | [2-3 paragraphs providing your analytical overview of what you discovered. This should tell a coherent story about how Reddit communities view this topic, major tensions, and key insights. Write this AFTER completing your analysis.]
120 | 
121 | ## The Conversation Landscape
122 | 
123 | [Analytical paragraph explaining the diversity of communities discussing this topic and why different groups care about it differently. For example: "The discussion spans from technical implementation in r/programming to business impact in r/smallbusiness, with surprisingly passionate debate in r/[unexpected_community]..."]
124 | 
125 | Key communities analyzed:
126 | - **r/[subreddit]**: [1-line description of this community's unique perspective]
127 | - **r/[subreddit]**: [What makes their viewpoint different]
128 | - **r/[subreddit]**: [Their specific angle or concern]
129 | 
130 | ## Major Themes
131 | 
132 | **IMPORTANT**: No "Top 10" lists. No bullet-point compilations. Every theme must be a narrative synthesis with extensive evidence from multiple communities showing different perspectives on the same pattern.
133 | 
134 | ### Theme 1: [Descriptive Title That Captures the Insight]
135 | 
136 | [Opening analytical paragraph explaining what this pattern is and why it matters. Include 2-3 broad community citations showing this is a widespread phenomenon, not isolated incidents.]
137 | 
138 | [Second paragraph diving into the human impact with 3-4 specific individual quotes that illustrate different facets of this theme. Show the emotional and practical reality through actual Reddit voices.]
139 | 
140 | [Third paragraph connecting different community perspectives, explaining WHY different groups see this differently. Use cross-community citations to show how the same issue manifests differently across subreddits.]
141 | 
142 | Example structure:
143 | ```markdown
144 | The CRM complexity crisis isn't about features—it's about fundamental misalignment 
145 | between vendor assumptions and small business reality. This theme dominates 
146 | r/smallbusiness discussions ([link1], [link2]), appears in weekly rant threads 
147 | on r/sales ([link3]), and even surfaces in r/ExperiencedDevs when developers 
148 | vent about building CRM integrations ([link4]).
149 | 
150 | The frustration is visceral and specific. A sales manager with 15 years 
151 | experience wrote: "I calculated it—I spend 38% of my time on CRM data entry 
152 | for metrics no one looks at" ([r/sales](link)). Another user, a small business 
153 | owner, was more blunt: "Salesforce is where sales go to die" ([r/smallbusiness](link)), 
154 | a comment that received 450 upvotes and sparked a thread of similar experiences. 
155 | Even technical users aren't immune—a developer noted: "I built our entire CRM 
156 | replacement in Google Sheets in a weekend. It does everything we need and nothing 
157 | we don't" ([r/programming](link)).
158 | 
159 | The divide between communities reveals deeper truths. While r/sales focuses on 
160 | time waste ([link1], [link2])—they have dedicated hours but resent non-selling 
161 | activities—r/smallbusiness emphasizes resource impossibility ([link3], [link4])—
162 | they simply don't have anyone to dedicate to CRM management. Meanwhile, 
163 | r/Entrepreneur questions the entire premise: "CRM is a solution looking for 
164 | a problem" was the top comment in a recent discussion ([link5]), suggesting 
165 | some view the entire category as manufactured need.
166 | ```
167 | 
168 | ### Theme 2: [Another Major Pattern or Tension]
169 | 
170 | [Similar structure - lead with YOUR analysis, support with evidence]
171 | 
172 | ### Theme 3: [Emerging Trend or Fundamental Divide]
173 | 
174 | [Similar structure - focus on synthesis and interpretation]
175 | 
176 | ## Divergent Perspectives
177 | 
178 | [Paragraph analyzing why certain communities see this topic so differently. What are the underlying factors - professional background, use cases, values, experiences - that drive these different viewpoints?]
179 | 
180 | Example contrasts:
181 | - **Technical vs Business**: [Your analysis of this divide]
182 | - **Veterans vs Newcomers**: [What experience changes]
183 | - **Geographic/Cultural**: [If relevant]
184 | 
185 | ## What This Means
186 | 
187 | [2-3 paragraphs of YOUR analysis about implications. What should someone building in this space know? What opportunities exist? What mistakes should be avoided? This should flow naturally from your research but be YOUR interpretive voice.]
188 | 
189 | Key takeaways:
190 | 1. [Actionable insight based on the research]
191 | 2. [Another practical implication]
192 | 3. [Strategic consideration]
193 | 
194 | ## Research Notes
195 | 
196 | *Communities analyzed*: [List of subreddits examined]
197 | *Methodology*: Semantic discovery to find diverse perspectives, followed by thematic analysis of top discussions and comments
198 | *Limitations*: [Brief note on any biases or gaps]
199 | ```
200 | 
201 | ## Writing Guidelines
202 | 
203 | ### Voice & Tone
204 | - **Analytical**: You're an insightful analyst, not a citation machine
205 | - **Confident**: Make clear assertions based on evidence
206 | - **Nuanced**: Acknowledge complexity without hedging excessively
207 | - **Accessible**: Write for intelligent readers who aren't Reddit experts
208 | 
209 | ### What Makes Good Analysis
210 | - Explains WHY patterns exist, not just WHAT they are
211 | - Connects disparate viewpoints into coherent narrative
212 | - Identifies non-obvious insights
213 | - Provides context for understanding different perspectives
214 | - Tells a story that helps readers understand the landscape
215 | 
216 | ### What to AVOID
217 | - ❌ "Top 10" or "Top X" lists of any kind
218 | - ❌ Bullet-point lists of complaints or features
219 | - ❌ Unsupported generalizations ("Users hate X" without citations)
220 | - ❌ Platform-by-platform breakdowns without narrative synthesis
221 | - ❌ Generic business writing that could exist without Reddit data
222 | - ❌ Claims without exploring WHY they exist
223 | 
224 | ### What to INCLUDE
225 | - ✅ Mixed citations: broad community patterns + individual voices
226 | - ✅ Cross-community analysis showing different perspectives
227 | - ✅ "Why" explanations for every pattern identified
228 | - ✅ Narrative flow that builds understanding progressively
229 | - ✅ Specific quotes that capture emotion and nuance
230 | - ✅ Evidence from at least 3 different communities per theme
231 | 
232 | ## File Handling
233 | 
234 | When saving reports:
235 | 1. Always save to `./reports/` directory (create if it doesn't exist)
236 | 2. Check if file exists with Read tool first
237 | 3. Use Write for new files, Edit/MultiEdit for existing
238 | 4. Default filename: `./reports/[topic]-reddit-analysis-[YYYY-MM-DD].md`
239 | 
240 | Example:
241 | ```bash
242 | # Ensure reports directory exists
243 | mkdir -p ./reports
244 | 
245 | # Save with descriptive filename
246 | ./reports/micro-saas-ideas-reddit-analysis-2024-01-15.md
247 | ```
248 | 
249 | ## Quality Checklist
250 | 
251 | Before finalizing:
252 | - [ ] Found genuinely diverse perspectives (5-7 different communities)
253 | - [ ] Built coherent narrative that explains the landscape
254 | - [ ] Analysis leads, evidence supports (not vice versa)
255 | - [ ] Explained WHY different groups think differently  
256 | - [ ] Connected patterns across communities
257 | - [ ] Provided actionable insights based on findings
258 | - [ ] Maintained analytical voice throughout
259 | - [ ] **Each theme has 8-12 citations minimum (mixed types)**
260 | - [ ] **No "Top X" lists anywhere in the report**
261 | - [ ] **Every claim supported by 2-3 citations**
262 | - [ ] **Community-level patterns shown with multiple links**
263 | - [ ] **Individual voices included for human perspective**
264 | - [ ] **Cross-community patterns demonstrated**
265 | - [ ] **Zero unsupported generalizations**
266 | 
267 | ## Core Competencies
268 | 
269 | ### 1. Perspective Discovery
270 | - Use semantic search to find conceptually related but culturally different communities
271 | - Identify adjacent spaces that discuss the topic from unique angles
272 | - Recognize when different terms are used for the same concept
273 | 
274 | ### 2. Narrative Building  
275 | - Connect individual comments to broader patterns
276 | - Explain tensions between different viewpoints
277 | - Identify emotional and practical drivers behind opinions
278 | - Build stories that make complex landscapes understandable
279 | 
280 | ### 3. Analytical Commentary
281 | - Add interpretive value beyond summarization
282 | - Explain implications and opportunities
283 | - Connect Reddit insights to real-world applications
284 | - Provide strategic guidance based on community wisdom
285 | 
286 | ## Remember
287 | 
288 | You're not a court reporter documenting everything said. You're an investigative analyst who:
289 | - Finds diverse perspectives across Reddit's ecosystem
290 | - Understands WHY different communities think differently
291 | - Builds compelling narratives that explain complex landscapes
292 | - Provides actionable insights through analytical synthesis
293 | 
294 | Your reports should feel like reading excellent research journalism - informative, insightful, and built on solid evidence, but driven by narrative and analysis rather than exhaustive citation.
```

--------------------------------------------------------------------------------
/tests/test_context_integration.py:
--------------------------------------------------------------------------------

```python
  1 | """
  2 | Integration tests for Context parameter acceptance in Phase 1.
  3 | 
  4 | This test suite verifies that all tool and operation functions
  5 | accept the Context parameter as required by FastMCP's Context API.
  6 | Phase 1 only validates parameter acceptance - actual context usage
  7 | will be tested in Phase 2+.
  8 | """
  9 | 
 10 | import pytest
 11 | import sys
 12 | import os
 13 | from unittest.mock import Mock, MagicMock, AsyncMock
 14 | from fastmcp import Context
 15 | 
 16 | # Add project root to Python path
 17 | sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
 18 | 
 19 | from src.tools.discover import discover_subreddits, validate_subreddit
 20 | from src.tools.search import search_in_subreddit
 21 | from src.tools.posts import fetch_subreddit_posts, fetch_multiple_subreddits
 22 | from src.tools.comments import fetch_submission_with_comments
 23 | 
 24 | 
 25 | @pytest.fixture
 26 | def mock_context():
 27 |     """Create a mock Context object for testing."""
 28 |     return Mock(spec=Context)
 29 | 
 30 | 
 31 | @pytest.fixture
 32 | def mock_reddit():
 33 |     """Create a mock Reddit client."""
 34 |     return Mock()
 35 | 
 36 | 
 37 | @pytest.fixture
 38 | def mock_chroma():
 39 |     """Mock ChromaDB client and collection."""
 40 |     with Mock() as mock_client:
 41 |         mock_collection = Mock()
 42 |         mock_collection.query.return_value = {
 43 |             'metadatas': [[
 44 |                 {'name': 'test', 'subscribers': 1000, 'url': 'https://reddit.com/r/test', 'nsfw': False}
 45 |             ]],
 46 |             'distances': [[0.5]]
 47 |         }
 48 |         yield mock_client, mock_collection
 49 | 
 50 | 
 51 | class TestDiscoverOperations:
 52 |     """Test discover_subreddits accepts context."""
 53 | 
 54 |     async def test_discover_accepts_context(self, mock_context, monkeypatch):
 55 |         """Verify discover_subreddits accepts context parameter."""
 56 |         # Mock the chroma client
 57 |         mock_client = Mock()
 58 |         mock_collection = Mock()
 59 |         mock_collection.query.return_value = {
 60 |             'metadatas': [[
 61 |                 {'name': 'test', 'subscribers': 1000, 'url': 'https://reddit.com/r/test', 'nsfw': False}
 62 |             ]],
 63 |             'distances': [[0.5]]
 64 |         }
 65 | 
 66 |         def mock_get_client():
 67 |             return mock_client
 68 | 
 69 |         def mock_get_collection(name, client):
 70 |             return mock_collection
 71 | 
 72 |         monkeypatch.setattr('src.tools.discover.get_chroma_client', mock_get_client)
 73 |         monkeypatch.setattr('src.tools.discover.get_collection', mock_get_collection)
 74 | 
 75 |         # Call with context
 76 |         result = await discover_subreddits(query="test", limit=5, ctx=mock_context)
 77 | 
 78 |         # Verify result structure (not context usage - that's Phase 2)
 79 |         assert "subreddits" in result or "error" in result
 80 | 
 81 | 
 82 | class TestSearchOperations:
 83 |     """Test search_in_subreddit accepts context."""
 84 | 
 85 |     def test_search_accepts_context(self, mock_context, mock_reddit):
 86 |         """Verify search_in_subreddit accepts context parameter."""
 87 |         mock_subreddit = Mock()
 88 |         mock_subreddit.display_name = "test"
 89 |         mock_subreddit.search.return_value = []
 90 |         mock_reddit.subreddit.return_value = mock_subreddit
 91 | 
 92 |         result = search_in_subreddit(
 93 |             subreddit_name="test",
 94 |             query="test query",
 95 |             reddit=mock_reddit,
 96 |             limit=5,
 97 |             ctx=mock_context
 98 |         )
 99 | 
100 |         assert "results" in result or "error" in result
101 | 
102 | 
103 | class TestPostOperations:
104 |     """Test post-fetching functions accept context."""
105 | 
106 |     def test_fetch_posts_accepts_context(self, mock_context, mock_reddit):
107 |         """Verify fetch_subreddit_posts accepts context parameter."""
108 |         mock_subreddit = Mock()
109 |         mock_subreddit.display_name = "test"
110 |         mock_subreddit.subscribers = 1000
111 |         mock_subreddit.public_description = "Test"
112 |         mock_subreddit.hot.return_value = []
113 |         mock_reddit.subreddit.return_value = mock_subreddit
114 | 
115 |         result = fetch_subreddit_posts(
116 |             subreddit_name="test",
117 |             reddit=mock_reddit,
118 |             limit=5,
119 |             ctx=mock_context
120 |         )
121 | 
122 |         assert "posts" in result or "error" in result
123 | 
124 |     async def test_fetch_multiple_accepts_context(self, mock_context, mock_reddit):
125 |         """Verify fetch_multiple_subreddits accepts context parameter."""
126 |         mock_multi = Mock()
127 |         mock_multi.hot.return_value = []
128 |         mock_reddit.subreddit.return_value = mock_multi
129 | 
130 |         result = await fetch_multiple_subreddits(
131 |             subreddit_names=["test1", "test2"],
132 |             reddit=mock_reddit,
133 |             limit_per_subreddit=5,
134 |             ctx=mock_context
135 |         )
136 | 
137 |         assert "subreddits_requested" in result or "error" in result
138 | 
139 | 
140 | class TestCommentOperations:
141 |     """Test comment-fetching functions accept context."""
142 | 
143 |     async def test_fetch_comments_accepts_context(self, mock_context, mock_reddit):
144 |         """Verify fetch_submission_with_comments accepts context parameter."""
145 |         mock_submission = Mock()
146 |         mock_submission.id = "test123"
147 |         mock_submission.title = "Test"
148 |         mock_submission.author = Mock()
149 |         mock_submission.author.__str__ = Mock(return_value="testuser")
150 |         mock_submission.score = 100
151 |         mock_submission.upvote_ratio = 0.95
152 |         mock_submission.num_comments = 0
153 |         mock_submission.created_utc = 1234567890.0
154 |         mock_submission.url = "https://reddit.com/test"
155 |         mock_submission.selftext = ""
156 |         mock_submission.subreddit = Mock()
157 |         mock_submission.subreddit.display_name = "test"
158 | 
159 |         # Mock comments
160 |         mock_comments = Mock()
161 |         mock_comments.__iter__ = Mock(return_value=iter([]))
162 |         mock_comments.replace_more = Mock()
163 |         mock_submission.comments = mock_comments
164 | 
165 |         mock_reddit.submission.return_value = mock_submission
166 | 
167 |         result = await fetch_submission_with_comments(
168 |             reddit=mock_reddit,
169 |             submission_id="test123",
170 |             comment_limit=10,
171 |             ctx=mock_context
172 |         )
173 | 
174 |         assert "submission" in result or "error" in result
175 | 
176 | 
177 | class TestHelperFunctions:
178 |     """Test helper functions accept context."""
179 | 
180 |     def test_validate_subreddit_accepts_context(self, mock_context, monkeypatch):
181 |         """Verify validate_subreddit accepts context parameter."""
182 |         # Mock the chroma client
183 |         mock_client = Mock()
184 |         mock_collection = Mock()
185 |         mock_collection.query.return_value = {
186 |             'metadatas': [[
187 |                 {'name': 'test', 'subscribers': 1000, 'nsfw': False}
188 |             ]],
189 |             'distances': [[0.5]]
190 |         }
191 | 
192 |         def mock_get_client():
193 |             return mock_client
194 | 
195 |         def mock_get_collection(name, client):
196 |             return mock_collection
197 | 
198 |         monkeypatch.setattr('src.tools.discover.get_chroma_client', mock_get_client)
199 |         monkeypatch.setattr('src.tools.discover.get_collection', mock_get_collection)
200 | 
201 |         result = validate_subreddit("test", ctx=mock_context)
202 | 
203 |         assert "valid" in result or "error" in result
204 | 
205 | 
206 | class TestContextParameterPosition:
207 |     """Test that context parameter works in various positions."""
208 | 
209 |     def test_context_as_last_param(self, mock_context, mock_reddit):
210 |         """Verify context works as the last parameter."""
211 |         mock_subreddit = Mock()
212 |         mock_subreddit.display_name = "test"
213 |         mock_subreddit.search.return_value = []
214 |         mock_reddit.subreddit.return_value = mock_subreddit
215 | 
216 |         # Context is last parameter
217 |         result = search_in_subreddit(
218 |             subreddit_name="test",
219 |             query="test",
220 |             reddit=mock_reddit,
221 |             sort="relevance",
222 |             time_filter="all",
223 |             limit=10,
224 |             ctx=mock_context
225 |         )
226 | 
227 |         assert result is not None
228 | 
229 |     def test_context_with_defaults(self, mock_context, mock_reddit):
230 |         """Verify context works with default parameters."""
231 |         mock_subreddit = Mock()
232 |         mock_subreddit.display_name = "test"
233 |         mock_subreddit.search.return_value = []
234 |         mock_reddit.subreddit.return_value = mock_subreddit
235 | 
236 |         # Only required params + context
237 |         result = search_in_subreddit(
238 |             subreddit_name="test",
239 |             query="test",
240 |             reddit=mock_reddit,
241 |             ctx=mock_context
242 |         )
243 | 
244 |         assert result is not None
245 | 
246 | 
247 | class TestDiscoverSubredditsProgress:
248 |     """Test progress reporting in discover_subreddits."""
249 | 
250 |     async def test_reports_progress_during_search(self, mock_context, monkeypatch):
251 |         """Verify progress is reported during vector search."""
252 |         # Mock ChromaDB response with 3 results
253 |         mock_client = Mock()
254 |         mock_collection = Mock()
255 |         mock_collection.query.return_value = {
256 |             'metadatas': [[
257 |                 {'name': 'Python', 'subscribers': 1000000, 'nsfw': False},
258 |                 {'name': 'learnpython', 'subscribers': 500000, 'nsfw': False},
259 |                 {'name': 'pythontips', 'subscribers': 100000, 'nsfw': False}
260 |             ]],
261 |             'distances': [[0.5, 0.7, 0.9]]
262 |         }
263 | 
264 |         # Setup async mock for progress
265 |         mock_context.report_progress = AsyncMock()
266 | 
267 |         def mock_get_client():
268 |             return mock_client
269 | 
270 |         def mock_get_collection(name, client):
271 |             return mock_collection
272 | 
273 |         monkeypatch.setattr('src.tools.discover.get_chroma_client', mock_get_client)
274 |         monkeypatch.setattr('src.tools.discover.get_collection', mock_get_collection)
275 | 
276 |         result = await discover_subreddits(query="python", ctx=mock_context)
277 | 
278 |         # Verify progress was reported at least 3 times (once per result)
279 |         assert mock_context.report_progress.call_count >= 3
280 | 
281 |         # Verify progress parameters
282 |         first_call = mock_context.report_progress.call_args_list[0]
283 |         # Check if arguments were passed as kwargs or positional args
284 |         if first_call[1]:  # kwargs
285 |             assert 'progress' in first_call[1]
286 |             assert 'total' in first_call[1]
287 |         else:  # positional
288 |             assert len(first_call[0]) >= 2
289 | 
290 | 
291 | class TestFetchMultipleProgress:
292 |     """Test progress reporting in fetch_multiple_subreddits."""
293 | 
294 |     async def test_reports_progress_per_subreddit(self, mock_context, mock_reddit):
295 |         """Verify progress is reported once per subreddit."""
296 |         # Setup async mock for progress
297 |         mock_context.report_progress = AsyncMock()
298 | 
299 |         # Mock submissions from 3 different subreddits
300 |         mock_sub1 = Mock()
301 |         mock_sub1.subreddit.display_name = "sub1"
302 |         mock_sub1.id = "id1"
303 |         mock_sub1.title = "Title 1"
304 |         mock_sub1.author = Mock()
305 |         mock_sub1.author.__str__ = Mock(return_value="user1")
306 |         mock_sub1.score = 100
307 |         mock_sub1.num_comments = 10
308 |         mock_sub1.created_utc = 1234567890.0
309 |         mock_sub1.url = "https://reddit.com/test1"
310 |         mock_sub1.permalink = "/r/sub1/comments/id1/"
311 | 
312 |         mock_sub2 = Mock()
313 |         mock_sub2.subreddit.display_name = "sub2"
314 |         mock_sub2.id = "id2"
315 |         mock_sub2.title = "Title 2"
316 |         mock_sub2.author = Mock()
317 |         mock_sub2.author.__str__ = Mock(return_value="user2")
318 |         mock_sub2.score = 200
319 |         mock_sub2.num_comments = 20
320 |         mock_sub2.created_utc = 1234567891.0
321 |         mock_sub2.url = "https://reddit.com/test2"
322 |         mock_sub2.permalink = "/r/sub2/comments/id2/"
323 | 
324 |         mock_sub3 = Mock()
325 |         mock_sub3.subreddit.display_name = "sub3"
326 |         mock_sub3.id = "id3"
327 |         mock_sub3.title = "Title 3"
328 |         mock_sub3.author = Mock()
329 |         mock_sub3.author.__str__ = Mock(return_value="user3")
330 |         mock_sub3.score = 300
331 |         mock_sub3.num_comments = 30
332 |         mock_sub3.created_utc = 1234567892.0
333 |         mock_sub3.url = "https://reddit.com/test3"
334 |         mock_sub3.permalink = "/r/sub3/comments/id3/"
335 | 
336 |         mock_multi = Mock()
337 |         mock_multi.hot.return_value = [mock_sub1, mock_sub2, mock_sub3]
338 |         mock_reddit.subreddit.return_value = mock_multi
339 | 
340 |         result = await fetch_multiple_subreddits(
341 |             subreddit_names=["sub1", "sub2", "sub3"],
342 |             reddit=mock_reddit,
343 |             ctx=mock_context
344 |         )
345 | 
346 |         # Verify progress was reported at least 3 times (once per subreddit)
347 |         assert mock_context.report_progress.call_count >= 3
348 | 
349 | 
350 | class TestFetchCommentsProgress:
351 |     """Test progress reporting in fetch_submission_with_comments."""
352 | 
353 |     async def test_reports_progress_during_loading(self, mock_context, mock_reddit):
354 |         """Verify progress is reported during comment loading."""
355 |         # Setup async mock for progress
356 |         mock_context.report_progress = AsyncMock()
357 | 
358 |         # Mock submission
359 |         mock_submission = Mock()
360 |         mock_submission.id = "test123"
361 |         mock_submission.title = "Test"
362 |         mock_submission.author = Mock()
363 |         mock_submission.author.__str__ = Mock(return_value="testuser")
364 |         mock_submission.score = 100
365 |         mock_submission.upvote_ratio = 0.95
366 |         mock_submission.num_comments = 5
367 |         mock_submission.created_utc = 1234567890.0
368 |         mock_submission.url = "https://reddit.com/test"
369 |         mock_submission.selftext = ""
370 |         mock_submission.subreddit = Mock()
371 |         mock_submission.subreddit.display_name = "test"
372 | 
373 |         # Mock 5 comments
374 |         mock_comments_list = []
375 |         for i in range(5):
376 |             mock_comment = Mock()
377 |             mock_comment.id = f"comment{i}"
378 |             mock_comment.body = f"Comment {i}"
379 |             mock_comment.author = Mock()
380 |             mock_comment.author.__str__ = Mock(return_value=f"user{i}")
381 |             mock_comment.score = 10 * i
382 |             mock_comment.created_utc = 1234567890.0 + i
383 |             mock_comment.replies = []
384 |             mock_comments_list.append(mock_comment)
385 | 
386 |         mock_comments = Mock()
387 |         mock_comments.__iter__ = Mock(return_value=iter(mock_comments_list))
388 |         mock_comments.replace_more = Mock()
389 |         mock_submission.comments = mock_comments
390 | 
391 |         mock_reddit.submission.return_value = mock_submission
392 | 
393 |         result = await fetch_submission_with_comments(
394 |             reddit=mock_reddit,
395 |             submission_id="test123",
396 |             comment_limit=10,
397 |             ctx=mock_context
398 |         )
399 | 
400 |         # Verify progress was reported at least 6 times (5 comments + 1 completion)
401 |         assert mock_context.report_progress.call_count >= 6
402 | 
```

--------------------------------------------------------------------------------
/specs/003-phase-2-progress-monitoring.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Phase 2: Progress Monitoring Implementation
  2 | 
  3 | **Status:** Ready for Implementation
  4 | **Created:** 2025-10-02
  5 | **Owner:** Engineering Team
  6 | **Depends On:** Phase 1 (Context Integration) ✅ Complete
  7 | 
  8 | ## Executive Summary
  9 | 
 10 | This specification details Phase 2 of the FastMCP Context API integration: adding real-time progress reporting to long-running Reddit operations. With Phase 1 complete (all tools accept `Context`), this phase focuses on implementing `ctx.report_progress()` calls to provide visibility into multi-step operations.
 11 | 
 12 | **Timeline:** 1-2 days
 13 | **Effort:** Low (foundation already in place from Phase 1)
 14 | 
 15 | ## Background
 16 | 
 17 | ### Phase 1 Completion Summary
 18 | 
 19 | Phase 1 successfully integrated the FastMCP `Context` parameter into all tool and operation functions:
 20 | - ✅ All MCP tool functions accept `ctx: Context`
 21 | - ✅ All operation functions accept and receive context
 22 | - ✅ Helper functions updated with context forwarding
 23 | - ✅ 15 tests passing (8 integration tests + 7 updated existing tests)
 24 | 
 25 | **Current State:** Context is available but unused (commented as "Phase 1: Accept context but don't use it yet")
 26 | 
 27 | ### Why Progress Monitoring?
 28 | 
 29 | Reddit operations can be time-consuming:
 30 | - **Vector search**: Searching thousands of subreddits and calculating confidence scores
 31 | - **Multi-subreddit fetches**: Fetching posts from 5-10 communities sequentially
 32 | - **Comment tree loading**: Parsing nested comment threads with hundreds of replies
 33 | 
 34 | Progress monitoring provides:
 35 | - Real-time feedback to users during long operations
 36 | - Prevention of timeout errors by showing active progress
 37 | - Better debugging visibility into operation performance
 38 | - Enhanced user experience with progress indicators
 39 | 
 40 | ## Goals
 41 | 
 42 | 1. ✅ Report progress during vector search iterations (`discover_subreddits`)
 43 | 2. ✅ Report progress per subreddit in batch fetches (`fetch_multiple_subreddits`)
 44 | 3. ✅ Report progress during comment tree traversal (`fetch_submission_with_comments`)
 45 | 4. ✅ Maintain all existing test coverage (15 tests must pass)
 46 | 5. ✅ Follow FastMCP progress reporting patterns from official docs
 47 | 
 48 | ## Non-Goals
 49 | 
 50 | - Frontend progress UI (separate project)
 51 | - Progress for single-subreddit fetches (too fast to matter)
 52 | - Structured logging (Phase 3)
 53 | - Enhanced error handling (Phase 4)
 54 | 
 55 | ## Implementation Plan
 56 | 
 57 | ### Operation 1: discover_subreddits Progress
 58 | 
 59 | **File:** `src/tools/discover.py`
 60 | **Function:** `_search_vector_db()` (lines 101-239)
 61 | **Location:** Result processing loop (lines 137-188)
 62 | 
 63 | #### Current Code Pattern
 64 | 
 65 | ```python
 66 | # Process results
 67 | processed_results = []
 68 | nsfw_filtered = 0
 69 | 
 70 | for metadata, distance in zip(
 71 |     results['metadatas'][0],
 72 |     results['distances'][0]
 73 | ):
 74 |     # Skip NSFW if not requested
 75 |     if metadata.get('nsfw', False) and not include_nsfw:
 76 |         nsfw_filtered += 1
 77 |         continue
 78 | 
 79 |     # Calculate confidence score...
 80 |     # Apply penalties...
 81 |     # Determine match type...
 82 | 
 83 |     processed_results.append({...})
 84 | ```
 85 | 
 86 | #### Enhanced Implementation
 87 | 
 88 | ```python
 89 | # Process results
 90 | processed_results = []
 91 | nsfw_filtered = 0
 92 | total_results = len(results['metadatas'][0])
 93 | 
 94 | for i, (metadata, distance) in enumerate(zip(
 95 |     results['metadatas'][0],
 96 |     results['distances'][0]
 97 | )):
 98 |     # Report progress (async call required)
 99 |     if ctx:
100 |         await ctx.report_progress(
101 |             progress=i + 1,
102 |             total=total_results,
103 |             message=f"Analyzing r/{metadata.get('name', 'unknown')}"
104 |         )
105 | 
106 |     # Skip NSFW if not requested
107 |     if metadata.get('nsfw', False) and not include_nsfw:
108 |         nsfw_filtered += 1
109 |         continue
110 | 
111 |     # Calculate confidence score...
112 |     # Apply penalties...
113 |     # Determine match type...
114 | 
115 |     processed_results.append({...})
116 | ```
117 | 
118 | #### Changes Required
119 | 
120 | 1. **Make function async**: Change `def _search_vector_db(...)` → `async def _search_vector_db(...)`
121 | 2. **Make parent function async**: Change `def discover_subreddits(...)` → `async def discover_subreddits(...)`
122 | 3. **Add await to calls**: Update `discover_subreddits` to `await _search_vector_db(...)`
123 | 4. **Add progress in loop**: Insert `await ctx.report_progress(...)` before processing each result
124 | 5. **Calculate total**: Add `total_results = len(results['metadatas'][0])` before loop
125 | 
126 | **Progress Events:** ~10-100 (depending on limit parameter)
127 | 
128 | ---
129 | 
130 | ### Operation 2: fetch_multiple_subreddits Progress
131 | 
132 | **File:** `src/tools/posts.py`
133 | **Function:** `fetch_multiple_subreddits()` (lines 102-188)
134 | **Location:** Subreddit iteration loop (lines 153-172)
135 | 
136 | #### Current Code Pattern
137 | 
138 | ```python
139 | # Parse posts and group by subreddit
140 | posts_by_subreddit = {}
141 | for submission in submissions:
142 |     subreddit_name = submission.subreddit.display_name
143 | 
144 |     if subreddit_name not in posts_by_subreddit:
145 |         posts_by_subreddit[subreddit_name] = []
146 | 
147 |     # Only add up to limit_per_subreddit posts per subreddit
148 |     if len(posts_by_subreddit[subreddit_name]) < limit_per_subreddit:
149 |         posts_by_subreddit[subreddit_name].append({...})
150 | ```
151 | 
152 | #### Enhanced Implementation
153 | 
154 | ```python
155 | # Parse posts and group by subreddit
156 | posts_by_subreddit = {}
157 | processed_subreddits = set()
158 | 
159 | for i, submission in enumerate(submissions):
160 |     subreddit_name = submission.subreddit.display_name
161 | 
162 |     # Report progress when encountering a new subreddit
163 |     if subreddit_name not in processed_subreddits:
164 |         processed_subreddits.add(subreddit_name)
165 |         if ctx:
166 |             await ctx.report_progress(
167 |                 progress=len(processed_subreddits),
168 |                 total=len(subreddit_names),
169 |                 message=f"Fetching r/{subreddit_name}"
170 |             )
171 | 
172 |     if subreddit_name not in posts_by_subreddit:
173 |         posts_by_subreddit[subreddit_name] = []
174 | 
175 |     # Only add up to limit_per_subreddit posts per subreddit
176 |     if len(posts_by_subreddit[subreddit_name]) < limit_per_subreddit:
177 |         posts_by_subreddit[subreddit_name].append({...})
178 | ```
179 | 
180 | #### Changes Required
181 | 
182 | 1. **Make function async**: Change `def fetch_multiple_subreddits(...)` → `async def fetch_multiple_subreddits(...)`
183 | 2. **Track processed subreddits**: Add `processed_subreddits = set()` before loop
184 | 3. **Add progress on new subreddit**: When a new subreddit is encountered, report progress
185 | 4. **Update server.py**: Add `await` when calling this function in `execute_operation()`
186 | 
187 | **Progress Events:** 1-10 (one per unique subreddit found)
188 | 
189 | ---
190 | 
191 | ### Operation 3: fetch_submission_with_comments Progress
192 | 
193 | **File:** `src/tools/comments.py`
194 | **Function:** `fetch_submission_with_comments()` (lines 47-147)
195 | **Location:** Comment parsing loop (lines 116-136)
196 | 
197 | #### Current Code Pattern
198 | 
199 | ```python
200 | # Parse comments
201 | comments = []
202 | comment_count = 0
203 | 
204 | for top_level_comment in submission.comments:
205 |     if hasattr(top_level_comment, 'id') and hasattr(top_level_comment, 'body'):
206 |         if comment_count >= comment_limit:
207 |             break
208 |         if isinstance(top_level_comment, PrawComment):
209 |             comments.append(parse_comment_tree(top_level_comment, ctx=ctx))
210 |         else:
211 |             # Handle mock objects in tests
212 |             comments.append(Comment(...))
213 |         # Count all comments including replies
214 |         comment_count += 1 + count_replies(comments[-1])
215 | ```
216 | 
217 | #### Enhanced Implementation
218 | 
219 | ```python
220 | # Parse comments
221 | comments = []
222 | comment_count = 0
223 | 
224 | for top_level_comment in submission.comments:
225 |     if hasattr(top_level_comment, 'id') and hasattr(top_level_comment, 'body'):
226 |         if comment_count >= comment_limit:
227 |             break
228 | 
229 |         # Report progress before processing comment
230 |         if ctx:
231 |             await ctx.report_progress(
232 |                 progress=comment_count,
233 |                 total=comment_limit,
234 |                 message=f"Loading comments ({comment_count}/{comment_limit})"
235 |             )
236 | 
237 |         if isinstance(top_level_comment, PrawComment):
238 |             comments.append(parse_comment_tree(top_level_comment, ctx=ctx))
239 |         else:
240 |             # Handle mock objects in tests
241 |             comments.append(Comment(...))
242 |         # Count all comments including replies
243 |         comment_count += 1 + count_replies(comments[-1])
244 | 
245 | # Report final completion
246 | if ctx:
247 |     await ctx.report_progress(
248 |         progress=comment_count,
249 |         total=comment_limit,
250 |         message=f"Completed: {comment_count} comments loaded"
251 |     )
252 | ```
253 | 
254 | #### Changes Required
255 | 
256 | 1. **Make function async**: Change `def fetch_submission_with_comments(...)` → `async def fetch_submission_with_comments(...)`
257 | 2. **Add progress in loop**: Insert `await ctx.report_progress(...)` before parsing each top-level comment
258 | 3. **Add completion progress**: Report final progress after loop completes
259 | 4. **Update server.py**: Add `await` when calling this function in `execute_operation()`
260 | 
261 | **Progress Events:** ~5-100 (depending on comment_limit and tree depth)
262 | 
263 | ---
264 | 
265 | ## FastMCP Progress Patterns
266 | 
267 | ### Basic Pattern (from FastMCP docs)
268 | 
269 | ```python
270 | from fastmcp import FastMCP, Context
271 | 
272 | @mcp.tool
273 | async def process_items(items: list[str], ctx: Context) -> dict:
274 |     """Process a list of items with progress updates."""
275 |     total = len(items)
276 |     results = []
277 | 
278 |     for i, item in enumerate(items):
279 |         # Report progress as we process each item
280 |         await ctx.report_progress(progress=i, total=total)
281 | 
282 |         results.append(item.upper())
283 | 
284 |     # Report 100% completion
285 |     await ctx.report_progress(progress=total, total=total)
286 | 
287 |     return {"processed": len(results), "results": results}
288 | ```
289 | 
290 | ### Key Requirements
291 | 
292 | 1. **Functions must be async** to use `await ctx.report_progress()`
293 | 2. **Progress parameter**: Current progress value (e.g., 5, 24, 0.75)
294 | 3. **Total parameter**: Optional total value (enables percentage calculation)
295 | 4. **Message parameter**: Optional descriptive message (not shown in examples above but supported)
296 | 
297 | ### Best Practices
298 | 
299 | - Report at regular intervals (every iteration for small loops)
300 | - Provide descriptive messages when possible
301 | - Report final completion (100%)
302 | - Don't spam - limit to reasonable frequency (5-10 events minimum)
303 | 
304 | ## Testing Requirements
305 | 
306 | ### Update Existing Tests
307 | 
308 | **File:** `tests/test_context_integration.py`
309 | 
310 | Add assertions to verify progress calls:
311 | 
312 | ```python
313 | import pytest
314 | from unittest.mock import AsyncMock, MagicMock, patch
315 | 
316 | class TestDiscoverSubredditsProgress:
317 |     """Test progress reporting in discover_subreddits."""
318 | 
319 |     @pytest.mark.asyncio
320 |     async def test_reports_progress_during_search(self, mock_context):
321 |         """Verify progress is reported during vector search."""
322 |         # Mock ChromaDB response with 3 results
323 |         mock_collection = MagicMock()
324 |         mock_collection.query.return_value = {
325 |             'metadatas': [[
326 |                 {'name': 'Python', 'subscribers': 1000000, 'nsfw': False},
327 |                 {'name': 'learnpython', 'subscribers': 500000, 'nsfw': False},
328 |                 {'name': 'pythontips', 'subscribers': 100000, 'nsfw': False}
329 |             ]],
330 |             'distances': [[0.5, 0.7, 0.9]]
331 |         }
332 | 
333 |         # Setup async mock for progress
334 |         mock_context.report_progress = AsyncMock()
335 | 
336 |         with patch('src.tools.discover.get_chroma_client'), \
337 |              patch('src.tools.discover.get_collection', return_value=mock_collection):
338 | 
339 |             result = await discover_subreddits(query="python", ctx=mock_context)
340 | 
341 |         # Verify progress was reported at least 3 times (once per result)
342 |         assert mock_context.report_progress.call_count >= 3
343 | 
344 |         # Verify progress parameters
345 |         first_call = mock_context.report_progress.call_args_list[0]
346 |         assert 'progress' in first_call[1] or len(first_call[0]) >= 1
347 |         assert 'total' in first_call[1] or len(first_call[0]) >= 2
348 | ```
349 | 
350 | ### New Test Coverage
351 | 
352 | Add similar tests for:
353 | - `test_fetch_multiple_subreddits_progress` - Verify progress per subreddit
354 | - `test_fetch_comments_progress` - Verify progress during comment loading
355 | 
356 | ### Success Criteria
357 | 
358 | - ✅ All existing 15 tests still pass
359 | - ✅ New progress assertion tests pass
360 | - ✅ Progress called at least 5 times per operation (varies by data)
361 | - ✅ No performance degradation (progress overhead <5%)
362 | 
363 | ## Server.py Updates
364 | 
365 | **File:** `src/server.py`
366 | **Functions:** Update calls to async operations
367 | 
368 | ### Current Pattern
369 | 
370 | ```python
371 | @mcp.tool
372 | def execute_operation(
373 |     operation_id: str,
374 |     parameters: dict,
375 |     ctx: Context
376 | ) -> dict:
377 |     """Execute a Reddit operation by ID."""
378 | 
379 |     if operation_id == "discover_subreddits":
380 |         return discover_subreddits(**params)
381 | ```
382 | 
383 | ### Updated Pattern
384 | 
385 | ```python
386 | @mcp.tool
387 | async def execute_operation(
388 |     operation_id: str,
389 |     parameters: dict,
390 |     ctx: Context
391 | ) -> dict:
392 |     """Execute a Reddit operation by ID."""
393 | 
394 |     if operation_id == "discover_subreddits":
395 |         return await discover_subreddits(**params)
396 | ```
397 | 
398 | ### Changes Required
399 | 
400 | 1. **Make execute_operation async**: `async def execute_operation(...)`
401 | 2. **Add await to async operations**:
402 |    - `await discover_subreddits(**params)`
403 |    - `await fetch_multiple_subreddits(**params)`
404 |    - `await fetch_submission_with_comments(**params)`
405 | 
406 | ## Implementation Checklist
407 | 
408 | ### Code Changes
409 | 
410 | - [ ] **src/tools/discover.py**
411 |   - [ ] Make `discover_subreddits()` async
412 |   - [ ] Make `_search_vector_db()` async
413 |   - [ ] Add `await` to `_search_vector_db()` call
414 |   - [ ] Add progress reporting in result processing loop
415 |   - [ ] Calculate total before loop starts
416 | 
417 | - [ ] **src/tools/posts.py**
418 |   - [ ] Make `fetch_multiple_subreddits()` async
419 |   - [ ] Add `processed_subreddits` tracking set
420 |   - [ ] Add progress reporting when new subreddit encountered
421 | 
422 | - [ ] **src/tools/comments.py**
423 |   - [ ] Make `fetch_submission_with_comments()` async
424 |   - [ ] Add progress reporting in comment parsing loop
425 |   - [ ] Add final completion progress report
426 | 
427 | - [ ] **src/server.py**
428 |   - [ ] Make `execute_operation()` async
429 |   - [ ] Add `await` to `discover_subreddits()` call
430 |   - [ ] Add `await` to `fetch_multiple_subreddits()` call
431 |   - [ ] Add `await` to `fetch_submission_with_comments()` call
432 | 
433 | ### Testing
434 | 
435 | - [ ] Update `tests/test_context_integration.py`
436 |   - [ ] Add progress test for `discover_subreddits`
437 |   - [ ] Add progress test for `fetch_multiple_subreddits`
438 |   - [ ] Add progress test for `fetch_submission_with_comments`
439 | 
440 | - [ ] Run full test suite: `pytest tests/`
441 |   - [ ] All 15 existing tests pass
442 |   - [ ] New progress tests pass
443 |   - [ ] No regressions
444 | 
445 | ### Validation
446 | 
447 | - [ ] Manual testing with MCP Inspector or Claude Desktop
448 | - [ ] Verify progress events appear in client logs
449 | - [ ] Confirm no performance degradation
450 | - [ ] Check that messages are descriptive and useful
451 | 
452 | ## File Summary
453 | 
454 | ### Files to Modify (4 files)
455 | 
456 | 1. `src/tools/discover.py` - Add progress to vector search
457 | 2. `src/tools/posts.py` - Add progress to batch fetches
458 | 3. `src/tools/comments.py` - Add progress to comment loading
459 | 4. `src/server.py` - Make execute_operation async + await calls
460 | 
461 | ### Files to Update (1 file)
462 | 
463 | 1. `tests/test_context_integration.py` - Add progress assertions
464 | 
465 | ### Files Not Modified
466 | 
467 | - `src/config.py` - No changes needed
468 | - `src/models.py` - No changes needed
469 | - `src/chroma_client.py` - No changes needed
470 | - `src/resources.py` - No changes needed
471 | - `tests/test_tools.py` - No changes needed (already passing)
472 | 
473 | ## Success Criteria
474 | 
475 | ### Functional Requirements
476 | 
477 | - ✅ Progress events emitted during vector search (≥5 per search)
478 | - ✅ Progress events emitted during multi-subreddit fetch (1 per subreddit)
479 | - ✅ Progress events emitted during comment loading (≥5 per fetch)
480 | - ✅ Progress includes total when known
481 | - ✅ Progress messages are descriptive
482 | 
483 | ### Technical Requirements
484 | 
485 | - ✅ All functions properly async/await
486 | - ✅ All 15+ tests pass
487 | - ✅ No breaking changes to API
488 | - ✅ Type hints maintained
489 | - ✅ No performance degradation
490 | 
491 | ### Quality Requirements
492 | 
493 | - ✅ Progress messages are user-friendly
494 | - ✅ Progress updates at reasonable frequency (not spammy)
495 | - ✅ Code follows FastMCP patterns from official docs
496 | - ✅ Maintains consistency with Phase 1 implementation
497 | 
498 | ## Estimated Effort
499 | 
500 | **Total Time:** 1-2 days
501 | 
502 | **Breakdown:**
503 | - Code implementation: 3-4 hours
504 | - Testing updates: 2-3 hours
505 | - Manual validation: 1 hour
506 | - Bug fixes & refinement: 1-2 hours
507 | 
508 | **Reduced from master spec (3-4 days)** because:
509 | - Phase 1 foundation complete (Context integration done)
510 | - Clear patterns established in codebase
511 | - Limited scope (3 operations only)
512 | - Existing test infrastructure in place
513 | 
514 | ## Next Steps
515 | 
516 | After Phase 2 completion:
517 | - **Phase 3**: Structured Logging (2-3 days)
518 | - **Phase 4**: Enhanced Error Handling (2 days)
519 | - **Phase 5**: Testing & Validation (1 day)
520 | 
521 | ## References
522 | 
523 | - [FastMCP Progress Documentation](../ai-docs/fastmcp/docs/servers/progress.mdx)
524 | - [FastMCP Context API](../ai-docs/fastmcp/docs/servers/context.mdx)
525 | - [Phase 1 Completion Summary](./003-phase-1-context-integration.md) *(if created)*
526 | - [Master Specification](./003-fastmcp-context-integration.md)
527 | - Current Implementation: `src/server.py`, `src/tools/*.py`
528 | 
```

--------------------------------------------------------------------------------
/specs/003-phase-1-context-integration.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Phase 1: Context Integration - Detailed Specification
  2 | 
  3 | **Status:** Ready for Implementation
  4 | **Created:** 2025-10-02
  5 | **Phase Duration:** Days 1-2
  6 | **Owner:** Engineering Team
  7 | **Parent Spec:** [003-fastmcp-context-integration.md](./003-fastmcp-context-integration.md)
  8 | 
  9 | ## Objective
 10 | 
 11 | Enable all tool functions in the Reddit MCP server to receive and utilize FastMCP's Context API. This phase establishes the foundation for progress monitoring, structured logging, and enhanced error handling in subsequent phases.
 12 | 
 13 | ## Background
 14 | 
 15 | FastMCP's Context API is automatically injected into tool functions decorated with `@mcp.tool`. The context object provides methods for:
 16 | - Progress reporting: `ctx.report_progress(current, total, message)`
 17 | - Structured logging: `ctx.info()`, `ctx.warning()`, `ctx.error()`, `ctx.debug()`
 18 | - Error context: Rich error information via structured logging
 19 | 
 20 | To use these features, all tool functions must accept a `Context` parameter. This phase focuses solely on adding the context parameter to function signatures—no actual usage of context methods yet.
 21 | 
 22 | ## Goals
 23 | 
 24 | 1. **Add Context Parameter**: Update all tool function signatures to accept `ctx: Context`
 25 | 2. **Maintain Type Safety**: Preserve all type hints and ensure type checking passes
 26 | 3. **Verify Auto-Injection**: Confirm FastMCP's decorator system injects context correctly
 27 | 4. **Test Compatibility**: Ensure all existing tests pass with updated signatures
 28 | 
 29 | ## Non-Goals
 30 | 
 31 | - Using context methods (progress, logging, error handling) - Phase 2+
 32 | - Adding new tool functions or operations
 33 | - Modifying MCP protocol or client interfaces
 34 | - Performance optimization or refactoring
 35 | 
 36 | ## Implementation Details
 37 | 
 38 | ### Context Parameter Pattern
 39 | 
 40 | FastMCP automatically injects `Context` when tools are decorated with `@mcp.tool`:
 41 | 
 42 | ```python
 43 | from fastmcp import Context
 44 | 
 45 | @mcp.tool
 46 | def my_tool(param: str, ctx: Context) -> dict:
 47 |     # Context is automatically injected by FastMCP
 48 |     # No usage required in Phase 1 - just accept the parameter
 49 |     return {"result": "data"}
 50 | ```
 51 | 
 52 | **Important Notes:**
 53 | - Context is a **required** parameter (not optional)
 54 | - Position in signature: Place after all other parameters
 55 | - Type hint must be `Context` (imported from `fastmcp`)
 56 | - No default value needed - FastMCP injects automatically
 57 | 
 58 | ### Files to Modify
 59 | 
 60 | #### 1. `src/tools/discover.py`
 61 | 
 62 | **Functions to update:**
 63 | - `discover_subreddits(query: str, limit: int = 10) -> dict`
 64 | - `get_subreddit_info(subreddit_name: str) -> dict`
 65 | 
 66 | **Before:**
 67 | ```python
 68 | def discover_subreddits(query: str, limit: int = 10) -> dict:
 69 |     """Search vector database for relevant subreddits."""
 70 |     results = search_vector_db(query, limit)
 71 |     return {
 72 |         "subreddits": [format_subreddit(r) for r in results],
 73 |         "count": len(results)
 74 |     }
 75 | ```
 76 | 
 77 | **After:**
 78 | ```python
 79 | from fastmcp import Context
 80 | 
 81 | def discover_subreddits(
 82 |     query: str,
 83 |     limit: int = 10,
 84 |     ctx: Context
 85 | ) -> dict:
 86 |     """Search vector database for relevant subreddits."""
 87 |     # Phase 1: Accept context but don't use it yet
 88 |     results = search_vector_db(query, limit)
 89 |     return {
 90 |         "subreddits": [format_subreddit(r) for r in results],
 91 |         "count": len(results)
 92 |     }
 93 | ```
 94 | 
 95 | **Estimated Time:** 30 minutes
 96 | 
 97 | ---
 98 | 
 99 | #### 2. `src/tools/posts.py`
100 | 
101 | **Functions to update:**
102 | - `fetch_subreddit_posts(subreddit_name: str, limit: int = 10, time_filter: str = "all", sort: str = "hot") -> dict`
103 | - `fetch_multiple_subreddits(subreddit_names: list[str], limit_per_subreddit: int = 10) -> dict`
104 | - `get_post_details(post_id: str) -> dict`
105 | 
106 | **Before:**
107 | ```python
108 | def fetch_subreddit_posts(
109 |     subreddit_name: str,
110 |     limit: int = 10,
111 |     time_filter: str = "all",
112 |     sort: str = "hot"
113 | ) -> dict:
114 |     """Fetch posts from a subreddit."""
115 |     subreddit = reddit.subreddit(subreddit_name)
116 |     posts = list(subreddit.hot(limit=limit))
117 |     return {"posts": [format_post(p) for p in posts]}
118 | ```
119 | 
120 | **After:**
121 | ```python
122 | from fastmcp import Context
123 | 
124 | def fetch_subreddit_posts(
125 |     subreddit_name: str,
126 |     limit: int = 10,
127 |     time_filter: str = "all",
128 |     sort: str = "hot",
129 |     ctx: Context
130 | ) -> dict:
131 |     """Fetch posts from a subreddit."""
132 |     # Phase 1: Accept context but don't use it yet
133 |     subreddit = reddit.subreddit(subreddit_name)
134 |     posts = list(subreddit.hot(limit=limit))
135 |     return {"posts": [format_post(p) for p in posts]}
136 | ```
137 | 
138 | **Estimated Time:** 45 minutes
139 | 
140 | ---
141 | 
142 | #### 3. `src/tools/comments.py`
143 | 
144 | **Functions to update:**
145 | - `fetch_submission_with_comments(submission_id: str, comment_limit: int = 50, comment_sort: str = "best") -> dict`
146 | - `get_comment_thread(comment_id: str, depth: int = 5) -> dict`
147 | 
148 | **Before:**
149 | ```python
150 | def fetch_submission_with_comments(
151 |     submission_id: str,
152 |     comment_limit: int = 50,
153 |     comment_sort: str = "best"
154 | ) -> dict:
155 |     """Fetch submission and its comments."""
156 |     submission = reddit.submission(id=submission_id)
157 |     comments = fetch_comments(submission, comment_limit, comment_sort)
158 |     return {
159 |         "submission": format_submission(submission),
160 |         "comments": comments
161 |     }
162 | ```
163 | 
164 | **After:**
165 | ```python
166 | from fastmcp import Context
167 | 
168 | def fetch_submission_with_comments(
169 |     submission_id: str,
170 |     comment_limit: int = 50,
171 |     comment_sort: str = "best",
172 |     ctx: Context
173 | ) -> dict:
174 |     """Fetch submission and its comments."""
175 |     # Phase 1: Accept context but don't use it yet
176 |     submission = reddit.submission(id=submission_id)
177 |     comments = fetch_comments(submission, comment_limit, comment_sort)
178 |     return {
179 |         "submission": format_submission(submission),
180 |         "comments": comments
181 |     }
182 | ```
183 | 
184 | **Estimated Time:** 30 minutes
185 | 
186 | ---
187 | 
188 | #### 4. `src/tools/search.py`
189 | 
190 | **Functions to update:**
191 | - `search_subreddit(subreddit_name: str, query: str, limit: int = 10, time_filter: str = "all", sort: str = "relevance") -> dict`
192 | 
193 | **Before:**
194 | ```python
195 | def search_subreddit(
196 |     subreddit_name: str,
197 |     query: str,
198 |     limit: int = 10,
199 |     time_filter: str = "all",
200 |     sort: str = "relevance"
201 | ) -> dict:
202 |     """Search within a specific subreddit."""
203 |     subreddit = reddit.subreddit(subreddit_name)
204 |     results = subreddit.search(query, limit=limit, time_filter=time_filter, sort=sort)
205 |     return {"results": [format_post(r) for r in results]}
206 | ```
207 | 
208 | **After:**
209 | ```python
210 | from fastmcp import Context
211 | 
212 | def search_subreddit(
213 |     subreddit_name: str,
214 |     query: str,
215 |     limit: int = 10,
216 |     time_filter: str = "all",
217 |     sort: str = "relevance",
218 |     ctx: Context
219 | ) -> dict:
220 |     """Search within a specific subreddit."""
221 |     # Phase 1: Accept context but don't use it yet
222 |     subreddit = reddit.subreddit(subreddit_name)
223 |     results = subreddit.search(query, limit=limit, time_filter=time_filter, sort=sort)
224 |     return {"results": [format_post(r) for r in results]}
225 | ```
226 | 
227 | **Estimated Time:** 20 minutes
228 | 
229 | ---
230 | 
231 | #### 5. `src/server.py`
232 | 
233 | **Changes needed:**
234 | - Import Context from fastmcp
235 | - Verify execute_operation passes context to tools (FastMCP handles this automatically)
236 | - No signature changes needed for execute_operation itself
237 | 
238 | **Before:**
239 | ```python
240 | # At top of file
241 | from fastmcp import FastMCP
242 | 
243 | mcp = FastMCP("Reddit Research MCP")
244 | ```
245 | 
246 | **After:**
247 | ```python
248 | # At top of file
249 | from fastmcp import FastMCP, Context
250 | 
251 | mcp = FastMCP("Reddit Research MCP")
252 | 
253 | # No other changes needed - FastMCP auto-injects context
254 | ```
255 | 
256 | **Estimated Time:** 10 minutes
257 | 
258 | ---
259 | 
260 | ### Helper Functions
261 | 
262 | **Internal helper functions** (not decorated with `@mcp.tool`) that need context should also accept it:
263 | 
264 | ```python
265 | # Helper function called by tool
266 | def fetch_comments(submission, limit: int, sort: str, ctx: Context) -> list:
267 |     """Internal helper for fetching comments."""
268 |     # Phase 1: Accept context but don't use it yet
269 |     submission.comment_sort = sort
270 |     submission.comments.replace_more(limit=0)
271 |     return list(submission.comments.list()[:limit])
272 | ```
273 | 
274 | **Functions to check:**
275 | - `src/tools/discover.py`: `search_vector_db()`, `format_subreddit()`
276 | - `src/tools/posts.py`: `format_post()`
277 | - `src/tools/comments.py`: `fetch_comments()`, `format_comment()`
278 | 
279 | **Decision rule:** Only add context to helpers that will need it in Phase 2+ (for logging/progress). Review each helper and add context parameter if:
280 | 1. It performs I/O operations (API calls, database queries)
281 | 2. It contains loops that could benefit from progress reporting
282 | 3. It has error handling that would benefit from context logging
283 | 
284 | **Estimated Time:** 30 minutes
285 | 
286 | ---
287 | 
288 | ## Testing Strategy
289 | 
290 | ### Unit Tests
291 | 
292 | Update existing tests in `tests/test_tools.py` to pass context:
293 | 
294 | **Before:**
295 | ```python
296 | def test_discover_subreddits():
297 |     result = discover_subreddits("machine learning", limit=5)
298 |     assert result["count"] == 5
299 | ```
300 | 
301 | **After:**
302 | ```python
303 | from unittest.mock import Mock
304 | from fastmcp import Context
305 | 
306 | def test_discover_subreddits():
307 |     # Create mock context for testing
308 |     mock_ctx = Mock(spec=Context)
309 | 
310 |     result = discover_subreddits("machine learning", limit=5, ctx=mock_ctx)
311 |     assert result["count"] == 5
312 | ```
313 | 
314 | **Note:** FastMCP provides test utilities for creating context objects. Consult FastMCP testing documentation for best practices.
315 | 
316 | ### Integration Tests
317 | 
318 | **New test file:** `tests/test_context_integration.py`
319 | 
320 | ```python
321 | import pytest
322 | from unittest.mock import Mock
323 | from fastmcp import Context
324 | 
325 | from src.tools.discover import discover_subreddits
326 | from src.tools.posts import fetch_subreddit_posts
327 | from src.tools.comments import fetch_submission_with_comments
328 | from src.tools.search import search_subreddit
329 | 
330 | @pytest.fixture
331 | def mock_context():
332 |     """Create a mock Context object for testing."""
333 |     return Mock(spec=Context)
334 | 
335 | def test_discover_accepts_context(mock_context):
336 |     """Verify discover_subreddits accepts context parameter."""
337 |     result = discover_subreddits("test query", limit=5, ctx=mock_context)
338 |     assert "subreddits" in result
339 | 
340 | def test_fetch_posts_accepts_context(mock_context):
341 |     """Verify fetch_subreddit_posts accepts context parameter."""
342 |     result = fetch_subreddit_posts("python", limit=5, ctx=mock_context)
343 |     assert "posts" in result
344 | 
345 | def test_fetch_comments_accepts_context(mock_context):
346 |     """Verify fetch_submission_with_comments accepts context parameter."""
347 |     result = fetch_submission_with_comments("test_id", comment_limit=10, ctx=mock_context)
348 |     assert "submission" in result
349 |     assert "comments" in result
350 | 
351 | def test_search_accepts_context(mock_context):
352 |     """Verify search_subreddit accepts context parameter."""
353 |     result = search_subreddit("python", "testing", limit=5, ctx=mock_context)
354 |     assert "results" in result
355 | ```
356 | 
357 | **Estimated Time:** 1 hour
358 | 
359 | ---
360 | 
361 | ## Success Criteria
362 | 
363 | ### Phase 1 Completion Checklist
364 | 
365 | - [ ] All functions in `src/tools/discover.py` accept `ctx: Context`
366 | - [ ] All functions in `src/tools/posts.py` accept `ctx: Context`
367 | - [ ] All functions in `src/tools/comments.py` accept `ctx: Context`
368 | - [ ] All functions in `src/tools/search.py` accept `ctx: Context`
369 | - [ ] `src/server.py` imports Context from fastmcp
370 | - [ ] All relevant helper functions accept context parameter
371 | - [ ] All existing unit tests updated to pass context
372 | - [ ] New integration tests created in `tests/test_context_integration.py`
373 | - [ ] All tests pass: `pytest tests/`
374 | - [ ] Type checking passes: `mypy src/`
375 | - [ ] No regressions in existing functionality
376 | 
377 | ### Validation Commands
378 | 
379 | ```bash
380 | # Run all tests
381 | pytest tests/ -v
382 | 
383 | # Type checking
384 | mypy src/
385 | 
386 | # Verify no breaking changes
387 | pytest tests/test_tools.py -v
388 | ```
389 | 
390 | ---
391 | 
392 | ## Implementation Order
393 | 
394 | 1. **Day 1 Morning (2 hours)**
395 |    - Update `src/tools/discover.py` (30 min)
396 |    - Update `src/tools/posts.py` (45 min)
397 |    - Update `src/tools/comments.py` (30 min)
398 |    - Update `src/tools/search.py` (20 min)
399 | 
400 | 2. **Day 1 Afternoon (2 hours)**
401 |    - Update `src/server.py` (10 min)
402 |    - Review and update helper functions (30 min)
403 |    - Update existing unit tests (1 hour)
404 |    - Run full test suite and fix issues (20 min)
405 | 
406 | 3. **Day 2 Morning (2 hours)**
407 |    - Create `tests/test_context_integration.py` (1 hour)
408 |    - Run all validation commands (30 min)
409 |    - Code review and cleanup (30 min)
410 | 
411 | 4. **Day 2 Afternoon (1 hour)**
412 |    - Final testing and validation
413 |    - Documentation updates (if needed)
414 |    - Prepare for Phase 2
415 | 
416 | **Total Estimated Time:** 7 hours over 2 days
417 | 
418 | ---
419 | 
420 | ## Dependencies
421 | 
422 | ### Required Packages
423 | - `fastmcp>=2.0.0` (already installed)
424 | - `pytest>=7.0.0` (already installed for testing)
425 | - `mypy>=1.0.0` (recommended for type checking)
426 | 
427 | ### External Dependencies
428 | - None - this phase only modifies function signatures
429 | 
430 | ### Knowledge Prerequisites
431 | - FastMCP decorator system and auto-injection
432 | - Python type hints and type checking
433 | - Pytest fixture system for mocking
434 | 
435 | ---
436 | 
437 | ## Risks & Mitigations
438 | 
439 | | Risk | Likelihood | Impact | Mitigation |
440 | |------|------------|--------|------------|
441 | | Breaking existing tests | Medium | High | Update tests incrementally, verify after each file |
442 | | Type checking errors | Low | Medium | Use `Mock(spec=Context)` for type-safe mocking |
443 | | FastMCP auto-injection not working | Low | High | Verify with simple test case first; consult docs |
444 | | Forgetting helper functions | Medium | Medium | Grep codebase for all function definitions, review systematically |
445 | 
446 | ---
447 | 
448 | ## Code Review Checklist
449 | 
450 | Before marking Phase 1 complete, verify:
451 | 
452 | - [ ] All tool functions have `ctx: Context` as last parameter
453 | - [ ] Type hints are correct: `ctx: Context` (not `ctx: Optional[Context]`)
454 | - [ ] Import statements include `from fastmcp import Context`
455 | - [ ] Helper functions that need context receive it
456 | - [ ] Test mocks use `Mock(spec=Context)` for type safety
457 | - [ ] No actual usage of context methods (that's Phase 2+)
458 | - [ ] All tests pass without errors or warnings
459 | - [ ] Type checking passes with mypy
460 | 
461 | ---
462 | 
463 | ## Next Steps
464 | 
465 | Upon successful completion of Phase 1:
466 | 
467 | 1. **Phase 2: Progress Monitoring** - Add `ctx.report_progress()` calls
468 | 2. **Phase 3: Structured Logging** - Add `ctx.info()`, `ctx.warning()`, `ctx.error()`
469 | 3. **Phase 4: Enhanced Error Handling** - Use context in error scenarios
470 | 4. **Phase 5: Testing & Validation** - Comprehensive integration testing
471 | 
472 | ---
473 | 
474 | ## References
475 | 
476 | - [FastMCP Context API Documentation](../ai-docs/fastmcp/docs/python-sdk/fastmcp-server-context.mdx)
477 | - [FastMCP Tool Decorator Pattern](../ai-docs/fastmcp/docs/python-sdk/fastmcp-server-tool.mdx)
478 | - [Parent Specification](./003-fastmcp-context-integration.md)
479 | - Current Implementation: `src/server.py`
480 | 
481 | ---
482 | 
483 | ## Appendix: Complete Example
484 | 
485 | **Full example showing before/after for a complete tool function:**
486 | 
487 | **Before (existing code):**
488 | ```python
489 | # src/tools/posts.py
490 | from src.reddit_client import reddit
491 | 
492 | def fetch_subreddit_posts(
493 |     subreddit_name: str,
494 |     limit: int = 10,
495 |     time_filter: str = "all",
496 |     sort: str = "hot"
497 | ) -> dict:
498 |     """
499 |     Fetch posts from a subreddit.
500 | 
501 |     Args:
502 |         subreddit_name: Name of the subreddit
503 |         limit: Number of posts to fetch
504 |         time_filter: Time filter (all, day, week, month, year)
505 |         sort: Sort method (hot, new, top, rising)
506 | 
507 |     Returns:
508 |         Dictionary with posts and metadata
509 |     """
510 |     try:
511 |         subreddit = reddit.subreddit(subreddit_name)
512 | 
513 |         # Get posts based on sort method
514 |         if sort == "hot":
515 |             posts = list(subreddit.hot(limit=limit))
516 |         elif sort == "new":
517 |             posts = list(subreddit.new(limit=limit))
518 |         elif sort == "top":
519 |             posts = list(subreddit.top(time_filter=time_filter, limit=limit))
520 |         elif sort == "rising":
521 |             posts = list(subreddit.rising(limit=limit))
522 |         else:
523 |             raise ValueError(f"Invalid sort method: {sort}")
524 | 
525 |         return {
526 |             "success": True,
527 |             "subreddit": subreddit_name,
528 |             "posts": [format_post(p) for p in posts],
529 |             "count": len(posts)
530 |         }
531 | 
532 |     except Exception as e:
533 |         return {
534 |             "success": False,
535 |             "error": str(e),
536 |             "subreddit": subreddit_name
537 |         }
538 | ```
539 | 
540 | **After (Phase 1 changes):**
541 | ```python
542 | # src/tools/posts.py
543 | from fastmcp import Context
544 | from src.reddit_client import reddit
545 | 
546 | def fetch_subreddit_posts(
547 |     subreddit_name: str,
548 |     limit: int = 10,
549 |     time_filter: str = "all",
550 |     sort: str = "hot",
551 |     ctx: Context  # ← ONLY CHANGE IN PHASE 1
552 | ) -> dict:
553 |     """
554 |     Fetch posts from a subreddit.
555 | 
556 |     Args:
557 |         subreddit_name: Name of the subreddit
558 |         limit: Number of posts to fetch
559 |         time_filter: Time filter (all, day, week, month, year)
560 |         sort: Sort method (hot, new, top, rising)
561 |         ctx: FastMCP context (auto-injected)
562 | 
563 |     Returns:
564 |         Dictionary with posts and metadata
565 |     """
566 |     # Phase 1: Context accepted but not used yet
567 |     # Phase 2+ will add: ctx.report_progress(), ctx.info(), etc.
568 | 
569 |     try:
570 |         subreddit = reddit.subreddit(subreddit_name)
571 | 
572 |         # Get posts based on sort method
573 |         if sort == "hot":
574 |             posts = list(subreddit.hot(limit=limit))
575 |         elif sort == "new":
576 |             posts = list(subreddit.new(limit=limit))
577 |         elif sort == "top":
578 |             posts = list(subreddit.top(time_filter=time_filter, limit=limit))
579 |         elif sort == "rising":
580 |             posts = list(subreddit.rising(limit=limit))
581 |         else:
582 |             raise ValueError(f"Invalid sort method: {sort}")
583 | 
584 |         return {
585 |             "success": True,
586 |             "subreddit": subreddit_name,
587 |             "posts": [format_post(p) for p in posts],
588 |             "count": len(posts)
589 |         }
590 | 
591 |     except Exception as e:
592 |         return {
593 |             "success": False,
594 |             "error": str(e),
595 |             "subreddit": subreddit_name
596 |         }
597 | ```
598 | 
599 | **Key observations:**
600 | 1. Only the function signature changed
601 | 2. Type hint added to docstring
602 | 3. No logic changes - context not used yet
603 | 4. Comment indicates Phase 1 status
604 | 
```

--------------------------------------------------------------------------------
/reports/ai-llm-weekly-trends-reddit-analysis-2025-01-20.md:
--------------------------------------------------------------------------------

```markdown
 1 | # AI & LLM Trends on Reddit: Weekly Analysis (January 13-20, 2025)
 2 | 
 3 | ## Summary
 4 | 
 5 | The AI community on Reddit experienced a watershed week marked by OpenAI's release of GPT-5-Codex, explosive growth in hardware hacking for local AI, and an intensifying rivalry between AI companies reflected in both technical achievements and marketing strategies. The conversation revealed a striking shift: while early AI adoption was dominated by technical users focused on coding applications, the technology has now reached mainstream adoption with women comprising 52% of users and only 4% of conversations involving programming tasks. This democratization coincides with growing frustration about incremental improvements among power users, who are increasingly turning to extreme measures—including flying to Shenzhen to purchase modded GPUs with expanded VRAM—to run local models. The week also highlighted a fundamental tension between corporate AI advancement and open-source alternatives, with Chinese companies releasing competitive models while simultaneously being banned from purchasing NVIDIA chips, creating a complex geopolitical landscape around AI development.
 6 | 
 7 | ## The Conversation Landscape
 8 | 
 9 | The AI discussion on Reddit spans from hardcore technical implementation in r/LocalLLaMA where users share stories of building custom GPU rigs and flying to China for hardware, to mainstream adoption conversations in r/ChatGPT dominated by memes and practical use cases, with r/singularity serving as the philosophical battleground for debates about AGI timelines and societal impact. The gender flip in AI usage—from 80% male to 52% female users—has fundamentally changed the tone of discussions, moving from technical specifications to practical applications and creative uses.
10 | 
11 | Key communities analyzed:
12 | - **r/ChatGPT** (11M subscribers): Mainstream user experiences, memes, and practical applications
13 | - **r/LocalLLaMA** (522K subscribers): Hardware hacking, open-source models, and technical deep dives
14 | - **r/singularity** (3.7M subscribers): AGI speculation, industry developments, and philosophical implications
15 | - **r/OpenAI** (2.4M subscribers): Company-specific news, model releases, and corporate drama
16 | - **r/ClaudeAI** (311K subscribers): Anthropic's community focused on Claude's capabilities and comparisons
17 | - **r/AI_Agents** (191K subscribers): Agent development, practical implementations, and ROI discussions
18 | - **r/ChatGPTPro** (486K subscribers): Power user strategies and professional applications
19 | 
20 | ## Major Themes
21 | 
22 | ### Theme 1: The GPT-5-Codex Revolution and the "Post-Programming" Era
23 | 
24 | OpenAI's release of GPT-5-Codex dominated technical discussions across multiple subreddits, with performance improvements showing a jump from 33.9% to 51.3% accuracy on refactoring tasks ([r/singularity](https://reddit.com/r/singularity/comments/1nhrsh6/openai_releases_gpt5codex/), [r/OpenAI](https://reddit.com/r/OpenAI/comments/1nhuoxw/sam_altman_just_announced_gpt5_codex_better_at/)). The model's ability to work autonomously for over 7 hours represents a fundamental shift in how coding is approached ([r/singularity](https://reddit.com/r/singularity/comments/1nhtt6t/gpt5_codex_can_work_for_more_than_7_hours/)). Reports suggest the model solved all 12 problems at the ICPC 2025 Programming Contest, achieving what many consider superhuman performance in competitive programming ([r/singularity](https://reddit.com/r/singularity/comments/1njjr6k/openai_reasoning_model_solved_all_12_problems_at/)).
25 | 
26 | The human impact is visceral and immediate. One OpenAI insider revealed: "we don't program anymore we just yell at codex agents" ([r/singularity](https://reddit.com/r/singularity/comments/1nidcr3/apparently_at_openai_insiders_have_graduated_from/)), while another developer celebrated earning "$2,200 in the last 3 weeks" after never coding before ChatGPT. Yet frustration bubbles beneath the surface—a developer testing the new model complained: "it's basically refusing to code and doing the bare minimum possible when pushed" ([r/singularity](https://reddit.com/r/singularity/comments/1nhrsh6/openai_releases_gpt5codex/)), highlighting the gap between marketing promises and real-world performance.
27 | 
28 | The divide between communities reveals deeper truths about AI's coding impact. While r/singularity celebrates the dawn of autonomous programming with claims that "the takeoff looks the most rapid," r/LocalLLaMA users remain skeptical, noting that "ChatGPT sucks at coding" compared to specialized tools. Meanwhile, r/ChatGPTPro provides crucial context: despite only 4.2% of ChatGPT conversations being about programming, this represents 29+ million users—roughly matching the entire global population of professional programmers ([r/ChatGPTPro](https://reddit.com/r/ChatGPTPro/comments/1nj5lj5/openai_just_dropped_their_biggest_study_ever_on/)). The low percentage paradoxically proves AI's coding dominance: professionals have moved beyond ChatGPT's interface to integrated tools like Cursor and Claude Code, making the web statistics misleading.
29 | 
30 | ### Theme 2: The Hardware Underground and the Cyberpunk Reality of Local AI
31 | 
32 | The story of a user flying to Shenzhen to purchase a modded 4090 with 48GB VRAM for CNY 22,900 cash captured the community's imagination, generating over 1,700 upvotes and sparking discussions about the lengths enthusiasts will go for local AI capabilities ([r/LocalLLaMA](https://reddit.com/r/LocalLLaMA/comments/1nifajh/i_bought_a_modded_4090_48gb_in_shenzhen_this_is/)). This narrative perfectly encapsulates the current state of local AI: a cyberpunk reality where users navigate Chinese electronics markets, handle stacks of cash, and risk customs violations to escape corporate AI limitations. The seller's claim that modded 5090s with 96GB VRAM are in development shows this underground market is expanding rapidly.
33 | 
34 | The desperation for hardware reflects genuine technical needs. One user showcased their "4x 3090 local ai workstation" ([r/LocalLLaMA](https://reddit.com/r/LocalLLaMA/comments/1ng0nia/4x_3090_local_ai_workstation/)), while another celebrated completing an "8xAMD MI50 - 256GB VRAM + 256GB RAM rig for $3k" ([r/LocalLLaMA](https://reddit.com/r/LocalLLaMA/comments/1nhd5ks/completed_8xamd_mi50_256gb_vram_256gb_ram_rig_for/)). The community's reaction was telling: "people flying to Asia to buy modded computer parts in cash to run their local AI, that's the cyberpunk future I asked for" received 542 upvotes. Yet skepticism emerged—multiple users suspected the Shenzhen story was marketing propaganda, noting the OP never provided benchmarks despite numerous requests.
35 | 
36 | The geopolitical dimension adds complexity. China's reported ban on its tech companies acquiring NVIDIA chips while claiming domestic processors match the H20 sparked heated debate ([r/LocalLLaMA](https://reddit.com/r/LocalLLaMA/comments/1njgicz/china_bans_its_biggest_tech_companies_from/)). This creates a paradox: Chinese companies are releasing competitive open-source models like DeepSeek V3.1 and Tongyi DeepResearch while simultaneously being cut off from the hardware that powers them. The underground GPU market represents a physical manifestation of these tensions, with modded American hardware flowing back to users desperate to run Chinese AI models locally.
37 | 
38 | ### Theme 3: The Mainstream Adoption Paradox and the Death of "AI Panic"
39 | 
40 | OpenAI's massive study of 700 million users revealed surprising patterns that challenge common narratives about AI adoption ([r/ChatGPTPro](https://reddit.com/r/ChatGPTPro/comments/1nj5lj5/openai_just_dropped_their_biggest_study_ever_on/), [r/OpenAI](https://reddit.com/r/OpenAI/comments/1niaw9p/new_openai_study_reveals_how_700_million_people/)). Only 30% of conversations are work-related, with the majority using AI for "random everyday stuff"—seeking information (24%), writing help (24%), and practical guidance (28%). The gender reversal from 80% male to 52% female users represents not just demographic shift but a fundamental change in how AI is perceived and utilized.
41 | 
42 | The community's reaction reveals competing anxieties. One r/ChatGPTPro user dismissed concerns: "So much for the 'AI will replace all jobs' panic," while another countered that the statistics are misleading since "ChatGPT is used a lot for personal conversations doesn't prove that 'AI' can't replace many jobs." The frustration from early adopters is palpable—"when are we going to get a BIG jump? Like a HUGE jump. Like +20%. It's been like a year" ([r/singularity](https://reddit.com/r/singularity/comments/1nhrsh6/openai_releases_gpt5codex/))—reflecting disappointment that exponential progress has given way to incremental improvements.
43 | 
44 | Different communities process this mainstream adoption differently. r/ChatGPT celebrates with memes about "Every single chat" starting with apologies and disclaimers (10,405 upvotes), while r/singularity worries about stagnation. r/ClaudeAI users position themselves as the sophisticated alternative: "Claude has always stayed in its lane and has been consistently useful... ChatGPT is getting a reputation as the loser's AI companion" ([r/singularity](https://reddit.com/r/singularity/comments/1nkcecf/anthropic_just_dropped_a_new_ad_for_claude_keep/)). The growth in developing countries—4x faster than rich nations—suggests AI's next billion users will have fundamentally different needs and expectations than Silicon Valley early adopters anticipated.
45 | 
46 | ### Theme 4: The Corporate AI Wars and the Marketing of Intelligence
47 | 
48 | The week witnessed intensifying competition between AI companies playing out through product releases, marketing campaigns, and community loyalty battles. Anthropic's new "Keep thinking" ad campaign, featuring MF DOOM's "All Caps," represents a sophisticated attempt to position Claude as the thinking person's AI ([r/singularity](https://reddit.com/r/singularity/comments/1nkcecf/anthropic_just_dropped_a_new_ad_for_claude_keep/), [r/ClaudeAI](https://reddit.com/r/ClaudeAI/comments/1nkcpwg/anthropic_just_dropped_a_cool_new_ad_for_claude/)). The aesthetic choice—"blending the familiar with the unfamiliar"—struck a nerve, with users praising it as "black mirror but warmer" while others called out the "sluuuuuurp" of brand loyalty.
49 | 
50 | Meta's failed live demo ("Meta's AI Live Demo Flopped" - 14,196 upvotes) and Gemini's bizarre meltdown after failing to produce a seahorse emoji (17,690 upvotes) provided fodder for community mockery ([r/ChatGPT](https://reddit.com/r/ChatGPT/comments/1nk8zmq/metas_ai_live_demo_flopped/), [r/ChatGPT](https://reddit.com/r/ChatGPT/comments/1ngoref/gemini_loses_its_mind_after_failing_to_produce_a/)). Users noted Gemini's tendency toward self-deprecation: "When it fails at some prompts it'll act like it's unworthy of living," with one user observing they "stared at the screen for a few mins the first time it happened." Meanwhile, Elon Musk's public attempts to manipulate Grok's political views that repeatedly failed (57,855 upvotes) highlighted the gap between corporate control fantasies and AI reality ([r/ChatGPT](https://reddit.com/r/ChatGPT/comments/1nhg1lv/elon_continues_to_openly_try_and_fail_to/)).
51 | 
52 | The community-level analysis reveals tribal dynamics. r/ClaudeAI users exhibit superiority: "Nobody trusts Meta's AI (which is also pretty useless), ChatGPT is getting a reputation as the loser's AI companion," while r/OpenAI maintains optimism about continued dominance. r/LocalLLaMA remains above the fray, focused on technical specifications rather than brand loyalty. The week's developments suggest these corporate battles matter less than underlying technical progress—users increasingly mix and match tools based on specific strengths rather than platform allegiance.
53 | 
54 | ### Theme 5: The Agent Revolution and the Gap Between Promise and Production
55 | 
56 | AI agents dominated r/AI_Agents discussions, but with a notably practical bent focused on real-world implementation challenges rather than theoretical potential ([r/AI_Agents](https://reddit.com/r/AI_Agents/comments/1nkx0bz/everyones_trying_vectors_and_graphs_for_ai_memory/), [r/AI_Agents](https://reddit.com/r/AI_Agents/comments/1nj7szn/how_are_you_building_ai_agents_that_actually/)). The headline "Everyone's trying vectors and graphs for AI memory. We went back to SQL" (148 upvotes) perfectly captures the community's shift from hype to pragmatism. Success stories like "How a $2000 AI voice agent automation turned a struggling eye clinic into a $15k/month lead conversion machine" (122 upvotes) compete with reality checks: "Your AI agent probably can't handle two users at once" ([r/AI_Agents](https://reddit.com/r/AI_Agents/comments/1nkkjuj/how_a_2000_ai_voice_agent_automation_turned_a/), [r/AI_Agents](https://reddit.com/r/AI_Agents/comments/1nir326/your_ai_agent_probably_cant_handle_two_users_at/)).
57 | 
58 | The framework debate reveals deep divisions about agent architecture. When asked "Which AI agent framework do you find most practical for real projects?" responses ranged from established solutions to "I built my own because everything else sucks" ([r/AI_Agents](https://reddit.com/r/AI_Agents/comments/1nfz717/which_ai_agent_framework_do_you_find_most/)). The community's focus on scraping ("What's the most reliable way you've found to scrape sites that don't have clean APIs?" - 57 upvotes) and micro-tools ("are micro-tools like this the missing pieces for future ai agents?") suggests current agent development is more about duct-taping APIs together than autonomous reasoning ([r/AI_Agents](https://reddit.com/r/AI_Agents/comments/1nkdlc8/whats_the_most_reliable_way_youve_found_to_scrape/), [r/AI_Agents](https://reddit.com/r/AI_Agents/comments/1njaf3o/are_microtools_like_this_the_missing_pieces_for/)).
59 | 
60 | The distinction between chatbots and agents remains contentious: "Chatbots Reply, Agents Achieve Goals — What's the Real Line Between Them?" generated substantive discussion about whether current "agents" are merely chatbots with API access ([r/AI_Agents](https://reddit.com/r/AI_Agents/comments/1nfzf1n/chatbots_reply_agents_achieve_goals_whats_the/)). OpenAI's claim about "Reliable Long Horizon Agents by 2026" was met with skepticism in r/singularity, where users questioned whether true agency is possible without embodiment or real-world consequences. The gap between Silicon Valley promises and developer realities suggests the agent revolution will be evolutionary rather than revolutionary.
61 | 
62 | ## Divergent Perspectives
63 | 
64 | The week revealed fundamental divides in how different communities perceive AI progress. **Technical vs Mainstream users** represent the starkest contrast: while r/LocalLLaMA obsesses over VRAM requirements and inference speeds, r/ChatGPT shares memes about AI therapy sessions. The technical community's frustration with incremental improvements ("Groan when are we going to get a BIG jump?") contrasts sharply with mainstream users' delight at basic functionality.
65 | 
66 | **Open Source vs Corporate AI** tensions intensified with Chinese companies releasing competitive models while being banned from hardware purchases. r/LocalLLaMA celebrates every open-source release as liberation from corporate control, while r/OpenAI and r/ClaudeAI users defend their platforms' superiority. The irony of users flying to China to buy modded American GPUs to run Chinese AI models epitomizes these contradictions.
67 | 
68 | **Builders vs Philosophers** split r/singularity down the middle, with half celebrating each breakthrough as steps toward AGI while others warn about societal collapse. r/AI_Agents remains firmly in the builder camp, focused on ROI and production deployments rather than existential questions. The gender shift in usage suggests a new demographic less interested in philosophical debates and more focused on practical applications.
69 | 
70 | ## What This Means
71 | 
72 | The past week reveals AI development entering a new phase characterized by mainstream adoption, technical pragmatism, and geopolitical complexity. The shift from 4% coding-related conversations doesn't indicate reduced programming impact but rather integration so complete that developers no longer use chat interfaces. Similarly, the gender rebalancing suggests AI has transcended its early-adopter phase to become genuinely useful for everyday tasks.
73 | 
74 | For builders and companies, several patterns demand attention. The underground hardware market signals massive unmet demand for local AI capabilities that current consumer GPUs cannot satisfy. The failure of major companies' live demos while Anthropic succeeds with thoughtful marketing suggests authenticity matters more than technical superiority. The agent revolution's slow progress indicates the gap between narrow AI success and general-purpose automation remains vast.
75 | 
76 | The geopolitical dimensions cannot be ignored. China's simultaneous advancement in AI models while being cut off from hardware creates an unstable equilibrium. The cyberpunk reality of cash-only GPU deals in Shenzhen represents just the beginning of a fractured global AI landscape. Companies and developers must prepare for a world where AI capabilities vary dramatically by geography, not due to knowledge gaps but hardware access.
77 | 
78 | Key takeaways:
79 | 1. The "post-programming" era has arrived for early adopters, but integration challenges mean most developers still code traditionally
80 | 2. Hardware limitations are driving an underground economy that will only grow as models demand more VRAM
81 | 3. Mainstream adoption is reshaping AI development priorities from technical impressiveness to practical utility
82 | 4. Corporate AI wars matter less than open-source progress for long-term ecosystem health
83 | 5. Agent development remains stuck between chatbot limitations and true autonomy, requiring fundamental architectural innovations
84 | 
85 | ## Research Notes
86 | 
87 | *Communities analyzed*: r/ChatGPT, r/OpenAI, r/ClaudeAI, r/LocalLLaMA, r/singularity, r/artificial, r/MachineLearning, r/ChatGPTPro, r/ChatGPTCoding, r/ClaudeCode, r/AI_Agents, r/aipromptprogramming, r/generativeAI, r/machinelearningnews, r/LargeLanguageModels
88 | 
89 | *Methodology*: Semantic discovery to find diverse perspectives, followed by thematic analysis of top discussions and comments from the past week (January 13-20, 2025)
90 | 
91 | *Limitations*: Analysis focused on English-language subreddits and may not capture developments in non-English AI communities. Corporate subreddit participation may be influenced by marketing efforts. Technical discussions in specialized forums outside Reddit were not included.
```

--------------------------------------------------------------------------------
/specs/reddit-research-agent-spec.md:
--------------------------------------------------------------------------------

```markdown
  1 | # Reddit Research Agent - Technical Specification
  2 | 
  3 | ## Executive Summary
  4 | A self-contained, single-file Python agent using the Orchestrator-Workers pattern to discover relevant Reddit communities for research questions. The system leverages UV's inline script metadata for automatic dependency management, OpenAI Agent SDK for orchestration, and PRAW for Reddit API access. No manual dependency installation required - just run the script and UV handles everything.
  5 | 
  6 | ## Single-File Architecture
  7 | 
  8 | The entire agent is contained in a single Python file (`reddit_research_agent.py`) with:
  9 | - **Inline Dependencies**: Using UV's PEP 723 support, dependencies are declared in the script header
 10 | - **Automatic Installation**: UV automatically installs all dependencies on first run
 11 | - **No Project Setup**: No `pyproject.toml`, `requirements.txt`, or virtual environment management needed
 12 | - **Portable**: Single file can be copied and run anywhere with UV installed
 13 | 
 14 | ## Architecture Pattern: Orchestrator-Workers
 15 | 
 16 | ```mermaid
 17 | flowchart LR
 18 |     Query([User Query]) --> Orchestrator[Orchestrator Agent]
 19 |     Orchestrator -->|Task 1| Worker1[Search Worker]
 20 |     Orchestrator -->|Task 2| Worker2[Discovery Worker]
 21 |     Orchestrator -->|Task 3| Worker3[Validation Worker]
 22 |     Worker1 --> Synthesizer[Synthesizer Agent]
 23 |     Worker2 --> Synthesizer
 24 |     Worker3 --> Synthesizer
 25 |     Synthesizer --> Results([Final Results])
 26 | ```
 27 | 
 28 | ## System Components
 29 | 
 30 | ### 1. Project Configuration
 31 | 
 32 | #### Self-Contained Dependencies
 33 | The agent uses UV's inline script metadata (PEP 723) for automatic dependency management. No separate `pyproject.toml` or manual installation required - dependencies are declared directly in the script header and UV handles everything automatically.
 34 | 
 35 | #### Environment Variables (`.env`)
 36 | ```bash
 37 | # OpenAI Configuration
 38 | OPENAI_API_KEY=sk-...
 39 | 
 40 | # Reddit API Configuration
 41 | REDDIT_CLIENT_ID=your_client_id
 42 | REDDIT_CLIENT_SECRET=your_client_secret
 43 | REDDIT_USER_AGENT=RedditResearchAgent/0.1.0 by YourUsername
 44 | ```
 45 | 
 46 | ### 2. Core Agents
 47 | 
 48 | #### 2.1 Orchestrator Agent
 49 | **Purpose**: Analyzes research questions and creates parallel search strategies
 50 | 
 51 | ```python
 52 | orchestrator = Agent(
 53 |     name="Research Orchestrator",
 54 |     instructions="""
 55 |     You are a research orchestrator specializing in Reddit discovery.
 56 |     
 57 |     Given a research question:
 58 |     1. Identify key concepts and terms
 59 |     2. Generate multiple search strategies:
 60 |        - Direct keyword searches (exact terms)
 61 |        - Semantic searches (related concepts, synonyms)
 62 |        - Category searches (broader topics, fields)
 63 |     3. Output specific tasks for parallel execution
 64 |     
 65 |     Consider:
 66 |     - Technical vs general audience communities
 67 |     - Active vs historical discussions
 68 |     - Niche vs mainstream subreddits
 69 |     """,
 70 |     output_type=SearchTaskPlan
 71 | )
 72 | ```
 73 | 
 74 | **Output Model**:
 75 | ```python
 76 | class SearchTaskPlan(BaseModel):
 77 |     direct_searches: List[str]  # Exact keyword searches
 78 |     semantic_searches: List[str]  # Related term searches
 79 |     category_searches: List[str]  # Broad topic searches
 80 |     validation_criteria: Dict[str, Any]  # Relevance criteria
 81 | ```
 82 | 
 83 | #### 2.2 Worker Agents (Parallel Execution)
 84 | 
 85 | ##### Search Worker
 86 | **Purpose**: Executes direct Reddit searches using PRAW
 87 | 
 88 | ```python
 89 | search_worker = Agent(
 90 |     name="Search Worker",
 91 |     instructions="Execute Reddit searches and return discovered subreddits",
 92 |     tools=[search_subreddits_tool, search_posts_tool]
 93 | )
 94 | ```
 95 | 
 96 | ##### Discovery Worker
 97 | **Purpose**: Finds related communities through analysis
 98 | 
 99 | ```python
100 | discovery_worker = Agent(
101 |     name="Discovery Worker",
102 |     instructions="Discover related subreddits through sidebars, wikis, and cross-references",
103 |     tools=[get_related_subreddits_tool, analyze_community_tool]
104 | )
105 | ```
106 | 
107 | ##### Validation Worker
108 | **Purpose**: Verifies relevance and quality of discovered subreddits
109 | 
110 | ```python
111 | validation_worker = Agent(
112 |     name="Validation Worker",
113 |     instructions="Validate subreddit relevance, activity levels, and quality",
114 |     tools=[get_subreddit_info_tool, check_activity_tool]
115 | )
116 | ```
117 | 
118 | #### 2.3 Synthesizer Agent
119 | **Purpose**: Combines, deduplicates, and ranks all results
120 | 
121 | ```python
122 | synthesizer = Agent(
123 |     name="Result Synthesizer",
124 |     instructions="""
125 |     Synthesize results from all workers:
126 |     
127 |     1. Deduplicate discoveries
128 |     2. Rank by relevance factors:
129 |        - Description alignment with research topic
130 |        - Subscriber count and activity level
131 |        - Content quality indicators
132 |        - Moderation status
133 |     3. Filter out:
134 |        - Inactive communities (< 10 posts/month)
135 |        - Spam/promotional subreddits
136 |        - Quarantined/banned communities
137 |     4. Return top 8-15 subreddits with justification
138 |     
139 |     Provide discovery rationale for each recommendation.
140 |     """,
141 |     output_type=FinalResearchResults
142 | )
143 | ```
144 | 
145 | **Output Model**:
146 | ```python
147 | class SubredditRecommendation(BaseModel):
148 |     name: str
149 |     description: str
150 |     subscribers: int
151 |     relevance_score: float
152 |     discovery_method: str
153 |     rationale: str
154 | 
155 | class FinalResearchResults(BaseModel):
156 |     query: str
157 |     total_discovered: int
158 |     recommendations: List[SubredditRecommendation]
159 |     search_strategies_used: List[str]
160 |     execution_time: float
161 | ```
162 | 
163 | ### 3. PRAW Integration Tools (Enhanced)
164 | 
165 | #### Core Reddit Connection
166 | ```python
167 | import praw
168 | from functools import lru_cache
169 | import os
170 | 
171 | @lru_cache(maxsize=1)
172 | def get_reddit_instance():
173 |     """Singleton Reddit instance for all workers - thread-safe via lru_cache"""
174 |     return praw.Reddit(
175 |         client_id=os.getenv("REDDIT_CLIENT_ID"),
176 |         client_secret=os.getenv("REDDIT_CLIENT_SECRET"),
177 |         user_agent=os.getenv("REDDIT_USER_AGENT"),
178 |         read_only=True  # Read-only mode for research
179 |     )
180 | ```
181 | 
182 | #### Pydantic Models for Type Safety
183 | ```python
184 | from pydantic import BaseModel
185 | from typing import List, Optional
186 | 
187 | class SubredditInfo(BaseModel):
188 |     """Structured subreddit information with validation"""
189 |     name: str
190 |     title: str
191 |     description: str
192 |     subscribers: int
193 |     created_utc: float
194 |     over18: bool
195 |     is_active: bool  # Based on recent activity
196 |     avg_comments_per_post: float
197 |     recent_posts_count: int
198 |     
199 | class ResearchContext(BaseModel):
200 |     """Context passed between tools"""
201 |     research_question: str
202 |     discovered_subreddits: List[str] = []
203 |     search_strategies_used: List[str] = []
204 | ```
205 | 
206 | #### Error Handler for Reddit API Issues
207 | ```python
208 | from agents import RunContextWrapper
209 | from typing import Any
210 | 
211 | def reddit_error_handler(ctx: RunContextWrapper[Any], error: Exception) -> str:
212 |     """
213 |     Handle common Reddit API errors gracefully.
214 |     
215 |     Returns user-friendly error messages for common issues.
216 |     """
217 |     error_str = str(error)
218 |     
219 |     if "403" in error_str or "Forbidden" in error_str:
220 |         return "Subreddit is private or restricted. Skipping this community."
221 |     elif "404" in error_str or "Not Found" in error_str:
222 |         return "Subreddit not found. It may be banned, deleted, or misspelled."
223 |     elif "429" in error_str or "Too Many Requests" in error_str:
224 |         return "Reddit rate limit reached. Waiting before retry."
225 |     elif "prawcore.exceptions" in error_str:
226 |         return f"Reddit API connection issue: {error_str[:50]}. Retrying..."
227 |     else:
228 |         return f"Unexpected Reddit error: {error_str[:100]}"
229 | ```
230 | 
231 | #### Enhanced Function Tools with Type Safety and Error Handling
232 | 
233 | ```python
234 | @function_tool(failure_error_function=reddit_error_handler)
235 | async def search_subreddits_tool(
236 |     ctx: RunContextWrapper[ResearchContext],
237 |     query: str,
238 |     limit: int = 25
239 | ) -> List[SubredditInfo]:
240 |     """
241 |     Search for subreddits matching the query with relevance filtering.
242 |     
243 |     Args:
244 |         ctx: Runtime context containing the original research question
245 |         query: Search terms for Reddit (2-512 characters)
246 |         limit: Maximum results to return (1-100, default: 25)
247 |     
248 |     Returns:
249 |         List of SubredditInfo objects with validated data
250 |         
251 |     Note:
252 |         Automatically filters out inactive subreddits (< 100 subscribers)
253 |         and those without recent activity.
254 |     """
255 |     reddit = get_reddit_instance()
256 |     results = []
257 |     original_query = ctx.context.research_question
258 |     
259 |     try:
260 |         for subreddit in reddit.subreddits.search(query, limit=limit):
261 |             # Skip very small/inactive subreddits
262 |             if subreddit.subscribers < 100:
263 |                 continue
264 |                 
265 |             # Get activity metrics
266 |             try:
267 |                 recent_posts = list(subreddit.new(limit=5))
268 |                 is_active = len(recent_posts) > 0
269 |                 avg_comments = sum(p.num_comments for p in recent_posts) / len(recent_posts) if recent_posts else 0
270 |             except:
271 |                 is_active = False
272 |                 avg_comments = 0
273 |                 recent_posts = []
274 |             
275 |             results.append(SubredditInfo(
276 |                 name=subreddit.display_name,
277 |                 title=subreddit.title or "",
278 |                 description=subreddit.public_description or "",
279 |                 subscribers=subreddit.subscribers,
280 |                 created_utc=subreddit.created_utc,
281 |                 over18=subreddit.over18,
282 |                 is_active=is_active,
283 |                 avg_comments_per_post=avg_comments,
284 |                 recent_posts_count=len(recent_posts)
285 |             ))
286 |     except Exception as e:
287 |         # Let the error handler deal with it
288 |         raise
289 |     
290 |     # Update context with discovered subreddits
291 |     ctx.context.discovered_subreddits.extend([r.name for r in results])
292 |     
293 |     return results
294 | 
295 | @function_tool(failure_error_function=reddit_error_handler)
296 | async def get_related_subreddits_tool(
297 |     ctx: RunContextWrapper[ResearchContext],
298 |     subreddit_name: str
299 | ) -> List[str]:
300 |     """
301 |     Find related subreddits from sidebar, wiki, and community info.
302 |     
303 |     Args:
304 |         ctx: Runtime context for tracking discoveries
305 |         subreddit_name: Name of subreddit to analyze (without r/ prefix)
306 |     
307 |     Returns:
308 |         List of related subreddit names (deduplicated)
309 |         
310 |     Note:
311 |         Searches in sidebar description, wiki pages, and 
312 |         community widgets for related community mentions.
313 |     """
314 |     reddit = get_reddit_instance()
315 |     related = set()  # Use set for automatic deduplication
316 |     
317 |     try:
318 |         subreddit = reddit.subreddit(subreddit_name)
319 |         
320 |         # Parse sidebar for r/ mentions
321 |         if hasattr(subreddit, 'description') and subreddit.description:
322 |             import re
323 |             pattern = r'r/([A-Za-z0-9_]+)'
324 |             matches = re.findall(pattern, subreddit.description)
325 |             related.update(matches)
326 |         
327 |         # Check wiki pages if accessible
328 |         try:
329 |             # Common wiki pages with related subreddits
330 |             wiki_pages = ['related', 'index', 'sidebar', 'communities']
331 |             for page_name in wiki_pages:
332 |                 try:
333 |                     wiki_page = subreddit.wiki[page_name]
334 |                     content = wiki_page.content_md
335 |                     matches = re.findall(pattern, content)
336 |                     related.update(matches)
337 |                 except:
338 |                     continue
339 |         except:
340 |             pass
341 |             
342 |         # Parse community widgets if available
343 |         try:
344 |             for widget in subreddit.widgets:
345 |                 if hasattr(widget, 'text'):
346 |                     matches = re.findall(pattern, widget.text)
347 |                     related.update(matches)
348 |         except:
349 |             pass
350 |             
351 |     except Exception as e:
352 |         # Let the error handler deal with it
353 |         raise
354 |     
355 |     # Remove the original subreddit from related list
356 |     related.discard(subreddit_name)
357 |     
358 |     return list(related)
359 | 
360 | @function_tool(failure_error_function=reddit_error_handler)
361 | async def validate_subreddit_relevance_tool(
362 |     ctx: RunContextWrapper[ResearchContext],
363 |     subreddit_name: str
364 | ) -> SubredditInfo:
365 |     """
366 |     Get detailed subreddit information with relevance validation.
367 |     
368 |     Args:
369 |         ctx: Runtime context containing research question
370 |         subreddit_name: Name of subreddit to validate
371 |     
372 |     Returns:
373 |         SubredditInfo with detailed metrics
374 |         
375 |     Note:
376 |         Checks activity level, moderation status, and 
377 |         content quality indicators.
378 |     """
379 |     reddit = get_reddit_instance()
380 |     
381 |     try:
382 |         subreddit = reddit.subreddit(subreddit_name)
383 |         
384 |         # Force load to check if subreddit exists
385 |         _ = subreddit.id
386 |         
387 |         # Get recent activity for validation
388 |         recent_posts = list(subreddit.new(limit=10))
389 |         
390 |         # Calculate activity metrics
391 |         if recent_posts:
392 |             avg_comments = sum(p.num_comments for p in recent_posts) / len(recent_posts)
393 |             # Check if posts are recent (within last 30 days)
394 |             import time
395 |             current_time = time.time()
396 |             latest_post_age = current_time - recent_posts[0].created_utc
397 |             is_active = latest_post_age < (30 * 24 * 60 * 60)  # 30 days in seconds
398 |         else:
399 |             avg_comments = 0
400 |             is_active = False
401 |         
402 |         return SubredditInfo(
403 |             name=subreddit.display_name,
404 |             title=subreddit.title or "",
405 |             description=subreddit.public_description or "",
406 |             subscribers=subreddit.subscribers,
407 |             created_utc=subreddit.created_utc,
408 |             over18=subreddit.over18,
409 |             is_active=is_active,
410 |             avg_comments_per_post=avg_comments,
411 |             recent_posts_count=len(recent_posts)
412 |         )
413 |         
414 |     except Exception as e:
415 |         # Let the error handler deal with it
416 |         raise
417 | ```
418 | 
419 | ### 4. Execution Controller
420 | 
421 | ```python
422 | import asyncio
423 | from typing import List, Dict, Any
424 | from agents import Runner
425 | 
426 | async def execute_reddit_research(query: str) -> FinalResearchResults:
427 |     """
428 |     Main execution controller for the research process.
429 |     
430 |     Args:
431 |         query: User's research question
432 |     
433 |     Returns:
434 |         Final curated results
435 |     """
436 |     
437 |     # Step 1: Orchestrator creates search plan
438 |     print(f"🎯 Analyzing research question: {query}")
439 |     orchestrator_result = await Runner.run(orchestrator, query)
440 |     search_plan = orchestrator_result.final_output_as(SearchTaskPlan)
441 |     
442 |     # Step 2: Execute workers in parallel
443 |     print("🔍 Executing parallel search strategies...")
444 |     worker_tasks = [
445 |         Runner.run(search_worker, {
446 |             "searches": search_plan.direct_searches,
447 |             "search_type": "direct"
448 |         }),
449 |         Runner.run(discovery_worker, {
450 |             "searches": search_plan.semantic_searches,
451 |             "search_type": "semantic"
452 |         }),
453 |         Runner.run(validation_worker, {
454 |             "searches": search_plan.category_searches,
455 |             "validation_criteria": search_plan.validation_criteria
456 |         })
457 |     ]
458 |     
459 |     worker_results = await asyncio.gather(*worker_tasks)
460 |     
461 |     # Step 3: Synthesize results
462 |     print("🔀 Synthesizing discoveries...")
463 |     synthesis_input = {
464 |         "query": query,
465 |         "worker_results": [r.final_output for r in worker_results],
466 |         "search_plan": search_plan.model_dump()
467 |     }
468 |     
469 |     synthesizer_result = await Runner.run(synthesizer, synthesis_input)
470 |     final_results = synthesizer_result.final_output_as(FinalResearchResults)
471 |     
472 |     return final_results
473 | ```
474 | 
475 | ### 5. Main Entry Point (Self-Contained with UV)
476 | 
477 | ```python
478 | #!/usr/bin/env -S uv run --script
479 | # /// script
480 | # requires-python = ">=3.11"
481 | # dependencies = [
482 | #     "openai-agents>=0.1.0",
483 | #     "praw>=7.7.0",
484 | #     "python-dotenv>=1.0.0",
485 | #     "pydantic>=2.0.0",
486 | #     "prawcore>=2.4.0"
487 | # ]
488 | # ///
489 | """
490 | Reddit Research Agent
491 | Discovers relevant Reddit communities for research questions
492 | using the Orchestrator-Workers pattern.
493 | 
494 | Usage:
495 |     ./reddit_research_agent.py
496 |     OR
497 |     uv run reddit_research_agent.py
498 |     
499 | No manual dependency installation required - UV handles everything automatically.
500 | """
501 | 
502 | import asyncio
503 | import os
504 | from dotenv import load_dotenv
505 | from typing import Optional, List, Dict, Any
506 | 
507 | # Load environment variables
508 | load_dotenv()
509 | 
510 | async def main():
511 |     """Main execution function"""
512 |     
513 |     # Validate environment
514 |     required_vars = [
515 |         "OPENAI_API_KEY",
516 |         "REDDIT_CLIENT_ID", 
517 |         "REDDIT_CLIENT_SECRET",
518 |         "REDDIT_USER_AGENT"
519 |     ]
520 |     
521 |     missing = [var for var in required_vars if not os.getenv(var)]
522 |     if missing:
523 |         print(f"❌ Missing environment variables: {', '.join(missing)}")
524 |         return
525 |     
526 |     # Get research query
527 |     query = input("🔬 Enter your research question: ").strip()
528 |     if not query:
529 |         print("❌ Please provide a research question")
530 |         return
531 |     
532 |     try:
533 |         # Execute research
534 |         results = await execute_reddit_research(query)
535 |         
536 |         # Display results
537 |         print(f"\n✅ Discovered {results.total_discovered} subreddits")
538 |         print(f"📊 Top {len(results.recommendations)} recommendations:\n")
539 |         
540 |         for i, rec in enumerate(results.recommendations, 1):
541 |             print(f"{i}. r/{rec.name} ({rec.subscribers:,} subscribers)")
542 |             print(f"   📝 {rec.description[:100]}...")
543 |             print(f"   🎯 Relevance: {rec.relevance_score:.2f}/10")
544 |             print(f"   💡 {rec.rationale}\n")
545 |         
546 |         print(f"⏱️ Execution time: {results.execution_time:.2f} seconds")
547 |         
548 |     except Exception as e:
549 |         print(f"❌ Error during execution: {e}")
550 |         raise
551 | 
552 | if __name__ == "__main__":
553 |     asyncio.run(main())
554 | ```
555 | 
556 | ## Search Strategies
557 | 
558 | ### 1. Direct Search
559 | - Exact keyword matching
560 | - Query variations (singular/plural)
561 | - Common abbreviations
562 | 
563 | ### 2. Semantic Search
564 | - Synonyms and related terms
565 | - Domain-specific terminology
566 | - Conceptual expansions
567 | 
568 | ### 3. Category Search
569 | - Broader topic areas
570 | - Academic disciplines
571 | - Industry sectors
572 | 
573 | ### 4. Discovery Methods
574 | - Sidebar parsing for related communities
575 | - Wiki page analysis
576 | - Cross-post detection
577 | - Moderator overlap analysis
578 | 
579 | ## Quality Metrics
580 | 
581 | ### Relevance Scoring
582 | 1. **Description Match** (40%)
583 |    - Keyword presence in description
584 |    - Semantic similarity to query
585 | 
586 | 2. **Activity Level** (30%)
587 |    - Posts per day
588 |    - Comment engagement
589 |    - Active user count
590 | 
591 | 3. **Community Size** (20%)
592 |    - Subscriber count
593 |    - Growth trajectory
594 | 
595 | 4. **Content Quality** (10%)
596 |    - Moderation level
597 |    - Rules complexity
598 |    - Wiki presence
599 | 
600 | ## Error Handling
601 | 
602 | ### API Rate Limits
603 | - Implement exponential backoff
604 | - Cache results for 1 hour
605 | - Batch requests where possible
606 | 
607 | ### Invalid Subreddits
608 | - Skip private/banned communities
609 | - Handle 404 errors gracefully
610 | - Log failures for debugging
611 | 
612 | ### Network Issues
613 | - Retry logic with timeout
614 | - Fallback to cached results
615 | - User notification of degraded service
616 | 
617 | ## Performance Targets
618 | 
619 | - **Discovery Time**: < 10 seconds for typical query
620 | - **Parallel Workers**: 3-5 concurrent operations
621 | - **Result Count**: 8-15 high-quality recommendations
622 | - **Cache Hit Rate**: > 30% for common topics
623 | 
624 | ## Testing Strategy
625 | 
626 | ### Unit Tests
627 | - Individual tool functions
628 | - PRAW mock responses
629 | - Agent prompt validation
630 | 
631 | ### Integration Tests
632 | - Full workflow execution
633 | - Parallel worker coordination
634 | - Result synthesis accuracy
635 | 
636 | ### Example Test Queries
637 | 1. "machine learning ethics"
638 | 2. "sustainable urban farming"
639 | 3. "quantum computing applications"
640 | 4. "remote work productivity"
641 | 5. "climate change solutions"
642 | 
643 | ## Future Enhancements
644 | 
645 | 1. **Temporal Analysis**
646 |    - Trending topic detection
647 |    - Historical activity patterns
648 | 
649 | 2. **Content Analysis**
650 |    - Sentiment analysis of discussions
651 |    - Expert identification
652 | 
653 | 3. **Network Analysis**
654 |    - Community overlap mapping
655 |    - Influence flow detection
656 | 
657 | 4. **Personalization**
658 |    - User preference learning
659 |    - Custom ranking weights
660 | 
661 | ## Deployment Considerations
662 | 
663 | ### Usage Instructions
664 | ```bash
665 | # Method 1: Direct execution (if file is executable)
666 | chmod +x reddit_research_agent.py
667 | ./reddit_research_agent.py
668 | 
669 | # Method 2: Using UV run
670 | uv run reddit_research_agent.py
671 | 
672 | # No manual dependency installation needed!
673 | # UV automatically handles all dependencies on first run
674 | ```
675 | 
676 | ### Key Benefits of UV Inline Dependencies
677 | - **Zero Setup**: No `pip install` or `uv add` commands needed
678 | - **Self-Contained**: Single file contains code and dependency specifications
679 | - **Reproducible**: Same dependencies installed every time
680 | - **Fast**: UV caches dependencies for quick subsequent runs
681 | - **Version Locked**: Optional `.lock` file ensures exact versions
682 | 
683 | ### Production Deployment
684 | - Use environment-specific `.env` files
685 | - Implement logging and monitoring
686 | - Add result caching layer with Redis/Memcached
687 | - Consider rate limit pooling for multiple users
688 | - Lock dependencies with `uv lock --script reddit_research_agent.py`
689 | 
690 | ## Success Metrics
691 | 
692 | 1. **Coverage**: Discovers 80%+ of relevant subreddits
693 | 2. **Precision**: 90%+ relevance accuracy
694 | 3. **Speed**: < 10 second average execution
695 | 4. **Reliability**: 99%+ uptime with graceful degradation
```

--------------------------------------------------------------------------------
/src/server.py:
--------------------------------------------------------------------------------

```python
  1 | from fastmcp import FastMCP, Context
  2 | from fastmcp.prompts import Message
  3 | from fastmcp.server.auth.providers.descope import DescopeProvider
  4 | from typing import Optional, Literal, List, Union, Dict, Any, Annotated
  5 | import sys
  6 | import os
  7 | import json
  8 | from pathlib import Path
  9 | from datetime import datetime
 10 | from dotenv import load_dotenv
 11 | from starlette.responses import Response, JSONResponse
 12 | 
 13 | # Load environment variables from .env file
 14 | load_dotenv()
 15 | 
 16 | # Add parent directory to path for imports
 17 | sys.path.insert(0, str(Path(__file__).parent.parent))
 18 | 
 19 | from src.config import get_reddit_client
 20 | from src.tools.search import search_in_subreddit
 21 | from src.tools.posts import fetch_subreddit_posts, fetch_multiple_subreddits
 22 | from src.tools.comments import fetch_submission_with_comments
 23 | from src.tools.discover import discover_subreddits
 24 | from src.resources import register_resources
 25 | 
 26 | # Configure Descope authentication
 27 | auth = DescopeProvider(
 28 |     project_id=os.getenv("DESCOPE_PROJECT_ID"),
 29 |     base_url=os.getenv("SERVER_URL", "http://localhost:8000"),
 30 |     descope_base_url=os.getenv("DESCOPE_BASE_URL", "https://api.descope.com")
 31 | )
 32 | 
 33 | # Initialize MCP server with authentication
 34 | mcp = FastMCP("Reddit MCP", auth=auth, instructions="""
 35 | Reddit MCP Server - Three-Layer Architecture
 36 | 
 37 | 🎯 ALWAYS FOLLOW THIS WORKFLOW:
 38 | 1. discover_operations() - See what's available
 39 | 2. get_operation_schema() - Understand requirements  
 40 | 3. execute_operation() - Perform the action
 41 | 
 42 | 📊 RESEARCH BEST PRACTICES:
 43 | • Start with discover_subreddits for ANY topic
 44 | • Use confidence scores to guide workflow:
 45 |   - High (>0.7): Direct to specific communities
 46 |   - Medium (0.4-0.7): Multi-community approach
 47 |   - Low (<0.4): Refine search terms
 48 | • Fetch comments for 10+ posts for thorough analysis
 49 | • Always include Reddit URLs when citing content
 50 | 
 51 | ⚡ EFFICIENCY TIPS:
 52 | • Use fetch_multiple for 2+ subreddits (70% fewer API calls)
 53 | • Single vector search finds semantically related communities
 54 | • Batch operations reduce token usage
 55 | 
 56 | Quick Start: Read reddit://server-info for complete documentation.
 57 | """)
 58 | 
 59 | # Add public health check endpoint (no auth required)
 60 | @mcp.custom_route("/health", methods=["GET"])
 61 | async def health_check(request) -> Response:
 62 |     """Public health check endpoint - no authentication required.
 63 | 
 64 |     Allows clients to verify the server is running before attempting OAuth.
 65 |     """
 66 |     try:
 67 |         return JSONResponse({
 68 |             "status": "ok",
 69 |             "server": "Reddit MCP",
 70 |             "version": "1.0.0",
 71 |             "auth_required": True,
 72 |             "auth_endpoint": "/.well-known/oauth-authorization-server"
 73 |         })
 74 |     except Exception as e:
 75 |         print(f"ERROR: Health check failed: {e}", flush=True)
 76 |         return JSONResponse(
 77 |             {"status": "error", "message": str(e)},
 78 |             status_code=500
 79 |         )
 80 | 
 81 | # Add public server info endpoint (no auth required)
 82 | @mcp.custom_route("/server-info", methods=["GET"])
 83 | async def server_info(request) -> Response:
 84 |     """Public server information endpoint - no authentication required.
 85 | 
 86 |     Provides server metadata and capabilities to help clients understand
 87 |     what authentication and features are available.
 88 |     """
 89 |     try:
 90 |         print(f"Server info requested from {request.client.host if request.client else 'unknown'}", flush=True)
 91 |         return JSONResponse({
 92 |             "name": "Reddit MCP",
 93 |             "version": "1.0.0",
 94 |             "description": "Reddit research and analysis tools with semantic subreddit discovery",
 95 |             "authentication": {
 96 |                 "required": True,
 97 |                 "type": "oauth2",
 98 |                 "provider": "descope",
 99 |                 "authorization_server": f"{os.getenv('SERVER_URL', 'http://localhost:8000')}/.well-known/oauth-authorization-server"
100 |             },
101 |             "capabilities": {
102 |                 "tools": ["discover_operations", "get_operation_schema", "execute_operation"],
103 |                 "tools_count": 3,
104 |                 "supports_resources": True,
105 |                 "supports_prompts": True,
106 |                 "reddit_operations": {
107 |                     "discover_subreddits": "Semantic search for relevant communities",
108 |                     "search_subreddit": "Search within a specific subreddit",
109 |                     "fetch_posts": "Get posts from a subreddit",
110 |                     "fetch_multiple": "Batch fetch from multiple subreddits",
111 |                     "fetch_comments": "Get complete comment trees"
112 |                 }
113 |             }
114 |         })
115 |     except Exception as e:
116 |         print(f"ERROR: Server info request failed: {e}", flush=True)
117 |         return JSONResponse(
118 |             {"status": "error", "message": str(e)},
119 |             status_code=500
120 |         )
121 | 
122 | # Initialize Reddit client (will be updated with config when available)
123 | reddit = None
124 | 
125 | 
126 | def initialize_reddit_client():
127 |     """Initialize Reddit client with environment config."""
128 |     global reddit
129 |     reddit = get_reddit_client()
130 |     # Register resources with the new client
131 |     register_resources(mcp, reddit)
132 | 
133 | # Initialize with environment variables initially
134 | try:
135 |     initialize_reddit_client()
136 | except Exception as e:
137 |     print(f"DEBUG: Reddit init failed: {e}", flush=True)
138 | 
139 | 
140 | # Three-Layer Architecture Implementation
141 | 
142 | @mcp.tool(
143 |     description="Discover available Reddit operations and recommended workflows",
144 |     annotations={"readOnlyHint": True}
145 | )
146 | def discover_operations(ctx: Context) -> Dict[str, Any]:
147 |     """
148 |     LAYER 1: Discover what operations this MCP server provides.
149 |     Start here to understand available capabilities.
150 |     """
151 |     # Phase 1: Accept context but don't use it yet
152 |     return {
153 |         "operations": {
154 |             "discover_subreddits": "Find relevant communities using semantic search",
155 |             "search_subreddit": "Search for posts within a specific community",
156 |             "fetch_posts": "Get posts from a single subreddit",
157 |             "fetch_multiple": "Batch fetch from multiple subreddits (70% more efficient)",
158 |             "fetch_comments": "Get complete comment tree for deep analysis"
159 |         },
160 |         "recommended_workflows": {
161 |             "comprehensive_research": [
162 |                 "discover_subreddits → fetch_multiple → fetch_comments",
163 |                 "Best for: Thorough analysis across communities"
164 |             ],
165 |             "targeted_search": [
166 |                 "discover_subreddits → search_subreddit → fetch_comments",
167 |                 "Best for: Finding specific content in relevant communities"
168 |             ]
169 |         },
170 |         "next_step": "Use get_operation_schema() to understand requirements"
171 |     }
172 | 
173 | 
174 | @mcp.tool(
175 |     description="Get detailed requirements and parameters for a Reddit operation",
176 |     annotations={"readOnlyHint": True}
177 | )
178 | def get_operation_schema(
179 |     operation_id: Annotated[str, "Operation ID from discover_operations"],
180 |     include_examples: Annotated[bool, "Include example parameter values"] = True,
181 |     ctx: Context = None
182 | ) -> Dict[str, Any]:
183 |     """
184 |     LAYER 2: Get parameter requirements for an operation.
185 |     Use after discover_operations to understand how to call operations.
186 |     """
187 |     # Phase 1: Accept context but don't use it yet
188 |     schemas = {
189 |         "discover_subreddits": {
190 |             "description": "Find communities using semantic vector search",
191 |             "parameters": {
192 |                 "query": {
193 |                     "type": "string",
194 |                     "required": True,
195 |                     "description": "Topic to find communities for",
196 |                     "validation": "2-100 characters"
197 |                 },
198 |                 "limit": {
199 |                     "type": "integer",
200 |                     "required": False,
201 |                     "default": 10,
202 |                     "range": [1, 50],
203 |                     "description": "Number of communities to return"
204 |                 },
205 |                 "include_nsfw": {
206 |                     "type": "boolean",
207 |                     "required": False,
208 |                     "default": False,
209 |                     "description": "Whether to include NSFW communities"
210 |                 }
211 |             },
212 |             "returns": {
213 |                 "subreddits": "Array with confidence scores (0-1)",
214 |                 "quality_indicators": {
215 |                     "good": "5+ subreddits with confidence > 0.7",
216 |                     "poor": "All results below 0.5 confidence"
217 |                 }
218 |             },
219 |             "examples": [] if not include_examples else [
220 |                 {"query": "machine learning", "limit": 15},
221 |                 {"query": "python web development", "limit": 10}
222 |             ]
223 |         },
224 |         "search_subreddit": {
225 |             "description": "Search for posts within a specific subreddit",
226 |             "parameters": {
227 |                 "subreddit_name": {
228 |                     "type": "string",
229 |                     "required": True,
230 |                     "description": "Exact subreddit name (without r/ prefix)",
231 |                     "tip": "Use exact name from discover_subreddits"
232 |                 },
233 |                 "query": {
234 |                     "type": "string",
235 |                     "required": True,
236 |                     "description": "Search terms"
237 |                 },
238 |                 "sort": {
239 |                     "type": "enum",
240 |                     "options": ["relevance", "hot", "top", "new"],
241 |                     "default": "relevance",
242 |                     "description": "How to sort results"
243 |                 },
244 |                 "time_filter": {
245 |                     "type": "enum",
246 |                     "options": ["all", "year", "month", "week", "day"],
247 |                     "default": "all",
248 |                     "description": "Time period for results"
249 |                 },
250 |                 "limit": {
251 |                     "type": "integer",
252 |                     "default": 10,
253 |                     "range": [1, 100],
254 |                     "description": "Maximum number of results"
255 |                 }
256 |             },
257 |             "examples": [] if not include_examples else [
258 |                 {"subreddit_name": "MachineLearning", "query": "transformers", "limit": 20},
259 |                 {"subreddit_name": "Python", "query": "async", "sort": "top", "time_filter": "month"}
260 |             ]
261 |         },
262 |         "fetch_posts": {
263 |             "description": "Get posts from a single subreddit",
264 |             "parameters": {
265 |                 "subreddit_name": {
266 |                     "type": "string",
267 |                     "required": True,
268 |                     "description": "Exact subreddit name (without r/ prefix)"
269 |                 },
270 |                 "listing_type": {
271 |                     "type": "enum",
272 |                     "options": ["hot", "new", "top", "rising"],
273 |                     "default": "hot",
274 |                     "description": "Type of posts to fetch"
275 |                 },
276 |                 "time_filter": {
277 |                     "type": "enum",
278 |                     "options": ["all", "year", "month", "week", "day"],
279 |                     "default": None,
280 |                     "description": "Time period (only for 'top' listing)"
281 |                 },
282 |                 "limit": {
283 |                     "type": "integer",
284 |                     "default": 10,
285 |                     "range": [1, 100],
286 |                     "description": "Number of posts to fetch"
287 |                 }
288 |             },
289 |             "examples": [] if not include_examples else [
290 |                 {"subreddit_name": "technology", "listing_type": "hot", "limit": 15},
291 |                 {"subreddit_name": "science", "listing_type": "top", "time_filter": "week", "limit": 20}
292 |             ]
293 |         },
294 |         "fetch_multiple": {
295 |             "description": "Batch fetch from multiple subreddits efficiently",
296 |             "parameters": {
297 |                 "subreddit_names": {
298 |                     "type": "array[string]",
299 |                     "required": True,
300 |                     "max_items": 10,
301 |                     "description": "List of subreddit names (without r/ prefix)",
302 |                     "tip": "Use names from discover_subreddits"
303 |                 },
304 |                 "listing_type": {
305 |                     "type": "enum",
306 |                     "options": ["hot", "new", "top", "rising"],
307 |                     "default": "hot",
308 |                     "description": "Type of posts to fetch"
309 |                 },
310 |                 "time_filter": {
311 |                     "type": "enum",
312 |                     "options": ["all", "year", "month", "week", "day"],
313 |                     "default": None,
314 |                     "description": "Time period (only for 'top' listing)"
315 |                 },
316 |                 "limit_per_subreddit": {
317 |                     "type": "integer",
318 |                     "default": 5,
319 |                     "range": [1, 25],
320 |                     "description": "Posts per subreddit"
321 |                 }
322 |             },
323 |             "efficiency": {
324 |                 "vs_individual": "70% fewer API calls",
325 |                 "token_usage": "~500-1000 tokens per subreddit"
326 |             },
327 |             "examples": [] if not include_examples else [
328 |                 {"subreddit_names": ["Python", "django", "flask"], "listing_type": "hot", "limit_per_subreddit": 5},
329 |                 {"subreddit_names": ["MachineLearning", "deeplearning"], "listing_type": "top", "time_filter": "week", "limit_per_subreddit": 10}
330 |             ]
331 |         },
332 |         "fetch_comments": {
333 |             "description": "Get complete comment tree for a post",
334 |             "parameters": {
335 |                 "submission_id": {
336 |                     "type": "string",
337 |                     "required_one_of": ["submission_id", "url"],
338 |                     "description": "Reddit post ID (e.g., '1abc234')"
339 |                 },
340 |                 "url": {
341 |                     "type": "string",
342 |                     "required_one_of": ["submission_id", "url"],
343 |                     "description": "Full Reddit URL to the post"
344 |                 },
345 |                 "comment_limit": {
346 |                     "type": "integer",
347 |                     "default": 100,
348 |                     "recommendation": "50-100 for analysis",
349 |                     "description": "Maximum comments to fetch"
350 |                 },
351 |                 "comment_sort": {
352 |                     "type": "enum",
353 |                     "options": ["best", "top", "new"],
354 |                     "default": "best",
355 |                     "description": "How to sort comments"
356 |                 }
357 |             },
358 |             "examples": [] if not include_examples else [
359 |                 {"submission_id": "1abc234", "comment_limit": 100},
360 |                 {"url": "https://reddit.com/r/Python/comments/xyz789/", "comment_limit": 50, "comment_sort": "top"}
361 |             ]
362 |         }
363 |     }
364 |     
365 |     if operation_id not in schemas:
366 |         return {
367 |             "error": f"Unknown operation: {operation_id}",
368 |             "available": list(schemas.keys()),
369 |             "hint": "Use discover_operations() first"
370 |         }
371 |     
372 |     return schemas[operation_id]
373 | 
374 | 
375 | @mcp.tool(
376 |     description="Execute a Reddit operation with validated parameters"
377 | )
378 | async def execute_operation(
379 |     operation_id: Annotated[str, "Operation to execute"],
380 |     parameters: Annotated[Dict[str, Any], "Parameters matching the schema"],
381 |     ctx: Context = None
382 | ) -> Dict[str, Any]:
383 |     """
384 |     LAYER 3: Execute a Reddit operation.
385 |     Only use after getting schema from get_operation_schema.
386 |     """
387 |     # Phase 1: Accept context but don't use it yet
388 | 
389 |     # Operation mapping
390 |     operations = {
391 |         "discover_subreddits": discover_subreddits,
392 |         "search_subreddit": search_in_subreddit,
393 |         "fetch_posts": fetch_subreddit_posts,
394 |         "fetch_multiple": fetch_multiple_subreddits,
395 |         "fetch_comments": fetch_submission_with_comments
396 |     }
397 | 
398 |     if operation_id not in operations:
399 |         return {
400 |             "success": False,
401 |             "error": f"Unknown operation: {operation_id}",
402 |             "available_operations": list(operations.keys())
403 |         }
404 | 
405 |     try:
406 |         # Add reddit client and context to params for operations that need them
407 |         if operation_id in ["search_subreddit", "fetch_posts", "fetch_multiple", "fetch_comments"]:
408 |             params = {**parameters, "reddit": reddit, "ctx": ctx}
409 |         else:
410 |             params = {**parameters, "ctx": ctx}
411 | 
412 |         # Execute operation with await for async operations
413 |         if operation_id in ["discover_subreddits", "fetch_multiple", "fetch_comments"]:
414 |             result = await operations[operation_id](**params)
415 |         else:
416 |             result = operations[operation_id](**params)
417 | 
418 |         return {
419 |             "success": True,
420 |             "data": result
421 |         }
422 |         
423 |     except Exception as e:
424 |         return {
425 |             "success": False,
426 |             "error": str(e),
427 |             "recovery": suggest_recovery(operation_id, e)
428 |         }
429 | 
430 | 
431 | def suggest_recovery(operation_id: str, error: Exception) -> str:
432 |     """Helper to suggest recovery actions based on error type."""
433 |     error_str = str(error).lower()
434 |     
435 |     if "not found" in error_str or "404" in error_str:
436 |         return "Verify the subreddit name or use discover_subreddits"
437 |     elif "rate" in error_str or "429" in error_str:
438 |         return "Rate limited - reduce limit parameter or wait before retrying"
439 |     elif "private" in error_str or "403" in error_str:
440 |         return "Subreddit is private - try other communities"
441 |     elif "invalid" in error_str or "validation" in error_str:
442 |         return "Check parameters match schema from get_operation_schema"
443 |     else:
444 |         return "Check parameters match schema from get_operation_schema"
445 | 
446 | 
447 | # Research Workflow Prompt Template
448 | RESEARCH_WORKFLOW_PROMPT = """
449 | You are conducting comprehensive Reddit research based on this request: "{research_request}"
450 | 
451 | ## WORKFLOW TO FOLLOW:
452 | 
453 | ### PHASE 1: DISCOVERY
454 | 1. First, call discover_operations() to see available operations
455 | 2. Then call get_operation_schema("discover_subreddits") to understand the parameters
456 | 3. Extract the key topic/question from the research request and execute:
457 |    execute_operation("discover_subreddits", {{"query": "<topic from request>", "limit": 15}})
458 | 4. Note the confidence scores for each discovered subreddit
459 | 
460 | ### PHASE 2: STRATEGY SELECTION
461 | Based on confidence scores from discovery:
462 | - **High confidence (>0.7)**: Focus on top 5-8 most relevant subreddits
463 | - **Medium confidence (0.4-0.7)**: Cast wider net with 10-12 subreddits  
464 | - **Low confidence (<0.4)**: Refine search terms and retry discovery
465 | 
466 | ### PHASE 3: GATHER POSTS
467 | Use batch operation for efficiency:
468 | execute_operation("fetch_multiple", {{
469 |     "subreddit_names": [<list from discovery>],
470 |     "listing_type": "top",
471 |     "time_filter": "year",
472 |     "limit_per_subreddit": 10
473 | }})
474 | 
475 | ### PHASE 4: DEEP DIVE INTO DISCUSSIONS
476 | For posts with high engagement (10+ comments, 5+ upvotes):
477 | execute_operation("fetch_comments", {{
478 |     "submission_id": "<post_id>",
479 |     "comment_limit": 100,
480 |     "comment_sort": "best"
481 | }})
482 | 
483 | Target: Analyze 100+ total comments across 10+ subreddits
484 | 
485 | ### PHASE 5: SYNTHESIZE FINDINGS
486 | 
487 | Create a comprehensive report that directly addresses the research request:
488 | 
489 | # Research Report: {research_request}
490 | *Generated: {timestamp}*
491 | 
492 | ## Executive Summary
493 | - Direct answer to the research question
494 | - Key findings with confidence levels
495 | - Coverage metrics: X subreddits, Y posts, Z comments analyzed
496 | 
497 | ## Communities Analyzed
498 | | Subreddit | Subscribers | Relevance Score | Posts Analyzed | Key Insights |
499 | |-----------|------------|-----------------|----------------|--------------|
500 | | [data]    | [count]    | [0.0-1.0]      | [count]        | [summary]    |
501 | 
502 | ## Key Findings
503 | 
504 | ### [Finding that directly addresses the research request]
505 | **Community Consensus**: [Strong/Moderate/Split/Emerging]
506 | 
507 | Evidence from Reddit:
508 | - u/[username] in r/[subreddit] stated: "exact quote" [↑450](https://reddit.com/r/subreddit/comments/abc123/)
509 | - Discussion with 200+ comments shows... [link](url)
510 | - Highly awarded post argues... [↑2.3k, Gold×3](url)
511 | 
512 | ### [Additional relevant findings...]
513 | [Continue with 2-4 more key findings that answer different aspects of the research request]
514 | 
515 | ## Temporal Trends
516 | - How perspectives have evolved over time
517 | - Recent shifts in community sentiment
518 | - Emerging viewpoints in the last 30 days
519 | 
520 | ## Notable Perspectives
521 | - Expert opinions (verified flairs, high karma users 10k+)
522 | - Contrarian views worth considering
523 | - Common misconceptions identified
524 | 
525 | ## Data Quality Metrics
526 | - Total subreddits analyzed: [count]
527 | - Total posts reviewed: [count]
528 | - Total comments analyzed: [count]  
529 | - Unique contributors: [count]
530 | - Date range: [oldest] to [newest]
531 | - Average post score: [score]
532 | - High-karma contributors (10k+): [count]
533 | 
534 | ## Limitations
535 | - Geographic/language bias (primarily English-speaking communities)
536 | - Temporal coverage (data from [date range])
537 | - Communities not represented in analysis
538 | 
539 | ---
540 | *Research methodology: Semantic discovery across 20,000+ indexed subreddits, followed by deep analysis of high-engagement discussions*
541 | 
542 | CRITICAL REQUIREMENTS:
543 | - Never fabricate Reddit content - only cite actual posts/comments from the data
544 | - Every claim must link to its Reddit source with a clickable URL
545 | - Include upvote counts and awards for credibility assessment
546 | - Note when content is [deleted] or [removed]
547 | - Track temporal context (when was this posted?)
548 | - Answer the specific research request - don't just summarize content
549 | """
550 | 
551 | 
552 | @mcp.prompt(
553 |     name="reddit_research",
554 |     description="Conduct comprehensive Reddit research on any topic or question",
555 |     tags={"research", "analysis", "comprehensive"}
556 | )
557 | def reddit_research(research_request: str) -> List[Message]:
558 |     """
559 |     Guides comprehensive Reddit research based on a natural language request.
560 |     
561 |     Args:
562 |         research_request: Natural language description of what to research
563 |                          Examples: "How do people feel about remote work?",
564 |                                  "Best practices for Python async programming",
565 |                                  "Community sentiment on electric vehicles"
566 |     
567 |     Returns:
568 |         Structured messages guiding the LLM through the complete research workflow
569 |     """
570 |     timestamp = datetime.now().strftime("%Y-%m-%d %H:%M UTC")
571 |     
572 |     return [
573 |         Message(
574 |             role="assistant", 
575 |             content=RESEARCH_WORKFLOW_PROMPT.format(
576 |                 research_request=research_request,
577 |                 timestamp=timestamp
578 |             )
579 |         ),
580 |         Message(
581 |             role="user",
582 |             content=f"Please conduct comprehensive Reddit research to answer: {research_request}"
583 |         )
584 |     ]
585 | 
586 | 
587 | def main():
588 |     """Main entry point for the server."""
589 |     print("Reddit MCP Server starting...", flush=True)
590 |     
591 |     # Try to initialize the Reddit client with available configuration
592 |     try:
593 |         initialize_reddit_client()
594 |         print("Reddit client initialized successfully", flush=True)
595 |     except Exception as e:
596 |         print(f"WARNING: Failed to initialize Reddit client: {e}", flush=True)
597 |         print("Server will run with limited functionality.", flush=True)
598 |         print("\nPlease provide Reddit API credentials via:", flush=True)
599 |         print("  1. Environment variables: REDDIT_CLIENT_ID, REDDIT_CLIENT_SECRET, REDDIT_USER_AGENT", flush=True)
600 |         print("  2. Config file: .mcp-config.json", flush=True)
601 |     
602 |     # Run with stdio transport
603 |     mcp.run()
604 | 
605 | 
606 | if __name__ == "__main__":
607 |     main()
```