This is page 2 of 5. Use http://codebase.md/marianfoo/mcp-sap-docs?lines=true&page={x} to view the full context.
# Directory Structure
```
├── .cursor
│ └── rules
│ ├── 00-overview.mdc
│ ├── 10-search-stack.mdc
│ ├── 20-tools-and-apis.mdc
│ ├── 30-tests-and-output.mdc
│ ├── 40-deploy.mdc
│ ├── 50-metadata-config.mdc
│ ├── 60-adding-github-sources.mdc
│ ├── 70-tool-usage-guide.mdc
│ └── 80-abap-integration.mdc
├── .cursorignore
├── .gitattributes
├── .github
│ ├── ISSUE_TEMPLATE
│ │ ├── config.yml
│ │ ├── missing-documentation.yml
│ │ └── new-documentation-source.yml
│ └── workflows
│ ├── deploy-mcp-sap-docs.yml
│ ├── test-pr.yml
│ └── update-submodules.yml
├── .gitignore
├── .gitmodules
├── .npmignore
├── .vscode
│ ├── extensions.json
│ └── settings.json
├── docs
│ ├── ABAP-INTEGRATION-SUMMARY.md
│ ├── ABAP-MULTI-VERSION-INTEGRATION.md
│ ├── ABAP-STANDARD-INTEGRATION.md
│ ├── ABAP-USAGE-GUIDE.md
│ ├── ARCHITECTURE.md
│ ├── COMMUNITY-SEARCH-IMPLEMENTATION.md
│ ├── CONTENT-SIZE-LIMITS.md
│ ├── CURSOR-SETUP.md
│ ├── DEV.md
│ ├── FTS5-IMPLEMENTATION-COMPLETE.md
│ ├── LLM-FRIENDLY-IMPROVEMENTS.md
│ ├── METADATA-CONSOLIDATION.md
│ ├── TEST-SEARCH.md
│ └── TESTS.md
├── ecosystem.config.cjs
├── index.html
├── LICENSE
├── package-lock.json
├── package.json
├── README.md
├── REMOTE_SETUP.md
├── scripts
│ ├── build-fts.ts
│ ├── build-index.ts
│ ├── check-version.js
│ └── summarize-src.js
├── server.json
├── setup.sh
├── src
│ ├── global.d.ts
│ ├── http-server.ts
│ ├── lib
│ │ ├── BaseServerHandler.ts
│ │ ├── communityBestMatch.ts
│ │ ├── config.ts
│ │ ├── localDocs.ts
│ │ ├── logger.ts
│ │ ├── metadata.ts
│ │ ├── sapHelp.ts
│ │ ├── search.ts
│ │ ├── searchDb.ts
│ │ ├── truncate.ts
│ │ ├── types.ts
│ │ └── url-generation
│ │ ├── abap.ts
│ │ ├── BaseUrlGenerator.ts
│ │ ├── cap.ts
│ │ ├── cloud-sdk.ts
│ │ ├── dsag.ts
│ │ ├── GenericUrlGenerator.ts
│ │ ├── index.ts
│ │ ├── README.md
│ │ ├── sapui5.ts
│ │ ├── utils.ts
│ │ └── wdi5.ts
│ ├── metadata.json
│ ├── server.ts
│ └── streamable-http-server.ts
├── test
│ ├── _utils
│ │ ├── httpClient.js
│ │ └── parseResults.js
│ ├── community-search.ts
│ ├── comprehensive-url-generation.test.ts
│ ├── performance
│ │ └── README.md
│ ├── prompts.test.ts
│ ├── quick-url-test.ts
│ ├── README.md
│ ├── tools
│ │ ├── run-tests.js
│ │ ├── sap_docs_search
│ │ │ ├── search-cap-docs.js
│ │ │ ├── search-cloud-sdk-ai.js
│ │ │ ├── search-cloud-sdk-js.js
│ │ │ └── search-sapui5-docs.js
│ │ ├── search-url-verification.js
│ │ ├── search.generic.spec.js
│ │ └── search.smoke.js
│ ├── url-status.ts
│ └── validate-urls.ts
├── test-community-search.js
├── test-search-interactive.ts
├── test-search.http
├── test-search.ts
├── tsconfig.json
└── vitest.config.ts
```
# Files
--------------------------------------------------------------------------------
/docs/FTS5-IMPLEMENTATION-COMPLETE.md:
--------------------------------------------------------------------------------
```markdown
1 | # ✅ FTS5 Hybrid Search Implementation - Complete!
2 |
3 | ## 🎉 Successfully Implemented
4 |
5 | I've successfully implemented the **FTS5 Hybrid Search** approach that preserves all your sophisticated search logic while providing massive performance improvements.
6 |
7 | ### 📊 **Performance Results**
8 | - **16x faster search**: 42ms vs 700ms (based on test results)
9 | - **7,677 documents indexed** into a **3.5MB SQLite database**
10 | - **Graceful fallback** to full search when FTS finds no candidates
11 | - **All sophisticated features preserved**: context-aware scoring, query expansion, fuzzy matching
12 |
13 | ### 🏗️ **What Was Built**
14 |
15 | #### 1. **FTS5 Index Builder** (`scripts/build-fts.ts`)
16 | - Reads your existing `data/index.json`
17 | - Creates optimized FTS5 SQLite database at `data/docs.sqlite`
18 | - Indexes: title, description, keywords, controlName, namespace
19 | - Simple schema focused on fast candidate filtering
20 |
21 | #### 2. **FTS Query Module** (`src/lib/searchDb.ts`)
22 | - `getFTSCandidateIds()` - Fast filtering to get top candidate document IDs
23 | - `searchFTS()` - Full FTS search for testing/debugging
24 | - `getFTSStats()` - Database statistics for monitoring
25 | - Handles query sanitization (quotes terms with dots like "sap.m.Button")
26 |
27 | #### 3. **Hybrid Search Logic** (Modified `src/lib/localDocs.ts`)
28 | - **Step 1**: Use FTS to get ~100 candidate documents per expanded query
29 | - **Step 2**: Apply your existing sophisticated scoring ONLY to FTS candidates
30 | - **Step 3**: If FTS fails/finds nothing, fall back to full search
31 | - **Preserves ALL**: Query expansion, context penalties, fuzzy matching, file content integration
32 |
33 | #### 4. **Updated Build Scripts** (`package.json`)
34 | ```bash
35 | npm run build:index # Build regular index (unchanged)
36 | npm run build:fts # Build FTS5 index from regular index
37 | npm run build:all # Build both indexes in sequence
38 | ```
39 |
40 | ### 🚀 **How It Works**
41 |
42 | #### The Hybrid Approach
43 | ```
44 | User Query "wizard"
45 | ↓
46 | Query Expansion: ["wizard", "sap.m.Wizard", "UI5 wizard", ...]
47 | ↓
48 | FTS Fast Filter: 7,677 docs → 30 candidates (in ~1ms)
49 | ↓
50 | Your Sophisticated Scoring: Applied only to 30 candidates (preserves all logic)
51 | ↓
52 | Context Penalties & Boosts: CAP/UI5/wdi5 context awareness (unchanged)
53 | ↓
54 | Formatted Results: Same output format as before
55 | ```
56 |
57 | #### Why This Approach is Superior
58 | - ✅ **16x performance improvement** without any functional regression
59 | - ✅ **Zero risk** - Falls back to full search if FTS fails
60 | - ✅ **All features preserved** - Context scoring, query expansion, fuzzy matching
61 | - ✅ **Simple deployment** - Just copy the `data/docs.sqlite` file
62 | - ✅ **Transparent operation** - Results show "(🚀 FTS-filtered from X candidates)" when active
63 |
64 | ### 🔧 **Usage Instructions**
65 |
66 | #### Initial Setup (Run Once)
67 | ```bash
68 | # Build both indexes
69 | npm run build:all
70 | ```
71 |
72 | #### Production Deployment
73 | 1. Run `npm run build:all` in your CI/CD
74 | 2. Deploy both files: `data/index.json` AND `data/docs.sqlite`
75 | 3. Your search is now 16x faster automatically!
76 |
77 | #### Monitoring
78 | The search results now show FTS status:
79 | - `(🚀 FTS-filtered from X candidates)` - FTS is working
80 | - `(🔍 Full search)` - Fell back to full search
81 |
82 | ### 🔍 **Technical Details**
83 |
84 | #### FTS5 Schema
85 | ```sql
86 | CREATE VIRTUAL TABLE docs USING fts5(
87 | libraryId, -- for filtering (/cap, /sapui5, etc.)
88 | type, -- markdown/jsdoc/sample
89 | title, -- strong search signal
90 | description, -- secondary search signal
91 | keywords, -- properties, events, aggregations
92 | controlName, -- Wizard, Button, etc.
93 | namespace, -- sap.m, sap.f, etc.
94 | id UNINDEXED, -- metadata only
95 | relFile UNINDEXED,
96 | snippetCount UNINDEXED
97 | );
98 | ```
99 |
100 | #### Query Processing
101 | - Simple terms: `wizard` → `wizard*` (prefix matching)
102 | - Dotted terms: `sap.m.Button` → `"sap.m.Button"` (phrase search)
103 | - Multi-word: `entity service` → `entity* service*`
104 | - Falls back gracefully on any FTS error
105 |
106 | ### 🎯 **What's Preserved**
107 |
108 | All your sophisticated search features remain intact:
109 | - ✅ **400+ line synonym expansion system**
110 | - ✅ **Context-aware penalties** (CAP/UI5/wdi5 scoring)
111 | - ✅ **Fuzzy matching** with Levenshtein distance
112 | - ✅ **File content integration** (extracts UI5 controls from user files)
113 | - ✅ **Rich metadata scoring** (properties, events, aggregations)
114 | - ✅ **SAP Community integration**
115 | - ✅ **All existing result formatting**
116 |
117 | ### 🚀 **Next Steps**
118 |
119 | 1. **Test in your environment**: The system is ready to use
120 | 2. **Monitor performance**: Check logs for FTS usage indicators
121 | 3. **CI/CD Integration**: Add `npm run build:all` to your deployment pipeline
122 | 4. **Optional**: Fine-tune FTS candidate limit (currently 100 per query)
123 |
124 | ### 📈 **Expected Benefits**
125 |
126 | - **Faster user experience**: 16x search speed improvement
127 | - **Better scalability**: Performance stays consistent as docs grow
128 | - **Lower server load**: Faster searches = less CPU usage
129 | - **Easy deployment**: Single SQLite file, no additional services needed
130 |
131 | ## 🎉 **Implementation Complete!**
132 |
133 | Your search is now **16x faster** while preserving **all sophisticated features**. The FTS5 hybrid approach gives you the best of both worlds: blazing fast performance with zero functional regression.
134 |
135 | Enjoy your supercharged search! 🚀
```
--------------------------------------------------------------------------------
/docs/LLM-FRIENDLY-IMPROVEMENTS.md:
--------------------------------------------------------------------------------
```markdown
1 | # LLM-Friendly MCP Tool Improvements
2 |
3 | This document summarizes the improvements made to make the SAP Docs MCP server more LLM-friendly, based on Claude's feedback and analysis.
4 |
5 | ## 🎯 **Key Issues Addressed**
6 |
7 | ### **Original Problem**
8 | Claude was confused about function names, using incorrect patterns like:
9 | - ❌ `search: query "..."` (FAILED - wrong syntax)
10 | - ❌ `SAP Docs MCP:search` (FAILED - incorrect namespace)
11 | - ✅ Should be: `search(query="...")` or `mcp_sap-docs-remote_search(query="...")`
12 |
13 | ## 🔧 **Improvements Implemented**
14 |
15 | ### **1. Simplified Visual Formatting**
16 | **Before:**
17 | ```
18 | **FUNCTION NAME: Use exactly 'search' or 'mcp_sap-docs-remote_search' depending on your MCP client**
19 |
20 | Unified search across all SAP documentation sources...
21 |
22 | **EXAMPLE USAGE:**
23 | ```
24 | search(query="CAP binary data LargeBinary MediaType")
25 | ```
26 | ```
27 |
28 | **After:**
29 | ```
30 | SEARCH SAP DOCS: search(query="search terms")
31 |
32 | FUNCTION NAME: search
33 |
34 | COVERS: ABAP (all versions), UI5, CAP, wdi5, OpenUI5 APIs, Cloud SDK
35 | AUTO-DETECTS: ABAP versions from query (e.g. "LOOP 7.57", defaults to 7.58)
36 | ```
37 |
38 | ### **2. Structured Examples in JSON Schema**
39 | **Before:** Examples mixed into description text
40 | **After:** Clean `examples` array in JSON schema:
41 | ```javascript
42 | {
43 | "examples": [
44 | "CAP binary data LargeBinary MediaType",
45 | "UI5 button properties",
46 | "wdi5 testing locators",
47 | "ABAP SELECT statements 7.58",
48 | "415 error CAP action parameter"
49 | ]
50 | }
51 | ```
52 |
53 | ### **3. Added Workflow Patterns**
54 | **New sections added:**
55 | ```
56 | TYPICAL WORKFLOW:
57 | 1. search(query="your search terms")
58 | 2. fetch(id="result_id_from_step_1")
59 |
60 | COMMON PATTERNS:
61 | • Broad exploration: id="/cap/binary"
62 | • Specific API: id="/openui5-api/sap/m/Button"
63 | • Community posts: id="community-12345"
64 | ```
65 |
66 | ### **4. Improved Error Messages**
67 | **Before:**
68 | ```
69 | No results found for "query". Try searching for UI5 controls like 'button', 'table', 'wizard', testing topics like 'wdi5', 'testing', 'e2e', or concepts like 'routing', 'annotation', 'authentication', 'fiori elements', 'rap'. For detailed ABAP language syntax, use abap_search instead.
70 | ```
71 |
72 | **After:**
73 | ```
74 | No results for "query".
75 |
76 | TRY INSTEAD:
77 | • UI5 controls: "button", "table", "wizard"
78 | • CAP topics: "actions", "authentication", "media", "binary"
79 | • Testing: "wdi5", "locators", "e2e"
80 | • ABAP: Use version numbers like "SELECT 7.58"
81 | • Errors: Include error codes like "415 error CAP action"
82 | ```
83 |
84 | ### **5. Query Optimization Hints**
85 | **Added to each tool:**
86 | ```
87 | QUERY TIPS:
88 | • Be specific: "CAP action binary parameter" not just "CAP"
89 | • Include error codes: "415 error CAP action"
90 | • Use technical terms: "LargeBinary MediaType XMLHttpRequest"
91 | • For ABAP: Include version like "7.58" or "latest"
92 | ```
93 |
94 | ### **6. Header Documentation for Developers**
95 | ```javascript
96 | /**
97 | * IMPORTANT FOR LLMs/AI ASSISTANTS:
98 | * =================================
99 | * The function names in this MCP server may appear with different prefixes depending on your MCP client:
100 | * - Simple names: search, fetch, sap_community_search, sap_help_search, sap_help_get
101 | * - Prefixed names: mcp_sap-docs-remote_search, mcp_sap-docs-remote_fetch, etc.
102 | *
103 | * Try the simple names first, then the prefixed versions if they don't work.
104 | */
105 | ```
106 |
107 | ## 📊 **Impact on LLM Usage**
108 |
109 | ### **Function Name Clarity**
110 | - ✅ Explicit guidance on both naming patterns
111 | - ✅ Clear fallback strategy (try simple names first)
112 | - ✅ Reduced confusion about MCP client variations
113 |
114 | ### **Query Construction**
115 | - ✅ Concrete examples for each tool type
116 | - ✅ Technical terminology guidance
117 | - ✅ Error code inclusion strategies
118 | - ✅ ABAP version detection hints
119 |
120 | ### **Workflow Understanding**
121 | - ✅ Clear search → get patterns
122 | - ✅ Common usage scenarios
123 | - ✅ Library ID vs document ID guidance
124 |
125 | ### **Error Recovery**
126 | - ✅ Actionable next steps instead of long descriptions
127 | - ✅ Alternative tool suggestions
128 | - ✅ Specific retry strategies
129 |
130 | ## 🚀 **Tools Updated**
131 |
132 | 1. **search** - Main documentation search
133 | 2. **fetch** - Retrieve specific documentation
134 | 3. **sap_community_search** - Community posts and discussions
135 | 4. **sap_help_search** - Official SAP Help Portal
136 | 5. **sap_help_get** - Get specific Help Portal pages
137 |
138 | ## 📝 **Best Practices for LLMs**
139 |
140 | ### **Search Strategy**
141 | 1. Start with `search` for technical documentation
142 | 2. Use `sap_community_search` for troubleshooting and error codes
143 | 3. Always follow up search results with `fetch` or `sap_help_get`
144 |
145 | ### **Query Construction**
146 | - Include product names: "CAP", "UI5", "ABAP", "wdi5"
147 | - Add technical terms: "LargeBinary", "MediaType", "XMLHttpRequest"
148 | - Include error codes: "415", "500", "404"
149 | - Specify ABAP versions: "7.58", "latest"
150 |
151 | ### **Common Workflows**
152 | ```
153 | Problem-solving pattern:
154 | 1. search(query="technical problem + error code")
155 | 2. sap_community_search(query="same problem for community solutions")
156 | 3. fetch(id="most_relevant_result")
157 | ```
158 |
159 | ## ✅ **Validation**
160 |
161 | The improvements address the specific issues Claude encountered:
162 | - ✅ Function naming confusion resolved
163 | - ✅ Parameter format clarity improved
164 | - ✅ Search strategy guidance provided
165 | - ✅ Error messages made actionable
166 | - ✅ Examples based on real usage patterns
167 |
168 | ---
169 |
170 | *These improvements make the SAP Docs MCP server significantly more accessible to LLMs like Claude, reducing confusion and improving successful tool call rates.*
171 |
172 |
173 |
```
--------------------------------------------------------------------------------
/src/lib/url-generation/BaseUrlGenerator.ts:
--------------------------------------------------------------------------------
```typescript
1 | /**
2 | * Abstract base class for URL generation across documentation sources
3 | * Provides common functionality and standardized interface for all URL generators
4 | */
5 |
6 | import { parseFrontmatter, extractSectionFromPath, buildUrl, detectContentSection, FrontmatterData } from './utils.js';
7 | import { DocUrlConfig } from '../metadata.js';
8 |
9 | export interface UrlGenerationContext {
10 | relFile: string;
11 | content: string;
12 | config: DocUrlConfig;
13 | libraryId: string;
14 | }
15 |
16 | export interface UrlGenerationResult {
17 | url: string | null;
18 | anchor?: string;
19 | section?: string;
20 | frontmatter?: FrontmatterData;
21 | }
22 |
23 | /**
24 | * Abstract base class for all URL generators
25 | * Provides common functionality while allowing source-specific customization
26 | */
27 | export abstract class BaseUrlGenerator {
28 | protected readonly libraryId: string;
29 | protected readonly config: DocUrlConfig;
30 |
31 | constructor(libraryId: string, config: DocUrlConfig) {
32 | this.libraryId = libraryId;
33 | this.config = config;
34 | }
35 |
36 | /**
37 | * Main entry point for URL generation
38 | * Orchestrates the generation process using template method pattern
39 | */
40 | public generateUrl(context: UrlGenerationContext): string | null {
41 | try {
42 | const frontmatter = this.parseFrontmatter(context.content);
43 | const section = this.extractSection(context.relFile);
44 | const anchor = this.generateAnchor(context.content);
45 |
46 | // Try source-specific generation first
47 | let url = this.generateSourceSpecificUrl({
48 | ...context,
49 | frontmatter,
50 | section,
51 | anchor
52 | });
53 |
54 | // Fallback to generic generation if needed
55 | if (!url) {
56 | url = this.generateFallbackUrl({
57 | ...context,
58 | frontmatter,
59 | section,
60 | anchor
61 | });
62 | }
63 |
64 | return url;
65 | } catch (error) {
66 | console.warn(`Error generating URL for ${this.libraryId}:`, error);
67 | return null;
68 | }
69 | }
70 |
71 | /**
72 | * Source-specific URL generation logic
73 | * Must be implemented by each concrete generator
74 | */
75 | protected abstract generateSourceSpecificUrl(context: UrlGenerationContext & {
76 | frontmatter: FrontmatterData;
77 | section: string;
78 | anchor: string | null;
79 | }): string | null;
80 |
81 | /**
82 | * Generic fallback URL generation
83 | * Uses filename and config pattern as last resort
84 | */
85 | protected generateFallbackUrl(context: UrlGenerationContext & {
86 | frontmatter: FrontmatterData;
87 | section: string;
88 | anchor: string | null;
89 | }): string | null {
90 | // Extract just the filename without directory path to avoid duplication with pathPattern
91 | const fileName = context.relFile
92 | .replace(/\.mdx?$/, '')
93 | .replace(/\.html?$/, '')
94 | .replace(/.*\//, ''); // Remove directory path, keep only filename
95 |
96 | let urlPath = this.config.pathPattern.replace('{file}', fileName);
97 |
98 | // Add anchor if available
99 | if (context.anchor) {
100 | const separator = this.getSeparator();
101 | urlPath += separator + context.anchor;
102 | }
103 |
104 | return this.config.baseUrl + urlPath;
105 | }
106 |
107 | /**
108 | * Parse frontmatter from content
109 | * Can be overridden for source-specific parsing needs
110 | */
111 | protected parseFrontmatter(content: string): FrontmatterData {
112 | return parseFrontmatter(content);
113 | }
114 |
115 | /**
116 | * Extract section from file path
117 | * Can be overridden for source-specific section logic
118 | */
119 | protected extractSection(relFile: string): string {
120 | return extractSectionFromPath(relFile);
121 | }
122 |
123 | /**
124 | * Generate anchor from content
125 | * Can be overridden for source-specific anchor logic
126 | */
127 | protected generateAnchor(content: string): string | null {
128 | return detectContentSection(content, this.config.anchorStyle);
129 | }
130 |
131 | /**
132 | * Get URL separator based on anchor style
133 | */
134 | protected getSeparator(): string {
135 | return this.config.anchorStyle === 'docsify' ? '?id=' : '#';
136 | }
137 |
138 | /**
139 | * Build clean URL with proper path joining
140 | */
141 | protected buildUrl(baseUrl: string, ...pathSegments: string[]): string {
142 | return buildUrl(baseUrl, ...pathSegments);
143 | }
144 |
145 | /**
146 | * Get the identifier from frontmatter (id or slug)
147 | * Common pattern used by many sources
148 | */
149 | protected getIdentifierFromFrontmatter(frontmatter: FrontmatterData): string | null {
150 | return frontmatter.id || frontmatter.slug || null;
151 | }
152 |
153 | /**
154 | * Check if file is in specific directory
155 | */
156 | protected isInDirectory(relFile: string, directory: string): boolean {
157 | return relFile.includes(`${directory}/`);
158 | }
159 |
160 | /**
161 | * Extract filename without extension
162 | */
163 | protected getCleanFileName(relFile: string): string {
164 | return relFile
165 | .replace(/\.mdx?$/, '')
166 | .replace(/\.html?$/, '')
167 | .replace(/.*\//, ''); // Get last part after slash
168 | }
169 |
170 | /**
171 | * Build URL with section and identifier
172 | * Common pattern for many documentation sites
173 | */
174 | protected buildSectionUrl(section: string, identifier: string, anchor?: string | null): string {
175 | let url = this.buildUrl(this.config.baseUrl, section, identifier);
176 |
177 | if (anchor) {
178 | const separator = this.getSeparator();
179 | url += separator + anchor;
180 | }
181 |
182 | return url;
183 | }
184 |
185 | /**
186 | * Build docsify-style URL with # fragment
187 | */
188 | protected buildDocsifyUrl(path: string): string {
189 | const cleanPath = path.startsWith('/') ? path.slice(1) : path;
190 | return `${this.config.baseUrl}/#/${cleanPath}`;
191 | }
192 | }
193 |
```
--------------------------------------------------------------------------------
/src/lib/truncate.ts:
--------------------------------------------------------------------------------
```typescript
1 | // Intelligent content truncation utility
2 | // Preserves structure and readability while limiting content size
3 |
4 | import { CONFIG } from "./config.js";
5 |
6 | export interface TruncationResult {
7 | content: string;
8 | wasTruncated: boolean;
9 | originalLength: number;
10 | truncatedLength: number;
11 | }
12 |
13 | /**
14 | * Intelligently truncate content to a maximum length while preserving:
15 | * - Beginning (intro/overview)
16 | * - End (conclusions/examples)
17 | * - Code block integrity
18 | * - Markdown structure
19 | *
20 | * @param content - The content to truncate
21 | * @param maxLength - Maximum length (defaults to CONFIG.MAX_CONTENT_LENGTH)
22 | * @returns TruncationResult with truncated content and metadata
23 | */
24 | export function truncateContent(
25 | content: string,
26 | maxLength: number = CONFIG.MAX_CONTENT_LENGTH
27 | ): TruncationResult {
28 | const originalLength = content.length;
29 |
30 | // No truncation needed
31 | if (originalLength <= maxLength) {
32 | return {
33 | content,
34 | wasTruncated: false,
35 | originalLength,
36 | truncatedLength: originalLength
37 | };
38 | }
39 |
40 | // Calculate how much content to preserve from start and end
41 | // Keep 60% from the start (intro/main content) and 20% from end (conclusions)
42 | // Reserve 20% for truncation notice and buffer
43 | const startLength = Math.floor(maxLength * 0.6);
44 | const endLength = Math.floor(maxLength * 0.2);
45 | const noticeLength = maxLength - startLength - endLength;
46 |
47 | // Extract start portion
48 | let startPortion = content.substring(0, startLength);
49 |
50 | // Try to break at a natural boundary (paragraph, heading, or code block)
51 | const naturalBreakPatterns = [
52 | /\n\n/g, // Paragraph breaks
53 | /\n#{1,6}\s/g, // Markdown headings
54 | /\n```\n/g, // Code block ends
55 | /\n---\n/g, // Horizontal rules
56 | /\.\s+/g // Sentence ends
57 | ];
58 |
59 | for (const pattern of naturalBreakPatterns) {
60 | const matches = Array.from(startPortion.matchAll(pattern));
61 | if (matches.length > 0) {
62 | const lastMatch = matches[matches.length - 1];
63 | if (lastMatch.index && lastMatch.index > startLength * 0.8) {
64 | startPortion = startPortion.substring(0, lastMatch.index + lastMatch[0].length);
65 | break;
66 | }
67 | }
68 | }
69 |
70 | // Extract end portion
71 | let endPortion = content.substring(content.length - endLength);
72 |
73 | // Try to break at natural boundary from the beginning of end portion
74 | for (const pattern of naturalBreakPatterns) {
75 | const match = endPortion.match(pattern);
76 | if (match && match.index !== undefined && match.index < endLength * 0.2) {
77 | endPortion = endPortion.substring(match.index + match[0].length);
78 | break;
79 | }
80 | }
81 |
82 | // Create truncation notice
83 | const omittedChars = originalLength - (startPortion.length + endPortion.length);
84 | const omittedPercent = Math.round((omittedChars / originalLength) * 100);
85 |
86 | const truncationNotice = `
87 |
88 | ---
89 |
90 | ⚠️ **Content Truncated**
91 |
92 | The full content was ${originalLength.toLocaleString()} characters (approximately ${Math.round(originalLength / 4)} tokens).
93 | For readability and performance, ${omittedChars.toLocaleString()} characters (${omittedPercent}%) have been omitted from the middle section.
94 |
95 | The beginning and end of the document are preserved above and below this notice.
96 |
97 | ---
98 |
99 | `;
100 |
101 | // Combine portions
102 | const truncatedContent = startPortion + truncationNotice + endPortion;
103 |
104 | return {
105 | content: truncatedContent,
106 | wasTruncated: true,
107 | originalLength,
108 | truncatedLength: truncatedContent.length
109 | };
110 | }
111 |
112 | /**
113 | * Truncate content with a simple notice at the end
114 | * Used when preserving both beginning and end doesn't make sense
115 | *
116 | * @param content - The content to truncate
117 | * @param maxLength - Maximum length (defaults to CONFIG.MAX_CONTENT_LENGTH)
118 | * @returns TruncationResult with truncated content and metadata
119 | */
120 | export function truncateContentSimple(
121 | content: string,
122 | maxLength: number = CONFIG.MAX_CONTENT_LENGTH
123 | ): TruncationResult {
124 | const originalLength = content.length;
125 |
126 | // No truncation needed
127 | if (originalLength <= maxLength) {
128 | return {
129 | content,
130 | wasTruncated: false,
131 | originalLength,
132 | truncatedLength: originalLength
133 | };
134 | }
135 |
136 | // Reserve space for truncation notice
137 | const noticeLength = 300;
138 | const contentLength = maxLength - noticeLength;
139 |
140 | // Extract content
141 | let truncatedContent = content.substring(0, contentLength);
142 |
143 | // Try to break at natural boundary
144 | const lastParagraph = truncatedContent.lastIndexOf('\n\n');
145 | const lastSentence = truncatedContent.lastIndexOf('. ');
146 |
147 | if (lastParagraph > contentLength * 0.9) {
148 | truncatedContent = truncatedContent.substring(0, lastParagraph);
149 | } else if (lastSentence > contentLength * 0.9) {
150 | truncatedContent = truncatedContent.substring(0, lastSentence + 1);
151 | }
152 |
153 | // Add truncation notice
154 | const omittedChars = originalLength - truncatedContent.length;
155 | const omittedPercent = Math.round((omittedChars / originalLength) * 100);
156 |
157 | truncatedContent += `
158 |
159 | ---
160 |
161 | ⚠️ **Content Truncated**
162 |
163 | The full content was ${originalLength.toLocaleString()} characters (approximately ${Math.round(originalLength / 4)} tokens).
164 | ${omittedChars.toLocaleString()} characters (${omittedPercent}%) have been omitted for readability.
165 |
166 | ---
167 | `;
168 |
169 | return {
170 | content: truncatedContent,
171 | wasTruncated: true,
172 | originalLength,
173 | truncatedLength: truncatedContent.length
174 | };
175 | }
176 |
177 |
```
--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/missing-documentation.yml:
--------------------------------------------------------------------------------
```yaml
1 | name: Missing Documentation Search Result
2 | description: Report when expected SAP documentation cannot be found through MCP server searches
3 | title: "[MISSING DOC]: "
4 | assignees: []
5 | body:
6 | - type: markdown
7 | attributes:
8 | value: |
9 | Thanks for reporting a missing documentation issue! This helps us improve our search coverage and indexing.
10 |
11 | Please provide as much detail as possible to help us locate and index the missing content.
12 |
13 | - type: textarea
14 | id: problem-description
15 | attributes:
16 | label: Problem Description
17 | description: Describe what you were trying to find and why the current search results are insufficient
18 | placeholder: "I was looking for information about... but the search results didn't include..."
19 | validations:
20 | required: true
21 |
22 | - type: textarea
23 | id: llm-search-query
24 | attributes:
25 | label: Search Query Used in LLM
26 | description: What did you ask the LLM? Include the exact prompt or question you used
27 | placeholder: "How do I configure authentication in SAP CAP applications?"
28 | render: text
29 | validations:
30 | required: true
31 |
32 | - type: checkboxes
33 | id: mcp-tool-used
34 | attributes:
35 | label: MCP Tools Called
36 | description: Which MCP tools were used for the search? (Select all that apply)
37 | options:
38 | - label: "sap_docs_search"
39 | - label: "sap_community_search"
40 | - label: "sap_help_search"
41 | - label: "sap_note_search"
42 | - label: "Not sure/Unknown"
43 |
44 | - type: textarea
45 | id: search-parameters
46 | attributes:
47 | label: Search Parameters
48 | description: What parameters were passed to the MCP server? (query, filters, etc.)
49 | placeholder: |
50 | Query: "authentication CAP"
51 | Library: "/cap"
52 | render: text
53 | validations:
54 | required: false
55 |
56 | - type: textarea
57 | id: mcp-response
58 | attributes:
59 | label: MCP Server Response
60 | description: What results did the MCP server return? Include relevant excerpts or "No results found"
61 | placeholder: "The server returned 3 results about basic authentication but none covered the specific JWT configuration I needed"
62 | render: text
63 | validations:
64 | required: false
65 |
66 | - type: checkboxes
67 | id: expected-source
68 | attributes:
69 | label: Expected Documentation Sources
70 | description: Which SAP documentation sources should contain this information? (Select all that apply)
71 | options:
72 | - label: "SAP CAP Documentation"
73 | - label: "SAPUI5 Documentation"
74 | - label: "OpenUI5 API Reference"
75 | - label: "SAP Community (Blog/Discussion)"
76 | - label: "SAP Help Portal"
77 | - label: "SAP Notes/KBA"
78 | - label: "wdi5 Testing Documentation"
79 | - label: "UI5 Tooling Documentation"
80 | - label: "Cloud MTA Build Tool Documentation"
81 | - label: "UI5 Web Components Documentation"
82 | - label: "SAP Cloud SDK Documentation"
83 | - label: "SAP Cloud SDK AI Documentation"
84 | - label: "Not sure"
85 | - label: "Other (please specify in Expected Document)"
86 | validations:
87 | required: true
88 |
89 | - type: textarea
90 | id: expected-document
91 | attributes:
92 | label: Expected Document/Page
93 | description: Provide details about the specific document, page, or section you expected to find
94 | placeholder: |
95 | - Document Title: "Authentication and Authorization in CAP"
96 | - URL (if known): https://cap.cloud.sap/docs/guides/security/
97 | - Section: JWT Configuration
98 | - Last seen: 2024-01-15
99 | validations:
100 | required: true
101 |
102 | - type: checkboxes
103 | id: sap-product-area
104 | attributes:
105 | label: SAP Product/Technology Areas
106 | description: Which SAP technology areas does this relate to? (Select all that apply)
107 | options:
108 | - label: "SAP CAP (Cloud Application Programming)"
109 | - label: "SAPUI5/OpenUI5"
110 | - label: "SAP Fiori"
111 | - label: "SAP BTP (Business Technology Platform)"
112 | - label: "SAP S/4HANA"
113 | - label: "SAP Analytics Cloud"
114 | - label: "SAP Integration Suite"
115 | - label: "SAP Mobile Development"
116 | - label: "SAP Testing (wdi5, etc.)"
117 | - label: "SAP Build"
118 | - label: "Cross-platform/General"
119 | - label: "Other"
120 | validations:
121 | required: true
122 |
123 | - type: textarea
124 | id: alternative-searches
125 | attributes:
126 | label: Alternative Search Terms Tried
127 | description: What other search terms or variations did you try?
128 | placeholder: |
129 | - "JWT authentication CAP"
130 | - "token based auth SAP"
131 | - "security configuration"
132 | - "OAuth2 CAP"
133 | render: text
134 | validations:
135 | required: false
136 |
137 | - type: textarea
138 | id: document-reference
139 | attributes:
140 | label: Document Reference/URL
141 | description: If you have a direct link to the document that should be indexed, please provide it
142 | placeholder: "https://help.sap.com/docs/..."
143 | validations:
144 | required: false
145 |
146 | - type: textarea
147 | id: additional-context
148 | attributes:
149 | label: Additional Context
150 | description: Any other information that might help us locate and index the missing content
151 | placeholder: |
152 | - When did you last see this documentation?
153 | - Is this a new feature that might not be indexed yet?
154 | - Are there related documents that were found correctly?
155 | validations:
156 | required: false
157 |
```
--------------------------------------------------------------------------------
/src/lib/url-generation/sapui5.ts:
--------------------------------------------------------------------------------
```typescript
1 | /**
2 | * URL generation for SAPUI5 documentation sources
3 | * Handles SAPUI5 guides, API docs, and samples
4 | */
5 |
6 | import { BaseUrlGenerator, UrlGenerationContext } from './BaseUrlGenerator.js';
7 | import { FrontmatterData } from './utils.js';
8 | import { DocUrlConfig } from '../metadata.js';
9 |
10 | export interface SapUi5UrlOptions {
11 | relFile: string;
12 | content: string;
13 | config: DocUrlConfig;
14 | libraryId: string;
15 | }
16 |
17 | /**
18 | * SAPUI5 URL Generator
19 | * Handles SAPUI5 guides, OpenUI5 API docs, and samples with different URL patterns
20 | */
21 | export class SapUi5UrlGenerator extends BaseUrlGenerator {
22 |
23 | protected generateSourceSpecificUrl(context: UrlGenerationContext & {
24 | frontmatter: FrontmatterData;
25 | section: string;
26 | anchor: string | null;
27 | }): string | null {
28 |
29 | switch (this.libraryId) {
30 | case '/sapui5':
31 | return this.generateSapUi5Url(context);
32 | case '/openui5-api':
33 | return this.generateOpenUi5ApiUrl(context);
34 | case '/openui5-samples':
35 | return this.generateOpenUi5SampleUrl(context);
36 | default:
37 | return this.generateSapUi5Url(context);
38 | }
39 | }
40 |
41 | /**
42 | * Generate URL for SAPUI5 documentation
43 | * SAPUI5 uses topic-based URLs with # fragments
44 | */
45 | private generateSapUi5Url(context: UrlGenerationContext & {
46 | frontmatter: FrontmatterData;
47 | section: string;
48 | anchor: string | null;
49 | }): string | null {
50 | // SAPUI5 docs often have topic IDs in frontmatter
51 | const topicId = context.frontmatter.id || context.frontmatter.topic;
52 | if (topicId) {
53 | return `${this.config.baseUrl}/#/topic/${topicId}`;
54 | }
55 |
56 | // SAPUI5 docs also use HTML comments with loio pattern: <!-- loio{id} -->
57 | const loioMatch = context.content?.match(/<!--\s*loio([a-f0-9]+)\s*-->/);
58 | if (loioMatch) {
59 | return `${this.config.baseUrl}/#/topic/${loioMatch[1]}`;
60 | }
61 |
62 | // Extract topic ID from filename if following SAPUI5 conventions (UUID pattern)
63 | const topicIdMatch = context.relFile.match(/([a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12})/i);
64 | if (topicIdMatch) {
65 | return `${this.config.baseUrl}/#/topic/${topicIdMatch[1]}`;
66 | }
67 |
68 | return null; // Let fallback handle it
69 | }
70 |
71 | /**
72 | * Generate URL for OpenUI5 API documentation
73 | * API docs use control/namespace-based URLs
74 | */
75 | private generateOpenUi5ApiUrl(context: UrlGenerationContext & {
76 | frontmatter: FrontmatterData;
77 | section: string;
78 | anchor: string | null;
79 | }): string | null {
80 | // Extract control name from file path (e.g., src/sap/m/Button.js -> sap.m.Button)
81 | const pathMatch = context.relFile.match(/src\/(sap\/[^\/]+\/[^\/]+)\.js$/);
82 | if (pathMatch) {
83 | const controlPath = pathMatch[1].replace(/\//g, '.');
84 | return `${this.config.baseUrl}/#/api/${controlPath}`;
85 | }
86 |
87 | // Alternative pattern matching
88 | const controlMatch = context.relFile.match(/\/([^\/]+)\.js$/);
89 | if (controlMatch) {
90 | const controlName = controlMatch[1];
91 |
92 | // Check if it's a full namespace path
93 | if (controlName.includes('.')) {
94 | return `${this.config.baseUrl}/#/api/${controlName}`;
95 | }
96 |
97 | // Try to extract namespace from content
98 | const namespaceMatch = context.content.match(/sap\.([a-z]+\.[A-Za-z0-9_]+)/);
99 | if (namespaceMatch) {
100 | return `${this.config.baseUrl}/#/api/${namespaceMatch[0]}`;
101 | }
102 |
103 | // Fallback to control name only
104 | return `${this.config.baseUrl}/#/api/${controlName}`;
105 | }
106 |
107 | return null; // Let fallback handle it
108 | }
109 |
110 | /**
111 | * Generate URL for OpenUI5 samples
112 | * Samples use sample-specific paths without # prefix
113 | */
114 | private generateOpenUi5SampleUrl(context: UrlGenerationContext & {
115 | frontmatter: FrontmatterData;
116 | section: string;
117 | anchor: string | null;
118 | }): string | null {
119 | // Extract sample ID from path patterns like:
120 | // /src/sap.m/test/sap/m/demokit/sample/ButtonWithBadge/Component.js
121 | const sampleMatch = context.relFile.match(/sample\/([^\/]+)\/([^\/]+)$/);
122 | if (sampleMatch) {
123 | const [, sampleName, fileName] = sampleMatch;
124 | // For samples, we construct the sample entity URL without # prefix
125 | return `${this.config.baseUrl}/entity/sap.m.Button/sample/sap.m.sample.${sampleName}`;
126 | }
127 |
128 | // Alternative pattern for samples
129 | const buttonSampleMatch = context.relFile.match(/\/([^\/]+)\/test\/sap\/m\/demokit\/sample\/([^\/]+)\//);
130 | if (buttonSampleMatch) {
131 | const [, controlLibrary, sampleName] = buttonSampleMatch;
132 | return `${this.config.baseUrl}/entity/sap.${controlLibrary}.Button/sample/sap.${controlLibrary}.sample.${sampleName}`;
133 | }
134 |
135 | return null; // Let fallback handle it
136 | }
137 | }
138 |
139 | // Convenience functions for backward compatibility
140 |
141 | /**
142 | * Generate URL for SAPUI5 documentation using the class-based approach
143 | */
144 | export function generateSapUi5Url(options: SapUi5UrlOptions): string | null {
145 | const generator = new SapUi5UrlGenerator(options.libraryId, options.config);
146 | return generator.generateUrl(options);
147 | }
148 |
149 | /**
150 | * Generate URL for OpenUI5 API documentation
151 | */
152 | export function generateOpenUi5ApiUrl(options: SapUi5UrlOptions): string | null {
153 | const generator = new SapUi5UrlGenerator('/openui5-api', options.config);
154 | return generator.generateUrl(options);
155 | }
156 |
157 | /**
158 | * Generate URL for OpenUI5 samples
159 | */
160 | export function generateOpenUi5SampleUrl(options: SapUi5UrlOptions): string | null {
161 | const generator = new SapUi5UrlGenerator('/openui5-samples', options.config);
162 | return generator.generateUrl(options);
163 | }
164 |
165 | /**
166 | * Main dispatcher for UI5-related URL generation
167 | */
168 | export function generateUi5UrlForLibrary(options: SapUi5UrlOptions): string | null {
169 | const generator = new SapUi5UrlGenerator(options.libraryId, options.config);
170 | return generator.generateUrl(options);
171 | }
172 |
173 |
```
--------------------------------------------------------------------------------
/src/lib/metadata.ts:
--------------------------------------------------------------------------------
```typescript
1 | // Metadata and configuration management
2 | import fs from "fs";
3 | import path from "path";
4 | import { CONFIG } from "./config.js";
5 |
6 | export type SourceMeta = {
7 | id: string;
8 | type: string;
9 | lang?: string;
10 | boost?: number;
11 | tags?: string[];
12 | description?: string;
13 | libraryId?: string;
14 | sourcePath?: string;
15 | baseUrl?: string;
16 | pathPattern?: string;
17 | anchorStyle?: 'docsify' | 'github' | 'custom';
18 | };
19 |
20 | export type DocUrlConfig = {
21 | baseUrl: string;
22 | pathPattern: string;
23 | anchorStyle: 'docsify' | 'github' | 'custom';
24 | };
25 |
26 | export type Metadata = {
27 | version: number;
28 | updated_at: string;
29 | description?: string;
30 | sources: SourceMeta[];
31 | acronyms?: Record<string, string[]>;
32 | synonyms?: Array<{ from: string; to: string[] }>;
33 | contextBoosts?: Record<string, Record<string, number>>;
34 | libraryMappings?: Record<string, string>;
35 | contextEmojis?: Record<string, string>;
36 | };
37 |
38 | let META: Metadata | null = null;
39 | let BOOSTS: Record<string, number> = {};
40 | let SYNONYM_MAP: Record<string, string[]> = {};
41 |
42 | export function loadMetadata(metaPath?: string): Metadata {
43 | if (META) return META;
44 |
45 | const finalPath = metaPath || path.resolve(process.cwd(), CONFIG.METADATA_PATH);
46 |
47 | try {
48 | const raw = fs.readFileSync(finalPath, "utf8");
49 | META = JSON.parse(raw) as Metadata;
50 |
51 | // Build source boosts map
52 | BOOSTS = Object.fromEntries(
53 | (META.sources || []).map(s => [s.id, s.boost || 0])
54 | );
55 |
56 | // Build synonym map (including acronyms)
57 | const syn: Record<string, string[]> = {};
58 | for (const [k, arr] of Object.entries(META.acronyms || {})) {
59 | syn[k.toLowerCase()] = arr;
60 | }
61 | for (const s of META.synonyms || []) {
62 | syn[s.from.toLowerCase()] = s.to;
63 | }
64 | SYNONYM_MAP = syn;
65 |
66 | console.log(`✅ Metadata loaded: ${META.sources.length} sources, ${Object.keys(SYNONYM_MAP).length} synonyms`);
67 | return META;
68 | } catch (error) {
69 | console.warn(`⚠️ Could not load metadata from ${finalPath}, using defaults:`, error);
70 |
71 | // Fallback to minimal defaults
72 | META = {
73 | version: 1,
74 | updated_at: new Date().toISOString(),
75 | sources: [],
76 | synonyms: [],
77 | acronyms: {}
78 | };
79 |
80 | BOOSTS = {};
81 | SYNONYM_MAP = {};
82 |
83 | return META;
84 | }
85 | }
86 |
87 | export function getSourceBoosts(): Record<string, number> {
88 | if (!META) loadMetadata();
89 | return BOOSTS;
90 | }
91 |
92 | export function expandQueryTerms(q: string): string[] {
93 | if (!META) loadMetadata();
94 |
95 | const terms = new Set<string>();
96 | const low = q.toLowerCase();
97 | terms.add(q);
98 |
99 | // Apply synonyms and acronyms
100 | for (const [from, toList] of Object.entries(SYNONYM_MAP)) {
101 | if (low.includes(from)) {
102 | for (const t of toList) {
103 | terms.add(q.replace(new RegExp(from, "ig"), t));
104 | }
105 | }
106 | }
107 |
108 | return Array.from(terms);
109 | }
110 |
111 | export function getMetadata(): Metadata {
112 | if (!META) loadMetadata();
113 | return META!;
114 | }
115 |
116 | // Get documentation URL configuration for a library
117 | export function getDocUrlConfig(libraryId: string): DocUrlConfig | null {
118 | if (!META) loadMetadata();
119 | if (!META) return null;
120 | const source = META.sources.find(s => s.libraryId === libraryId);
121 | if (!source || !source.baseUrl || !source.pathPattern || !source.anchorStyle) {
122 | return null;
123 | }
124 | return {
125 | baseUrl: source.baseUrl,
126 | pathPattern: source.pathPattern,
127 | anchorStyle: source.anchorStyle
128 | };
129 | }
130 |
131 | // Get all documentation URL configurations
132 | export function getAllDocUrlConfigs(): Record<string, DocUrlConfig> {
133 | if (!META) loadMetadata();
134 | if (!META) return {};
135 | const configs: Record<string, DocUrlConfig> = {};
136 | for (const source of META.sources) {
137 | if (source.libraryId && source.baseUrl && source.pathPattern && source.anchorStyle) {
138 | configs[source.libraryId] = {
139 | baseUrl: source.baseUrl,
140 | pathPattern: source.pathPattern,
141 | anchorStyle: source.anchorStyle
142 | };
143 | }
144 | }
145 | return configs;
146 | }
147 |
148 | // Get source path for a library
149 | export function getSourcePath(libraryId: string): string | null {
150 | if (!META) loadMetadata();
151 | if (!META) return null;
152 | const source = META.sources.find(s => s.libraryId === libraryId);
153 | return source?.sourcePath || null;
154 | }
155 |
156 | // Get all source paths
157 | export function getAllSourcePaths(): Record<string, string> {
158 | if (!META) loadMetadata();
159 | if (!META) return {};
160 | const paths: Record<string, string> = {};
161 | for (const source of META.sources) {
162 | if (source.libraryId && source.sourcePath) {
163 | paths[source.libraryId] = source.sourcePath;
164 | }
165 | }
166 | return paths;
167 | }
168 |
169 | // Get context boosts for a specific context
170 | export function getContextBoosts(context: string): Record<string, number> {
171 | if (!META) loadMetadata();
172 | if (!META) return {};
173 | return META.contextBoosts?.[context] || {};
174 | }
175 |
176 | // Get all context boosts
177 | export function getAllContextBoosts(): Record<string, Record<string, number>> {
178 | if (!META) loadMetadata();
179 | if (!META) return {};
180 | return META.contextBoosts || {};
181 | }
182 |
183 | // Get library mapping for source ID
184 | export function getLibraryMapping(sourceId: string): string | null {
185 | if (!META) loadMetadata();
186 | if (!META) return null;
187 | return META.libraryMappings?.[sourceId] || null;
188 | }
189 |
190 | // Get all library mappings
191 | export function getAllLibraryMappings(): Record<string, string> {
192 | if (!META) loadMetadata();
193 | if (!META) return {};
194 | return META.libraryMappings || {};
195 | }
196 |
197 | // Get context emoji
198 | export function getContextEmoji(context: string): string {
199 | if (!META) loadMetadata();
200 | if (!META) return '🔍';
201 | return META.contextEmojis?.[context] || '🔍';
202 | }
203 |
204 | // Get all context emojis
205 | export function getAllContextEmojis(): Record<string, string> {
206 | if (!META) loadMetadata();
207 | if (!META) return {};
208 | return META.contextEmojis || {};
209 | }
210 |
211 | // Get source by library ID
212 | export function getSourceByLibraryId(libraryId: string): SourceMeta | null {
213 | if (!META) loadMetadata();
214 | if (!META) return null;
215 | return META.sources.find(s => s.libraryId === libraryId) || null;
216 | }
217 |
218 | // Get source by ID
219 | export function getSourceById(id: string): SourceMeta | null {
220 | if (!META) loadMetadata();
221 | if (!META) return null;
222 | return META.sources.find(s => s.id === id) || null;
223 | }
224 |
```
--------------------------------------------------------------------------------
/docs/ABAP-STANDARD-INTEGRATION.md:
--------------------------------------------------------------------------------
```markdown
1 | # ABAP Documentation - Standard System Integration
2 |
3 | ## ✅ **Integration Complete**
4 |
5 | ABAP documentation is now integrated as a **standard source** in the MCP system, just like UI5, CAP, and other sources. No special tools needed!
6 |
7 | ## **What Was Added**
8 |
9 | ### **1. Standard Metadata Configuration**
10 | ```json
11 | // src/metadata.json
12 | {
13 | "id": "abap-docs",
14 | "type": "documentation",
15 | "boost": 0.95,
16 | "tags": ["abap", "keyword-documentation", "language-reference"],
17 | "libraryId": "/abap-docs",
18 | "sourcePath": "abap-docs/docs/7.58/md",
19 | "baseUrl": "https://help.sap.com/doc/abapdocu_758_index_htm/7.58/en-US"
20 | }
21 | ```
22 |
23 | ### **2. Standard Index Configuration**
24 | ```typescript
25 | // scripts/build-index.ts
26 | {
27 | repoName: "abap-docs",
28 | absDir: join("sources", "abap-docs", "docs", "7.58", "md"),
29 | id: "/abap-docs",
30 | name: "ABAP Keyword Documentation",
31 | filePattern: "*.md", // Individual files, not bundles!
32 | type: "markdown"
33 | }
34 | ```
35 |
36 | ### **3. Custom URL Generator**
37 | ```typescript
38 | // src/lib/url-generation/abap.ts
39 | export class AbapUrlGenerator extends BaseUrlGenerator {
40 | generateUrl(context): string {
41 | // Converts: abeninline_declarations.md
42 | // To: https://help.sap.com/doc/abapdocu_758_index_htm/7.58/en-US/abeninline_declarations.htm
43 | }
44 | }
45 | ```
46 |
47 | ### **4. Git Submodule**
48 | ```bash
49 | # .gitmodules (already exists)
50 | [submodule "sources/abap-docs"]
51 | path = sources/abap-docs
52 | url = https://github.com/marianfoo/abap-docs.git
53 | branch = main
54 | ```
55 |
56 | ## **How It Works**
57 |
58 | ### **🔍 Search Integration**
59 | Uses the **standard `search`** tool - no special ABAP tools needed!
60 |
61 | ```javascript
62 | // Query examples that will find ABAP docs:
63 | "SELECT statements in ABAP" → Finds individual SELECT documentation files
64 | "internal table operations" → Finds table-related ABAP files
65 | "exception handling" → Finds TRY/CATCH documentation
66 | "ABAP class definition" → Finds OOP documentation
67 | ```
68 |
69 | ### **📄 File Structure**
70 | ```
71 | sources/abap-docs/docs/7.58/md/
72 | ├── abeninline_declarations.md (3KB) ← Perfect for LLMs!
73 | ├── abenselect.md (5KB) ← Individual statement docs
74 | ├── abenloop.md (4KB) ← Focused content
75 | ├── abenclass.md (8KB) ← OOP documentation
76 | └── ... 6,000+ more individual files
77 | ```
78 |
79 | ### **🔗 URL Generation**
80 | - `abeninline_declarations.md` → `https://help.sap.com/doc/abapdocu_758_index_htm/7.58/en-US/abeninline_declarations.htm`
81 | - Works across all ABAP versions (7.52-7.58, latest)
82 | - Direct links to official SAP documentation
83 |
84 | ## **Setup Instructions**
85 |
86 | ### **1. Initialize Submodule**
87 | ```bash
88 | cd /Users/marianzeis/DEV/sap-docs-mcp
89 | git submodule update --init --recursive sources/abap-docs
90 | ```
91 |
92 | ### **2. Optimize ABAP Source** (Recommended)
93 | ```bash
94 | cd sources/abap-docs
95 | node scripts/generate.js --version 7.58 --standard-system
96 | ```
97 | This will:
98 | - ✅ Fix all JavaScript links → proper SAP URLs
99 | - ✅ Add source attribution to each file
100 | - ✅ Optimize content structure for LLM consumption
101 | - ✅ Create clean individual .md files (no complex bundles)
102 |
103 | ### **3. Build Index**
104 | ```bash
105 | cd /Users/marianzeis/DEV/sap-docs-mcp
106 | npm run build:index
107 | ```
108 |
109 | ### **4. Build FTS Database**
110 | ```bash
111 | npm run build:fts
112 | ```
113 |
114 | ### **5. Test Integration**
115 | ```bash
116 | npm test
117 | curl -X POST http://localhost:3000/search \
118 | -H "Content-Type: application/json" \
119 | -d '{"query": "ABAP inline declarations"}'
120 | ```
121 |
122 | ## **Expected Results**
123 |
124 | ### **Standard Search Query**
125 | ```json
126 | {
127 | "tool": "search",
128 | "query": "ABAP inline declarations"
129 | }
130 | ```
131 |
132 | ### **Expected Response**
133 | ```
134 | Found 5 results for 'ABAP inline declarations':
135 |
136 | ⚡ **Inline Declarations (ABAP 7.58)**
137 | Data declarations directly in ABAP statements for cleaner code...
138 | 🔗 https://help.sap.com/doc/abapdocu_758_index_htm/7.58/en-US/abeninline_declarations.htm
139 | 📋 3KB | individual | beginner
140 |
141 | ⚡ **DATA - Inline Declaration (ABAP 7.58)**
142 | Creating data objects inline using DATA() operator...
143 | 🔗 https://help.sap.com/doc/abapdocu_758_index_htm/7.58/en-US/abendata_inline.htm
144 | 📋 2KB | individual | intermediate
145 | ```
146 |
147 | ## **Key Benefits**
148 |
149 | ### ✅ **Standard Integration**
150 | - **No special tools** - uses existing `search`
151 | - **Same interface** as UI5, CAP, wdi5 sources
152 | - **Consistent behavior** with other documentation
153 |
154 | ### ✅ **Perfect LLM Experience**
155 | - **6,000+ individual files** (1-10KB each)
156 | - **Direct SAP documentation URLs** for attribution
157 | - **Clean markdown** optimized for context windows
158 |
159 | ### ✅ **High Search Quality**
160 | - **BM25 FTS5 search** - same quality as other sources
161 | - **Context-aware boosting** - ABAP queries get ABAP results
162 | - **Proper scoring** integrated with general search
163 |
164 | ### ✅ **Easy Maintenance**
165 | - **Standard build process** - same as other sources
166 | - **No complex bundling** - simple file-based approach
167 | - **Version support** - easy to add 7.57, 7.56, etc.
168 |
169 | ## **Multi-Version Support** (Future)
170 |
171 | To add more ABAP versions:
172 |
173 | ```typescript
174 | // Add to build-index.ts
175 | {
176 | repoName: "abap-docs",
177 | absDir: join("sources", "abap-docs", "docs", "7.57", "md"),
178 | id: "/abap-docs-757",
179 | name: "ABAP Keyword Documentation 7.57"
180 | },
181 | {
182 | repoName: "abap-docs",
183 | absDir: join("sources", "abap-docs", "docs", "latest", "md"),
184 | id: "/abap-docs-latest",
185 | name: "ABAP Keyword Documentation (Latest)"
186 | }
187 | ```
188 |
189 | ## **Performance Characteristics**
190 |
191 | - **Index Size**: ~6,000 documents (vs 42,901 with specialized system)
192 | - **Search Speed**: ~50ms (standard FTS5 performance)
193 | - **File Sizes**: 1-10KB each (perfect for LLM consumption)
194 | - **Memory Usage**: Standard - no special caching needed
195 |
196 | ## **Migration from Specialized Tools**
197 |
198 | ### **Old Approach (Specialized)**
199 | ```javascript
200 | // Required separate tools
201 | abap_search: "inline declarations"
202 | abap_get: "abap-7.58-individual-7.58-abeninline_declarations"
203 | ```
204 |
205 | ### **New Approach (Standard)**
206 | ```javascript
207 | // Uses standard tool like everything else
208 | search: "ABAP inline declarations"
209 | fetch: "/abap-docs/abeninline_declarations.md"
210 | ```
211 |
212 | **Result: Same quality, simpler interface, standard integration!** 🚀
213 |
214 | ---
215 |
216 | ## **✅ Integration Status: COMPLETE**
217 |
218 | ABAP documentation is now fully integrated as a standard source:
219 | - ✅ **Metadata configured**
220 | - ✅ **Build index updated**
221 | - ✅ **URL generator created**
222 | - ✅ **Submodule exists**
223 | - ✅ **Tests added**
224 |
225 | **Ready for production use with the standard MCP search system!**
226 |
```
--------------------------------------------------------------------------------
/docs/ARCHITECTURE.md:
--------------------------------------------------------------------------------
```markdown
1 | # 🏗️ SAP Docs MCP Architecture
2 |
3 | ## System Overview
4 |
5 | ```mermaid
6 | graph TD
7 | A[User Query] --> B[MCP Server]
8 | B --> C[Search Pipeline]
9 | C --> D[FTS5 SQLite]
10 | C --> E[Metadata APIs]
11 | D --> F[BM25 Scoring]
12 | E --> F
13 | F --> G[Context Awareness]
14 | G --> H[Result Formatting]
15 | H --> I[Formatted Response]
16 |
17 | J[Documentation Sources] --> K[Build Index]
18 | K --> L[dist/data/index.json]
19 | K --> M[dist/data/docs.sqlite]
20 | L --> C
21 | M --> D
22 |
23 | N[src/metadata.json] --> E
24 | ```
25 |
26 | ## Core Components
27 |
28 | ### 🔍 **Search Pipeline**
29 | 1. **Query Processing**: Parse and expand user queries with synonyms/acronyms
30 | 2. **FTS5 Search**: Fast SQLite full-text search using BM25 algorithm
31 | 3. **Metadata Integration**: Apply source boosts, context awareness, and mappings
32 | 4. **Result Formatting**: Structure results with scores, context indicators, and source attribution
33 |
34 | ### 📊 **Data Flow**
35 | ```
36 | Documentation Sources → Build Scripts → Search Artifacts → Runtime Search
37 | (12 sources) → (2 files) → (index.json + → (BM25 +
38 | docs.sqlite) metadata)
39 | ```
40 |
41 | ### 🎯 **Metadata-Driven Configuration**
42 | - **Single Source**: `src/metadata.json` contains all source configurations
43 | - **Type-Safe APIs**: 12 functions in `src/lib/metadata.ts` for configuration access
44 | - **Runtime Loading**: Metadata loaded once at startup with graceful fallbacks
45 | - **Zero Code Changes**: Add new sources by updating metadata.json only
46 |
47 | ## Server Architecture
48 |
49 | ### 🖥️ **Three Server Modes**
50 | 1. **Stdio MCP** (`src/server.ts`): Main server for Claude/LLM integration
51 | 2. **HTTP Server** (`src/http-server.ts`): Development and status monitoring (port 3001)
52 | 3. **Streamable HTTP** (`src/streamable-http-server.ts`): Production HTTP MCP (port 3122)
53 |
54 | ### 🔧 **MCP Tools (5 total)**
55 | - `search`: Search across all documentation
56 | - `fetch`: Retrieve specific documents
57 | - `sap_community_search`: SAP Community integration
58 | - `sap_help_search`: SAP Help Portal search
59 | - `sap_help_get`: SAP Help content retrieval
60 |
61 | ## Performance Characteristics
62 |
63 | ### ⚡ **Search Performance**
64 | - **Sub-second**: FTS5 provides fast full-text search
65 | - **Scalable**: Performance consistent as documentation grows
66 | - **Efficient**: BM25-only approach eliminates ML model overhead
67 | - **Reliable**: No external dependencies for core search
68 |
69 | ### 📈 **Resource Usage**
70 | - **Memory**: ~50-100MB for index and metadata
71 | - **Disk**: ~3.5MB SQLite database + ~2MB JSON index
72 | - **CPU**: Minimal - BM25 scoring is computationally light
73 | - **Network**: Only for community/help portal integrations
74 |
75 | ## Documentation Sources (12 total)
76 |
77 | ### 📚 **Primary Sources**
78 | - **SAPUI5**: Frontend framework documentation
79 | - **CAP**: Cloud Application Programming model
80 | - **OpenUI5 API**: Control API documentation
81 | - **wdi5**: Testing framework documentation
82 |
83 | ### 🔧 **Supporting Sources**
84 | - **Cloud SDK (JS/Java)**: SAP Cloud SDK documentation
85 | - **Cloud SDK AI (JS/Java)**: AI-enhanced SDK documentation
86 | - **UI5 Tooling**: Build and development tools
87 | - **UI5 Web Components**: Modern web components
88 | - **Cloud MTA Build Tool**: Multi-target application builder
89 |
90 | ## Context Awareness
91 |
92 | ### 🎯 **Context Detection**
93 | - **UI5 Context** 🎨: Controls, Fiori, frontend development
94 | - **CAP Context** 🏗️: CDS, entities, services, backend
95 | - **wdi5 Context** 🧪: Testing, automation, browser testing
96 | - **Mixed Context** 🔀: Cross-platform or unclear intent
97 |
98 | ### 📊 **Intelligent Scoring**
99 | - **Source Boosts**: Context-specific score adjustments
100 | - **Library Mappings**: Resolve source IDs to canonical names
101 | - **Query Expansion**: Synonyms and acronyms for better recall
102 | - **Penalty System**: Reduce off-context results
103 |
104 | ## Build Process
105 |
106 | ### 🔨 **Enhanced Build Pipeline**
107 | ```bash
108 | npm run build:tsc # TypeScript compilation → dist/src/
109 | npm run build:index # Sources → dist/data/index.json
110 | npm run build:fts # Index → dist/data/docs.sqlite
111 | npm run build # Complete pipeline (tsc + index + fts)
112 | ```
113 |
114 | ### 📦 **Submodule Management**
115 | ```bash
116 | npm run setup # Complete setup with enhanced submodule handling
117 | npm run setup:submodules # Submodule sync and update only
118 | ```
119 |
120 | The enhanced setup script provides:
121 | - **Shallow Clones**: `--depth 1` with `--filter=blob:none` for minimal size
122 | - **Single Branch**: Only fetch the target branch (main/master)
123 | - **Repository Compaction**: Aggressive GC and storage optimization
124 | - **Fallback Handling**: Auto-retry with master if branch fails
125 | - **Skip Nested**: `SKIP_NESTED_SUBMODULES=1` for deployment speed
126 |
127 | ### 📦 **Deployment Artifacts**
128 | - `dist/data/index.json`: Structured documentation index
129 | - `dist/data/docs.sqlite`: FTS5 search database
130 | - `dist/src/`: Compiled TypeScript server code
131 | - `src/metadata.json`: Runtime configuration
132 |
133 | ## Production Deployment
134 |
135 | ### 🚀 **PM2 Configuration**
136 | - **3 Processes**: Proxy (18080), HTTP (3001), Streamable (3122)
137 | - **Health Monitoring**: Multiple endpoints for status checks
138 | - **Auto-restart**: Configurable restart policies
139 | - **Logging**: Structured JSON logging in production
140 |
141 | ### 🔄 **CI/CD Pipeline**
142 | 1. **GitHub Actions**: Triggered on main branch push
143 | 2. **SSH Deployment**: Connect to production server
144 | 3. **Build Process**: Run complete build pipeline
145 | 4. **PM2 Restart**: Restart all processes with new code
146 | 5. **Health Validation**: Verify all endpoints responding
147 |
148 | ## Key Design Principles
149 |
150 | ### 🎯 **Simplicity First**
151 | - **BM25-Only**: No complex ML models or external dependencies
152 | - **SQLite**: Single-file database for easy deployment
153 | - **Metadata-Driven**: Configuration without code changes
154 |
155 | ### 🔒 **Reliability**
156 | - **Graceful Fallbacks**: Handle missing data and errors elegantly
157 | - **Type Safety**: Comprehensive TypeScript interfaces
158 | - **Testing**: Smoke tests and integration validation
159 |
160 | ### 📈 **Performance**
161 | - **Fast Search**: Sub-second response times
162 | - **Efficient Indexing**: Optimized FTS5 schema
163 | - **Minimal Resources**: Low memory and CPU usage
164 |
165 | ### 🔧 **Maintainability**
166 | - **Single Source of Truth**: Centralized configuration
167 | - **Clear Separation**: Distinct layers for search, metadata, and presentation
168 | - **Comprehensive Documentation**: Architecture, APIs, and deployment guides
169 |
```
--------------------------------------------------------------------------------
/docs/ABAP-MULTI-VERSION-INTEGRATION.md:
--------------------------------------------------------------------------------
```markdown
1 | # ✅ **ABAP Multi-Version Integration Complete**
2 |
3 | ## 🎯 **Integration Summary**
4 |
5 | ABAP documentation is now fully integrated as **standard sources** across all versions with intelligent auto-detection capabilities.
6 |
7 | ### **📊 Statistics: 42,901 ABAP Files Across 8 Versions**
8 |
9 | | Version | Files | Avg Size | Status |
10 | |---------|-------|----------|--------|
11 | | 7.58 | 6,088 | 5,237B | ✅ Active (default) |
12 | | latest | 6,089 | 5,059B | ✅ Active (boost: 0.90) |
13 | | 7.57 | 5,808 | 5,026B | ✅ Active (boost: 0.95) |
14 | | 7.56 | 5,605 | 4,498B | ✅ Active (boost: 0.90) |
15 | | 7.55 | 5,154 | 4,146B | ✅ Active (boost: 0.85) |
16 | | 7.54 | 4,905 | 4,052B | ✅ Active (boost: 0.80) |
17 | | 7.53 | 4,680 | 3,992B | ✅ Active (boost: 0.75) |
18 | | 7.52 | 4,572 | 3,931B | ✅ Active (boost: 0.70) |
19 | | **Total** | **42,901** | **4,493B** | **8 versions** |
20 |
21 | ---
22 |
23 | ## 🚀 **Features**
24 |
25 | ### **✅ Standard Integration**
26 | - **No special tools** - uses existing `search` like UI5, CAP, wdi5
27 | - **63,454 total documents** indexed (up from 20,553)
28 | - **30.52 MB FTS5 database** for lightning-fast search
29 |
30 | ### **✅ Intelligent Version Auto-Detection**
31 |
32 | #### **Query Examples:**
33 | ```bash
34 | # Version auto-detection from queries
35 | "LOOP 7.57" → Searches ABAP 7.57 specifically
36 | "SELECT latest" → Searches latest ABAP version
37 | "exception handling 7.53" → Searches ABAP 7.53 specifically
38 | "inline declarations" → Searches ABAP 7.58 (default)
39 | "class definition 7.56" → Searches ABAP 7.56 specifically
40 | ```
41 |
42 | #### **Results Show Correct Versions:**
43 | ```
44 | Query: "LOOP 7.57"
45 | ✅ /abap-docs-757/abapcheck_loop (Score: 15.60)
46 | ✅ /abap-docs-757/abapexit_loop (Score: 15.60)
47 | ✅ /abap-docs-757/abenabap_loops (Score: 15.60)
48 |
49 | Query: "SELECT latest"
50 | ✅ /abap-docs-latest/abenfree_selections (Score: 12.19)
51 | ✅ /abap-docs-latest/abenldb_selections (Score: 12.19)
52 | ✅ /abap-docs-latest/abapat_line-selection (Score: 12.10)
53 | ```
54 |
55 | ### **✅ Cross-Source Intelligence**
56 | Finds related content across all SAP sources:
57 |
58 | ```
59 | Query: "exception handling 7.53"
60 | ✅ ABAP 7.53 official docs (/abap-docs-753/)
61 | ✅ Clean ABAP style guides (/sap-styleguides/)
62 | ✅ ABAP cheat sheets (/abap-cheat-sheets/)
63 | ```
64 |
65 | ### **✅ Perfect LLM Experience**
66 | - **Individual files** (1-10KB each) - perfect for context windows
67 | - **Official attribution** - every file links to help.sap.com
68 | - **Clean structure** - optimized markdown for LLM consumption
69 |
70 | ---
71 |
72 | ## 🔧 **Technical Implementation**
73 |
74 | ### **Metadata Configuration (27 Total Sources)**
75 | ```json
76 | {
77 | "sources": [
78 | { "id": "abap-docs-758", "boost": 0.95, "tags": ["abap", "7.58"] },
79 | { "id": "abap-docs-latest", "boost": 0.90, "tags": ["abap", "latest"] },
80 | { "id": "abap-docs-757", "boost": 0.95, "tags": ["abap", "7.57"] },
81 | { "id": "abap-docs-756", "boost": 0.90, "tags": ["abap", "7.56"] },
82 | { "id": "abap-docs-755", "boost": 0.85, "tags": ["abap", "7.55"] },
83 | { "id": "abap-docs-754", "boost": 0.80, "tags": ["abap", "7.54"] },
84 | { "id": "abap-docs-753", "boost": 0.75, "tags": ["abap", "7.53"] },
85 | { "id": "abap-docs-752", "boost": 0.70, "tags": ["abap", "7.52"] }
86 | ]
87 | }
88 | ```
89 |
90 | ### **Context Boosting Strategy**
91 | ```typescript
92 | "ABAP": {
93 | "/abap-docs-758": 1.0, // Highest priority for general ABAP
94 | "/abap-docs-latest": 0.98, // Latest features
95 | "/abap-docs-757": 0.95, // Recent stable
96 | "/abap-docs-756": 0.90, // Stable
97 | // ... decreasing boost for older versions
98 | }
99 | ```
100 |
101 | ### **URL Generation per Version**
102 | ```typescript
103 | // Automatic version-specific URLs
104 | "/abap-docs-757/abenloop.md"
105 | → "https://help.sap.com/doc/abapdocu_757_index_htm/7.57/en-US/abenloop.htm"
106 |
107 | "/abap-docs-latest/abenselect.md"
108 | → "https://help.sap.com/doc/abapdocu_latest_index_htm/latest/en-US/abenselect.htm"
109 | ```
110 |
111 | ---
112 |
113 | ## 🎯 **Usage Patterns**
114 |
115 | ### **Version-Specific Queries**
116 | ```bash
117 | # Search specific ABAP versions
118 | search: "LOOP AT 7.57" # → ABAP 7.57 docs
119 | search: "CDS views latest" # → Latest ABAP docs
120 | search: "class definition 7.53" # → ABAP 7.53 docs
121 | ```
122 |
123 | ### **General ABAP Queries (Default 7.58)**
124 | ```bash
125 | search: "SELECT statements" # → ABAP 7.58 docs
126 | search: "internal tables" # → ABAP 7.58 docs
127 | search: "exception handling" # → ABAP 7.58 docs
128 | ```
129 |
130 | ### **Cross-Source Results**
131 | ```bash
132 | search: "inline declarations"
133 | # Returns:
134 | ✅ Official ABAP docs (version-specific)
135 | ✅ Clean ABAP style guides
136 | ✅ ABAP cheat sheets
137 | ✅ Related UI5/CAP content
138 | ```
139 |
140 | ---
141 |
142 | ## 📈 **Performance & Quality**
143 |
144 | ### **Search Performance**
145 | - **~50ms search time** (standard FTS5 performance)
146 | - **63,454 total documents** in searchable index
147 | - **30.52 MB database** - efficient storage
148 |
149 | ### **Result Quality**
150 | - **Version-aware scoring** - newer versions get slight boost
151 | - **Cross-source intelligence** - finds related content across all sources
152 | - **LLM-optimized** - individual files perfect for context windows
153 |
154 | ### **Content Quality**
155 | - **100% working links** - all JavaScript links fixed to help.sap.com URLs
156 | - **Official attribution** - every file includes source documentation link
157 | - **Clean structure** - optimized for LLM consumption
158 |
159 | ---
160 |
161 | ## 🔮 **Benefits of Standard Integration**
162 |
163 | ### **✅ Unified Experience**
164 | - **One search tool** for all SAP development (ABAP + UI5 + CAP + testing)
165 | - **Automatic version detection** - no need to specify versions manually
166 | - **Cross-source results** - finds related content across documentation types
167 |
168 | ### **✅ Technical Excellence**
169 | - **Standard architecture** - same proven system as UI5/CAP sources
170 | - **No special tools** - uses existing infrastructure
171 | - **Easy maintenance** - standard build and deployment process
172 |
173 | ### **✅ Developer Productivity**
174 | - **42,901 individual ABAP files** ready for LLM consumption
175 | - **8 versions supported** with intelligent prioritization
176 | - **Perfect file sizes** (1-10KB) for optimal AI interaction
177 |
178 | ---
179 |
180 | ## 🎉 **Mission Complete: World's Most Comprehensive SAP MCP**
181 |
182 | The SAP Docs MCP now provides:
183 | - ✅ **Complete ABAP coverage** - 8 versions, 42,901+ files
184 | - ✅ **Intelligent version detection** - auto-detects from queries
185 | - ✅ **Unified interface** - one tool for all SAP development
186 | - ✅ **Cross-source intelligence** - finds related content everywhere
187 | - ✅ **LLM-optimized** - perfect file sizes and structure
188 | - ✅ **Production-ready** - standard architecture, full testing
189 |
190 | **The most advanced SAP development documentation system available for LLMs!** 🚀
191 |
```
--------------------------------------------------------------------------------
/src/lib/searchDb.ts:
--------------------------------------------------------------------------------
```typescript
1 | import Database from "better-sqlite3";
2 | import path from "path";
3 | import { existsSync, statSync } from "fs";
4 | import { CONFIG } from "./config.js";
5 |
6 | let db: Database.Database | null = null;
7 |
8 | export function openDb(dbPath?: string): Database.Database {
9 | if (!db) {
10 | // Use centralized config path
11 | const defaultPath = path.join(process.cwd(), CONFIG.DB_PATH);
12 | const finalPath = dbPath || defaultPath;
13 |
14 | if (!existsSync(finalPath)) {
15 | throw new Error(`FTS database not found at ${finalPath}. Run 'npm run build:fts' to create it.`);
16 | }
17 |
18 | db = new Database(finalPath, { readonly: true, fileMustExist: true });
19 | // Read-only safe pragmas
20 | db.pragma("query_only = ON");
21 | db.pragma("cache_size = -8000"); // ~8MB page cache
22 | }
23 | return db;
24 | }
25 |
26 | export function closeDb(): void {
27 | if (db) {
28 | db.close();
29 | db = null;
30 | }
31 | }
32 |
33 | type Filters = {
34 | libraries?: string[]; // e.g. ["/cap", "/sapui5"]
35 | types?: string[]; // e.g. ["markdown","jsdoc","sample"]
36 | };
37 |
38 | export type FTSResult = {
39 | id: string;
40 | libraryId: string;
41 | type: string;
42 | title: string;
43 | description: string;
44 | relFile: string;
45 | snippetCount: number;
46 | bm25Score: number;
47 | highlight: string;
48 | };
49 |
50 | export function toMatchQuery(userQuery: string): string {
51 | // Convert user input into FTS syntax with prefix matching:
52 | // keep quoted phrases as-is, append * to bare terms for prefix matching
53 | const terms = userQuery.match(/"[^"]+"|\S+/g) ?? [];
54 | // Very common stopwords that hurt FTS when ANDed together
55 | const stopwords = new Set([
56 | "a","an","the","to","in","on","for","and","or","of","with","from",
57 | "how","what","why","when","where","which","who","whom","does","do","is","are"
58 | ]);
59 |
60 | const cleanTerms = terms.map(t => {
61 | if (t.startsWith('"') && t.endsWith('"')) return t; // phrase query
62 |
63 | // For terms with dots (like sap.m.Button), quote them as phrases
64 | if (t.includes('.')) {
65 | return `"${t}"`;
66 | }
67 |
68 | // Handle annotation qualifiers with # (like #SpecificationWidthColumnChart)
69 | if (t.startsWith('#') && t.length > 1) {
70 | // Keep the # and treat as exact phrase for better matching
71 | return `"${t}"`;
72 | }
73 |
74 | // Handle compound terms with # (like UI.Chart#Something)
75 | if (t.includes('#') && !t.startsWith('#')) {
76 | // Split on # and treat as phrase to preserve structure
77 | return `"${t}"`;
78 | }
79 |
80 | // Sanitize and add prefix matching for simple terms
81 | const clean = t.replace(/[^\w]/g, "").toLowerCase();
82 | if (!clean || stopwords.has(clean)) return "";
83 | return `${clean}*`;
84 | }).filter(Boolean);
85 |
86 | // Use OR logic for better recall in BM25-only mode (configurable)
87 | // FTS5 will still rank documents with more matching terms higher
88 | if (CONFIG.USE_OR_LOGIC || cleanTerms.length > 3) {
89 | return cleanTerms.join(" OR ");
90 | }
91 |
92 | return cleanTerms.join(" ");
93 | }
94 |
95 | /**
96 | * Fast FTS5 candidate filtering
97 | * Returns document IDs that match the query, for use with existing sophisticated scoring
98 | */
99 | export function getFTSCandidateIds(userQuery: string, filters: Filters = {}, limit = 100): string[] {
100 | const database = openDb();
101 | const match = toMatchQuery(userQuery);
102 |
103 | if (!match.trim()) {
104 | return []; // Empty query
105 | }
106 |
107 | // Build WHERE conditions
108 | const conditions = ["docs MATCH ?"];
109 | const params: any[] = [match];
110 |
111 | if (filters.libraries?.length) {
112 | const placeholders = filters.libraries.map(() => "?").join(",");
113 | conditions.push(`libraryId IN (${placeholders})`);
114 | params.push(...filters.libraries);
115 | }
116 |
117 | if (filters.types?.length) {
118 | const placeholders = filters.types.map(() => "?").join(",");
119 | conditions.push(`type IN (${placeholders})`);
120 | params.push(...filters.types);
121 | }
122 |
123 | const sql = `
124 | SELECT id
125 | FROM docs
126 | WHERE ${conditions.join(" AND ")}
127 | ORDER BY bm25(docs)
128 | LIMIT ?
129 | `;
130 |
131 | try {
132 | const stmt = database.prepare(sql);
133 | const rows = stmt.all(...params, limit) as { id: string }[];
134 | return rows.map(r => r.id);
135 | } catch (error) {
136 | console.warn("FTS query failed, falling back to full search:", error);
137 | return []; // Fallback gracefully
138 | }
139 | }
140 |
141 | /**
142 | * Full FTS search with results (for debugging/testing)
143 | */
144 | export function searchFTS(userQuery: string, filters: Filters = {}, limit = 20): FTSResult[] {
145 | const database = openDb();
146 | const match = toMatchQuery(userQuery);
147 |
148 | if (!match.trim()) {
149 | return [];
150 | }
151 |
152 | // Build WHERE conditions
153 | const conditions = ["docs MATCH ?"];
154 | const params: any[] = [match];
155 |
156 | if (filters.libraries?.length) {
157 | const placeholders = filters.libraries.map(() => "?").join(",");
158 | conditions.push(`libraryId IN (${placeholders})`);
159 | params.push(...filters.libraries);
160 | }
161 |
162 | if (filters.types?.length) {
163 | const placeholders = filters.types.map(() => "?").join(",");
164 | conditions.push(`type IN (${placeholders})`);
165 | params.push(...filters.types);
166 | }
167 |
168 | // BM25 weights: title, description, keywords, controlName, namespace
169 | // Higher weight = more important (title and controlName are most important)
170 | const sql = `
171 | SELECT
172 | id, libraryId, type, title, description, relFile, snippetCount,
173 | highlight(docs, 2, '<mark>', '</mark>') AS highlight,
174 | bm25(docs, 1.0, 8.0, 2.0, 4.0, 6.0, 3.0) AS bm25Score
175 | FROM docs
176 | WHERE ${conditions.join(" AND ")}
177 | ORDER BY bm25Score
178 | LIMIT ?
179 | `;
180 |
181 | try {
182 | const stmt = database.prepare(sql);
183 | const rows = stmt.all(...params, limit) as any[];
184 |
185 | return rows.map(r => ({
186 | id: r.id,
187 | libraryId: r.libraryId,
188 | type: r.type,
189 | title: r.title,
190 | description: r.description,
191 | relFile: r.relFile,
192 | snippetCount: r.snippetCount,
193 | bm25Score: Number(r.bm25Score),
194 | highlight: r.highlight || r.title
195 | }));
196 | } catch (error) {
197 | console.warn("FTS query failed:", error);
198 | return [];
199 | }
200 | }
201 |
202 | /**
203 | * Get database stats for monitoring
204 | */
205 | export function getFTSStats(): { rowCount: number; dbSize: number; mtime: string } | null {
206 | try {
207 | const database = openDb();
208 | const rowCount = database.prepare("SELECT count(*) as n FROM docs").get() as { n: number };
209 |
210 | const dbPath = path.join(process.cwd(), CONFIG.DB_PATH);
211 | const stats = statSync(dbPath);
212 |
213 | return {
214 | rowCount: rowCount.n,
215 | dbSize: stats.size,
216 | mtime: stats.mtime.toISOString()
217 | };
218 | } catch (error) {
219 | console.warn("Could not get FTS stats:", error);
220 | return null;
221 | }
222 | }
```
--------------------------------------------------------------------------------
/docs/CONTENT-SIZE-LIMITS.md:
--------------------------------------------------------------------------------
```markdown
1 | # Content Size Limits
2 |
3 | ## Overview
4 |
5 | To ensure optimal performance and prevent token overflow in LLM interactions, the server implements intelligent content size limits for SAP Help Portal and Community content retrieval.
6 |
7 | ## Configuration
8 |
9 | ### Maximum Content Length
10 |
11 | **Default: 75,000 characters** (~18,750 tokens)
12 |
13 | This limit is configurable in `/src/lib/config.ts`:
14 |
15 | ```typescript
16 | export const CONFIG = {
17 | // Maximum content length for SAP Help and Community full content retrieval
18 | // Limits help prevent token overflow and keep responses manageable (~18,750 tokens)
19 | MAX_CONTENT_LENGTH: 75000, // 75,000 characters
20 | };
21 | ```
22 |
23 | ## Affected Tools
24 |
25 | The content size limit applies to the following MCP tools:
26 |
27 | ### 1. `sap_help_get`
28 | Retrieves full SAP Help Portal pages. If content exceeds 75,000 characters, it is intelligently truncated while preserving:
29 | - Beginning section (introduction and main content)
30 | - End section (conclusions and examples)
31 | - A clear truncation notice showing what was omitted
32 |
33 | ### 2. `sap_community_search`
34 | Returns full content of top 3 SAP Community posts. Each post is truncated if needed using the same intelligent algorithm.
35 |
36 | ### 3. Community Post Retrieval
37 | Individual community posts fetched via `fetch` tool with `community-*` IDs are also subject to truncation.
38 |
39 | ## Intelligent Truncation Algorithm
40 |
41 | When content exceeds the maximum length, the truncation algorithm:
42 |
43 | ### Preservation Strategy
44 | - **60%** from the beginning (introduction, overview, main content)
45 | - **20%** from the end (conclusions, examples, summaries)
46 | - **20%** reserved for truncation notice and natural break padding
47 |
48 | ### Natural Boundaries
49 | The algorithm attempts to break content at natural points rather than mid-sentence:
50 | 1. Paragraph breaks (`\n\n`)
51 | 2. Markdown headings (`# Heading`)
52 | 3. Code block boundaries (` ```\n`)
53 | 4. Horizontal rules (`---`)
54 | 5. Sentence endings (`. `)
55 |
56 | ### Truncation Notice
57 | A clear notice is inserted showing:
58 | - Original content length in characters
59 | - Approximate original token count (chars ÷ 4)
60 | - Number of omitted characters
61 | - Percentage of content omitted
62 | - Explanation that beginning and end are preserved
63 |
64 | Example truncation notice:
65 | ```markdown
66 | ---
67 |
68 | ⚠️ **Content Truncated**
69 |
70 | The full content was 425,000 characters (approximately 106,250 tokens).
71 | For readability and performance, 350,000 characters (82%) have been omitted from the middle section.
72 |
73 | The beginning and end of the document are preserved above and below this notice.
74 |
75 | ---
76 | ```
77 |
78 | ## Rationale
79 |
80 | ### Why 75,000 Characters?
81 |
82 | 1. **LLM Context Windows**: Fits comfortably in most modern LLM context windows:
83 | - Claude 3.5 Sonnet: 200k tokens (can handle ~800k chars)
84 | - GPT-4 Turbo: 128k tokens (can handle ~512k chars)
85 | - Leaves room for conversation history and system prompts
86 |
87 | 2. **Performance**: Reduces response time and API costs while maintaining comprehensive coverage
88 |
89 | 3. **Readability**: Very long documents (>100k chars) are often better consumed in multiple focused queries
90 |
91 | 4. **Practical Coverage**: 75k characters is sufficient for most documentation pages while preventing extreme cases
92 |
93 | ### Alternative Approaches Considered
94 |
95 | | Approach | Characters | Tokens (approx) | Trade-off |
96 | |----------|-----------|-----------------|-----------|
97 | | Conservative | 50,000 | ~12,500 | Too restrictive for comprehensive docs |
98 | | **Current** | **75,000** | **~18,750** | **Balanced - recommended** |
99 | | Generous | 100,000 | ~25,000 | Risk of slow responses |
100 | | Maximum | 150,000 | ~37,500 | Only for edge cases |
101 |
102 | ## Implementation Details
103 |
104 | ### Source Files
105 |
106 | - **Configuration**: `/src/lib/config.ts` - MAX_CONTENT_LENGTH constant
107 | - **Truncation Logic**: `/src/lib/truncate.ts` - Intelligent truncation implementation
108 | - **SAP Help**: `/src/lib/sapHelp.ts` - Applied in `getSapHelpContent()`
109 | - **Community**: `/src/lib/communityBestMatch.ts` - Applied in post retrieval functions
110 |
111 | ### Functions
112 |
113 | #### `truncateContent(content: string, maxLength?: number): TruncationResult`
114 | Main truncation function with beginning/end preservation.
115 |
116 | **Returns:**
117 | ```typescript
118 | {
119 | content: string; // Truncated content
120 | wasTruncated: boolean; // Whether truncation occurred
121 | originalLength: number; // Original character count
122 | truncatedLength: number; // Final character count
123 | }
124 | ```
125 |
126 | #### `truncateContentSimple(content: string, maxLength?: number): TruncationResult`
127 | Alternative truncation function that only preserves beginning with end notice.
128 |
129 | ## Monitoring and Adjustment
130 |
131 | ### When to Increase Limit
132 |
133 | Consider increasing if:
134 | - Users frequently encounter truncated content
135 | - Average document sizes are near the limit
136 | - LLM context windows have increased significantly
137 |
138 | ### When to Decrease Limit
139 |
140 | Consider decreasing if:
141 | - Response times are too slow
142 | - Token costs are concerning
143 | - Most content doesn't use the available space
144 |
145 | ### Override for Specific Cases
146 |
147 | To override the limit for specific use cases, modify the `truncateContent()` call:
148 |
149 | ```typescript
150 | // Custom limit of 100,000 characters
151 | const truncationResult = truncateContent(fullContent, 100000);
152 | ```
153 |
154 | ## User Experience
155 |
156 | ### Transparent Communication
157 |
158 | When content is truncated, users see:
159 | - Clear visual indicator (⚠️ warning emoji)
160 | - Exact statistics (original length, omitted amount, percentage)
161 | - Explanation of what's preserved
162 | - No disruption to markdown formatting
163 |
164 | ### Best Practices for Users
165 |
166 | 1. **Specific Queries**: Ask focused questions to get relevant sections
167 | 2. **Multiple Requests**: Break very long documents into targeted fetches
168 | 3. **Search First**: Use `sap_help_search` to find specific sections before fetching
169 | 4. **Check URLs**: Visit the provided URLs for complete untruncated content
170 |
171 | ## Future Enhancements
172 |
173 | Potential improvements to consider:
174 |
175 | 1. **Dynamic Limits**: Adjust based on LLM context window
176 | 2. **Sectioned Retrieval**: Fetch specific document sections
177 | 3. **Summary Generation**: Auto-summarize omitted middle sections
178 | 4. **User Preferences**: Allow users to specify their preferred limits
179 | 5. **Compression**: Apply content compression for technical reference material
180 |
181 | ## Testing
182 |
183 | Content size limits are tested in:
184 | - Unit tests for truncation functions
185 | - Integration tests for SAP Help and Community tools
186 | - Manual validation with known large documents
187 |
188 | ## Related Documentation
189 |
190 | - **Architecture**: `/docs/ARCHITECTURE.md` - System overview
191 | - **Tool Descriptions**: `/docs/CURSOR-SETUP.md` - MCP tool documentation
192 | - **Community Search**: `/docs/COMMUNITY-SEARCH-IMPLEMENTATION.md` - Community integration details
193 |
194 |
```
--------------------------------------------------------------------------------
/src/lib/url-generation/utils.ts:
--------------------------------------------------------------------------------
```typescript
1 | /**
2 | * Common utilities for URL generation across different documentation sources
3 | */
4 |
5 | export interface FrontmatterData {
6 | id?: string;
7 | slug?: string;
8 | title?: string;
9 | sidebar_label?: string;
10 | [key: string]: any;
11 | }
12 |
13 | /**
14 | * Extract frontmatter from document content
15 | * Supports YAML frontmatter format used in Markdown/MDX files
16 | */
17 | export function parseFrontmatter(content: string): FrontmatterData {
18 | const frontmatterMatch = content.match(/^---\s*\n([\s\S]*?)\n---/);
19 | if (!frontmatterMatch) {
20 | return {};
21 | }
22 |
23 | const frontmatter = frontmatterMatch[1];
24 | const result: FrontmatterData = {};
25 |
26 | // Parse simple key-value pairs
27 | const lines = frontmatter.split('\n');
28 | let currentKey = '';
29 | let isInArray = false;
30 |
31 | for (const line of lines) {
32 | const trimmedLine = line.trim();
33 |
34 | if (!trimmedLine || trimmedLine.startsWith('#')) {
35 | continue; // Skip empty lines and comments
36 | }
37 |
38 | // Handle array items (lines starting with -)
39 | if (trimmedLine.startsWith('-')) {
40 | if (isInArray && currentKey) {
41 | const arrayValue = trimmedLine.substring(1).trim();
42 | if (!Array.isArray(result[currentKey])) {
43 | result[currentKey] = [];
44 | }
45 | (result[currentKey] as string[]).push(arrayValue);
46 | }
47 | continue;
48 | }
49 |
50 | // Handle key-value pairs
51 | const colonIndex = trimmedLine.indexOf(':');
52 | if (colonIndex !== -1) {
53 | currentKey = trimmedLine.substring(0, colonIndex).trim();
54 | const value = trimmedLine.substring(colonIndex + 1).trim();
55 |
56 | if (value === '') {
57 | // This might be the start of an array
58 | isInArray = true;
59 | result[currentKey] = [];
60 | } else {
61 | isInArray = false;
62 | // Clean up quoted values
63 | result[currentKey] = value.replace(/^["']|["']$/g, '');
64 | }
65 | }
66 | }
67 |
68 | return result;
69 | }
70 |
71 | /**
72 | * Detect the main section/topic from content for anchor generation
73 | */
74 | export function detectContentSection(content: string, anchorStyle: 'docsify' | 'github' | 'custom'): string | null {
75 | // Find the first major heading (## or #) that gives context about the content
76 | const headingMatch = content.match(/^#{1,2}\s+(.+)$/m);
77 | if (!headingMatch) {
78 | return null;
79 | }
80 |
81 | const heading = headingMatch[1].trim();
82 |
83 | // Convert heading to anchor format based on style
84 | switch (anchorStyle) {
85 | case 'docsify':
86 | // Docsify format: lowercase, spaces to hyphens, remove special chars
87 | return heading
88 | .toLowerCase()
89 | .replace(/[^\w\s-]/g, '') // Remove special characters except hyphens
90 | .replace(/\s+/g, '-') // Spaces to hyphens
91 | .replace(/-+/g, '-') // Multiple hyphens to single
92 | .replace(/^-|-$/g, ''); // Remove leading/trailing hyphens
93 |
94 | case 'github':
95 | // GitHub format: lowercase, spaces to hyphens, keep some special chars
96 | return heading
97 | .toLowerCase()
98 | .replace(/[^\w\s-]/g, '')
99 | .replace(/\s+/g, '-');
100 |
101 | case 'custom':
102 | default:
103 | // Return as-is for custom handling
104 | return heading;
105 | }
106 | }
107 |
108 | /**
109 | * Determine the section path from file relative path
110 | */
111 | export function extractSectionFromPath(relFile: string): string {
112 | if (relFile.includes('guides/')) {
113 | return '/guides/';
114 | } else if (relFile.includes('features/')) {
115 | return '/features/';
116 | } else if (relFile.includes('tutorials/')) {
117 | return '/tutorials/';
118 | } else if (relFile.includes('environments/')) {
119 | return '/environments/';
120 | } else if (relFile.includes('getting-started/')) {
121 | return '/getting-started/';
122 | } else if (relFile.includes('examples/')) {
123 | return '/examples/';
124 | } else if (relFile.includes('api/')) {
125 | return '/api/';
126 | }
127 | return '';
128 | }
129 |
130 | /**
131 | * Clean filename for URL usage
132 | */
133 | export function cleanFilename(filename: string): string {
134 | return filename
135 | .replace(/\.mdx?$/, '') // Remove .md/.mdx extensions
136 | .replace(/\.html?$/, '') // Remove .html/.htm extensions
137 | .replace(/\s+/g, '-') // Spaces to hyphens
138 | .toLowerCase();
139 | }
140 |
141 | /**
142 | * Build URL with proper path joining
143 | */
144 | export function buildUrl(baseUrl: string, ...pathSegments: string[]): string {
145 | const cleanBase = baseUrl.replace(/\/$/, ''); // Remove trailing slash
146 | const cleanSegments = pathSegments
147 | .filter(segment => segment && segment.trim() !== '') // Remove empty segments
148 | .map(segment => segment.replace(/^\/|\/$/g, '')); // Remove leading/trailing slashes
149 |
150 | if (cleanSegments.length === 0) {
151 | return cleanBase;
152 | }
153 |
154 | return `${cleanBase}/${cleanSegments.join('/')}`;
155 | }
156 |
157 | /**
158 | * Extract library ID from document ID path
159 | * Used for search result URL generation
160 | */
161 | export function extractLibraryIdFromPath(docId: string): string {
162 | if (docId.startsWith('/')) {
163 | const parts = docId.split('/');
164 | return parts.length > 1 ? `/${parts[1]}` : docId;
165 | }
166 | return docId;
167 | }
168 |
169 | /**
170 | * Extract relative file path from document ID
171 | * Used for search result URL generation
172 | */
173 | export function extractRelativeFileFromPath(docId: string): string {
174 | if (docId.includes('/')) {
175 | const parts = docId.split('/');
176 | return parts.length > 2 ? parts.slice(2).join('/') : '';
177 | }
178 | return '';
179 | }
180 |
181 | /**
182 | * Format a single search result with URL generation and excerpt truncation
183 | * Shared utility for consistent search result formatting across servers
184 | */
185 | export function formatSearchResult(
186 | result: any,
187 | excerptLength: number,
188 | urlGenerator?: {
189 | generateDocumentationUrl: (libraryId: string, relFile: string, content: string, config: any) => string | null;
190 | getDocUrlConfig: (libraryId: string) => any;
191 | }
192 | ): string {
193 | // Extract library ID and relative file path to generate URL
194 | const libraryId = result.sourceId ? `/${result.sourceId}` : extractLibraryIdFromPath(result.id);
195 | const relFile = extractRelativeFileFromPath(result.id);
196 |
197 | // Try to generate documentation URL
198 | let urlInfo = '';
199 | if (urlGenerator) {
200 | try {
201 | const config = urlGenerator.getDocUrlConfig && urlGenerator.getDocUrlConfig(libraryId);
202 | if (config && urlGenerator.generateDocumentationUrl) {
203 | const docUrl = urlGenerator.generateDocumentationUrl(libraryId, relFile, result.text || '', config);
204 | if (docUrl) {
205 | urlInfo = `\n 🔗 ${docUrl}`;
206 | }
207 | }
208 | } catch (error) {
209 | // Silently fail URL generation
210 | console.warn(`URL generation failed for ${libraryId}/${relFile}:`, error);
211 | }
212 | }
213 |
214 | return `⭐️ **${result.id}** (Score: ${result.finalScore.toFixed(2)})\n ${(result.text || '').substring(0, excerptLength)}${urlInfo}\n Use in fetch\n`;
215 | }
216 |
217 |
```
--------------------------------------------------------------------------------
/docs/COMMUNITY-SEARCH-IMPLEMENTATION.md:
--------------------------------------------------------------------------------
```markdown
1 | # SAP Community Search Implementation
2 |
3 | ## Overview
4 |
5 | The SAP Community search has been completely rewritten to use HTML scraping instead of the LiQL API approach, providing better search results that match the SAP Community's "Best Match" ranking algorithm.
6 |
7 | ## New Implementation Details
8 |
9 | ### 1. HTML Scraping Module (`src/lib/communityBestMatch.ts`)
10 |
11 | **Key Features:**
12 | - Direct HTML scraping of SAP Community search results
13 | - Extracts comprehensive metadata: title, author, publish date, likes, snippet, tags
14 | - Zero external dependencies - uses native Node.js `fetch` and regex parsing
15 | - Respects SAP Community's "Best Match" ranking
16 | - Includes both search and full post retrieval functions
17 |
18 | **Functions:**
19 | - `searchCommunityBestMatch(query, options)` - Search for community posts via HTML scraping
20 | - `getCommunityPostByUrl(url, userAgent)` - Get full post content from URL (fallback method)
21 | - `getCommunityPostsByIds(postIds, userAgent)` - **NEW**: Batch retrieve multiple posts via LiQL API
22 | - `getCommunityPostById(postId, userAgent)` - **NEW**: Single post retrieval via LiQL API
23 | - `searchAndGetTopPosts(query, topN, options)` - **NEW**: Search + batch retrieve in one call
24 |
25 | ### 2. Updated Search Integration (`src/lib/localDocs.ts`)
26 |
27 | **Changes:**
28 | - Replaced LiQL API calls with HTML scraping
29 | - Enhanced SearchResult interface with new fields (author, likes, tags)
30 | - Improved post ID handling for both legacy and new URL-based formats
31 | - Better error handling and graceful fallbacks
32 |
33 | ### 3. Enhanced Type Definitions (`src/lib/types.ts`)
34 |
35 | **New SearchResult fields:**
36 | - `author?: string` - Post author name
37 | - `likes?: number` - Number of kudos/likes
38 | - `tags?: string[]` - Associated topic tags
39 |
40 | ## Usage
41 |
42 | ### Search Community Posts
43 | ```javascript
44 | import { searchCommunityBestMatch } from './src/lib/communityBestMatch.js';
45 |
46 | const results = await searchCommunityBestMatch('SAPUI5 wizard', {
47 | includeBlogs: true,
48 | limit: 10,
49 | userAgent: 'MyApp/1.0'
50 | });
51 | ```
52 |
53 | ### Batch Retrieve Multiple Posts (Recommended)
54 | ```javascript
55 | import { getCommunityPostsByIds } from './src/lib/communityBestMatch.js';
56 |
57 | // Efficient batch retrieval using LiQL API
58 | const posts = await getCommunityPostsByIds(['13961398', '13446100', '14152848'], 'MyApp/1.0');
59 | // Returns: { '13961398': 'formatted content...', '13446100': 'formatted content...', ... }
60 | ```
61 |
62 | ### Search + Get Top Posts (One-Stop Solution)
63 | ```javascript
64 | import { searchAndGetTopPosts } from './src/lib/communityBestMatch.js';
65 |
66 | // Search and get full content of top 3 posts in one call
67 | const { search, posts } = await searchAndGetTopPosts('odata cache', 3, {
68 | includeBlogs: true,
69 | userAgent: 'MyApp/1.0'
70 | });
71 |
72 | search.forEach((result, index) => {
73 | console.log(`${index + 1}. ${result.title}`);
74 | if (posts[result.postId]) {
75 | console.log(posts[result.postId]); // Full formatted content
76 | }
77 | });
78 | ```
79 |
80 | ### Single Post Retrieval
81 | ```javascript
82 | import { getCommunityPostById } from './src/lib/communityBestMatch.js';
83 |
84 | const content = await getCommunityPostById('13961398', 'MyApp/1.0');
85 | ```
86 |
87 | ### Fallback: Get Full Post Content by URL
88 | ```javascript
89 | import { getCommunityPostByUrl } from './src/lib/communityBestMatch.js';
90 |
91 | const content = await getCommunityPostByUrl(
92 | 'https://community.sap.com/t5/technology-blogs-by-sap/...',
93 | 'MyApp/1.0'
94 | );
95 | ```
96 |
97 | ### Via MCP Server
98 | The community search is exposed as the `sap_community_search` tool, which now **automatically returns the full content** of the top 3 most relevant posts using the efficient LiQL API batch retrieval. Individual posts can also be retrieved using `fetch` with community post IDs.
99 |
100 | **Key Behavior:**
101 | - **`sap_community_search`**: Returns full content of top 3 posts (search + batch retrieval in one call)
102 | - **`fetch`**: Retrieves individual post content by ID
103 |
104 | ## Testing
105 |
106 | ### Run the Test Suite
107 | ```bash
108 | # Run comprehensive test suite (recommended)
109 | npm run test:community
110 |
111 | # Run directly with Node.js (TypeScript support)
112 | node test/community-search.ts
113 | ```
114 |
115 | The **unified test suite** (`test/community-search.ts`) covers:
116 | - **HTML Search Scraping**: Search accuracy, post ID extraction, metadata parsing
117 | - **LiQL API Batch Retrieval**: Efficient multi-post content retrieval
118 | - **Single Post Retrieval**: Individual post fetching via API
119 | - **Convenience Functions**: Combined search + batch retrieval workflow
120 | - **Direct API Testing**: Raw LiQL API validation
121 | - **Known Post Validation**: Testing with specific real posts
122 |
123 | ### Test Features
124 | - **TypeScript**: Modern, type-safe test implementation
125 | - **Comprehensive Coverage**: All functionality tested in one script
126 | - **Organized Structure**: Modular test functions with clear separation
127 | - **Real-time Validation**: Tests against live SAP Community data
128 | - **Error Handling**: Robust error reporting and graceful failures
129 |
130 | ## Benefits of the New Implementation
131 |
132 | 1. **Better Search Results**: Uses SAP Community's native "Best Match" algorithm
133 | 2. **Richer Metadata**: Extracts author, likes, tags, and better snippets
134 | 3. **Efficient Batch Retrieval**: LiQL API for fast bulk post content retrieval
135 | 4. **Hybrid Approach**: HTML scraping for search + API calls for content = best of both worlds
136 | 5. **One-Stop Functions**: `searchAndGetTopPosts()` combines search + retrieval in single call
137 | 6. **Improved Reliability**: Fallback methods for different scenarios
138 | 7. **Real-time Data**: Gets the same results users see on the website
139 |
140 | ## Technical Notes
141 |
142 | ### HTML Parsing Strategy
143 | - Uses regex patterns to extract structured data from Khoros-based SAP Community
144 | - Targets stable CSS classes and HTML structure patterns
145 | - Includes fallback patterns for different page layouts
146 | - Sanitizes and decodes HTML entities properly
147 |
148 | ### Rate Limiting & Respect
149 | - Includes User-Agent identification
150 | - Test script includes delays between requests
151 | - Graceful error handling for HTTP failures
152 | - Respects community guidelines
153 |
154 | ### Post ID Formats
155 | The system now supports two post ID formats:
156 | 1. **Legacy**: `community-postId` (tries to construct URL)
157 | 2. **New**: `community-url-encodedUrl` (direct URL extraction)
158 |
159 | ### Error Handling
160 | - Network failures return empty results instead of crashing
161 | - HTML parsing errors are logged but don't break the search
162 | - Malformed URLs are handled gracefully
163 | - User-Agent can be customized for identification
164 |
165 | ## Future Enhancements
166 |
167 | 1. **Caching**: Add optional caching layer for frequently accessed posts
168 | 2. **Pagination**: Support for multiple result pages
169 | 3. **Advanced Filtering**: Filter by author, date range, or specific tags
170 | 4. **Performance**: Add connection pooling for high-volume usage
171 |
172 | ## Migration Notes
173 |
174 | The new implementation is a **drop-in replacement** for the old LiQL-based approach:
175 | - Same function signatures for `searchCommunity()`
176 | - Same MCP tool interface (`sap_community_search`)
177 | - Enhanced with additional metadata fields
178 | - Backward compatible post ID handling
```
--------------------------------------------------------------------------------
/src/lib/logger.ts:
--------------------------------------------------------------------------------
```typescript
1 | // src/lib/logger.ts
2 | // Standard logging utility with configurable levels
3 |
4 | export enum LogLevel {
5 | ERROR = 0,
6 | WARN = 1,
7 | INFO = 2,
8 | DEBUG = 3
9 | }
10 |
11 | export class Logger {
12 | private level: LogLevel;
13 | private enableJson: boolean;
14 | private startTime: number = Date.now();
15 |
16 | constructor() {
17 | // Standard environment-based configuration
18 | const envLevel = process.env.LOG_LEVEL?.toUpperCase() || 'INFO';
19 | this.level = LogLevel[envLevel as keyof typeof LogLevel] ?? LogLevel.INFO;
20 | this.enableJson = process.env.LOG_FORMAT === 'json';
21 |
22 | // Setup global error handlers
23 | this.setupGlobalErrorHandlers();
24 | }
25 |
26 | private shouldLog(level: LogLevel): boolean {
27 | return level <= this.level;
28 | }
29 |
30 | private formatMessage(level: string, message: string, meta?: Record<string, any>): string {
31 | const timestamp = new Date().toISOString();
32 |
33 | if (this.enableJson) {
34 | return JSON.stringify({
35 | timestamp,
36 | level,
37 | message,
38 | ...meta
39 | });
40 | } else {
41 | const metaStr = meta ? ` ${JSON.stringify(meta)}` : '';
42 | return `${timestamp} [${level}] ${message}${metaStr}`;
43 | }
44 | }
45 |
46 | error(message: string, meta?: Record<string, any>): void {
47 | if (this.shouldLog(LogLevel.ERROR)) {
48 | // Always use stderr for MCP stdio compatibility
49 | process.stderr.write(this.formatMessage('ERROR', message, meta) + '\n');
50 | }
51 | }
52 |
53 | warn(message: string, meta?: Record<string, any>): void {
54 | if (this.shouldLog(LogLevel.WARN)) {
55 | // Always use stderr for MCP stdio compatibility
56 | process.stderr.write(this.formatMessage('WARN', message, meta) + '\n');
57 | }
58 | }
59 |
60 | info(message: string, meta?: Record<string, any>): void {
61 | if (this.shouldLog(LogLevel.INFO)) {
62 | // Always use stderr for MCP stdio compatibility
63 | process.stderr.write(this.formatMessage('INFO', message, meta) + '\n');
64 | }
65 | }
66 |
67 | debug(message: string, meta?: Record<string, any>): void {
68 | if (this.shouldLog(LogLevel.DEBUG)) {
69 | // Always use stderr for MCP stdio compatibility
70 | process.stderr.write(this.formatMessage('DEBUG', message, meta) + '\n');
71 | }
72 | }
73 |
74 |
75 |
76 | private sanitizeQuery(query: string): string {
77 | // Basic sanitization for logging
78 | return query
79 | .replace(/\b\d{4,}\b/g, '[NUM]')
80 | .replace(/\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g, '[EMAIL]')
81 | .substring(0, 200);
82 | }
83 |
84 | private sanitizeError(error: string): string {
85 | return error
86 | .replace(/\/[^\s]+/g, '[PATH]')
87 | .replace(/\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b/g, '[IP]')
88 | .substring(0, 300);
89 | }
90 |
91 | // Setup global error handlers to catch unhandled errors
92 | private setupGlobalErrorHandlers(): void {
93 | // Handle unhandled promise rejections
94 | process.on('unhandledRejection', (reason: any, promise: Promise<any>) => {
95 | this.error('Unhandled Promise Rejection', {
96 | reason: this.sanitizeError(String(reason)),
97 | stack: reason?.stack ? this.sanitizeError(reason.stack) : undefined,
98 | pid: process.pid,
99 | uptime: Date.now() - this.startTime,
100 | timestamp: new Date().toISOString()
101 | });
102 | });
103 |
104 | // Handle uncaught exceptions
105 | process.on('uncaughtException', (error: Error) => {
106 | this.error('Uncaught Exception', {
107 | message: this.sanitizeError(error.message),
108 | stack: this.sanitizeError(error.stack || ''),
109 | name: error.name,
110 | pid: process.pid,
111 | uptime: Date.now() - this.startTime,
112 | timestamp: new Date().toISOString()
113 | });
114 |
115 | // Exit after logging the error
116 | setTimeout(() => process.exit(1), 100);
117 | });
118 |
119 | // Handle warnings (useful for debugging deprecations and other issues)
120 | process.on('warning', (warning: Error) => {
121 | this.warn('Process Warning', {
122 | message: warning.message,
123 | name: warning.name,
124 | stack: warning.stack ? this.sanitizeError(warning.stack) : undefined,
125 | pid: process.pid,
126 | timestamp: new Date().toISOString()
127 | });
128 | });
129 | }
130 |
131 | // Enhanced tool execution logging with timing
132 | logToolStart(tool: string, query: string, clientInfo?: Record<string, any>): { startTime: number; requestId: string } {
133 | const startTime = Date.now();
134 | const requestId = `req_${startTime}_${Math.random().toString(36).substr(2, 9)}`;
135 |
136 | this.info('Tool execution started', {
137 | tool,
138 | query: this.sanitizeQuery(query),
139 | client: clientInfo,
140 | requestId,
141 | timestamp: new Date().toISOString(),
142 | pid: process.pid,
143 | uptime: Date.now() - this.startTime
144 | });
145 |
146 | return { startTime, requestId };
147 | }
148 |
149 | logToolSuccess(tool: string, requestId: string, startTime: number, resultCount?: number, additionalInfo?: Record<string, any>): void {
150 | const duration = Date.now() - startTime;
151 |
152 | this.info('Tool execution completed', {
153 | tool,
154 | requestId,
155 | duration,
156 | resultCount,
157 | ...additionalInfo,
158 | timestamp: new Date().toISOString(),
159 | pid: process.pid
160 | });
161 | }
162 |
163 | logToolError(tool: string, requestId: string, startTime: number, error: any, fallback?: boolean): void {
164 | const duration = Date.now() - startTime;
165 |
166 | this.error('Tool execution failed', {
167 | tool,
168 | requestId,
169 | duration,
170 | error: this.sanitizeError(String(error)),
171 | stack: error?.stack ? this.sanitizeError(error.stack) : undefined,
172 | errorName: error?.name,
173 | fallback: fallback || false,
174 | timestamp: new Date().toISOString(),
175 | pid: process.pid,
176 | uptime: Date.now() - this.startTime
177 | });
178 | }
179 |
180 | // Enhanced request logging with more context
181 | logRequest(tool: string, query: string, clientInfo?: Record<string, any>): void {
182 | this.info('Tool request received', {
183 | tool,
184 | query: this.sanitizeQuery(query),
185 | client: {
186 | ...clientInfo,
187 | userAgent: clientInfo?.headers?.['user-agent'],
188 | contentType: clientInfo?.headers?.['content-type']
189 | },
190 | timestamp: new Date().toISOString(),
191 | pid: process.pid,
192 | uptime: Date.now() - this.startTime
193 | });
194 | }
195 |
196 | // Log transport/connection issues
197 | logTransportEvent(event: string, sessionId?: string, details?: Record<string, any>): void {
198 | this.info('Transport event', {
199 | event,
200 | sessionId,
201 | details,
202 | timestamp: new Date().toISOString(),
203 | pid: process.pid,
204 | uptime: Date.now() - this.startTime
205 | });
206 | }
207 |
208 | // Log memory and performance metrics
209 | logPerformanceMetrics(): void {
210 | const memUsage = process.memoryUsage();
211 | const cpuUsage = process.cpuUsage();
212 |
213 | this.debug('Performance metrics', {
214 | memory: {
215 | rss: Math.round(memUsage.rss / 1024 / 1024),
216 | heapTotal: Math.round(memUsage.heapTotal / 1024 / 1024),
217 | heapUsed: Math.round(memUsage.heapUsed / 1024 / 1024),
218 | external: Math.round(memUsage.external / 1024 / 1024)
219 | },
220 | cpu: {
221 | user: cpuUsage.user,
222 | system: cpuUsage.system
223 | },
224 | uptime: Math.round((Date.now() - this.startTime) / 1000),
225 | pid: process.pid,
226 | timestamp: new Date().toISOString()
227 | });
228 | }
229 | }
230 |
231 | // Export singleton logger instance
232 | export const logger = new Logger();
```
--------------------------------------------------------------------------------
/docs/DEV.md:
--------------------------------------------------------------------------------
```markdown
1 | # 🛠️ Development Guide
2 |
3 | ## Quick Start
4 |
5 | ### 🚀 **Initial Setup**
6 | ```bash
7 | # Clone and install
8 | git clone <repo-url>
9 | cd sap-docs-mcp
10 | npm install
11 |
12 | # Run enhanced setup (submodules + build)
13 | npm run setup
14 |
15 | # Start development server
16 | npm run start:http
17 | ```
18 |
19 | ### 🧪 **Run Tests**
20 | ```bash
21 | npm run test:smoke # Quick validation
22 | npm run test:fast # Skip build, test only
23 | npm run test # Full build + test
24 | ```
25 |
26 | ## Common Commands
27 |
28 | ### 📦 **Build Commands**
29 | ```bash
30 | npm run build:tsc # Compile TypeScript
31 | npm run build:index # Build documentation index
32 | npm run build:fts # Build FTS5 search database
33 | npm run build # Complete build pipeline (tsc + index + fts)
34 | ```
35 |
36 | ### 🖥️ **Server Commands**
37 | ```bash
38 | npm start # MCP stdio server (for Claude)
39 | npm run start:http # HTTP development server (port 3001)
40 | npm run start:streamable # Streamable HTTP server (port 3122)
41 | ```
42 |
43 | ### 🧪 **Test Commands**
44 | ```bash
45 | npm run test:smoke # Quick smoke tests
46 | npm run test:fast # Test without rebuild
47 | npm run test # Full test suite
48 | npm run test:community # SAP Community search tests
49 | npm run inspect # MCP protocol inspector
50 | ```
51 |
52 | ## Environment Variables
53 |
54 | ### 🔧 **Core Configuration**
55 | ```bash
56 | RETURN_K=25 # Number of search results (default: 25)
57 | LOG_LEVEL=INFO # Logging level (ERROR, WARN, INFO, DEBUG)
58 | LOG_FORMAT=json # Log format (json or text)
59 | NODE_ENV=production # Environment mode
60 | ```
61 |
62 | ### 🗄️ **Database & Paths**
63 | ```bash
64 | DB_PATH=dist/data/docs.sqlite # FTS5 database path
65 | METADATA_PATH=src/metadata.json # Metadata configuration path
66 | ```
67 |
68 | ### 🌐 **Server Configuration**
69 | ```bash
70 | PORT=3001 # HTTP server port
71 | MCP_PORT=3122 # Streamable HTTP MCP port
72 | ```
73 |
74 | ## Development Servers
75 |
76 | ### 📡 **1. Stdio MCP Server** (Main)
77 | ```bash
78 | npm run start:stdio
79 | # For Claude/LLM integration via stdio transport
80 | ```
81 |
82 | ### 🌐 **2. HTTP Development Server**
83 | ```bash
84 | npm run start:http
85 | # Access: http://localhost:3001
86 | # Endpoints: /status, /healthz, /readyz, /mcp
87 | ```
88 |
89 | ### 🔄 **3. Streamable HTTP Server**
90 | ```bash
91 | npm run start:streamable
92 | # Access: http://localhost:3122
93 | # Endpoints: /mcp, /health
94 | ```
95 |
96 | ## Where to Change Things
97 |
98 | ### 🔍 **Search Behavior**
99 | - **Query Processing**: `src/lib/searchDb.ts` → `toMatchQuery()`
100 | - **Search Logic**: `src/lib/search.ts` → `search()`
101 | - **Result Formatting**: `src/lib/localDocs.ts` → `searchLibraries()`
102 |
103 | ### ⚙️ **Configuration**
104 | - **Source Settings**: `src/metadata.json` → Add/modify sources
105 | - **Core Config**: `src/lib/config.ts` → System settings
106 | - **Metadata APIs**: `src/lib/metadata.ts` → Configuration access
107 |
108 | ### 🛠️ **MCP Tools**
109 | - **Tool Definitions**: `src/server.ts` → `ListToolsRequestSchema`
110 | - **Tool Handlers**: `src/server.ts` → `CallToolRequestSchema`
111 | - **HTTP Endpoints**: `src/http-server.ts` → `/mcp` handler
112 |
113 | ### 🏗️ **Build Process**
114 | - **Index Building**: `scripts/build-index.ts`
115 | - **FTS Database**: `scripts/build-fts.ts`
116 | - **Source Processing**: Modify build scripts for new source types
117 |
118 | ### 🧪 **Tests**
119 | - **Test Cases**: `test/tools/search/` → Add new test files
120 | - **Test Runner**: `test/tools/run-tests.js` → Modify test execution
121 | - **Output Parsing**: `test/_utils/parseResults.js` → Update format expectations
122 |
123 | ### 🚀 **Deployment**
124 | - **PM2 Config**: `ecosystem.config.cjs` → Process configuration
125 | - **GitHub Actions**: `.github/workflows/deploy-mcp-sap-docs.yml`
126 | - **Setup Script**: `setup.sh` → Deployment automation
127 |
128 | ## Adding New Documentation Sources
129 |
130 | ### 1. **Update Metadata** (`src/metadata.json`)
131 | ```json
132 | {
133 | "id": "new-source",
134 | "type": "documentation",
135 | "libraryId": "/new-source",
136 | "sourcePath": "new-source/docs",
137 | "baseUrl": "https://example.com/docs",
138 | "pathPattern": "/{file}",
139 | "anchorStyle": "github",
140 | "boost": 0.05,
141 | "tags": ["new", "documentation"],
142 | "description": "New documentation source"
143 | }
144 | ```
145 |
146 | ### 2. **Add Context Boosts** (if needed)
147 | ```json
148 | "contextBoosts": {
149 | "New Context": {
150 | "/new-source": 1.0,
151 | "/other-source": 0.3
152 | }
153 | }
154 | ```
155 |
156 | ### 3. **Add Library Mapping** (if needed)
157 | ```json
158 | "libraryMappings": {
159 | "new-source-alias": "new-source"
160 | }
161 | ```
162 |
163 | ### 4. **No Code Changes Required!**
164 | The metadata APIs automatically handle the new source.
165 |
166 | ## Debugging
167 |
168 | ### 🔍 **Search Issues**
169 | ```bash
170 | # Test specific queries
171 | node -e "
172 | import { search } from './dist/src/lib/search.js';
173 | const results = await search('your query');
174 | console.log(JSON.stringify(results, null, 2));
175 | "
176 |
177 | # Check FTS database
178 | sqlite3 dist/data/docs.sqlite "SELECT * FROM docs WHERE docs MATCH 'your query' LIMIT 5;"
179 | ```
180 |
181 | ### 📊 **Metadata Issues**
182 | ```bash
183 | # Test metadata loading
184 | node -e "
185 | import { loadMetadata, getSourceBoosts } from './dist/src/lib/metadata.js';
186 | loadMetadata();
187 | console.log('Boosts:', getSourceBoosts());
188 | "
189 | ```
190 |
191 | ### 🌐 **Server Issues**
192 | ```bash
193 | # Check server health
194 | curl http://localhost:3001/status
195 | curl http://localhost:3122/health
196 |
197 | # Test search endpoint
198 | curl -X POST http://localhost:3001/mcp \
199 | -H "Content-Type: application/json" \
200 | -d '{"role": "user", "content": "wizard"}'
201 | ```
202 |
203 | ## Performance Optimization
204 |
205 | ### ⚡ **Search Performance**
206 | - **FTS5 Tuning**: Modify `scripts/build-fts.ts` for different indexing strategies
207 | - **Query Optimization**: Adjust `toMatchQuery()` in `src/lib/searchDb.ts`
208 | - **Result Limits**: Configure `RETURN_K` environment variable
209 |
210 | ### 💾 **Memory Usage**
211 | - **Index Size**: Monitor `dist/data/` artifact sizes
212 | - **Metadata Loading**: Lazy loading in `src/lib/metadata.ts`
213 | - **Process Monitoring**: Use PM2 monitoring features
214 |
215 | ## Common Issues
216 |
217 | ### ❌ **Build Failures**
218 | ```bash
219 | # Clean and rebuild
220 | rm -rf dist/
221 | npm run build:all
222 | ```
223 |
224 | ### ❌ **Search Returns No Results**
225 | ```bash
226 | # Check if database exists
227 | ls -la dist/data/docs.sqlite
228 |
229 | # Verify index content
230 | sqlite3 dist/data/docs.sqlite "SELECT COUNT(*) FROM docs;"
231 | ```
232 |
233 | ### ❌ **Metadata Loading Errors**
234 | ```bash
235 | # Validate JSON syntax
236 | node -e "JSON.parse(require('fs').readFileSync('src/metadata.json', 'utf8'))"
237 |
238 | # Check file permissions
239 | ls -la src/metadata.json
240 | ```
241 |
242 | ### ❌ **Server Won't Start**
243 | ```bash
244 | # Check port availability
245 | lsof -i :3001
246 | lsof -i :3122
247 |
248 | # Kill conflicting processes
249 | lsof -ti:3001 | xargs kill -9
250 | ```
251 |
252 | ## Best Practices
253 |
254 | ### 📝 **Code Changes**
255 | 1. **Update Cursor Rules**: Modify `.cursor/rules/` when changing architecture
256 | 2. **Test First**: Run smoke tests before committing
257 | 3. **Metadata Over Code**: Use metadata.json for configuration changes
258 | 4. **Type Safety**: Use metadata APIs, never direct JSON access
259 |
260 | ### 🧪 **Testing**
261 | 1. **Smoke Tests**: Always run before deployment
262 | 2. **Integration Tests**: Test full MCP tool workflows
263 | 3. **Performance Tests**: Monitor search response times
264 | 4. **Output Validation**: Ensure format consistency
265 |
266 | ### 🚀 **Deployment**
267 | 1. **Build Validation**: Ensure all artifacts generated
268 | 2. **Health Checks**: Verify all endpoints after deployment
269 | 3. **Rollback Plan**: Keep previous artifacts for quick rollback
270 | 4. **Monitoring**: Watch logs and performance metrics
271 |
272 | ## Useful Development Tools
273 |
274 | ### 🔧 **VS Code Extensions**
275 | - **REST Client**: Use `test-search.http` for API testing
276 | - **SQLite Viewer**: Inspect FTS5 database content
277 | - **JSON Schema**: Validate metadata.json structure
278 |
279 | ### 📊 **Monitoring**
280 | ```bash
281 | # PM2 monitoring
282 | pm2 monit
283 |
284 | # Log streaming
285 | pm2 logs mcp-sap-http --lines 100
286 |
287 | # Process status
288 | pm2 status
289 | ```
290 |
```
--------------------------------------------------------------------------------
/docs/TESTS.md:
--------------------------------------------------------------------------------
```markdown
1 | # 🧪 Testing Guide
2 |
3 | ## Test Architecture
4 |
5 | ### 📁 **Test Structure**
6 | ```
7 | test/
8 | ├── tools/
9 | │ ├── run-tests.js # Main test runner
10 | │ ├── search.smoke.js # Quick validation tests
11 | │ └── search/ # Search test cases
12 | │ ├── search-cap-docs.js # CAP documentation tests
13 | │ ├── search-cloud-sdk-js.js # Cloud SDK tests
14 | │ └── search-sapui5-docs.js # UI5 documentation tests
15 | ├── _utils/
16 | │ ├── httpClient.js # HTTP server utilities
17 | │ └── parseResults.js # Output format validation
18 | └── performance/
19 | └── README.md # Performance testing guide
20 | ```
21 |
22 | ## Test Commands
23 |
24 | ### 🚀 **Quick Testing**
25 | ```bash
26 | npm run test:smoke # Fast validation (30 seconds)
27 | npm run test:fast # Skip build, test only (2 minutes)
28 | npm run test # Full build + test (5 minutes)
29 | npm run test:community # SAP Community functionality (1 minute)
30 | npm run inspect # MCP protocol inspector (interactive)
31 | ```
32 |
33 | ### 🎯 **Specific Tests**
34 | ```bash
35 | # Run specific test file
36 | node test/tools/run-tests.js --spec search-cap-docs
37 |
38 | # Run with custom server
39 | node test/tools/run-tests.js --port 3002
40 | ```
41 |
42 | ## Expected Output Format
43 |
44 | ### 📊 **BM25-Only Results**
45 | ```
46 | ⭐️ **<document-id>** (Score: <final-score>)
47 | <description-preview>
48 | Use in fetch
49 | ```
50 |
51 | **Example:**
52 | ```
53 | ⭐️ **/cap/cds/cdl#enums** (Score: 95.42)
54 | Use enums to define a fixed set of values for an element...
55 | Use in fetch
56 | ```
57 |
58 | ### 🎨 **Context Indicators**
59 | - **🎨 UI5 Context**: Frontend, controls, Fiori
60 | - **🏗️ CAP Context**: Backend, CDS, services
61 | - **🧪 wdi5 Context**: Testing, automation
62 | - **🔀 MIXED Context**: Cross-platform queries
63 |
64 | ### 📈 **Result Summary**
65 | ```
66 | Found X results for 'query' 🎨 **UI5 Context**:
67 |
68 | 🔹 **UI5 API Documentation:**
69 | ⭐️ **sap.m.Wizard** (Score: 100.00)
70 | ...
71 |
72 | 💡 **Context**: UI5 query detected. Scores reflect relevance to this context.
73 | ```
74 |
75 | ## Test Data Structure
76 |
77 | ### 🧪 **Test Case Format**
78 | ```javascript
79 | export default [
80 | {
81 | name: 'Test Name',
82 | tool: 'search',
83 | query: 'search term',
84 | expectIncludes: ['/expected/document/id'],
85 | validate: (results) => {
86 | // Custom validation logic
87 | return results.some(r => r.includes('expected content'));
88 | }
89 | }
90 | ];
91 | ```
92 |
93 | ### 📋 **Test Categories**
94 |
95 | #### **CAP Tests** (`search-cap-docs.js`)
96 | - CDS entities and services
97 | - Annotations and aspects
98 | - Query language features
99 | - Database integration
100 |
101 | #### **UI5 Tests** (`search-sapui5-docs.js`)
102 | - UI5 controls and APIs
103 | - Fiori elements
104 | - Data binding and routing
105 | - Chart and visualization components
106 |
107 | #### **Cloud SDK Tests** (`search-cloud-sdk-js.js`)
108 | - SDK getting started guides
109 | - API documentation
110 | - Upgrade and migration guides
111 | - Error handling patterns
112 |
113 | ## Output Validation
114 |
115 | ### 🔍 **Parser Logic** (`parseResults.js`)
116 | ```javascript
117 | // Expected line format
118 | const lineRe = /^⭐️ \*\*(.+?)\*\* \(Score: ([\d.]+)\)/;
119 |
120 | // Parsed result structure
121 | {
122 | id: '/document/path',
123 | finalScore: 95.42,
124 | rerankerScore: 0 // Always 0 in BM25-only mode
125 | }
126 | ```
127 |
128 | ### ✅ **Validation Rules**
129 | 1. **Score Format**: Must be numeric with 2 decimal places
130 | 2. **Document ID**: Must start with `/` and contain valid path
131 | 3. **Result Count**: Must respect `RETURN_K` limit (default: 25)
132 | 4. **Context Detection**: Must include appropriate emoji indicator
133 | 5. **Source Attribution**: Must group results by library type
134 |
135 | ## Test Execution Flow
136 |
137 | ### 🔄 **Test Runner Process**
138 | 1. **Server Startup**: Launch HTTP server on test port
139 | 2. **Health Check**: Verify server responds to `/status`
140 | 3. **Test Execution**: Run each test case sequentially
141 | 4. **Result Validation**: Parse and validate output format
142 | 5. **Server Cleanup**: Gracefully shut down test server
143 |
144 | ### 📊 **HTTP Client Utilities**
145 | ```javascript
146 | // Server management
147 | startServerHttp(port) // Launch server
148 | waitForStatus(port) // Wait for ready state
149 | stopServer(childProcess) // Clean shutdown
150 |
151 | // Search operations
152 | docsSearch(query, port) // Execute search query
153 | parseResults(response) // Parse formatted output
154 | ```
155 |
156 | ## Performance Testing
157 |
158 | ### ⏱️ **Response Time Expectations**
159 | - **Simple Queries**: < 100ms (after warm-up)
160 | - **Complex Queries**: < 500ms
161 | - **First Query**: May take longer (index loading)
162 | - **Subsequent Queries**: Should be consistently fast
163 |
164 | ### 📈 **Performance Metrics**
165 | ```javascript
166 | // Timing measurement
167 | const start = Date.now();
168 | const results = await docsSearch(query);
169 | const duration = Date.now() - start;
170 |
171 | // Validation
172 | assert(duration < 1000, `Query too slow: ${duration}ms`);
173 | ```
174 |
175 | ## Smoke Tests
176 |
177 | ### 🚀 **Quick Validation** (`search.smoke.js`)
178 | ```javascript
179 | const SMOKE_QUERIES = [
180 | { q: 'wizard', expect: /wizard|Wizard/i },
181 | { q: 'CAP entity', expect: /entity|Entity/i },
182 | { q: 'wdi5 testing', expect: /test|Test/i }
183 | ];
184 | ```
185 |
186 | ### ✅ **Smoke Test Assertions**
187 | 1. **Results Found**: Each query returns at least one result
188 | 2. **Expected Content**: Results contain expected keywords
189 | 3. **BM25 Mode**: All reranker scores are 0
190 | 4. **Format Compliance**: Output matches expected format
191 | 5. **Server Health**: All endpoints respond correctly
192 |
193 | ## Test Debugging
194 |
195 | ### 🔍 **Debug Failed Tests**
196 | ```bash
197 | # Run single test with verbose output
198 | DEBUG=1 node test/tools/run-tests.js --spec search-cap-docs
199 |
200 | # Check server logs
201 | tail -f logs/test-server.log
202 |
203 | # Validate specific query
204 | curl -X POST http://localhost:43122/mcp \
205 | -H "Content-Type: application/json" \
206 | -d '{"role": "user", "content": "failing query"}'
207 | ```
208 |
209 | ### 📊 **Common Test Failures**
210 |
211 | #### **No Results Found**
212 | - Check if search database exists: `ls -la dist/data/docs.sqlite`
213 | - Verify index content: `sqlite3 dist/data/docs.sqlite "SELECT COUNT(*) FROM docs;"`
214 | - Rebuild search artifacts: `npm run build:all`
215 |
216 | #### **Wrong Output Format**
217 | - Update parser regex in `parseResults.js`
218 | - Check for extra whitespace or formatting changes
219 | - Validate against expected format examples
220 |
221 | #### **Server Connection Issues**
222 | - Kill existing processes: `lsof -ti:43122 | xargs kill -9`
223 | - Check port availability: `lsof -i :43122`
224 | - Verify server startup logs
225 |
226 | #### **Context Detection Failures**
227 | - Review query expansion in `src/lib/metadata.ts`
228 | - Check context boost configuration in `src/metadata.json`
229 | - Validate context detection logic in `src/lib/localDocs.ts`
230 |
231 | ## Test Maintenance
232 |
233 | ### 🔄 **Updating Tests**
234 | 1. **New Sources**: Add test cases for new documentation sources
235 | 2. **Query Changes**: Update expected results when search logic changes
236 | 3. **Format Updates**: Modify parser when output format evolves
237 | 4. **Performance**: Adjust timing expectations based on system changes
238 |
239 | ### 📝 **Test Documentation**
240 | 1. **Document Changes**: Update test descriptions when modifying logic
241 | 2. **Expected Results**: Keep expectIncludes arrays current
242 | 3. **Validation Logic**: Comment complex validation functions
243 | 4. **Performance Baselines**: Document expected response times
244 |
245 | ### 🎯 **Best Practices**
246 | 1. **Specific Queries**: Use precise search terms for reliable results
247 | 2. **Stable Expectations**: Test against content unlikely to change
248 | 3. **Error Handling**: Include tests for edge cases and failures
249 | 4. **Performance Monitoring**: Track response time trends over time
250 |
251 | ## Integration with CI/CD
252 |
253 | ### 🚀 **GitHub Actions Integration**
254 | ```yaml
255 | - name: Run tests
256 | run: npm run test
257 |
258 | - name: Validate smoke tests
259 | run: npm run test:smoke
260 | ```
261 |
262 | ### 📊 **Test Reporting**
263 | - **Exit Codes**: 0 for success, non-zero for failures
264 | - **Console Output**: Structured test results with timing
265 | - **Error Details**: Specific failure information for debugging
266 | - **Summary Statistics**: Pass/fail counts and performance metrics
267 |
```
--------------------------------------------------------------------------------
/docs/ABAP-USAGE-GUIDE.md:
--------------------------------------------------------------------------------
```markdown
1 | # ABAP Documentation Usage Guide
2 |
3 | ## 🎯 **Overview**
4 |
5 | ABAP documentation is now fully integrated into the standard MCP search system with **intelligent version filtering** for clean, focused results.
6 |
7 | ## 🔍 **How to Search ABAP Documentation**
8 |
9 | ### **Standard Interface - No Special Tools**
10 | Use **`search`** for all ABAP queries - same as UI5, CAP, wdi5!
11 |
12 | ### **✅ General ABAP Queries (Latest + Context)**
13 |
14 | #### **Query Patterns:**
15 | ```javascript
16 | search: "inline declarations"
17 | search: "SELECT statements"
18 | search: "exception handling"
19 | search: "class definition"
20 | search: "internal table operations"
21 | ```
22 |
23 | #### **What You Get:**
24 | - **Latest ABAP documentation** - Most current syntax and features
25 | - **Clean ABAP style guides** - Best practices and guidelines
26 | - **ABAP cheat sheets** - Practical examples and working code
27 | - **4-5 focused results** - No version clutter or duplicates
28 |
29 | #### **Example Result:**
30 | ```
31 | Found 4 results for 'inline declarations':
32 |
33 | ⭐️ SAP Style Guides - Best practices (Score: 22.75)
34 | Prefer inline to up-front declarations
35 | 🔗 Clean ABAP guidelines
36 |
37 | ⭐️ ABAP Cheat Sheets - Examples (Score: 19.80)
38 | Inline Declaration, CAST Operator, Method Chaining
39 | 🔗 Practical code examples
40 |
41 | ⭐️ Latest ABAP Docs - Programming guide (Score: 18.59)
42 | Background The declaration operators - [DATA(var)] for variables
43 |
44 | ⭐️ Latest ABAP Docs - Language reference (Score: 17.72)
45 | An inline declaration is performed using a declaration operator...
46 | ```
47 |
48 | ### **✅ Version-Specific Queries (Targeted)**
49 |
50 | #### **Query Patterns:**
51 | ```javascript
52 | search: "LOOP 7.57" // → ABAP 7.57 only
53 | search: "SELECT statements 7.58" // → ABAP 7.58 only
54 | search: "exception handling latest" // → Latest ABAP only
55 | search: "class definition 7.53" // → ABAP 7.53 only
56 | ```
57 |
58 | #### **What You Get:**
59 | - **Requested ABAP version only** - No other versions shown
60 | - **Dramatically boosted scores** - Requested version gets priority
61 | - **Related sources included** - Style guides and cheat sheets for context
62 | - **Clean, targeted results** - 5-8 results, all relevant
63 |
64 | #### **Example Result:**
65 | ```
66 | Found 5 results for 'LOOP 7.57':
67 |
68 | ⭐️ /abap-docs-757/abenloop_glosry (Score: 14.35) - Boosted 7.57 docs
69 | Loops - This section describes the loops defined using DO-ENDDO, WHILE-ENDWHILE
70 |
71 | ⭐️ /abap-docs-757/abenabap_loops (Score: 14.08) - Boosted 7.57 docs
72 | ABAP Loops - Loop processing and control structures
73 |
74 | ⭐️ /abap-docs-757/abapexit_loop (Score: 13.53) - Boosted 7.57 docs
75 | EXIT, loop - Exits a loop completely with EXIT statement
76 |
77 | ⭐️ Style guides and cheat sheets for additional context
78 | ```
79 |
80 | ---
81 |
82 | ## **📖 Document Retrieval**
83 |
84 | ### **Standard Document Access**
85 | ```javascript
86 | // Use IDs from search results
87 | fetch: "/abap-docs-latest/abeninline_declarations"
88 | fetch: "/abap-docs-758/abenselect"
89 | fetch: "/abap-docs-757/abenloop_glosry"
90 | ```
91 |
92 | ### **What You Get:**
93 | - **Complete documentation** with full content and examples
94 | - **Official attribution** - Direct links to help.sap.com
95 | - **Rich formatting** - Optimized for LLM consumption
96 | - **Source context** - Version, category, and related concepts
97 |
98 | ---
99 |
100 | ## **🎯 Supported ABAP Versions**
101 |
102 | | Version | Library ID | Default Boost | When Shown |
103 | |---------|------------|---------------|------------|
104 | | **Latest** | `/abap-docs-latest` | 1.0 | Always (default) |
105 | | **7.58** | `/abap-docs-758` | 0.05 | When "7.58" in query |
106 | | **7.57** | `/abap-docs-757` | 0.02 | When "7.57" in query |
107 | | **7.56** | `/abap-docs-756` | 0.01 | When "7.56" in query |
108 | | **7.55** | `/abap-docs-755` | 0.01 | When "7.55" in query |
109 | | **7.54** | `/abap-docs-754` | 0.01 | When "7.54" in query |
110 | | **7.53** | `/abap-docs-753` | 0.01 | When "7.53" in query |
111 | | **7.52** | `/abap-docs-752` | 0.01 | When "7.52" in query |
112 |
113 | ### **Context Boosting**
114 | When versions are mentioned in queries, they get **2.0x boost** for perfect targeting.
115 |
116 | ---
117 |
118 | ## **💡 Query Examples**
119 |
120 | ### **ABAP Language Concepts**
121 | ```javascript
122 | // General queries (latest ABAP + context)
123 | "How do I use inline declarations?" → Latest ABAP + style guides + examples
124 | "What are different LOOP statement types?" → Latest ABAP + best practices
125 | "Explain exception handling in ABAP" → Latest ABAP + clean code guidelines
126 | "ABAP object-oriented programming" → Latest ABAP + OOP examples
127 |
128 | // Expected: 4-5 clean, focused results
129 | ```
130 |
131 | ### **Version-Specific Development**
132 | ```javascript
133 | // Version-targeted queries (specific version only)
134 | "LOOP variations in 7.57" → ABAP 7.57 + related sources only
135 | "SELECT features in 7.58" → ABAP 7.58 + related sources only
136 | "What's new in ABAP latest?" → Latest ABAP + feature highlights
137 | "Exception handling in 7.53" → ABAP 7.53 + related sources only
138 |
139 | // Expected: 5-8 targeted results, dramatically boosted scores
140 | ```
141 |
142 | ### **Cross-Source Discovery**
143 | ```javascript
144 | // Finds related content across all sources
145 | "ABAP class definition best practices" → Official docs + Clean ABAP + examples
146 | "SELECT statement performance optimization" → ABAP syntax + performance guides + examples
147 | "ABAP clean code guidelines" → Style guides + latest syntax + examples
148 | ```
149 |
150 | ---
151 |
152 | ## **📈 Performance & Quality**
153 |
154 | ### **Search Quality**
155 | - **4-5 focused results** instead of 25 crowded duplicates
156 | - **Rich content descriptions** with actual explanations
157 | - **Cross-source intelligence** - finds related content everywhere
158 | - **Perfect relevance** - only show what's actually needed
159 |
160 | ### **Version Management**
161 | - **Latest by default** - always current unless specified otherwise
162 | - **Smart targeting** - specific versions only when requested
163 | - **Automatic detection** - no need to specify version parameters manually
164 | - **Clean results** - no version clutter or noise
165 |
166 | ### **Content Quality**
167 | - **40,761 curated files** - irrelevant content filtered out
168 | - **Meaningful frontmatter** - structured metadata for better AI understanding
169 | - **Official attribution** - complete source linking to help.sap.com
170 | - **LLM-optimized** - perfect file sizes and content structure
171 |
172 | ---
173 |
174 | ## **🔄 Migration from Old Tools**
175 |
176 | ### **Old Approach (Deprecated)**
177 | ```javascript
178 | // Required specialized tools (now deprecated)
179 | abap_search: "inline declarations"
180 | abap_get: "abap-7.58-individual-abeninline_declarations"
181 | ```
182 |
183 | ### **New Approach (Standard)**
184 | ```javascript
185 | // Uses unified tool like everything else
186 | search: "inline declarations"
187 | fetch: "/abap-docs-latest/abeninline_declarations"
188 | ```
189 |
190 | ### **Benefits of Migration**
191 | - ✅ **Simpler interface** - one tool for all SAP development
192 | - ✅ **Better results** - intelligent filtering and cross-source discovery
193 | - ✅ **Rich content** - meaningful descriptions and context
194 | - ✅ **Version flexibility** - automatic management with manual override
195 |
196 | ---
197 |
198 | ## **🚀 Production Usage**
199 |
200 | ### **For LLM Interactions**
201 | ```
202 | Human: "How do I handle exceptions in ABAP?"
203 |
204 | LLM uses: search: "exception handling"
205 |
206 | Gets:
207 | ✅ Latest ABAP exception syntax
208 | ✅ Clean ABAP best practices
209 | ✅ Practical examples with TRY/CATCH
210 | ✅ Cross-references to related concepts
211 | ```
212 |
213 | ### **For Version-Specific Development**
214 | ```
215 | Human: "I'm working with ABAP 7.53, how do LOOP statements work?"
216 |
217 | LLM uses: search: "LOOP statements 7.53"
218 |
219 | Gets:
220 | ✅ ABAP 7.53 loop documentation only
221 | ✅ Version-specific features and limitations
222 | ✅ Related style guides and examples
223 | ✅ No confusion from other versions
224 | ```
225 |
226 | ---
227 |
228 | ## **📋 Summary**
229 |
230 | **The ABAP integration is now complete and production-ready with:**
231 |
232 | - ✅ **Unified interface** - same tool for all SAP development
233 | - ✅ **Intelligent filtering** - clean, focused results
234 | - ✅ **Rich content** - meaningful descriptions and context
235 | - ✅ **Version flexibility** - latest by default, specific when needed
236 | - ✅ **Cross-source intelligence** - finds related content everywhere
237 | - ✅ **Standard architecture** - proven, scalable, maintainable
238 |
239 | **Result: The cleanest, most intelligent ABAP documentation search experience available for LLMs!** 🎉
240 |
```
--------------------------------------------------------------------------------
/docs/ABAP-INTEGRATION-SUMMARY.md:
--------------------------------------------------------------------------------
```markdown
1 | # ABAP Integration Summary - Complete Standard System Integration
2 |
3 | ## 🎯 **What Was Accomplished**
4 |
5 | This major update integrates **40,761+ ABAP documentation files** across **8 versions** into the standard MCP system with intelligent version management and rich content extraction.
6 |
7 | ### **Key Changes Made**
8 |
9 | #### **1. Standard System Integration** ✅
10 | - ✅ **Removed specialized tools** - No more `abap_search`/`abap_get`
11 | - ✅ **Unified interface** - Uses standard `search` like UI5/CAP
12 | - ✅ **Multi-version support** - All 8 ABAP versions (7.52-7.58 + latest) integrated
13 | - ✅ **Clean architecture** - Same proven system powering other sources
14 |
15 | #### **2. Intelligent Version Management** ✅
16 | - ✅ **Latest by default** - General queries show only latest ABAP version
17 | - ✅ **Version auto-detection** - "LOOP 7.57" automatically searches ABAP 7.57
18 | - ✅ **Smart filtering** - Prevents crowded results with duplicate content
19 | - ✅ **Context boosting** - Requested versions get dramatically higher scores
20 |
21 | #### **3. Content Quality Revolution** ✅
22 | - ✅ **Rich frontmatter** - Every file has title, description, keywords, category
23 | - ✅ **Meaningful snippets** - Actual explanations instead of filenames
24 | - ✅ **Filtered noise** - Removed 2,156+ irrelevant `abennews` files
25 | - ✅ **YAML-safe generation** - Proper escaping for complex ABAP syntax
26 |
27 | #### **4. Enhanced Search Experience** ✅
28 | - ✅ **Perfect result focus** - 4-5 targeted results vs 25 crowded duplicates
29 | - ✅ **Cross-source intelligence** - Finds style guides + cheat sheets + docs
30 | - ✅ **Version-aware scoring** - Latest gets highest boost, specific versions when requested
31 | - ✅ **Error resilience** - Graceful handling of malformed content
32 |
33 | ---
34 |
35 | ## **📊 Integration Statistics**
36 |
37 | | Metric | Before | After | Change |
38 | |--------|--------|-------|--------|
39 | | **ABAP Tools** | 2 specialized | 0 (standard integration) | -2 tools |
40 | | **Total Documents** | 63,454 | 61,298 | -2,156 irrelevant files |
41 | | **ABAP Files** | 42,901 raw | 40,761 curated | Quality over quantity |
42 | | **Database Size** | 30.53 MB | 33.32 MB | +Rich content |
43 | | **Default Results** | 25 crowded | 4-5 focused | 80%+ noise reduction |
44 | | **Versions Supported** | 1 (specialized) | 8 (standard) | Full version coverage |
45 |
46 | ---
47 |
48 | ## **🚀 How to Use ABAP Search**
49 |
50 | ### **Standard Interface (Like UI5/CAP)**
51 | All ABAP search now uses the **unified `search` tool** - no special tools needed!
52 |
53 | #### **General ABAP Queries (Latest Version)**
54 | ```javascript
55 | // Shows latest ABAP docs + style guides + cheat sheets
56 | search: "inline declarations"
57 | search: "SELECT statements"
58 | search: "exception handling"
59 | search: "class definition"
60 | search: "internal table operations"
61 |
62 | // Example Result (Clean & Focused):
63 | Found 4 results for 'inline declarations':
64 | ✅ SAP Style Guides - Best practices
65 | ✅ ABAP Cheat Sheets - Practical examples
66 | ✅ Latest ABAP Docs - Official reference
67 | ✅ Cross-references - Related concepts
68 | ```
69 |
70 | #### **Version-Specific Queries (Targeted Results)**
71 | ```javascript
72 | // Auto-detects version and shows ONLY that version + related sources
73 | search: "LOOP 7.57" // → ABAP 7.57 only
74 | search: "SELECT statements 7.58" // → ABAP 7.58 only
75 | search: "exception handling latest" // → Latest ABAP only
76 | search: "class definition 7.53" // → ABAP 7.53 only
77 |
78 | // Example Result (Version-Targeted):
79 | Found 5 results for 'LOOP 7.57':
80 | ✅ /abap-docs-757/abenloop_glosry (Score: 14.35) - Boosted 7.57 docs
81 | ✅ /abap-docs-757/abenabap_loops (Score: 14.08) - Boosted 7.57 docs
82 | ✅ Style guides and cheat sheets for context
83 | ```
84 |
85 | #### **Document Retrieval (Standard)**
86 | ```javascript
87 | // Same as other sources - use IDs from search results
88 | fetch: "/abap-docs-758/abeninline_declarations"
89 | fetch: "/abap-docs-latest/abenselect"
90 | fetch: "/abap-docs-757/abenloop_glosry"
91 | ```
92 |
93 | ---
94 |
95 | ## **🔧 Technical Implementation**
96 |
97 | ### **Metadata Configuration**
98 | ```json
99 | // 8 ABAP versions with intelligent boosting
100 | {
101 | "sources": [
102 | { "id": "abap-docs-latest", "boost": 1.0 }, // Default
103 | { "id": "abap-docs-758", "boost": 0.05 }, // Background
104 | { "id": "abap-docs-757", "boost": 0.02 }, // Background
105 | // ... 7.56-7.52 with 0.01 boost
106 | ],
107 | "contextBoosts": {
108 | "7.58": { "/abap-docs-758": 2.0 }, // Massive boost when version specified
109 | "7.57": { "/abap-docs-757": 2.0 },
110 | "latest": { "/abap-docs-latest": 1.5 }
111 | }
112 | }
113 | ```
114 |
115 | ### **Search Logic Enhancement**
116 | ```typescript
117 | // Intelligent version detection and filtering
118 | const versionMatch = query.match(/\b(7\.\d{2}|latest)\b/i);
119 | const requestedVersion = versionMatch ? versionMatch[1] : null;
120 |
121 | if (!requestedVersion) {
122 | // General queries: Show ONLY latest ABAP
123 | results = results.filter(r =>
124 | !r.id.includes('/abap-docs-') || r.id.includes('/abap-docs-latest/')
125 | );
126 | } else {
127 | // Version-specific: Show ONLY requested version
128 | results = results.filter(r =>
129 | !r.id.includes('/abap-docs-') || r.id.includes(`/abap-docs-${versionId}/`)
130 | );
131 | }
132 | ```
133 |
134 | ### **Content Generation Optimization**
135 | ```javascript
136 | // Enhanced generate.js with frontmatter
137 | function generateFrontmatter(metadata) {
138 | return `title: "${metadata.title}"
139 | description: |
140 | ${metadata.description}
141 | version: "${metadata.version}"
142 | category: "${metadata.category}"
143 | keywords: [${metadata.keywords.join(', ')}]
144 | `;
145 | }
146 |
147 | // Skip irrelevant files
148 | if (htmlFile.startsWith('abennews')) {
149 | continue; // Skip 2,156+ news files
150 | }
151 | ```
152 |
153 | ---
154 |
155 | ## **💡 Usage Examples**
156 |
157 | ### **ABAP Language Questions**
158 | ```
159 | "How do I use inline declarations?"
160 | → Latest ABAP reference + Clean ABAP best practices + practical examples
161 |
162 | "What are the LOOP statement variations in 7.57?"
163 | → ABAP 7.57 loop documentation + style guides + cheat sheets
164 |
165 | "Show me exception handling patterns"
166 | → Latest ABAP TRY/CATCH reference + clean code guidelines + examples
167 | ```
168 |
169 | ### **Cross-Source Discovery**
170 | ```
171 | "ABAP class definition best practices"
172 | → Official ABAP OOP docs + Clean ABAP guidelines + practical examples
173 |
174 | "SELECT statement optimization"
175 | → Latest ABAP SQL reference + performance guidelines + working code
176 | ```
177 |
178 | ### **Version-Specific Development**
179 | ```
180 | "What's new in ABAP latest?"
181 | → Latest ABAP features and syntax changes
182 |
183 | "ABAP 7.53 specific features"
184 | → ABAP 7.53 documentation focused on version-specific capabilities
185 | ```
186 |
187 | ---
188 |
189 | ## **🎉 Benefits for Users**
190 |
191 | ### **✅ Simplified Experience**
192 | - **One tool** for all SAP development (ABAP + UI5 + CAP + testing)
193 | - **Clean results** - no more sifting through duplicate versions
194 | - **Intelligent defaults** - latest ABAP unless otherwise specified
195 |
196 | ### **✅ Comprehensive Coverage**
197 | - **40,761+ ABAP files** with rich, searchable content
198 | - **8 ABAP versions** available with smart targeting
199 | - **Cross-source intelligence** - related content across all documentation
200 |
201 | ### **✅ Perfect LLM Integration**
202 | - **Rich content snippets** with actual explanations
203 | - **Optimal file sizes** (3-8KB) for context windows
204 | - **Structured metadata** for better AI understanding
205 | - **Official attribution** with direct SAP documentation links
206 |
207 | ---
208 |
209 | ## **🔮 Future Extensibility**
210 |
211 | This architecture makes it trivial to:
212 | - ✅ **Add new ABAP versions** - just add to metadata and build index
213 | - ✅ **Add new sources** - same standard integration process
214 | - ✅ **Adjust version priorities** - modify boost values in metadata
215 | - ✅ **Enhance filtering** - extend version detection patterns
216 |
217 | The standard integration approach ensures **long-term maintainability** and **easy scaling** as the SAP ecosystem evolves.
218 |
219 | ---
220 |
221 | ## **📋 Migration Notes**
222 |
223 | ### **For Existing Users**
224 | - ✅ **No breaking changes** - `search` behavior enhanced, not changed
225 | - ✅ **Better results** - same queries now return higher quality, focused results
226 | - ✅ **New capabilities** - version auto-detection and cross-source intelligence
227 |
228 | ### **For New Users**
229 | - ✅ **Simple onboarding** - just one tool to learn (`search`)
230 | - ✅ **Intuitive behavior** - latest by default, specific versions on request
231 | - ✅ **Rich context** - meaningful results from day one
232 |
233 | **The ABAP integration represents a quantum leap in documentation accessibility and search quality for SAP development with LLMs.** 🚀
234 |
```
--------------------------------------------------------------------------------
/docs/CURSOR-SETUP.md:
--------------------------------------------------------------------------------
```markdown
1 | # 🎯 Cursor IDE Optimization Guide
2 |
3 | ## Overview
4 |
5 | This guide explains how to optimize Cursor IDE for the SAP Docs MCP project using `.cursorignore` and Project Rules to improve AI assistance quality and response speed.
6 |
7 | ## 📁 File Structure
8 |
9 | ```
10 | .cursorignore # Exclude large/irrelevant files
11 | .cursor/
12 | └── rules/ # Project-specific rules
13 | ├── 00-overview.mdc # High-level system overview
14 | ├── 10-search-stack.mdc # Search and indexing
15 | ├── 20-tools-and-apis.mdc # MCP tools and endpoints
16 | ├── 30-tests-and-output.mdc # Testing and validation
17 | ├── 40-deploy.mdc # Deployment and operations
18 | └── 50-metadata-config.mdc # Configuration management
19 | docs/
20 | ├── ARCHITECTURE.md # System architecture
21 | ├── DEV.md # Development guide
22 | ├── TESTS.md # Testing guide
23 | └── CURSOR-SETUP.md # This guide
24 | ```
25 |
26 | ## 🚫 .cursorignore Configuration
27 |
28 | ### Purpose
29 | Keeps the index small and responses focused by excluding:
30 | - Build artifacts and caches
31 | - Large vendor documentation
32 | - Generated search databases
33 | - Test artifacts and logs
34 |
35 | ### Current Configuration
36 | ```gitignore
37 | # Build output & caches
38 | dist/**
39 | node_modules/**
40 | .cache/**
41 | coverage/**
42 | *.log
43 |
44 | # Large vendor docs & tests
45 | sources/**/test/**
46 | sources/openui5/**/test/**
47 | sources/**/.git/**
48 | sources/**/.github/**
49 | sources/**/node_modules/**
50 |
51 | # Generated search artifacts
52 | dist/data/index.json
53 | dist/data/*.sqlite
54 | dist/data/*.db
55 |
56 | # Test artifacts
57 | test-*.js
58 | debug-*.js
59 | *.tmp
60 | ```
61 |
62 | ## 📋 Project Rules System
63 |
64 | ### Rule Structure
65 | Each rule file (`.mdc`) contains:
66 | - **Purpose**: When to use this rule
67 | - **Key Concepts**: Important information for that domain
68 | - **File References**: `@file` directives to auto-attach relevant context
69 |
70 | ### Current Rules
71 |
72 | #### **00-overview.mdc** - System Overview
73 | - **When**: "how it works", "where to change X", "what runs in prod"
74 | - **Covers**: Architecture, components, production setup
75 | - **Files**: Core system files (server.ts, metadata.json, config.ts)
76 |
77 | #### **10-search-stack.mdc** - Search & Indexing
78 | - **When**: Modifying search behavior, ranking, or index builds
79 | - **Covers**: BM25 search, FTS5, metadata APIs, query processing
80 | - **Files**: Search-related modules (search.ts, searchDb.ts, metadata.ts)
81 |
82 | #### **20-tools-and-apis.mdc** - MCP Tools & Endpoints
83 | - **When**: Tool schemas, request/response formats, endpoints
84 | - **Covers**: 5 MCP tools, server implementations, response formats
85 | - **Files**: Server implementations and tool handlers
86 |
87 | #### **30-tests-and-output.mdc** - Tests & Output
88 | - **When**: Changing output formatting or test stability
89 | - **Covers**: Test architecture, expected formats, validation
90 | - **Files**: Test runner, utilities, and test cases
91 |
92 | #### **40-deploy.mdc** - Deploy & Operations
93 | - **When**: PM2 processes, GitHub Actions, health checks
94 | - **Covers**: Deployment pipeline, PM2 config, monitoring
95 | - **Files**: Deployment configurations and workflows
96 |
97 | #### **50-metadata-config.mdc** - Configuration Management
98 | - **When**: Working with centralized configuration system
99 | - **Covers**: Metadata APIs, configuration structure, adding sources
100 | - **Files**: Metadata system files and documentation
101 |
102 | ## 🔄 Maintaining Rules (CRITICAL)
103 |
104 | ### ⚠️ **ALWAYS UPDATE RULES WHEN MAKING CHANGES**
105 |
106 | When you modify the system, **immediately update the relevant rules**:
107 |
108 | #### **Architecture Changes**
109 | - Update `00-overview.mdc` for system-level changes
110 | - Update `docs/ARCHITECTURE.md` for structural modifications
111 | - Add new `@file` references for new core modules
112 |
113 | #### **Search System Changes**
114 | - Update `10-search-stack.mdc` for search logic modifications
115 | - Update file references if search modules are renamed/moved
116 | - Document new search features or configuration options
117 |
118 | #### **API/Tool Changes**
119 | - Update `20-tools-and-apis.mdc` for new tools or endpoints
120 | - Update response format documentation
121 | - Add new server implementations to file references
122 |
123 | #### **Test Changes**
124 | - Update `30-tests-and-output.mdc` for test format changes
125 | - Document new test categories or validation rules
126 | - Update expected output format examples
127 |
128 | #### **Deployment Changes**
129 | - Update `40-deploy.mdc` for PM2 or workflow changes
130 | - Update environment variable documentation
131 | - Add new deployment artifacts or processes
132 |
133 | #### **Configuration Changes**
134 | - Update `50-metadata-config.mdc` for metadata system changes
135 | - Document new APIs or configuration options
136 | - Update examples for adding new sources
137 |
138 | ## 📖 Documentation Integration
139 |
140 | ### Reference Pattern
141 | Rules reference documentation files using `@file` directives:
142 |
143 | ```markdown
144 | @file docs/ARCHITECTURE.md
145 | @file docs/DEV.md
146 | @file docs/TESTS.md
147 | ```
148 |
149 | ### Documentation Files
150 | - **ARCHITECTURE.md**: System overview with diagrams
151 | - **DEV.md**: Development commands and common tasks
152 | - **TESTS.md**: Test execution and validation details
153 | - **METADATA-CONSOLIDATION.md**: Configuration system changes
154 |
155 | ## 🎯 Usage in Cursor
156 |
157 | ### Automatic Context
158 | Cursor automatically includes relevant rules and files based on:
159 | - Current file being edited
160 | - Keywords in your questions
161 | - Project structure analysis
162 |
163 | ### Manual Invocation
164 | You can explicitly reference rules:
165 | ```
166 | @Cursor Rules search-stack
167 | @Files src/lib/search.ts
168 | ```
169 |
170 | ### Best Practices
171 | 1. **Specific Questions**: Ask about specific components for better rule matching
172 | 2. **Context Hints**: Mention the area you're working on (search, deploy, tests)
173 | 3. **File References**: Open relevant files to provide additional context
174 |
175 | ## 🔧 Optimization Tips
176 |
177 | ### Performance
178 | - **Selective Ignoring**: Add large files to `.cursorignore` immediately
179 | - **Rule Specificity**: Keep rules focused on specific domains
180 | - **File References**: Only include essential files in `@file` directives
181 |
182 | ### Quality
183 | - **Regular Updates**: Update rules whenever system changes
184 | - **Clear Descriptions**: Use descriptive rule purposes and coverage
185 | - **Comprehensive Coverage**: Ensure all major system areas have rules
186 |
187 | ### Maintenance
188 | - **Version Control**: Commit rule changes with related code changes
189 | - **Documentation Sync**: Keep rules and docs in sync
190 | - **Regular Review**: Periodically review and update rule effectiveness
191 |
192 | ## 📋 Rule Update Checklist
193 |
194 | When making system changes, check these items:
195 |
196 | ### ✅ **Before Coding**
197 | - [ ] Identify which rules might be affected
198 | - [ ] Review current rule content for accuracy
199 | - [ ] Plan rule updates alongside code changes
200 |
201 | ### ✅ **During Development**
202 | - [ ] Update rule content as you make changes
203 | - [ ] Add new `@file` references for new modules
204 | - [ ] Update documentation files if needed
205 |
206 | ### ✅ **After Changes**
207 | - [ ] Verify all affected rules are updated
208 | - [ ] Test rule effectiveness with sample questions
209 | - [ ] Commit rule changes with code changes
210 | - [ ] Update this guide if rule structure changes
211 |
212 | ## 🚀 Advanced Usage
213 |
214 | ### Custom Rules
215 | Create additional rules for:
216 | - **Feature-specific**: Complex features spanning multiple modules
217 | - **Team-specific**: Team conventions and practices
218 | - **Environment-specific**: Development vs production considerations
219 |
220 | ### Rule Templates
221 | Use this template for new rules:
222 |
223 | ```markdown
224 | # Rule Title (Rule)
225 |
226 | Brief description of when to use this rule.
227 |
228 | ## Key Concepts
229 | - Important concept 1
230 | - Important concept 2
231 |
232 | ## Coverage Areas
233 | - Area 1: Description
234 | - Area 2: Description
235 |
236 | @file relevant/file1.ts
237 | @file relevant/file2.ts
238 | @file docs/RELEVANT.md
239 | ```
240 |
241 | ### Integration with Workflow
242 | 1. **Planning**: Review relevant rules before starting work
243 | 2. **Development**: Keep rules open for reference
244 | 3. **Review**: Update rules as part of code review process
245 | 4. **Documentation**: Use rules to guide documentation updates
246 |
247 | ## 🎉 Benefits
248 |
249 | ### For Development
250 | - **Faster Context**: Cursor quickly understands project structure
251 | - **Better Suggestions**: More relevant code suggestions and fixes
252 | - **Reduced Repetition**: Less need to explain system architecture
253 |
254 | ### For Maintenance
255 | - **Knowledge Preservation**: System knowledge captured in rules
256 | - **Onboarding**: New developers can understand system quickly
257 | - **Consistency**: Consistent approach to similar problems
258 |
259 | ### For AI Assistance
260 | - **Focused Responses**: AI responses are more targeted and relevant
261 | - **Better Understanding**: AI has deeper context about system design
262 | - **Accurate Suggestions**: Suggestions align with project patterns and conventions
263 |
264 | ---
265 |
266 | **Remember**: The key to effective Cursor optimization is keeping the rules current and comprehensive. Always update rules when making system changes!
267 |
```
--------------------------------------------------------------------------------
/docs/METADATA-CONSOLIDATION.md:
--------------------------------------------------------------------------------
```markdown
1 | # 🎯 Metadata-Driven Configuration Consolidation
2 |
3 | ## Overview
4 |
5 | This document describes the comprehensive consolidation of all hardcoded source configurations into a centralized, metadata-driven system. The changes eliminate scattered configuration values throughout the codebase and provide a single source of truth for all documentation source settings.
6 |
7 | ## 🚀 Key Changes Summary
8 |
9 | ### 1. **Centralized Metadata System**
10 | - **Moved** `data/metadata.json` → `src/metadata.json`
11 | - **Extended** metadata.json with comprehensive source configurations
12 | - **Created** type-safe APIs for accessing all configuration data
13 | - **Eliminated** all hardcoded configuration values from source code
14 |
15 | ### 2. **Enhanced Configuration Structure**
16 | - **12 documentation sources** with complete metadata
17 | - **Source paths, URLs, and anchor styles** for documentation generation
18 | - **Context-specific boosts** for intelligent query routing
19 | - **Library ID mappings** for source resolution
20 | - **Context emojis** for UI presentation
21 | - **Synonyms and acronyms** for query expansion
22 |
23 | ### 3. **Simplified Core Configuration**
24 | - **Removed** hardcoded `SOURCE_BOOSTS` from `config.ts`
25 | - **Centralized** all source-specific settings in metadata.json
26 | - **Maintained** core system settings (RETURN_K, DB_PATH, etc.)
27 |
28 | ## 📁 Files Modified
29 |
30 | ### **Core Configuration Files**
31 |
32 | #### `src/metadata.json` ✨ **NEW LOCATION**
33 | ```json
34 | {
35 | "version": 1,
36 | "sources": [
37 | {
38 | "id": "sapui5",
39 | "libraryId": "/sapui5",
40 | "sourcePath": "sapui5-docs/docs",
41 | "baseUrl": "https://ui5.sap.com",
42 | "pathPattern": "/#/topic/{file}",
43 | "anchorStyle": "custom",
44 | "boost": 0.1,
45 | "tags": ["ui5", "frontend", "javascript"]
46 | }
47 | // ... 11 more sources
48 | ],
49 | "contextBoosts": {
50 | "UI5": { "/sapui5": 0.9, "/openui5-api": 0.9 },
51 | "CAP": { "/cap": 1.0, "/sapui5": 0.2 }
52 | // ... more contexts
53 | },
54 | "libraryMappings": {
55 | "openui5-api": "sapui5",
56 | "openui5-samples": "sapui5"
57 | // ... more mappings
58 | },
59 | "contextEmojis": {
60 | "CAP": "🏗️", "UI5": "🎨", "wdi5": "🧪"
61 | // ... more emojis
62 | }
63 | }
64 | ```
65 |
66 | #### `src/lib/config.ts` 🔧 **SIMPLIFIED**
67 | ```typescript
68 | // Before: 25 lines with hardcoded SOURCE_BOOSTS
69 | export const CONFIG = {
70 | RETURN_K: Number(process.env.RETURN_K || 25),
71 | DB_PATH: "dist/data/docs.sqlite",
72 | METADATA_PATH: "src/metadata.json", // Updated path
73 | USE_OR_LOGIC: true,
74 | // SOURCE_BOOSTS removed - now in metadata.json
75 | };
76 | ```
77 |
78 | #### `src/lib/metadata.ts` ✨ **ENHANCED**
79 | **New comprehensive API with 12 functions:**
80 | ```typescript
81 | // Documentation URL configuration
82 | export function getDocUrlConfig(libraryId: string): DocUrlConfig | null
83 | export function getAllDocUrlConfigs(): Record<string, DocUrlConfig>
84 |
85 | // Source path management
86 | export function getSourcePath(libraryId: string): string | null
87 | export function getAllSourcePaths(): Record<string, string>
88 |
89 | // Context-aware boosts
90 | export function getContextBoosts(context: string): Record<string, number>
91 | export function getAllContextBoosts(): Record<string, Record<string, number>>
92 |
93 | // Library mappings
94 | export function getLibraryMapping(sourceId: string): string | null
95 | export function getAllLibraryMappings(): Record<string, string>
96 |
97 | // UI presentation
98 | export function getContextEmoji(context: string): string
99 | export function getAllContextEmojis(): Record<string, string>
100 |
101 | // Source lookup
102 | export function getSourceByLibraryId(libraryId: string): SourceMeta | null
103 | export function getSourceById(id: string): SourceMeta | null
104 | ```
105 |
106 | ### **Updated Implementation Files**
107 |
108 | #### `src/lib/search.ts` 🔄 **REFACTORED**
109 | ```typescript
110 | // Before: Hardcoded library mappings (15 lines)
111 | const mapping: Record<string, string> = {
112 | 'sapui5': 'sapui5',
113 | 'openui5-api': 'sapui5', // Map UI5 API to sapui5 source
114 | // ... more hardcoded mappings
115 | };
116 |
117 | // After: Metadata-driven (1 line)
118 | const mappings = getAllLibraryMappings();
119 | return mappings[sourceId] || sourceId;
120 | ```
121 |
122 | #### `src/lib/localDocs.ts` 🔄 **MAJOR REFACTOR**
123 | **Removed hardcoded configurations:**
124 | - ❌ `DOC_URL_CONFIGS` (45 lines) → ✅ `getDocUrlConfig()`
125 | - ❌ Source path mappings (75 lines × 3 locations) → ✅ `getSourcePath()`
126 | - ❌ Context boost logic (50 lines) → ✅ `getContextBoosts()`
127 | - ❌ Context emojis (10 lines) → ✅ `getContextEmoji()`
128 |
129 | **Total reduction: ~250+ lines of hardcoded configuration**
130 |
131 | ### **Deployment Configuration**
132 |
133 | #### `ecosystem.config.cjs` 🚀 **SIMPLIFIED**
134 | ```javascript
135 | // Before: 9 reranker environment variables per service
136 | env: {
137 | RERANKER_MODEL: "", SEARCH_K: "100", W_RERANKER: "0.8",
138 | W_BM25: "0.2", RERANKER_TIMEOUT_MS: "1000", // ... more
139 | }
140 |
141 | // After: Clean BM25-only configuration
142 | env: {
143 | NODE_ENV: "production",
144 | RETURN_K: "25" // Centralized result limit
145 | }
146 | ```
147 |
148 | #### `.github/workflows/deploy-mcp-sap-docs.yml` 🚀 **UPDATED**
149 | - Removed transformers cache directory creation
150 | - Updated deployment comments for BM25-only system
151 | - Added metadata.json existence check
152 |
153 | ## 🎯 Benefits Achieved
154 |
155 | ### **1. Single Source of Truth**
156 | - All source configurations in one file (`src/metadata.json`)
157 | - No more hunting through multiple files for settings
158 | - Consistent configuration across all components
159 |
160 | ### **2. Easy Maintenance**
161 | - Add new documentation sources without code changes
162 | - Modify boosts, URLs, or paths in metadata.json only
163 | - No need to update multiple hardcoded locations
164 |
165 | ### **3. Type Safety**
166 | - Comprehensive TypeScript interfaces for all metadata
167 | - Compile-time validation of configuration access
168 | - IntelliSense support for all configuration properties
169 |
170 | ### **4. Cleaner Codebase**
171 | - **~250+ lines** of hardcoded configuration removed
172 | - Simplified core configuration files
173 | - More readable and maintainable code
174 |
175 | ### **5. Flexible Configuration**
176 | - Environment variable overrides still supported
177 | - Easy to add new configuration properties
178 | - Backward compatibility maintained
179 |
180 | ## 🔧 Migration Impact
181 |
182 | ### **Zero Breaking Changes**
183 | - All existing functionality preserved
184 | - Same search results and behavior
185 | - All tests passing (TypeScript + smoke tests)
186 |
187 | ### **Performance Impact**
188 | - Minimal: Metadata loaded once at startup
189 | - No runtime performance degradation
190 | - Same search speed and accuracy
191 |
192 | ### **Deployment Impact**
193 | - Simplified PM2 configuration
194 | - Faster deployment (no model downloads)
195 | - Reduced memory usage in production
196 |
197 | ## 📊 Configuration Comparison
198 |
199 | ### **Before: Scattered Configuration**
200 | ```
201 | src/lib/config.ts - SOURCE_BOOSTS (9 sources)
202 | src/lib/localDocs.ts - DOC_URL_CONFIGS (11 sources)
203 | src/lib/localDocs.ts - Source paths (12 sources × 3 locations)
204 | src/lib/localDocs.ts - Context boosts (7 contexts)
205 | src/lib/localDocs.ts - Context emojis (7 emojis)
206 | src/lib/search.ts - Library mappings (9 mappings)
207 | ecosystem.config.cjs - Reranker env vars (9 vars × 3 services)
208 | ```
209 |
210 | ### **After: Centralized Configuration**
211 | ```
212 | src/metadata.json - ALL source configurations
213 | src/lib/metadata.ts - Type-safe APIs for access
214 | src/lib/config.ts - Core system settings only
215 | ecosystem.config.cjs - Essential env vars only
216 | ```
217 |
218 | ## 🚀 Usage Examples
219 |
220 | ### **Adding a New Documentation Source**
221 | ```json
222 | // Just add to src/metadata.json - no code changes needed!
223 | {
224 | "id": "new-docs",
225 | "type": "documentation",
226 | "libraryId": "/new-docs",
227 | "sourcePath": "new-docs/content",
228 | "baseUrl": "https://example.com/docs",
229 | "pathPattern": "/{file}",
230 | "anchorStyle": "github",
231 | "boost": 0.05,
232 | "tags": ["new", "documentation"]
233 | }
234 | ```
235 |
236 | ### **Modifying Context Boosts**
237 | ```json
238 | // Adjust in src/metadata.json
239 | "contextBoosts": {
240 | "New Context": {
241 | "/new-docs": 1.0,
242 | "/sapui5": 0.3
243 | }
244 | }
245 | ```
246 |
247 | ### **Using the New APIs**
248 | ```typescript
249 | // Get source path for any library
250 | const sourcePath = getSourcePath('/sapui5');
251 | // Returns: "sapui5-docs/docs"
252 |
253 | // Get URL configuration
254 | const urlConfig = getDocUrlConfig('/cap');
255 | // Returns: { baseUrl: "https://cap.cloud.sap", pathPattern: "/docs/{file}", ... }
256 |
257 | // Get context-specific boosts
258 | const boosts = getContextBoosts('UI5');
259 | // Returns: { "/sapui5": 0.9, "/openui5-api": 0.9, ... }
260 | ```
261 |
262 | ## 🧪 Testing & Validation
263 |
264 | ### **Comprehensive Testing**
265 | - ✅ TypeScript compilation successful
266 | - ✅ All smoke tests passing
267 | - ✅ No linting errors
268 | - ✅ Functionality preserved
269 | - ✅ Performance maintained
270 |
271 | ### **Validation Steps**
272 | 1. **Build Test**: `npm run build:tsc` - No compilation errors
273 | 2. **Smoke Test**: `npm run test:smoke` - All search functionality working
274 | 3. **Integration Test**: All metadata APIs returning expected values
275 | 4. **Deployment Test**: PM2 configuration validated
276 |
277 | ## 🔮 Future Enhancements
278 |
279 | ### **Easy Extensions**
280 | - **New Sources**: Add to metadata.json without code changes
281 | - **Custom Boosts**: Modify context boosts per environment
282 | - **A/B Testing**: Switch configurations via environment variables
283 | - **Dynamic Updates**: Hot-reload metadata without restarts
284 |
285 | ### **Advanced Features**
286 | - **User Preferences**: Per-user source preferences
287 | - **Analytics**: Track which sources are most useful
288 | - **Caching**: Cache frequently accessed metadata
289 | - **Validation**: Schema validation for metadata.json
290 |
291 | ## 📈 Metrics
292 |
293 | ### **Code Reduction**
294 | - **~250+ lines** of hardcoded configuration removed
295 | - **5 files** significantly simplified
296 | - **12 new APIs** for type-safe configuration access
297 | - **1 centralized** metadata file
298 |
299 | ### **Maintainability Improvement**
300 | - **100%** of source configurations centralized
301 | - **0** breaking changes to existing functionality
302 | - **12** type-safe APIs for configuration access
303 | - **1** single file to modify for source changes
304 |
305 | ## 🎉 Conclusion
306 |
307 | The metadata-driven configuration consolidation successfully transforms the SAP Docs MCP system from a scattered, hardcoded configuration approach to a centralized, maintainable, and type-safe metadata system.
308 |
309 | **Key Achievements:**
310 | - ✅ **Single source of truth** for all configurations
311 | - ✅ **Zero breaking changes** to existing functionality
312 | - ✅ **Comprehensive APIs** for type-safe configuration access
313 | - ✅ **Simplified maintenance** and deployment
314 | - ✅ **Future-proof architecture** for easy extensions
315 |
316 | The system is now significantly more maintainable, flexible, and ready for future enhancements while preserving all existing functionality and performance characteristics.
317 |
```