This is page 3 of 5. Use http://codebase.md/blankcut/kubernetes-mcp-server?page={x} to view the full context. # Directory Structure ``` ├── .gitignore ├── docs │ ├── .astro │ │ ├── collections │ │ │ └── docs.schema.json │ │ ├── content-assets.mjs │ │ ├── content-modules.mjs │ │ ├── content.d.ts │ │ ├── data-store.json │ │ ├── settings.json │ │ └── types.d.ts │ ├── .gitignore │ ├── astro.config.mjs │ ├── package-lock.json │ ├── package.json │ ├── public │ │ └── images │ │ └── logo.svg │ ├── README.md │ ├── src │ │ ├── components │ │ │ ├── CodeBlock.astro │ │ │ ├── DocSidebar.astro │ │ │ ├── Footer.astro │ │ │ ├── Header.astro │ │ │ ├── HeadSEO.astro │ │ │ ├── Search.astro │ │ │ ├── Sidebar.astro │ │ │ └── TableOfContents.astro │ │ ├── content │ │ │ ├── config.ts │ │ │ └── docs │ │ │ ├── api-overview.md │ │ │ ├── configuration.md │ │ │ ├── installation.md │ │ │ ├── introduction.md │ │ │ ├── model-context-protocol.md │ │ │ ├── quick-start.md │ │ │ └── troubleshooting-resources.md │ │ ├── env.d.ts │ │ ├── layouts │ │ │ ├── BaseLayout.astro │ │ │ └── DocLayout.astro │ │ ├── pages │ │ │ ├── [...slug].astro │ │ │ ├── 404.astro │ │ │ ├── docs │ │ │ │ └── index.astro │ │ │ ├── docs-test.astro │ │ │ ├── examples │ │ │ │ └── index.astro │ │ │ └── index.astro │ │ └── styles │ │ └── global.css │ ├── tailwind.config.cjs │ └── tsconfig.json ├── go.mod ├── kubernetes-claude-mcp │ ├── .gitignore │ ├── cmd │ │ └── server │ │ └── main.go │ ├── docker-compose.yml │ ├── Dockerfile │ ├── go.mod │ ├── go.sum │ ├── internal │ │ ├── api │ │ │ ├── namespace_routes.go │ │ │ ├── routes.go │ │ │ └── server.go │ │ ├── argocd │ │ │ ├── applications.go │ │ │ ├── client.go │ │ │ └── history.go │ │ ├── auth │ │ │ ├── credentials.go │ │ │ ├── secrets.go │ │ │ └── vault.go │ │ ├── claude │ │ │ ├── client.go │ │ │ └── protocol.go │ │ ├── correlator │ │ │ ├── gitops.go │ │ │ ├── helm_correlator.go │ │ │ └── troubleshoot.go │ │ ├── gitlab │ │ │ ├── client.go │ │ │ ├── mergerequests.go │ │ │ ├── pipelines.go │ │ │ └── repositories.go │ │ ├── helm │ │ │ └── parser.go │ │ ├── k8s │ │ │ ├── client.go │ │ │ ├── enhanced_client.go │ │ │ ├── events.go │ │ │ ├── resource_mapper.go │ │ │ └── resources.go │ │ ├── mcp │ │ │ ├── context.go │ │ │ ├── namespace_analyzer.go │ │ │ ├── prompt.go │ │ │ └── protocol.go │ │ └── models │ │ ├── argocd.go │ │ ├── context.go │ │ ├── gitlab.go │ │ └── kubernetes.go │ └── pkg │ ├── config │ │ └── config.go │ ├── logging │ │ └── logging.go │ └── utils │ ├── serialization.go │ └── truncation.go ├── LICENSE └── README.md ``` # Files -------------------------------------------------------------------------------- /docs/src/content/docs/installation.md: -------------------------------------------------------------------------------- ```markdown --- title: Installation Guide description: Comprehensive guide for installing and configuring the Kubernetes Claude MCP server in various environments. date: 2025-03-01 order: 3 tags: ['installation', 'deployment'] --- # Installation Guide This guide provides detailed instructions for installing Kubernetes Claude MCP in different environments. Choose the method that best suits your needs. ## Prerequisites Before installing Kubernetes Claude MCP, ensure you have: - Access to a Kubernetes cluster (v1.19+) - kubectl configured to access your cluster - Claude API key from Anthropic - Optional: ArgoCD instance (for GitOps integration) - Optional: GitLab access (for commit analysis) ## Installation Methods There are several ways to install Kubernetes Claude MCP: 1. [Docker Compose](#docker-compose) (for development/testing) 2. [Kubernetes Deployment](#kubernetes-deployment) (recommended for production) 3. [Helm Chart](#helm-chart) (easiest for Kubernetes) 4. [Manual Binary](#manual-binary) (for custom environments) ## Docker Compose Docker Compose is ideal for local development and testing. ### Step 1: Clone the Repository ```bash git clone https://github.com/blankcut/kubernetes-mcp-server.git cd kubernetes-mcp-server ``` ### Step 2: Configure Environment Variables Create a `.env` file with your credentials: ```bash CLAUDE_API_KEY=your_claude_api_key ARGOCD_USERNAME=your_argocd_username ARGOCD_PASSWORD=your_argocd_password GITLAB_AUTH_TOKEN=your_gitlab_token API_KEY=your_api_key_for_server_access ``` ### Step 3: Configure the Server Create or modify `config.yaml`: ```yaml server: address: ":8080" readTimeout: 30 writeTimeout: 60 auth: apiKey: "${API_KEY}" kubernetes: kubeconfig: "" inCluster: false defaultContext: "" defaultNamespace: "default" argocd: url: "${ARGOCD_URL}" authToken: "${ARGOCD_AUTH_TOKEN}" username: "${ARGOCD_USERNAME}" password: "${ARGOCD_PASSWORD}" insecure: true gitlab: url: "${GITLAB_URL}" authToken: "${GITLAB_AUTH_TOKEN}" apiVersion: "v4" projectPath: "${PROJECT_PATH}" claude: apiKey: "${CLAUDE_API_KEY}" baseURL: "https://api.anthropic.com" modelID: "claude-3-haiku-20240307" maxTokens: 4096 temperature: 0.7 ``` ### Step 4: Start the Service ```bash docker-compose up -d ``` The server will be available at http://localhost:8080. ## Kubernetes Deployment For production environments, deploying to Kubernetes is recommended. ### Step 1: Create a Namespace ```bash kubectl create namespace mcp-system ``` ### Step 2: Create Secrets ```bash kubectl create secret generic mcp-secrets \ --namespace mcp-system \ --from-literal=claude-api-key=your_claude_api_key \ --from-literal=argocd-username=your_argocd_username \ --from-literal=argocd-password=your_argocd_password \ --from-literal=gitlab-token=your_gitlab_token \ --from-literal=api-key=your_api_key_for_server_access ``` ### Step 3: Create ConfigMap ```bash kubectl create configmap mcp-config \ --namespace mcp-system \ --from-file=config.yaml ``` ### Step 4: Apply Deployment Manifest Create a file named `deployment.yaml`: ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: kubernetes-mcp-server namespace: mcp-system labels: app: kubernetes-mcp-server spec: replicas: 1 selector: matchLabels: app: kubernetes-mcp-server template: metadata: labels: app: kubernetes-mcp-server spec: serviceAccountName: mcp-service-account containers: - name: server image: blankcut/kubernetes-mcp-server:latest imagePullPolicy: Always ports: - containerPort: 8080 env: - name: CLAUDE_API_KEY valueFrom: secretKeyRef: name: mcp-secrets key: claude-api-key - name: ARGOCD_USERNAME valueFrom: secretKeyRef: name: mcp-secrets key: argocd-username optional: true - name: ARGOCD_PASSWORD valueFrom: secretKeyRef: name: mcp-secrets key: argocd-password optional: true - name: GITLAB_AUTH_TOKEN valueFrom: secretKeyRef: name: mcp-secrets key: gitlab-token optional: true - name: API_KEY valueFrom: secretKeyRef: name: mcp-secrets key: api-key volumeMounts: - name: config mountPath: /app/config.yaml subPath: config.yaml volumes: - name: config configMap: name: mcp-config --- apiVersion: v1 kind: Service metadata: name: kubernetes-mcp-server namespace: mcp-system spec: selector: app: kubernetes-mcp-server ports: - port: 80 targetPort: 8080 type: ClusterIP --- apiVersion: v1 kind: ServiceAccount metadata: name: mcp-service-account namespace: mcp-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: mcp-cluster-role rules: - apiGroups: [""] resources: ["pods", "services", "events", "configmaps", "secrets", "namespaces", "nodes"] verbs: ["get", "list", "watch"] - apiGroups: ["apps"] resources: ["deployments", "statefulsets", "daemonsets", "replicasets"] verbs: ["get", "list", "watch"] - apiGroups: ["batch"] resources: ["jobs", "cronjobs"] verbs: ["get", "list", "watch"] - apiGroups: ["networking.k8s.io"] resources: ["ingresses"] verbs: ["get", "list", "watch"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: mcp-role-binding subjects: - kind: ServiceAccount name: mcp-service-account namespace: mcp-system roleRef: kind: ClusterRole name: mcp-cluster-role apiGroup: rbac.authorization.k8s.io ``` Apply the configuration: ```bash kubectl apply -f deployment.yaml ``` ### Step 5: Access the Server Create an Ingress or port-forward to access the server: ```bash kubectl port-forward -n mcp-system svc/kubernetes-mcp-server 8080:80 ``` ## Helm Chart For Kubernetes users, the Helm chart provides the easiest installation method. ### Step 1: Add the Helm Repository ```bash helm repo add blankcut https://blankcut.github.io/helm-charts helm repo update ``` ### Step 2: Configure Values Create a `values.yaml` file: ```yaml image: repository: blankcut/kubernetes-mcp-server tag: latest config: server: address: ":8080" kubernetes: inCluster: true defaultNamespace: "default" argocd: url: "https://argocd.example.com" gitlab: url: "https://gitlab.com" claude: modelID: "claude-3-haiku-20240307" secrets: claude: apiKey: "your_claude_api_key" argocd: username: "your_argocd_username" password: "your_argocd_password" gitlab: authToken: "your_gitlab_token" service: type: ClusterIP ingress: enabled: false # Uncomment to enable ingress # hosts: # - host: mcp.example.com # paths: # - path: / # pathType: Prefix ``` ### Step 3: Install the Chart ```bash helm install kubernetes-mcp-server blankcut/kubernetes-claude-mcp -f values.yaml -n mcp-system ``` ### Step 4: Verify the Installation ```bash kubectl get pods -n mcp-system ``` ## Manual Binary For environments where Docker or Kubernetes is not available, you can run the binary directly. ### Step 1: Download the Latest Release Visit the [Releases page](https://github.com/blankcut/kubernetes-mcp-server/releases) and download the appropriate binary for your platform. ### Step 2: Make the Binary Executable ```bash chmod +x mcp-server ``` ### Step 3: Create Configuration File Create a `config.yaml` file in the same directory: ```yaml server: address: ":8080" readTimeout: 30 writeTimeout: 60 auth: apiKey: "your_api_key_for_server_access" kubernetes: kubeconfig: "/path/to/.kube/config" # Path to your kubeconfig file inCluster: false defaultContext: "" defaultNamespace: "default" argocd: url: "https://argocd.example.com" username: "your_argocd_username" password: "your_argocd_password" insecure: true gitlab: url: "https://gitlab.com" authToken: "your_gitlab_token" apiVersion: "v4" projectPath: "" claude: apiKey: "your_claude_api_key" baseURL: "https://api.anthropic.com" modelID: "claude-3-haiku-20240307" maxTokens: 4096 temperature: 0.7 ``` ### Step 4: Run the Server ```bash export CLAUDE_API_KEY=your_claude_api_key export API_KEY=your_api_key_for_server ./mcp-server --config config.yaml ``` ## Verifying the Installation To verify your installation is working correctly: 1. Check the health endpoint: ```bash curl http://localhost:8080/api/v1/health ``` 2. List Kubernetes namespaces: ```bash curl -H "X-API-Key: your_api_key" http://localhost:8080/api/v1/namespaces ``` 3. Test a resource query: ```bash curl -X POST \ -H "Content-Type: application/json" \ -H "X-API-Key: your_api_key" \ -d '{ "action": "queryResource", "resource": "pod", "name": "example-pod", "namespace": "default", "query": "Is this pod healthy?" }' \ http://localhost:8080/api/v1/mcp/resource ``` ## Security Considerations When deploying Kubernetes Claude MCP, consider the following security best practices: 1. **API Access**: Use a strong API key and restrict access to the server. 2. **Kubernetes Permissions**: Use a service account with the minimum required permissions. 3. **Secrets Management**: Store credentials in Kubernetes Secrets or a secure vault. 4. **Network Isolation**: Consider network policies to limit access to the server. 5. **TLS**: Use TLS to encrypt connections to the server. For more security recommendations, see the [Security Best Practices](/docs/security-best-practices) guide. ## Troubleshooting If you encounter issues during installation, check: 1. **Logs**: View server logs for error messages ```bash # For Docker Compose docker-compose logs # For Kubernetes kubectl logs -n mcp-system deployment/kubernetes-mcp-server ``` 2. **Configuration**: Verify your `config.yaml` has the correct settings 3. **Connectivity**: Ensure the server can connect to Kubernetes, ArgoCD, and GitLab 4. **API Key**: Verify you're using the correct API key in requests For more troubleshooting tips, see the [Troubleshooting](/docs/troubleshooting-resources) guide. ## Next Steps After successful installation, continue with: - [Configuration Guide](/docs/configuration) - Configure the server for your environment - [API Reference](/docs/api-overview) - Explore the API endpoints - [Examples](/docs/examples/basic-usage) - See examples of common use cases ``` -------------------------------------------------------------------------------- /kubernetes-claude-mcp/internal/auth/credentials.go: -------------------------------------------------------------------------------- ```go package auth import ( "context" "fmt" "os" "sync" "time" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/pkg/config" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/pkg/logging" ) // ServiceType represents the type of service requiring credentials type ServiceType string const ( ServiceKubernetes ServiceType = "kubernetes" ServiceArgoCD ServiceType = "argocd" ServiceGitLab ServiceType = "gitlab" ServiceClaude ServiceType = "claude" ) // Credentials stores authentication information for various services type Credentials struct { // API tokens, oauth tokens, etc. Token string APIKey string Username string Password string Certificate []byte PrivateKey []byte ExpiresAt time.Time } // IsExpired checks if the credentials are expired func (c *Credentials) IsExpired() bool { // If no expiration time is set, we'll assume credentials don't expire if c.ExpiresAt.IsZero() { return false } // Check if current time is past the expiration time return time.Now().After(c.ExpiresAt) } // CredentialProvider manages credentials for various services type CredentialProvider struct { mu sync.RWMutex credentials map[ServiceType]*Credentials config *config.Config logger *logging.Logger secretsManager *SecretsManager vaultManager *VaultManager } // NewCredentialProvider creates a new credential provider func NewCredentialProvider(cfg *config.Config) *CredentialProvider { logger := logging.NewLogger().Named("auth") return &CredentialProvider{ credentials: make(map[ServiceType]*Credentials), config: cfg, logger: logger, secretsManager: NewSecretsManager(logger), vaultManager: NewVaultManager(logger), } } // LoadCredentials loads all service credentials based on configuration func (p *CredentialProvider) LoadCredentials(ctx context.Context) error { // Load credentials for each service type based on config if err := p.loadKubernetesCredentials(ctx); err != nil { return fmt.Errorf("failed to load Kubernetes credentials: %w", err) } if err := p.loadArgoCDCredentials(ctx); err != nil { return fmt.Errorf("failed to load ArgoCD credentials: %w", err) } if err := p.loadGitLabCredentials(ctx); err != nil { return fmt.Errorf("failed to load GitLab credentials: %w", err) } if err := p.loadClaudeCredentials(ctx); err != nil { return fmt.Errorf("failed to load Claude credentials: %w", err) } return nil } // GetCredentials returns credentials for the specified service func (p *CredentialProvider) GetCredentials(serviceType ServiceType) (*Credentials, error) { p.mu.RLock() defer p.mu.RUnlock() creds, ok := p.credentials[serviceType] if !ok { return nil, fmt.Errorf("credentials not found for service: %s", serviceType) } // Check if credentials are expired and need refresh if creds.IsExpired() { p.mu.RUnlock() // Release read lock // Acquire write lock for refresh p.mu.Lock() defer p.mu.Unlock() // Check again in case another goroutine refreshed while we were waiting if creds.IsExpired() { p.logger.Info("Refreshing expired credentials", "serviceType", serviceType) if err := p.RefreshCredentials(context.Background(), serviceType); err != nil { return nil, fmt.Errorf("failed to refresh expired credentials: %w", err) } creds = p.credentials[serviceType] } } return creds, nil } // loadKubernetesCredentials loads Kubernetes authentication credentials func (p *CredentialProvider) loadKubernetesCredentials(ctx context.Context) error { p.mu.Lock() defer p.mu.Unlock() // For Kubernetes, we primarily rely on kubeconfig or in-cluster config // We won't need to store explicit credentials p.credentials[ServiceKubernetes] = &Credentials{} return nil } // loadArgoCDCredentials loads ArgoCD authentication credentials func (p *CredentialProvider) loadArgoCDCredentials(ctx context.Context) error { p.mu.Lock() defer p.mu.Unlock() // Try to load from secrets manager if available if p.secretsManager != nil && p.secretsManager.IsAvailable() { creds, err := p.secretsManager.GetCredentials(ctx, "argocd") if err == nil && creds != nil { p.credentials[ServiceArgoCD] = creds p.logger.Info("Loaded ArgoCD credentials from secrets manager") return nil } } // Try to load from vault if available if p.vaultManager != nil && p.vaultManager.IsAvailable() { creds, err := p.vaultManager.GetCredentials(ctx, "argocd") if err == nil && creds != nil { p.credentials[ServiceArgoCD] = creds p.logger.Info("Loaded ArgoCD credentials from vault") return nil } } // Primary source: Environment variables token := os.Getenv("ARGOCD_AUTH_TOKEN") if token != "" { p.credentials[ServiceArgoCD] = &Credentials{ Token: token, } p.logger.Info("Loaded ArgoCD credentials from environment") return nil } // Secondary source: Config file if p.config.ArgoCD.AuthToken != "" { p.credentials[ServiceArgoCD] = &Credentials{ Token: p.config.ArgoCD.AuthToken, } p.logger.Info("Loaded ArgoCD credentials from config file") return nil } // Tertiary source: Username/password... username := os.Getenv("ARGOCD_USERNAME") password := os.Getenv("ARGOCD_PASSWORD") if username != "" && password != "" { p.credentials[ServiceArgoCD] = &Credentials{ Username: username, Password: password, } p.logger.Info("Loaded ArgoCD username/password from environment") return nil } // Final fallback to config if p.config.ArgoCD.Username != "" && p.config.ArgoCD.Password != "" { p.credentials[ServiceArgoCD] = &Credentials{ Username: p.config.ArgoCD.Username, Password: p.config.ArgoCD.Password, } p.logger.Info("Loaded ArgoCD username/password from config file") return nil } p.logger.Warn("No ArgoCD credentials found, continuing without them") // We don't want to fail if ArgoCD credentials are not found // since ArgoCD integration is optional p.credentials[ServiceArgoCD] = &Credentials{} return nil } // loadGitLabCredentials loads GitLab authentication credentials func (p *CredentialProvider) loadGitLabCredentials(ctx context.Context) error { p.mu.Lock() defer p.mu.Unlock() // Try to load from secrets manager if available if p.secretsManager != nil && p.secretsManager.IsAvailable() { creds, err := p.secretsManager.GetCredentials(ctx, "gitlab") if err == nil && creds != nil { p.credentials[ServiceGitLab] = creds p.logger.Info("Loaded GitLab credentials from secrets manager") return nil } } // Try to load from vault if available if p.vaultManager != nil && p.vaultManager.IsAvailable() { creds, err := p.vaultManager.GetCredentials(ctx, "gitlab") if err == nil && creds != nil { p.credentials[ServiceGitLab] = creds p.logger.Info("Loaded GitLab credentials from vault") return nil } } // Primary source: Environment variables token := os.Getenv("GITLAB_AUTH_TOKEN") if token != "" { p.credentials[ServiceGitLab] = &Credentials{ Token: token, } p.logger.Info("Loaded GitLab credentials from environment") return nil } // Secondary source: Config file if p.config.GitLab.AuthToken != "" { p.credentials[ServiceGitLab] = &Credentials{ Token: p.config.GitLab.AuthToken, } p.logger.Info("Loaded GitLab credentials from config file") return nil } p.logger.Warn("No GitLab credentials found, continuing without them") // We don't want to fail if GitLab credentials are not found // since GitLab integration is optional p.credentials[ServiceGitLab] = &Credentials{} return nil } // loadClaudeCredentials loads Claude API credentials func (p *CredentialProvider) loadClaudeCredentials(ctx context.Context) error { p.mu.Lock() defer p.mu.Unlock() // Try to load from secrets manager if available if p.secretsManager != nil && p.secretsManager.IsAvailable() { creds, err := p.secretsManager.GetCredentials(ctx, "claude") if err == nil && creds != nil { p.credentials[ServiceClaude] = creds p.logger.Info("Loaded Claude credentials from secrets manager") return nil } } // Try to load from vault if available if p.vaultManager != nil && p.vaultManager.IsAvailable() { creds, err := p.vaultManager.GetCredentials(ctx, "claude") if err == nil && creds != nil { p.credentials[ServiceClaude] = creds p.logger.Info("Loaded Claude credentials from vault") return nil } } // Primary source: Environment variables apiKey := os.Getenv("CLAUDE_API_KEY") if apiKey != "" { p.credentials[ServiceClaude] = &Credentials{ APIKey: apiKey, } p.logger.Info("Loaded Claude credentials from environment") return nil } // Secondary source: Config file if p.config.Claude.APIKey != "" { p.credentials[ServiceClaude] = &Credentials{ APIKey: p.config.Claude.APIKey, } p.logger.Info("Loaded Claude credentials from config file") return nil } p.logger.Warn("No Claude API key found") return fmt.Errorf("no Claude API key found") } // RefreshCredentials refreshes credentials for a specific service (for tokens that expire) func (p *CredentialProvider) RefreshCredentials(ctx context.Context, serviceType ServiceType) error { // Implement credential refresh logic based on service type switch serviceType { case ServiceArgoCD: return p.refreshArgoCDToken(ctx) default: p.logger.Debug("No refresh needed for service", "serviceType", serviceType) return nil // No refresh needed for other services } } // refreshArgoCDToken refreshes the ArgoCD token if using username/password auth func (p *CredentialProvider) refreshArgoCDToken(ctx context.Context) error { p.mu.Lock() defer p.mu.Unlock() creds, ok := p.credentials[ServiceArgoCD] if !ok { return fmt.Errorf("ArgoCD credentials not found") } // If using token authentication and it's not expired, no refresh needed if creds.Token != "" && !creds.IsExpired() { return nil } // If using username/password, we would implement logic to get a new token if creds.Username != "" && creds.Password != "" { p.logger.Info("Refreshing ArgoCD token using username/password") p.logger.Info("Successfully refreshed ArgoCD token") return nil } return fmt.Errorf("unable to refresh ArgoCD token: invalid credential type") } // UpdateArgoToken updates the ArgoCD token func (p *CredentialProvider) UpdateArgoToken(ctx context.Context, token string) { p.mu.Lock() defer p.mu.Unlock() if creds, ok := p.credentials[ServiceArgoCD]; ok { creds.Token = token creds.ExpiresAt = time.Now().Add(24 * time.Hour) p.logger.Info("Updated ArgoCD token") } else { p.credentials[ServiceArgoCD] = &Credentials{ Token: token, ExpiresAt: time.Now().Add(24 * time.Hour), } p.logger.Info("Created new ArgoCD token") } } ``` -------------------------------------------------------------------------------- /docs/src/content/docs/configuration.md: -------------------------------------------------------------------------------- ```markdown --- title: Configuration Guide description: Learn how to configure and customize the Kubernetes Claude MCP server to suit your needs and environment. date: 2025-03-01 order: 4 tags: ['configuration', 'setup'] --- # Configuration Guide This guide explains how to configure Kubernetes Claude MCP to work optimally in your environment. The server is highly configurable, allowing you to customize its behavior and integrations. ## Configuration File Kubernetes Claude MCP is primarily configured using a YAML file (`config.yaml`). This file contains settings for the server, Kubernetes connection, ArgoCD integration, GitLab integration, and Claude AI. Here's a complete example of the configuration file with explanations: ```yaml # Server configuration server: # Address to bind the server on (host:port) address: ":8080" # Read timeout in seconds readTimeout: 30 # Write timeout in seconds writeTimeout: 60 # Authentication settings auth: # API key for authenticating requests apiKey: "your_api_key_here" # Kubernetes connection settings kubernetes: # Path to kubeconfig file (leave empty for in-cluster) kubeconfig: "" # Whether to use in-cluster config inCluster: false # Default Kubernetes context (leave empty for current) defaultContext: "" # Default namespace defaultNamespace: "default" # ArgoCD integration settings argocd: # ArgoCD server URL url: "https://argocd.example.com" # ArgoCD auth token (optional if using username/password) authToken: "" # ArgoCD username (optional if using token) username: "admin" # ArgoCD password (optional if using token) password: "password" # Whether to allow insecure connections insecure: false # GitLab integration settings gitlab: # GitLab server URL url: "https://gitlab.com" # GitLab personal access token authToken: "your_gitlab_token" # GitLab API version apiVersion: "v4" # Default project path projectPath: "namespace/project" # Claude AI settings claude: # Claude API key apiKey: "your_claude_api_key" # Claude API base URL baseURL: "https://api.anthropic.com" # Claude model ID modelID: "claude-3-haiku-20240307" # Maximum tokens for Claude responses maxTokens: 4096 # Temperature for Claude responses (0.0-1.0) temperature: 0.7 ``` ## Configuration Options ### Server Configuration | Option | Description | Default | |--------|-------------|---------| | `address` | Host and port to bind the server (":8080" means all interfaces, port 8080) | ":8080" | | `readTimeout` | HTTP read timeout in seconds | 30 | | `writeTimeout` | HTTP write timeout in seconds | 60 | | `auth.apiKey` | API key for authenticating requests | - | ### Kubernetes Configuration | Option | Description | Default | |--------|-------------|---------| | `kubeconfig` | Path to kubeconfig file | "" (auto-detect) | | `inCluster` | Whether to use in-cluster configuration | false | | `defaultContext` | Default Kubernetes context | "" (current context) | | `defaultNamespace` | Default namespace for operations | "default" | ### ArgoCD Configuration | Option | Description | Default | |--------|-------------|---------| | `url` | ArgoCD server URL | - | | `authToken` | ArgoCD auth token | "" | | `username` | ArgoCD username | "" | | `password` | ArgoCD password | "" | | `insecure` | Allow insecure connections to ArgoCD | false | ### GitLab Configuration | Option | Description | Default | |--------|-------------|---------| | `url` | GitLab server URL | "https://gitlab.com" | | `authToken` | GitLab personal access token | - | | `apiVersion` | GitLab API version | "v4" | | `projectPath` | Default project path | "" | ### Claude Configuration | Option | Description | Default | |--------|-------------|---------| | `apiKey` | Claude API key | - | | `baseURL` | Claude API base URL | "https://api.anthropic.com" | | `modelID` | Claude model ID | "claude-3-haiku-20240307" | | `maxTokens` | Maximum tokens for response | 4096 | | `temperature` | Temperature for responses (0.0-1.0) | 0.7 | ## Environment Variables In addition to the configuration file, you can use environment variables to override any configuration option. This is especially useful for secrets and credentials. Environment variables follow this pattern: - For server options: `SERVER_OPTION_NAME` - For Kubernetes options: `KUBERNETES_OPTION_NAME` - For ArgoCD options: `ARGOCD_OPTION_NAME` - For GitLab options: `GITLAB_OPTION_NAME` - For Claude options: `CLAUDE_OPTION_NAME` Common examples: ```bash # API keys export CLAUDE_API_KEY=your_claude_api_key export API_KEY=your_api_key_for_server # ArgoCD credentials export ARGOCD_USERNAME=your_argocd_username export ARGOCD_PASSWORD=your_argocd_password # GitLab credentials export GITLAB_AUTH_TOKEN=your_gitlab_token ``` ## Variable Interpolation The configuration file supports variable interpolation, allowing you to reference environment variables in your config. This is useful for injecting secrets: ```yaml server: auth: apiKey: "${API_KEY}" claude: apiKey: "${CLAUDE_API_KEY}" ``` ## Configuration Hierarchy The server reads configuration in the following order (later overrides earlier): 1. Default values 2. Configuration file 3. Environment variables This allows you to have a base configuration file and override specific settings with environment variables. ## ArgoCD Integration ### Authentication Methods There are two ways to authenticate with ArgoCD: 1. **Token-based authentication**: Provide an auth token in `argocd.authToken`. 2. **Username/password authentication**: Provide username and password in `argocd.username` and `argocd.password`. For production environments, token-based authentication is recommended for security. ### Insecure Mode If you're using a self-signed certificate for ArgoCD, you can set `argocd.insecure` to `true` to skip certificate validation. However, this is not recommended for production environments. ## GitLab Integration ### Personal Access Token To integrate with GitLab, you need a personal access token with the following scopes: - `read_api` - For accessing repository information - `read_repository` - For accessing repository content - `read_registry` - For accessing container registry (if needed) ### Self-hosted GitLab If you're using a self-hosted GitLab instance, set the `gitlab.url` to your GitLab URL: ```yaml gitlab: url: "https://gitlab.your-company.com" # Other GitLab settings... ``` ## Claude AI Configuration ### Model Selection Kubernetes Claude MCP supports different Claude model variants. The default is `claude-3-haiku-20240307`, but you can choose others based on your needs: - `claude-3-opus-20240229` - Most capable model, best for complex analysis - `claude-3-sonnet-20240229` - Balanced performance and speed - `claude-3-haiku-20240307` - Fastest model, suitable for most use cases ### Response Parameters You can adjust two parameters that affect Claude's responses: 1. `maxTokens` - Maximum number of tokens in the response (1-4096) 2. `temperature` - Controls randomness in responses (0.0-1.0) - Lower values (e.g., 0.3) make responses more deterministic - Higher values (e.g., 0.7) make responses more creative For troubleshooting and analysis, a temperature of 0.3-0.5 is recommended. ## Advanced Configuration ### Running Behind a Proxy If the server needs to connect to external services through a proxy, set the standard HTTP proxy environment variables: ```bash export HTTP_PROXY=http://proxy.example.com:8080 export HTTPS_PROXY=http://proxy.example.com:8080 export NO_PROXY=localhost,127.0.0.1,.cluster.local ``` ### TLS Configuration For production deployments, it's recommended to use TLS. This is typically handled by your ingress controller, load balancer, or API gateway. If you need to terminate TLS at the server (not recommended for production), you can use a reverse proxy like Nginx or Traefik. ### Logging Configuration The logging level can be controlled with the `LOG_LEVEL` environment variable: ```bash export LOG_LEVEL=debug # debug, info, warn, error ``` For production, `info` is recommended. Use `debug` only for troubleshooting. ## Configuration Examples ### Minimal Configuration ```yaml server: address: ":8080" auth: apiKey: "your_api_key_here" kubernetes: inCluster: false claude: apiKey: "your_claude_api_key" modelID: "claude-3-haiku-20240307" ``` ### Production Kubernetes Configuration ```yaml server: address: ":8080" readTimeout: 60 writeTimeout: 120 auth: apiKey: "${API_KEY}" kubernetes: inCluster: true defaultNamespace: "default" argocd: url: "https://argocd.example.com" authToken: "${ARGOCD_AUTH_TOKEN}" insecure: false gitlab: url: "https://gitlab.example.com" authToken: "${GITLAB_AUTH_TOKEN}" apiVersion: "v4" claude: apiKey: "${CLAUDE_API_KEY}" baseURL: "https://api.anthropic.com" modelID: "claude-3-haiku-20240307" maxTokens: 4096 temperature: 0.5 ``` ## Troubleshooting Configuration If you encounter issues with your configuration: 1. Check that all required fields are set correctly 2. Verify that environment variables are correctly set and accessible to the server 3. Test connectivity to external services (Kubernetes, ArgoCD, GitLab) 4. Check the server logs for error messages 5. Ensure your Claude API key is valid and has sufficient quota ### Common Issues #### "Failed to create Kubernetes client" This usually indicates an issue with the Kubernetes configuration: - Check if the kubeconfig file exists and is accessible - Verify the permissions of the kubeconfig file - For in-cluster config, ensure the pod has the proper service account #### "Failed to connect to ArgoCD" ArgoCD connectivity issues are typically related to: - Incorrect URL or credentials - Network connectivity issues - Certificate validation (if `insecure: false`) Try using the `--log-level=debug` flag to get more details: ```bash LOG_LEVEL=debug ./mcp-server --config config.yaml ``` #### "Failed to connect to GitLab" GitLab connectivity issues may be due to: - Invalid personal access token - Insufficient permissions for the token - Network connectivity issues #### "Claude API error" Claude API errors usually indicate: - Invalid API key - Rate limiting or quota issues - Incorrect model ID ## Updating Configuration You can update the configuration without restarting the server by sending a SIGHUP signal: ```bash # Find the process ID ps aux | grep mcp-server # Send SIGHUP signal kill -HUP <process_id> ``` For containerized deployments, you'll need to restart the container to apply configuration changes. ## Next Steps Now that you've configured Kubernetes Claude MCP, you can: - [Explore the API](/docs/api-overview) to learn how to interact with the server - [Try some examples](/docs/examples/basic-usage) to see common use cases - [Learn about troubleshooting](/docs/troubleshooting-resources) to diagnose issues in your cluster ``` -------------------------------------------------------------------------------- /docs/src/content/docs/troubleshooting-resources.md: -------------------------------------------------------------------------------- ```markdown --- title: Troubleshooting Resources description: Learn how to use Kubernetes Claude MCP to diagnose and solve problems with your Kubernetes resources and applications. date: 2025-03-01 order: 6 tags: ['troubleshooting', 'guides'] --- # Troubleshooting Resources Kubernetes Claude MCP is a powerful tool for diagnosing and resolving issues in your Kubernetes environment. This guide will walk you through common troubleshooting scenarios and how to use the MCP server to address them. ## Getting Started with Troubleshooting The `/api/v1/mcp/troubleshoot` endpoint is specifically designed for troubleshooting. It automatically: 1. Collects all relevant information about a resource 2. Detects common issues and their severity 3. Correlates information across systems (Kubernetes, ArgoCD, GitLab) 4. Generates recommendations for fixing the issues 5. Provides Claude AI-powered analysis of the problems ## Troubleshooting Common Resource Types ### Troubleshooting Pods Pods are often the first place to look when troubleshooting application issues. **Example Request:** ```bash curl -X POST \ -H "Content-Type: application/json" \ -H "X-API-Key: your_api_key" \ -d '{ "resource": "pod", "name": "my-app-pod", "namespace": "default", "query": "Why is this pod not starting?" }' \ http://localhost:8080/api/v1/mcp/troubleshoot ``` **What MCP Detects:** - Pod status issues (Pending, CrashLoopBackOff, ImagePullBackOff, etc.) - Container status and restart counts - Resource constraints (CPU/memory limits) - Volume mounting issues - Init container failures - Image pull errors - Scheduling problems - Events related to the pod **Example Troubleshooting Output:** ```json { "success": true, "analysis": "The pod 'my-app-pod' is failing to start due to an ImagePullBackOff error. The container runtime is unable to pull the image 'myregistry.com/my-app:v1.2.3' because of authentication issues with the private registry. Looking at the events, there was an 'ErrImagePull' error with the message 'unauthorized: authentication required'...", "troubleshootResult": { "issues": [ { "title": "Image Pull Error", "category": "ImagePullError", "severity": "Error", "source": "Kubernetes", "description": "Failed to pull image 'myregistry.com/my-app:v1.2.3': unauthorized: authentication required" } ], "recommendations": [ "Create or update the ImagePullSecret for the private registry", "Verify the image name and tag are correct", "Check that the ServiceAccount has access to the ImagePullSecret" ] } } ``` ### Troubleshooting Deployments Deployments manage replica sets and pods, so issues can occur at multiple levels. **Example Request:** ```bash curl -X POST \ -H "Content-Type: application/json" \ -H "X-API-Key: your_api_key" \ -d '{ "resource": "deployment", "name": "my-app", "namespace": "default", "query": "Why are pods not scaling up?" }' \ http://localhost:8080/api/v1/mcp/troubleshoot ``` **What MCP Detects:** - ReplicaSet creation issues - Pod scaling issues - Resource quotas preventing scaling - Node capacity issues - Pod disruption budgets - Deployment strategy issues - Resource constraints on pods - Health check configuration issues **Example Troubleshooting Output:** ```json { "success": true, "analysis": "The deployment 'my-app' is unable to scale up because the pods are requesting more CPU resources than are available in the cluster. The deployment is configured to request 2 CPU cores per pod, but the nodes in your cluster only have 1.8 cores available per node...", "troubleshootResult": { "issues": [ { "title": "Insufficient CPU Resources", "category": "ResourceConstraint", "severity": "Warning", "source": "Kubernetes", "description": "Insufficient CPU resources available to schedule pods (requested: 2, available: 1.8)" } ], "recommendations": [ "Reduce the CPU request in the deployment specification", "Add more nodes to the cluster or use nodes with more CPU capacity", "Check if there are any resource quotas preventing the scaling" ] } } ``` ### Troubleshooting Services Services provide network connectivity between components, and issues often relate to selector mismatches or port configurations. **Example Request:** ```bash curl -X POST \ -H "Content-Type: application/json" \ -H "X-API-Key: your_api_key" \ -d '{ "resource": "service", "name": "my-app-service", "namespace": "default", "query": "Why can't I connect to this service?" }' \ http://localhost:8080/api/v1/mcp/troubleshoot ``` **What MCP Detects:** - Selector mismatches between service and pods - Port configuration issues - Endpoint availability - Pod readiness issues - Network policy restrictions - Service type misconfigurations - External name resolution issues (for ExternalName services) **Example Troubleshooting Output:** ```json { "success": true, "analysis": "The service 'my-app-service' is not working correctly because there are no endpoints being selected. The service uses the selector 'app=my-app,tier=frontend', but examining the pods in the namespace, I can see that the pods have the labels 'app=my-app,tier=web'. The mismatch in the 'tier' label (frontend vs web) is preventing the service from selecting any pods...", "troubleshootResult": { "issues": [ { "title": "Selector Mismatch", "category": "ServiceSelectorIssue", "severity": "Error", "source": "Kubernetes", "description": "Service selector 'app=my-app,tier=frontend' doesn't match any pods (pods have 'app=my-app,tier=web')" } ], "recommendations": [ "Update the service selector to match the actual pod labels: 'app=my-app,tier=web'", "Alternatively, update the pod labels to match the service selector", "Verify that pods are in the 'Running' state and passing readiness probes" ] } } ``` ### Troubleshooting Ingresses Ingress resources configure external access to services, and issues often relate to hostname mismatches or TLS configuration. **Example Request:** ```bash curl -X POST \ -H "Content-Type: application/json" \ -H "X-API-Key: your_api_key" \ -d '{ "resource": "ingress", "name": "my-app-ingress", "namespace": "default", "query": "Why is this ingress returning 404 errors?" }' \ http://localhost:8080/api/v1/mcp/troubleshoot ``` **What MCP Detects:** - Backend service existence and configuration - Path routing rules - TLS certificate issues - Ingress controller availability - Host name configurations - Annotation misconfigurations - Service port mappings ## Troubleshooting GitOps Resources Kubernetes Claude MCP excels at diagnosing issues in GitOps workflows by correlating information between Kubernetes, ArgoCD, and GitLab. ### Troubleshooting ArgoCD Applications **Example Request:** ```bash curl -X POST \ -H "Content-Type: application/json" \ -H "X-API-Key: your_api_key" \ -d '{ "resource": "application", "name": "my-argocd-app", "namespace": "argocd", "query": "Why is this application out of sync?" }' \ http://localhost:8080/api/v1/mcp/troubleshoot ``` **What MCP Detects:** - Sync status issues - Sync history and recent failures - Git repository connectivity issues - Manifest validation errors - Resource differences between desired and actual state - Health status issues - Related Kubernetes resources **Example Troubleshooting Output:** ```json { "success": true, "analysis": "The ArgoCD application 'my-argocd-app' is out of sync because there are local changes to the Deployment resource that differ from the version in Git. Specifically, someone has manually scaled the deployment from 3 replicas (as defined in Git) to 5 replicas using kubectl...", "troubleshootResult": { "issues": [ { "title": "Manual Modification", "category": "SyncIssue", "severity": "Warning", "source": "ArgoCD", "description": "Deployment 'my-app' was manually modified: replicas changed from 3 to 5" } ], "recommendations": [ "Use 'argocd app sync my-argocd-app' to revert to the state defined in Git", "Update the Git repository to reflect the desired replica count", "Enable self-healing in the ArgoCD application to prevent manual modifications" ] } } ``` ### Investigating Commit Impact When a deployment fails after a GitLab commit, you can analyze the commit's impact: **Example Request:** ```bash curl -X POST \ -H "Content-Type: application/json" \ -H "X-API-Key: your_api_key" \ -d '{ "projectId": "mygroup/myproject", "commitSha": "abcdef1234567890", "query": "How has this commit affected Kubernetes resources and what issues has it caused?" }' \ http://localhost:8080/api/v1/mcp/commit ``` **What MCP Analyzes:** - Files changed in the commit - Connected ArgoCD applications - Affected Kubernetes resources - Subsequent pipeline results - Changes in resource configurations - Introduction of new errors or warnings ## Advanced Troubleshooting Scenarios ### Multi-Resource Analysis You can troubleshoot complex issues by instructing Claude to correlate multiple resources: **Example Request:** ```bash curl -X POST \ -H "Content-Type: application/json" \ -H "X-API-Key: your_api_key" \ -d '{ "query": "Analyze the connectivity issue between the frontend deployment and the backend service in the 'myapp' namespace. Check both the deployment and the service configurations." }' \ http://localhost:8080/api/v1/mcp ``` ### Diagram Generation For complex troubleshooting scenarios, you can request diagram generation to visualize relationships: **Example Request:** ```bash curl -X POST \ -H "Content-Type: application/json" \ -H "X-API-Key: your_api_key" \ -d '{ "resource": "deployment", "name": "my-app", "namespace": "default", "query": "Create a diagram showing this deployment's relationship to all associated resources, including services, ingresses, configmaps, and secrets." }' \ http://localhost:8080/api/v1/mcp/resource ``` Claude can generate Mermaid diagrams within its response to visualize the relationships. ## Troubleshooting Best Practices When using Kubernetes Claude MCP for troubleshooting: 1. **Start specific**: Begin with the resource that's showing symptoms 2. **Go broad**: If needed, expand to related resources 3. **Use specific queries**: The more specific your query, the better Claude can help 4. **Include context**: Mention what you've already tried or specific symptoms 5. **Follow recommendations**: Try the recommended fixes one at a time 6. **Iterate**: Use follow-up queries to dive deeper ## Real-Time Troubleshooting For ongoing issues, you can set up continuous monitoring: ```bash # Watch a resource and get alerts when issues are detected watch -n 30 'curl -s -X POST \ -H "Content-Type: application/json" \ -H "X-API-Key: your_api_key" \ -d "{\"resource\":\"deployment\",\"name\":\"my-app\",\"namespace\":\"default\",\"query\":\"Report any new issues\"}" \ http://localhost:8080/api/v1/mcp/troubleshoot | jq .troubleshootResult.issues' ``` ## Troubleshooting Reference Here's a quick reference of what to check for common Kubernetes issues: | Symptom | Resource to Check | Common Issues | |---------|-------------------|---------------| | Application not starting | Pod | Image pull errors, resource constraints, configuration issues | | Cannot connect to app | Service | Selector mismatch, port configuration, pod health | | External access failing | Ingress | Path configuration, backend service, TLS issues | | Scaling issues | Deployment | Resource constraints, pod disruption budgets, affinity rules | | Configuration issues | ConfigMap/Secret | Missing keys, invalid format, mounting issues | | Persistent storage issues | PVC | Storage class, capacity issues, access modes | | GitOps sync failures | ArgoCD Application | Git repo issues, manifest errors, resource conflicts | ``` -------------------------------------------------------------------------------- /kubernetes-claude-mcp/internal/mcp/context.go: -------------------------------------------------------------------------------- ```go package mcp import ( "context" "fmt" "strings" "time" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/internal/models" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/pkg/logging" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/pkg/utils" ) // ContextManager handles the creation and management of context for Claude type ContextManager struct { maxContextSize int logger *logging.Logger } // NewContextManager creates a new context manager func NewContextManager(maxContextSize int, logger *logging.Logger) *ContextManager { if maxContextSize <= 0 { maxContextSize = 100000 } if logger == nil { logger = logging.NewLogger().Named("context") } return &ContextManager{ maxContextSize: maxContextSize, logger: logger, } } // FormatResourceContext formats a resource context for Claude func (cm *ContextManager) FormatResourceContext(rc models.ResourceContext) (string, error) { cm.logger.Debug("Formatting resource context", "kind", rc.Kind, "name", rc.Name, "namespace", rc.Namespace) var formattedContext string // Format the basic resource information formattedContext += fmt.Sprintf("# Kubernetes Resource: %s/%s\n", rc.Kind, rc.Name) if rc.Namespace != "" { formattedContext += fmt.Sprintf("Namespace: %s\n", rc.Namespace) } formattedContext += fmt.Sprintf("API Version: %s\n\n", rc.APIVersion) // Add the full resource data if available if rc.ResourceData != "" { formattedContext += "## Resource Details\n```json\n" formattedContext += rc.ResourceData formattedContext += "\n```\n\n" } // Add resource-specific metadata if available if rc.Metadata != nil { // Add deployment-specific information if strings.EqualFold(rc.Kind, "deployment") { formattedContext += "## Deployment Status\n" // Add replica information if desiredReplicas, ok := rc.Metadata["desiredReplicas"].(int64); ok { formattedContext += fmt.Sprintf("Desired Replicas: %d\n", desiredReplicas) } if currentReplicas, ok := rc.Metadata["currentReplicas"].(int64); ok { formattedContext += fmt.Sprintf("Current Replicas: %d\n", currentReplicas) } if readyReplicas, ok := rc.Metadata["readyReplicas"].(int64); ok { formattedContext += fmt.Sprintf("Ready Replicas: %d\n", readyReplicas) } if availableReplicas, ok := rc.Metadata["availableReplicas"].(int64); ok { formattedContext += fmt.Sprintf("Available Replicas: %d\n", availableReplicas) } // Add container information if containers, ok := rc.Metadata["containers"].([]map[string]interface{}); ok && len(containers) > 0 { formattedContext += "\n### Containers\n" for i, container := range containers { formattedContext += fmt.Sprintf("%d. Name: %s\n", i+1, container["name"]) if image, ok := container["image"].(string); ok { formattedContext += fmt.Sprintf(" Image: %s\n", image) } if resources, ok := container["resources"].(map[string]interface{}); ok { formattedContext += " Resources:\n" if requests, ok := resources["requests"].(map[string]interface{}); ok { formattedContext += " Requests:\n" for k, v := range requests { formattedContext += fmt.Sprintf(" %s: %v\n", k, v) } } if limits, ok := resources["limits"].(map[string]interface{}); ok { formattedContext += " Limits:\n" for k, v := range limits { formattedContext += fmt.Sprintf(" %s: %v\n", k, v) } } } } } formattedContext += "\n" } } // If this is a namespace, add namespace-specific information if strings.EqualFold(rc.Kind, "namespace") { // Add resource metadata if available if rc.Metadata != nil { if resourceCounts, ok := rc.Metadata["resourceCounts"].(map[string][]string); ok { formattedContext += "## Resources in Namespace\n" for kind, resources := range resourceCounts { formattedContext += fmt.Sprintf("- %s: %d resources\n", kind, len(resources)) // List up to 5 resources of each kind if len(resources) > 0 { formattedContext += " - " for i, name := range resources { if i > 4 { formattedContext += fmt.Sprintf("and %d more...", len(resources)-5) break } if i > 0 { formattedContext += ", " } formattedContext += name } formattedContext += "\n" } } formattedContext += "\n" } if health, ok := rc.Metadata["health"].(map[string]map[string]string); ok { formattedContext += "## Health Status\n" for kind, statuses := range health { healthy := 0 unhealthy := 0 progressing := 0 unknown := 0 for _, status := range statuses { switch status { case "healthy": healthy++ case "unhealthy": unhealthy++ case "progressing": progressing++ default: unknown++ } } formattedContext += fmt.Sprintf("- %s: %d healthy, %d unhealthy, %d progressing, %d unknown\n", kind, healthy, unhealthy, progressing, unknown) // List unhealthy resources unhealthyResources := []string{} for name, status := range statuses { if status == "unhealthy" { unhealthyResources = append(unhealthyResources, name) } } if len(unhealthyResources) > 0 { formattedContext += " Unhealthy: " for i, name := range unhealthyResources { if i > 4 { formattedContext += fmt.Sprintf("and %d more...", len(unhealthyResources)-5) break } if i > 0 { formattedContext += ", " } formattedContext += name } formattedContext += "\n" } } formattedContext += "\n" } } } // Format ArgoCD information if available if rc.ArgoApplication != nil { formattedContext += "## ArgoCD Application\n" formattedContext += fmt.Sprintf("Name: %s\n", rc.ArgoApplication.Name) formattedContext += fmt.Sprintf("Sync Status: %s\n", rc.ArgoSyncStatus) formattedContext += fmt.Sprintf("Health Status: %s\n", rc.ArgoHealthStatus) if rc.ArgoApplication.Spec.Source.RepoURL != "" { formattedContext += fmt.Sprintf("Source: %s\n", rc.ArgoApplication.Spec.Source.RepoURL) formattedContext += fmt.Sprintf("Path: %s\n", rc.ArgoApplication.Spec.Source.Path) formattedContext += fmt.Sprintf("Target Revision: %s\n", rc.ArgoApplication.Spec.Source.TargetRevision) } formattedContext += "\n" // Add recent sync history if len(rc.ArgoSyncHistory) > 0 { formattedContext += "### Recent Sync History\n" for i, history := range rc.ArgoSyncHistory { formattedContext += fmt.Sprintf("%d. [%s] Revision: %s, Status: %s\n", i+1, history.DeployedAt.Format(time.RFC3339), history.Revision, history.Status) } formattedContext += "\n" } } // Format GitLab information if available if rc.GitLabProject != nil { formattedContext += "## GitLab Project\n" formattedContext += fmt.Sprintf("Name: %s\n", rc.GitLabProject.PathWithNamespace) formattedContext += fmt.Sprintf("URL: %s\n\n", rc.GitLabProject.WebURL) // Add last pipeline information if rc.LastPipeline != nil { formattedContext += "### Last Pipeline\n" // Handle pipeline CreatedAt timestamp var pipelineTimestamp string switch createdAt := rc.LastPipeline.CreatedAt.(type) { case int64: pipelineTimestamp = time.Unix(createdAt, 0).Format(time.RFC3339) case float64: pipelineTimestamp = time.Unix(int64(createdAt), 0).Format(time.RFC3339) case string: // Try to parse the string timestamp parsed, err := time.Parse(time.RFC3339, createdAt) if err != nil { // Try alternative format parsed, err = time.Parse("2006-01-02T15:04:05.000Z", createdAt) if err != nil { // Use raw string if parsing fails pipelineTimestamp = createdAt } else { pipelineTimestamp = parsed.Format(time.RFC3339) } } else { pipelineTimestamp = parsed.Format(time.RFC3339) } default: pipelineTimestamp = "unknown timestamp" } formattedContext += fmt.Sprintf("Status: %s\n", rc.LastPipeline.Status) formattedContext += fmt.Sprintf("Ref: %s\n", rc.LastPipeline.Ref) formattedContext += fmt.Sprintf("SHA: %s\n", rc.LastPipeline.SHA) formattedContext += fmt.Sprintf("Created At: %s\n\n", pipelineTimestamp) } // Add last deployment information if rc.LastDeployment != nil { formattedContext += "### Last Deployment\n" // Handle deployment CreatedAt timestamp var deploymentTimestamp string switch createdAt := rc.LastDeployment.CreatedAt.(type) { case int64: deploymentTimestamp = time.Unix(createdAt, 0).Format(time.RFC3339) case float64: deploymentTimestamp = time.Unix(int64(createdAt), 0).Format(time.RFC3339) case string: // Try to parse the string timestamp parsed, err := time.Parse(time.RFC3339, createdAt) if err != nil { // Try alternative format parsed, err = time.Parse("2006-01-02T15:04:05.000Z", createdAt) if err != nil { // Use raw string if parsing fails deploymentTimestamp = createdAt } else { deploymentTimestamp = parsed.Format(time.RFC3339) } } else { deploymentTimestamp = parsed.Format(time.RFC3339) } default: deploymentTimestamp = "unknown timestamp" } formattedContext += fmt.Sprintf("Status: %s\n", rc.LastDeployment.Status) formattedContext += fmt.Sprintf("Environment: %s\n", rc.LastDeployment.Environment.Name) formattedContext += fmt.Sprintf("Created At: %s\n\n", deploymentTimestamp) } // Add recent commits if len(rc.RecentCommits) > 0 { formattedContext += "### Recent Commits\n" for i, commit := range rc.RecentCommits { // Handle commit CreatedAt timestamp var commitTimestamp string switch createdAt := commit.CreatedAt.(type) { case int64: commitTimestamp = time.Unix(createdAt, 0).Format(time.RFC3339) case float64: commitTimestamp = time.Unix(int64(createdAt), 0).Format(time.RFC3339) case string: // Try to parse the string timestamp parsed, err := time.Parse(time.RFC3339, createdAt) if err != nil { // Try alternative format parsed, err = time.Parse("2006-01-02T15:04:05.000Z", createdAt) if err != nil { // Use raw string if parsing fails commitTimestamp = createdAt } else { commitTimestamp = parsed.Format(time.RFC3339) } } else { commitTimestamp = parsed.Format(time.RFC3339) } default: commitTimestamp = "unknown timestamp" } formattedContext += fmt.Sprintf("%d. [%s] %s by %s: %s\n", i+1, commitTimestamp, commit.ShortID, commit.AuthorName, commit.Title) } formattedContext += "\n" } } // Format Kubernetes events if len(rc.Events) > 0 { formattedContext += "## Recent Kubernetes Events\n" for i, event := range rc.Events { formattedContext += fmt.Sprintf("%d. [%s] %s: %s\n", i+1, event.Type, event.Reason, event.Message) } formattedContext += "\n" } if len(rc.RelatedResources) > 0 { formattedContext += "## Related Resources\n" // Group by resource kind resourcesByKind := make(map[string][]string) for _, resource := range rc.RelatedResources { parts := strings.Split(resource, "/") if len(parts) == 2 { kind := parts[0] name := parts[1] resourcesByKind[kind] = append(resourcesByKind[kind], name) } else { // If format is unexpected, just add as is formattedContext += fmt.Sprintf("- %s\n", resource) } } // Format resources by kind for kind, names := range resourcesByKind { formattedContext += fmt.Sprintf("- %s (%d):\n", kind, len(names)) // Show up to 10 resources per kind maxToShow := 10 if len(names) > maxToShow { for i := 0; i < maxToShow; i++ { formattedContext += fmt.Sprintf(" - %s\n", names[i]) } formattedContext += fmt.Sprintf(" - ... and %d more\n", len(names)-maxToShow) } else { for _, name := range names { formattedContext += fmt.Sprintf(" - %s\n", name) } } } formattedContext += "\n" } // Add errors if any if len(rc.Errors) > 0 { formattedContext += "## Errors in Data Collection\n" for _, err := range rc.Errors { formattedContext += fmt.Sprintf("- %s\n", err) } formattedContext += "\n" } // Ensure context doesn't exceed max size if len(formattedContext) > cm.maxContextSize { cm.logger.Debug("Context exceeds maximum size, truncating", "originalSize", len(formattedContext), "maxSize", cm.maxContextSize) formattedContext = utils.TruncateContextSmartly(formattedContext, cm.maxContextSize) } cm.logger.Debug("Formatted resource context", "kind", rc.Kind, "name", rc.Name, "contextSize", len(formattedContext)) return formattedContext, nil } // CombineContexts combines multiple resource contexts into a single context func (cm *ContextManager) CombineContexts(ctx context.Context, resourceContexts []models.ResourceContext) (string, error) { cm.logger.Debug("Combining resource contexts", "count", len(resourceContexts)) var combinedContext string combinedContext += fmt.Sprintf("# Kubernetes GitOps Context (%d resources)\n\n", len(resourceContexts)) // Add context for each resource for i, rc := range resourceContexts { resourceContext, err := cm.FormatResourceContext(rc) if err != nil { return "", fmt.Errorf("failed to format resource context #%d: %w", i+1, err) } combinedContext += fmt.Sprintf("--- RESOURCE %d/%d ---\n", i+1, len(resourceContexts)) combinedContext += resourceContext combinedContext += "------------------------\n\n" } // Ensure combined context doesn't exceed max size if len(combinedContext) > cm.maxContextSize { cm.logger.Debug("Combined context exceeds maximum size, truncating", "originalSize", len(combinedContext), "maxSize", cm.maxContextSize) combinedContext = utils.TruncateContextSmartly(combinedContext, cm.maxContextSize) } cm.logger.Debug("Combined resource contexts", "resourceCount", len(resourceContexts), "contextSize", len(combinedContext)) return combinedContext, nil } ``` -------------------------------------------------------------------------------- /kubernetes-claude-mcp/internal/api/routes.go: -------------------------------------------------------------------------------- ```go package api import ( "encoding/json" "fmt" "net/http" "strings" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/internal/models" "github.com/gorilla/mux" ) // setupRoutes configures the API routes func (s *Server) setupRoutes() { // Apply CORS middleware to all routes s.router.Use(s.corsMiddleware) // API version prefix apiV1 := s.router.PathPrefix("/api/v1").Subrouter() // Health check endpoint (no auth required) apiV1.HandleFunc("/health", s.handleHealth).Methods("GET") // Add authentication middleware to all other routes apiSecure := apiV1.NewRoute().Subrouter() apiSecure.Use(s.authMiddleware) // MCP endpoints apiSecure.HandleFunc("/mcp", s.handleMCPRequest).Methods("POST") apiSecure.HandleFunc("/mcp/resource", s.handleResourceQuery).Methods("POST") apiSecure.HandleFunc("/mcp/commit", s.handleCommitQuery).Methods("POST") apiSecure.HandleFunc("/mcp/troubleshoot", s.handleTroubleshoot).Methods("POST") // Kubernetes resource endpoints apiSecure.HandleFunc("/namespaces", s.handleListNamespaces).Methods("GET") apiSecure.HandleFunc("/resources/{resource}", s.handleListResources).Methods("GET") apiSecure.HandleFunc("/resources/{resource}/{name}", s.handleGetResource).Methods("GET") apiSecure.HandleFunc("/events", s.handleGetEvents).Methods("GET") // ArgoCD endpoints apiSecure.HandleFunc("/argocd/applications", s.handleListArgoApplications).Methods("GET") apiSecure.HandleFunc("/argocd/applications/{name}", s.handleGetArgoApplication).Methods("GET") // GitLab endpoints apiSecure.HandleFunc("/gitlab/projects", s.handleListGitLabProjects).Methods("GET") apiSecure.HandleFunc("/gitlab/projects/{projectId}/pipelines", s.handleListGitLabPipelines).Methods("GET") // Merge Request endpoints apiSecure.HandleFunc("/mcp/mergeRequest", s.handleMergeRequestQuery).Methods("POST") } // handleMergeRequestQuery handles MCP requests for analyzing merge requests func (s *Server) handleMergeRequestQuery(w http.ResponseWriter, r *http.Request) { var request models.MCPRequest if err := json.NewDecoder(r.Body).Decode(&request); err != nil { s.respondWithError(w, http.StatusBadRequest, "Invalid request format", err) return } // Force action to be queryMergeRequest request.Action = "queryMergeRequest" // Validate merge request parameters if request.ProjectID == "" || request.MergeRequestIID <= 0 { s.respondWithError(w, http.StatusBadRequest, "Project ID and merge request IID are required", nil) return } s.logger.Info("Received merge request query", "projectId", request.ProjectID, "mergeRequestIID", request.MergeRequestIID) // Process the request response, err := s.mcpHandler.ProcessRequest(r.Context(), &request) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to process request", err) return } s.respondWithJSON(w, http.StatusOK, response) } // handleHealth handles health check requests func (s *Server) handleHealth(w http.ResponseWriter, r *http.Request) { type healthResponse struct { Status string `json:"status"` Services map[string]string `json:"services"` } // Check each service services := map[string]string{ "kubernetes": "unknown", "argocd": "unknown", "gitlab": "unknown", "claude": "unknown", } ctx := r.Context() // Check Kubernetes connectivity if err := s.k8sClient.CheckConnectivity(ctx); err != nil { services["kubernetes"] = "unavailable" s.logger.Warn("Kubernetes health check failed", "error", err) } else { services["kubernetes"] = "available" } // Check ArgoCD connectivity if err := s.argoClient.CheckConnectivity(ctx); err != nil { services["argocd"] = "unavailable" s.logger.Warn("ArgoCD health check failed", "error", err) } else { services["argocd"] = "available" } // Check GitLab connectivity if err := s.gitlabClient.CheckConnectivity(ctx); err != nil { services["gitlab"] = "unavailable" s.logger.Warn("GitLab health check failed", "error", err) } else { services["gitlab"] = "available" } // For Claude, we just assume it's available since we don't want to make an API call // in a health check endpoint services["claude"] = "assumed available" // Determine overall status status := "ok" if services["kubernetes"] != "available" { status = "degraded" } response := healthResponse{ Status: status, Services: services, } w.Header().Set("Content-Type", "application/json") w.WriteHeader(http.StatusOK) json.NewEncoder(w).Encode(response) } // handleMCPRequest handles generic MCP requests func (s *Server) handleMCPRequest(w http.ResponseWriter, r *http.Request) { var request models.MCPRequest if err := json.NewDecoder(r.Body).Decode(&request); err != nil { s.respondWithError(w, http.StatusBadRequest, "Invalid request format", err) return } s.logger.Info("Received MCP request", "action", request.Action) // Process the request response, err := s.mcpHandler.ProcessRequest(r.Context(), &request) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to process request", err) return } s.respondWithJSON(w, http.StatusOK, response) } // handleResourceQuery handles MCP requests for querying resources func (s *Server) handleResourceQuery(w http.ResponseWriter, r *http.Request) { var request models.MCPRequest if err := json.NewDecoder(r.Body).Decode(&request); err != nil { s.respondWithError(w, http.StatusBadRequest, "Invalid request format", err) return } // Force action to be queryResource request.Action = "queryResource" // Validate resource parameters if request.Resource == "" || request.Name == "" { s.respondWithError(w, http.StatusBadRequest, "Resource and name are required", nil) return } s.logger.Info("Received resource query", "resource", request.Resource, "name", request.Name, "namespace", request.Namespace) // Special handling for namespace resources to provide comprehensive data if strings.ToLower(request.Resource) == "namespace" { // Get namespace topology topology, err := s.k8sClient.GetNamespaceTopology(r.Context(), request.Name) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to get namespace topology", err) return } // Get all resources in the namespace resources, err := s.k8sClient.GetAllNamespaceResources(r.Context(), request.Name) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to get namespace resources", err) return } // Get namespace analysis analysis, err := s.mcpHandler.AnalyzeNamespace(r.Context(), request.Name) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to analyze namespace", err) return } // Create an enhanced request with the gathered data enhancedRequest := request enhancedRequest.Context = fmt.Sprintf("# Namespace Analysis: %s\n\n", request.Name) enhancedRequest.Context += fmt.Sprintf("## Resource Counts\n") for kind, count := range resources.Stats { enhancedRequest.Context += fmt.Sprintf("- %s: %d\n", kind, count) } enhancedRequest.Context += "\n## Resource Relationships\n" for _, rel := range topology.Relationships { enhancedRequest.Context += fmt.Sprintf("- %s/%s → %s/%s (%s)\n", rel.SourceKind, rel.SourceName, rel.TargetKind, rel.TargetName, rel.RelationType) } enhancedRequest.Context += "\n## Health Status\n" for kind, statuses := range topology.Health { healthy := 0 unhealthy := 0 progressing := 0 unknown := 0 for _, status := range statuses { switch status { case "healthy": healthy++ case "unhealthy": unhealthy++ case "progressing": progressing++ default: unknown++ } } enhancedRequest.Context += fmt.Sprintf("- %s: %d healthy, %d unhealthy, %d progressing, %d unknown\n", kind, healthy, unhealthy, progressing, unknown) } // Get events for the namespace events, err := s.k8sClient.GetNamespaceEvents(r.Context(), request.Name) if err == nil && len(events) > 0 { enhancedRequest.Context += "\n## Recent Events\n" for i, event := range events { if i >= 10 { break // Limit to 10 events } enhancedRequest.Context += fmt.Sprintf("- [%s] %s: %s\n", event.Type, event.Reason, event.Message) } } // Process the enhanced request response, err := s.mcpHandler.ProcessRequest(r.Context(), &enhancedRequest) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to process request", err) return } // Add analysis insights to the response if analysis != nil { response.NamespaceAnalysis = analysis } s.respondWithJSON(w, http.StatusOK, response) return } // Process regular resource query response, err := s.mcpHandler.ProcessRequest(r.Context(), &request) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to process request", err) return } s.respondWithJSON(w, http.StatusOK, response) } // handleCommitQuery handles MCP requests for analyzing commits func (s *Server) handleCommitQuery(w http.ResponseWriter, r *http.Request) { var request models.MCPRequest if err := json.NewDecoder(r.Body).Decode(&request); err != nil { s.respondWithError(w, http.StatusBadRequest, "Invalid request format", err) return } // Force action to be queryCommit request.Action = "queryCommit" // Validate commit parameters if request.ProjectID == "" || request.CommitSHA == "" { s.respondWithError(w, http.StatusBadRequest, "Project ID and commit SHA are required", nil) return } s.logger.Info("Received commit query", "projectId", request.ProjectID, "commitSha", request.CommitSHA) // Process the request response, err := s.mcpHandler.ProcessRequest(r.Context(), &request) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to process request", err) return } s.respondWithJSON(w, http.StatusOK, response) } // handleTroubleshoot handles troubleshooting requests func (s *Server) handleTroubleshoot(w http.ResponseWriter, r *http.Request) { var request struct { Resource string `json:"resource"` Name string `json:"name"` Namespace string `json:"namespace"` Query string `json:"query,omitempty"` } if err := json.NewDecoder(r.Body).Decode(&request); err != nil { s.respondWithError(w, http.StatusBadRequest, "Invalid request format", err) return } // Validate parameters if request.Resource == "" || request.Name == "" { s.respondWithError(w, http.StatusBadRequest, "Resource and name are required", nil) return } s.logger.Info("Received troubleshoot request", "resource", request.Resource, "name", request.Name, "namespace", request.Namespace) // Process the troubleshooting request result, err := s.troubleshootCorrelator.TroubleshootResource( r.Context(), request.Namespace, request.Resource, request.Name, ) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to troubleshoot resource", err) return } // If there's a query, use Claude to analyze the results if request.Query != "" { mcpRequest := &models.MCPRequest{ Resource: request.Resource, Name: request.Name, Namespace: request.Namespace, Query: request.Query, } response, err := s.mcpHandler.ProcessTroubleshootRequest(r.Context(), mcpRequest, result) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to process troubleshoot analysis", err) return } // Add the troubleshoot result to the response responseWithResult := struct { *models.MCPResponse TroubleshootResult *models.TroubleshootResult `json:"troubleshootResult"` }{ MCPResponse: response, TroubleshootResult: result, } s.respondWithJSON(w, http.StatusOK, responseWithResult) return } // If no query, just return the troubleshoot result s.respondWithJSON(w, http.StatusOK, result) } // handleListNamespaces handles requests to list namespaces func (s *Server) handleListNamespaces(w http.ResponseWriter, r *http.Request) { namespaces, err := s.k8sClient.GetNamespaces(r.Context()) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to list namespaces", err) return } s.respondWithJSON(w, http.StatusOK, map[string][]string{"namespaces": namespaces}) } // handleListResources handles requests to list resources of a specific type func (s *Server) handleListResources(w http.ResponseWriter, r *http.Request) { vars := mux.Vars(r) resourceType := vars["resource"] namespace := r.URL.Query().Get("namespace") resources, err := s.k8sClient.ListResources(r.Context(), resourceType, namespace) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to list resources", err) return } s.respondWithJSON(w, http.StatusOK, map[string]interface{}{"resources": resources}) } // handleGetResource handles requests to get a specific resource func (s *Server) handleGetResource(w http.ResponseWriter, r *http.Request) { vars := mux.Vars(r) resourceType := vars["resource"] name := vars["name"] namespace := r.URL.Query().Get("namespace") resource, err := s.k8sClient.GetResource(r.Context(), resourceType, namespace, name) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to get resource", err) return } s.respondWithJSON(w, http.StatusOK, resource) } // handleGetEvents handles requests to get events func (s *Server) handleGetEvents(w http.ResponseWriter, r *http.Request) { namespace := r.URL.Query().Get("namespace") resourceType := r.URL.Query().Get("resource") name := r.URL.Query().Get("name") events, err := s.k8sClient.GetResourceEvents(r.Context(), namespace, resourceType, name) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to get events", err) return } s.respondWithJSON(w, http.StatusOK, map[string]interface{}{"events": events}) } // handleListArgoApplications handles requests to list ArgoCD applications func (s *Server) handleListArgoApplications(w http.ResponseWriter, r *http.Request) { applications, err := s.argoClient.ListApplications(r.Context()) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to list ArgoCD applications", err) return } s.respondWithJSON(w, http.StatusOK, map[string]interface{}{"applications": applications}) } // handleGetArgoApplication handles requests to get a specific ArgoCD application func (s *Server) handleGetArgoApplication(w http.ResponseWriter, r *http.Request) { vars := mux.Vars(r) name := vars["name"] application, err := s.argoClient.GetApplication(r.Context(), name) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to get ArgoCD application", err) return } s.respondWithJSON(w, http.StatusOK, application) } // handleListGitLabProjects handles requests to list GitLab projects func (s *Server) handleListGitLabProjects(w http.ResponseWriter, r *http.Request) { // This would typically include pagination parameters projects, err := s.gitlabClient.ListProjects(r.Context()) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to list GitLab projects", err) return } s.respondWithJSON(w, http.StatusOK, map[string]interface{}{"projects": projects}) } // handleListGitLabPipelines handles requests to list GitLab pipelines func (s *Server) handleListGitLabPipelines(w http.ResponseWriter, r *http.Request) { vars := mux.Vars(r) projectId := vars["projectId"] pipelines, err := s.gitlabClient.ListPipelines(r.Context(), projectId) if err != nil { s.respondWithError(w, http.StatusInternalServerError, "Failed to list GitLab pipelines", err) return } s.respondWithJSON(w, http.StatusOK, map[string]interface{}{"pipelines": pipelines}) } // Helper methods // respondWithError sends an error response to the client func (s *Server) respondWithError(w http.ResponseWriter, code int, message string, err error) { errorResponse := map[string]string{ "error": message, } if err != nil { errorResponse["details"] = err.Error() s.logger.Error(message, "error", err, "code", code) } else { s.logger.Warn(message, "code", code) } w.Header().Set("Content-Type", "application/json") w.WriteHeader(code) json.NewEncoder(w).Encode(errorResponse) } // respondWithJSON sends a JSON response to the client func (s *Server) respondWithJSON(w http.ResponseWriter, code int, payload interface{}) { w.Header().Set("Content-Type", "application/json") w.WriteHeader(code) json.NewEncoder(w).Encode(payload) } ``` -------------------------------------------------------------------------------- /kubernetes-claude-mcp/internal/correlator/gitops.go: -------------------------------------------------------------------------------- ```go package correlator import ( "context" "fmt" "strings" "time" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/internal/argocd" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/internal/gitlab" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/internal/k8s" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/internal/models" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/pkg/logging" ) // GitOpsCorrelator correlates data between Kubernetes, ArgoCD, and GitLab type GitOpsCorrelator struct { k8sClient *k8s.Client argoClient *argocd.Client gitlabClient *gitlab.Client helmCorrelator *HelmCorrelator logger *logging.Logger } // NewGitOpsCorrelator creates a new GitOps correlator func NewGitOpsCorrelator(k8sClient *k8s.Client, argoClient *argocd.Client, gitlabClient *gitlab.Client, logger *logging.Logger) *GitOpsCorrelator { if logger == nil { logger = logging.NewLogger().Named("correlator") } correlator := &GitOpsCorrelator{ k8sClient: k8sClient, argoClient: argoClient, gitlabClient: gitlabClient, logger: logger, } // Initialize the Helm correlator correlator.helmCorrelator = NewHelmCorrelator(gitlabClient, logger.Named("helm")) return correlator } // Add a new method to analyze merge requests func (c *GitOpsCorrelator) AnalyzeMergeRequest( ctx context.Context, projectID string, mergeRequestIID int, ) ([]models.ResourceContext, error) { c.logger.Info("Analyzing merge request", "projectID", projectID, "mergeRequestIID", mergeRequestIID) // Get merge request details mergeRequest, err := c.gitlabClient.AnalyzeMergeRequest(ctx, projectID, mergeRequestIID) if err != nil { return nil, fmt.Errorf("failed to analyze merge request: %w", err) } // Check if the MR affects Helm charts or Kubernetes manifests if !mergeRequest.MergeRequestContext.HelmChartAffected && !mergeRequest.MergeRequestContext.KubernetesManifest { c.logger.Info("Merge request does not affect Kubernetes resources") return []models.ResourceContext{}, nil } // Get all ArgoCD applications argoApps, err := c.argoClient.ListApplications(ctx) if err != nil { return nil, fmt.Errorf("failed to list ArgoCD applications: %w", err) } // Find the project path projectPath := fmt.Sprintf("%s", projectID) project, err := c.gitlabClient.GetProject(ctx, projectID) if err == nil && project != nil { projectPath = project.PathWithNamespace } // For Helm-affected MRs, analyze Helm changes var helmAffectedResources []string if mergeRequest.MergeRequestContext.HelmChartAffected { helmResources, err := c.helmCorrelator.AnalyzeMergeRequestHelmChanges(ctx, projectID, mergeRequestIID) if err != nil { c.logger.Warn("Failed to analyze Helm changes in MR", "error", err) } else if len(helmResources) > 0 { helmAffectedResources = helmResources c.logger.Info("Found resources affected by Helm changes in MR", "count", len(helmResources)) } } // Identify potentially affected applications var affectedApps []models.ArgoApplication for _, app := range argoApps { if isAppSourcedFromProject(app, projectPath) { // For each file changed in the MR, check if it affects the app isAffected := false // Check if any changed file affects the app for _, file := range mergeRequest.MergeRequestContext.AffectedFiles { if isFileInAppSourcePath(app, file) { isAffected = true break } } // Check Helm-derived resources if !isAffected && len(helmAffectedResources) > 0 { if appContainsAnyResource(ctx, c.argoClient, app, helmAffectedResources) { isAffected = true } } if isAffected { affectedApps = append(affectedApps, app) } } } // For each affected app, identify the resources that would be affected var result []models.ResourceContext for _, app := range affectedApps { c.logger.Info("Found potentially affected ArgoCD application", "app", app.Name) // Get resources managed by this application tree, err := c.argoClient.GetResourceTree(ctx, app.Name) if err != nil { c.logger.Warn("Failed to get resource tree", "app", app.Name, "error", err) continue } // For each resource in the app, create a deployment info object for _, node := range tree.Nodes { // Skip non-Kubernetes resources or resources with no name/namespace if node.Kind == "" || node.Name == "" { continue } // Avoid unnecessary duplicates in the result if isResourceAlreadyInResults(result, node.Kind, node.Name, node.Namespace) { continue } // Trace the deployment for this resource resourceContext, err := c.TraceResourceDeployment( ctx, node.Namespace, node.Kind, node.Name, ) if err != nil { c.logger.Warn("Failed to trace resource deployment", "kind", node.Kind, "name", node.Name, "namespace", node.Namespace, "error", err) continue } // Add source info resourceContext.RelatedResources = append(resourceContext.RelatedResources, fmt.Sprintf("MergeRequest/%d", mergeRequestIID)) // Add to results result = append(result, resourceContext) } } // Add cleanup on exit defer func() { if c.helmCorrelator != nil { c.helmCorrelator.Cleanup() } }() c.logger.Info("Analysis of merge request completed", "projectID", projectID, "mergeRequestIID", mergeRequestIID, "resourceCount", len(result)) return result, nil } func (c *GitOpsCorrelator) TraceResourceDeployment( ctx context.Context, namespace, kind, name string, ) (models.ResourceContext, error) { c.logger.Info("Tracing resource deployment", "kind", kind, "name", name, "namespace", namespace) resourceContext := models.ResourceContext{ Kind: kind, Name: name, Namespace: namespace, } var errors []string // Get Kubernetes resource information resource, err := c.k8sClient.GetResource(ctx, kind, namespace, name) if err != nil { errMsg := fmt.Sprintf("Failed to get Kubernetes resource: %v", err) errors = append(errors, errMsg) c.logger.Warn(errMsg) } else { resourceContext.APIVersion = resource.GetAPIVersion() // Get events related to this resource events, err := c.k8sClient.GetResourceEvents(ctx, namespace, kind, name) if err != nil { errMsg := fmt.Sprintf("Failed to get resource events: %v", err) errors = append(errors, errMsg) c.logger.Warn(errMsg) } else { resourceContext.Events = events } } // Find the ArgoCD application managing this resource argoApps, err := c.argoClient.FindApplicationsByResource(ctx, kind, name, namespace) if err != nil { errMsg := fmt.Sprintf("Failed to find ArgoCD applications: %v", err) errors = append(errors, errMsg) c.logger.Warn(errMsg) } else if len(argoApps) > 0 { // Use the first application that manages this resource app := argoApps[0] resourceContext.ArgoApplication = &app resourceContext.ArgoSyncStatus = app.Status.Sync.Status resourceContext.ArgoHealthStatus = app.Status.Health.Status // Get recent syncs history, err := c.argoClient.GetApplicationHistory(ctx, app.Name) if err != nil { errMsg := fmt.Sprintf("Failed to get application history: %v", err) errors = append(errors, errMsg) c.logger.Warn(errMsg) } else { // Limit to recent syncs (last 5) if len(history) > 5 { history = history[:5] } resourceContext.ArgoSyncHistory = history } // Connect to GitLab if we have source information if app.Spec.Source.RepoURL != "" { // Extract GitLab project path from repo URL projectPath := extractGitLabProjectPath(app.Spec.Source.RepoURL) if projectPath != "" { project, err := c.gitlabClient.GetProjectByPath(ctx, projectPath) if err != nil { errMsg := fmt.Sprintf("Failed to get GitLab project: %v", err) errors = append(errors, errMsg) c.logger.Warn(errMsg) } else { resourceContext.GitLabProject = project // Get recent pipelines pipelines, err := c.gitlabClient.ListPipelines(ctx, fmt.Sprintf("%d", project.ID)) if err != nil { errMsg := fmt.Sprintf("Failed to list pipelines: %v", err) errors = append(errors, errMsg) c.logger.Warn(errMsg) } else { // Get the latest pipeline if len(pipelines) > 0 { resourceContext.LastPipeline = &pipelines[0] } } // Find environment from ArgoCD application environment := extractEnvironmentFromArgoApp(app) if environment != "" { // Get recent deployments to this environment deployments, err := c.gitlabClient.FindRecentDeployments( ctx, fmt.Sprintf("%d", project.ID), environment, ) if err != nil { errMsg := fmt.Sprintf("Failed to find deployments: %v", err) errors = append(errors, errMsg) c.logger.Warn(errMsg) } else if len(deployments) > 0 { resourceContext.LastDeployment = &deployments[0] } } // Get recent commits sinceTime := time.Now().Add(-24 * time.Hour) // Last 24 hours commits, err := c.gitlabClient.FindRecentChanges( ctx, fmt.Sprintf("%d", project.ID), sinceTime, ) if err != nil { errMsg := fmt.Sprintf("Failed to find recent changes: %v", err) errors = append(errors, errMsg) c.logger.Warn(errMsg) } else { // Here we'll limit to recent commits (last 5)... if len(commits) > 5 { commits = commits[:5] } resourceContext.RecentCommits = commits } } } } } // Collect any errors that occurred during correlation resourceContext.Errors = errors c.logger.Info("Resource deployment traced", "kind", kind, "name", name, "namespace", namespace, "argoApp", resourceContext.ArgoApplication != nil, "gitlabProject", resourceContext.GitLabProject != nil, "errors", len(errors)) return resourceContext, nil } // isFileInAppSourcePath checks if a file is in the application's source path func isFileInAppSourcePath(app models.ArgoApplication, file string) bool { sourcePath := app.Spec.Source.Path if sourcePath == "" { // If no specific path is provided, any file could affect the app return true } return strings.HasPrefix(file, sourcePath) } // hasHelmChanges checks if any of the changed files are related to Helm charts func hasHelmChanges(diffs []models.GitLabDiff) bool { for _, diff := range diffs { path := diff.NewPath if strings.Contains(path, "Chart.yaml") || strings.Contains(path, "values.yaml") || (strings.Contains(path, "templates/") && strings.HasSuffix(path, ".yaml")) { return true } } return false } // appContainsAnyResource checks if an ArgoCD application contains any of the specified resources func appContainsAnyResource(ctx context.Context, argoClient *argocd.Client, app models.ArgoApplication, resources []string) bool { tree, err := argoClient.GetResourceTree(ctx, app.Name) if err != nil { return false } for _, resource := range resources { parts := strings.Split(resource, "/") if len(parts) == 2 { // Format: Kind/Name kind := parts[0] name := parts[1] for _, node := range tree.Nodes { if strings.EqualFold(node.Kind, kind) && node.Name == name { return true } } } else if len(parts) == 3 { // Format: Namespace/Kind/Name namespace := parts[0] kind := parts[1] name := parts[2] for _, node := range tree.Nodes { if strings.EqualFold(node.Kind, kind) && node.Name == name && node.Namespace == namespace { return true } } } } return false } // FindResourcesAffectedByCommit finds resources affected by a specific Git commit func (c *GitOpsCorrelator) FindResourcesAffectedByCommit( ctx context.Context, projectID string, commitSHA string, ) ([]models.ResourceContext, error) { c.logger.Info("Finding resources affected by commit", "projectID", projectID, "commitSHA", commitSHA) var result []models.ResourceContext // Get commit information from GitLab commit, err := c.gitlabClient.GetCommit(ctx, projectID, commitSHA) if err != nil { return nil, fmt.Errorf("failed to get commit: %w", err) } c.logger.Info("Processing commit", "author", commit.AuthorName, "message", commit.Title) // Get commit diff to see what files were changed diffs, err := c.gitlabClient.GetCommitDiff(ctx, projectID, commitSHA) if err != nil { return nil, fmt.Errorf("failed to get commit diff: %w", err) } // Get all ArgoCD applications argoApps, err := c.argoClient.ListApplications(ctx) if err != nil { return nil, fmt.Errorf("failed to list ArgoCD applications: %w", err) } // Find applications that use this GitLab project as source projectPath := fmt.Sprintf("%s", projectID) // This might need more parsing depending on projectID format project, err := c.gitlabClient.GetProject(ctx, projectID) if err == nil && project != nil { projectPath = project.PathWithNamespace } // For each application, check if it's affected by the changed files for _, app := range argoApps { if !isAppSourcedFromProject(app, projectPath) { continue } // Check if the commit affects files used by this application if isAppAffectedByDiffs(app, diffs) { c.logger.Info("Found affected ArgoCD application", "app", app.Name) // Get resources managed by this application tree, err := c.argoClient.GetResourceTree(ctx, app.Name) if err != nil { c.logger.Warn("Failed to get resource tree", "app", app.Name, "error", err) continue } // For each resource in the app, create a deployment info object for _, node := range tree.Nodes { // Skip non-Kubernetes resources or resources with no name/namespace if node.Kind == "" || node.Name == "" { continue } // Avoid unnecessary duplicates in the result if isResourceAlreadyInResults(result, node.Kind, node.Name, node.Namespace) { continue } // Trace the deployment for this resource resourceContext, err := c.TraceResourceDeployment( ctx, node.Namespace, node.Kind, node.Name, ) if err != nil { c.logger.Warn("Failed to trace resource deployment", "kind", node.Kind, "name", node.Name, "namespace", node.Namespace, "error", err) continue } result = append(result, resourceContext) } } } c.logger.Info("Found resources affected by commit", "projectID", projectID, "commitSHA", commitSHA, "resourceCount", len(result)) return result, nil } // Helper functions // extractGitLabProjectPath extracts the GitLab project path from a repo URL func extractGitLabProjectPath(repoURL string) string { // Handle different URL formats // Format: https://gitlab.com/namespace/project.git if strings.HasPrefix(repoURL, "https://") || strings.HasPrefix(repoURL, "http://") { parts := strings.Split(repoURL, "/") if len(parts) < 3 { return "" } // Remove ".git" suffix if present lastPart := parts[len(parts)-1] if strings.HasSuffix(lastPart, ".git") { parts[len(parts)-1] = lastPart[:len(lastPart)-4] } // Reconstruct path without protocol and domain domainIndex := 2 // After http:// or https:// if len(parts) <= domainIndex+1 { return "" } return strings.Join(parts[domainIndex+1:], "/") } // Format: [email protected]:namespace/project.git if strings.HasPrefix(repoURL, "git@") { // Split at ":" to get the path part parts := strings.Split(repoURL, ":") if len(parts) != 2 { return "" } // Remove ".git" suffix if present pathPart := parts[1] if strings.HasSuffix(pathPart, ".git") { pathPart = pathPart[:len(pathPart)-4] } return pathPart } return "" } // extractEnvironmentFromArgoApp tries to determine the environment from an ArgoCD application func extractEnvironmentFromArgoApp(app models.ArgoApplication) string { // Check for environment in labels if env, ok := app.Metadata.Labels["environment"]; ok { return env } if env, ok := app.Metadata.Labels["env"]; ok { return env } // Check if environment is in the destination namespace if strings.Contains(app.Spec.Destination.Namespace, "prod") { return "production" } if strings.Contains(app.Spec.Destination.Namespace, "staging") { return "staging" } if strings.Contains(app.Spec.Destination.Namespace, "dev") { return "development" } // Check path in source for environment indicators if app.Spec.Source.Path != "" { if strings.Contains(app.Spec.Source.Path, "prod") { return "production" } if strings.Contains(app.Spec.Source.Path, "staging") { return "staging" } if strings.Contains(app.Spec.Source.Path, "dev") { return "development" } } // Default to destination namespace as a fallback return app.Spec.Destination.Namespace } // isAppSourcedFromProject checks if an ArgoCD application uses a specific GitLab project func isAppSourcedFromProject(app models.ArgoApplication, projectPath string) bool { // Extract project path from app's repo URL appProjectPath := extractGitLabProjectPath(app.Spec.Source.RepoURL) // Compare paths return strings.EqualFold(appProjectPath, projectPath) } // isAppAffectedByDiffs checks if application manifests are affected by file changes func isAppAffectedByDiffs(app models.ArgoApplication, diffs []models.GitLabDiff) bool { sourcePath := app.Spec.Source.Path if sourcePath == "" { // If no specific path is provided, any change could affect the app return true } // Check if any changed file is in the application's source path for _, diff := range diffs { if strings.HasPrefix(diff.NewPath, sourcePath) || strings.HasPrefix(diff.OldPath, sourcePath) { return true } } return false } // isResourceAlreadyInResults checks if a resource is already in the results list func isResourceAlreadyInResults(results []models.ResourceContext, kind, name, namespace string) bool { for _, rc := range results { if rc.Kind == kind && rc.Name == name && rc.Namespace == namespace { return true } } return false } ``` -------------------------------------------------------------------------------- /kubernetes-claude-mcp/internal/k8s/resource_mapper.go: -------------------------------------------------------------------------------- ```go package k8s import ( "context" "fmt" "strings" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" "k8s.io/apimachinery/pkg/runtime/schema" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/pkg/logging" ) // ResourceMapper maps relationships between Kubernetes resources type ResourceMapper struct { client *Client logger *logging.Logger } // ResourceRelationship represents a relationship between two resources type ResourceRelationship struct { SourceKind string `json:"sourceKind"` SourceName string `json:"sourceName"` SourceNamespace string `json:"sourceNamespace"` TargetKind string `json:"targetKind"` TargetName string `json:"targetName"` TargetNamespace string `json:"targetNamespace"` RelationType string `json:"relationType"` } // NamespaceTopology represents the topology of resources in a namespace type NamespaceTopology struct { Namespace string `json:"namespace"` Resources map[string][]string `json:"resources"` Relationships []ResourceRelationship `json:"relationships"` Metrics map[string]map[string]int `json:"metrics"` Health map[string]map[string]string `json:"health"` } // NewResourceMapper creates a new resource mapper func NewResourceMapper(client *Client) *ResourceMapper { return &ResourceMapper{ client: client, logger: client.logger.Named("resource-mapper"), } } // GetNamespaceTopology maps all resources and their relationships in a namespace func (m *ResourceMapper) GetNamespaceTopology(ctx context.Context, namespace string) (*NamespaceTopology, error) { m.logger.Info("Mapping namespace topology", "namespace", namespace) // Initialize topology topology := &NamespaceTopology{ Namespace: namespace, Resources: make(map[string][]string), Relationships: []ResourceRelationship{}, Metrics: make(map[string]map[string]int), Health: make(map[string]map[string]string), } // Discover all available resource types resources, err := m.client.discoveryClient.ServerPreferredResources() if err != nil { return nil, fmt.Errorf("failed to get server resources: %w", err) } // Collect all namespaced resources for _, resourceList := range resources { gv, err := schema.ParseGroupVersion(resourceList.GroupVersion) if err != nil { m.logger.Warn("Failed to parse group version", "groupVersion", resourceList.GroupVersion) continue } for _, r := range resourceList.APIResources { // Skip resources that can't be listed or aren't namespaced if !strings.Contains(r.Verbs.String(), "list") || !r.Namespaced { continue } // Build GVR for this resource type gvr := schema.GroupVersionResource{ Group: gv.Group, Version: gv.Version, Resource: r.Name, } // List resources of this type m.logger.Debug("Listing resources", "namespace", namespace, "resource", r.Name) list, err := m.client.dynamicClient.Resource(gvr).Namespace(namespace).List(ctx, metav1.ListOptions{}) if err != nil { m.logger.Warn("Failed to list resources", "namespace", namespace, "resource", r.Name, "error", err) continue } // Add to topology if len(list.Items) > 0 { topology.Resources[r.Kind] = make([]string, len(list.Items)) topology.Metrics[r.Kind] = map[string]int{"count": len(list.Items)} topology.Health[r.Kind] = make(map[string]string) for i, item := range list.Items { topology.Resources[r.Kind][i] = item.GetName() // Determine health status health := m.determineResourceHealth(&item) topology.Health[r.Kind][item.GetName()] = health } // Find relationships for this resource type relationships := m.findRelationships(ctx, list.Items, namespace) topology.Relationships = append(topology.Relationships, relationships...) } } } m.logger.Info("Namespace topology mapped", "namespace", namespace, "resourceTypes", len(topology.Resources), "relationships", len(topology.Relationships)) return topology, nil } // GetResourceGraph returns a resource graph for visualization func (m *ResourceMapper) GetResourceGraph(ctx context.Context, namespace string) (map[string]interface{}, error) { topology, err := m.GetNamespaceTopology(ctx, namespace) if err != nil { return nil, err } // Convert topology to graph format graph := map[string]interface{}{ "nodes": []map[string]interface{}{}, "edges": []map[string]interface{}{}, } // Add nodes nodeIndex := make(map[string]int) nodeCount := 0 for kind, resources := range topology.Resources { for _, name := range resources { health := "unknown" if h, ok := topology.Health[kind][name]; ok { health = h } node := map[string]interface{}{ "id": nodeCount, "kind": kind, "name": name, "health": health, "group": kind, } // Add to nodes array graph["nodes"] = append(graph["nodes"].([]map[string]interface{}), node) // Save index for edge creation nodeIndex[fmt.Sprintf("%s/%s", kind, name)] = nodeCount nodeCount++ } } // Add edges for _, rel := range topology.Relationships { sourceKey := fmt.Sprintf("%s/%s", rel.SourceKind, rel.SourceName) targetKey := fmt.Sprintf("%s/%s", rel.TargetKind, rel.TargetName) sourceIdx, sourceOk := nodeIndex[sourceKey] targetIdx, targetOk := nodeIndex[targetKey] if sourceOk && targetOk { edge := map[string]interface{}{ "source": sourceIdx, "target": targetIdx, "relationship": rel.RelationType, } graph["edges"] = append(graph["edges"].([]map[string]interface{}), edge) } } return graph, nil } // findRelationships discovers relationships between resources func (m *ResourceMapper) findRelationships(ctx context.Context, resources []unstructured.Unstructured, namespace string) []ResourceRelationship { var relationships []ResourceRelationship for _, resource := range resources { // Check owner references for _, ownerRef := range resource.GetOwnerReferences() { rel := ResourceRelationship{ SourceKind: ownerRef.Kind, SourceName: ownerRef.Name, SourceNamespace: namespace, TargetKind: resource.GetKind(), TargetName: resource.GetName(), TargetNamespace: namespace, RelationType: "owns", } relationships = append(relationships, rel) } // Check for Pod -> Service relationships (via labels/selectors) if resource.GetKind() == "Service" { selector, found, _ := unstructured.NestedMap(resource.Object, "spec", "selector") if found && len(selector) > 0 { // Find pods matching this selector pods, err := m.client.clientset.CoreV1().Pods(namespace).List(ctx, metav1.ListOptions{ LabelSelector: m.labelsToSelector(selector), }) if err == nil { for _, pod := range pods.Items { rel := ResourceRelationship{ SourceKind: "Service", SourceName: resource.GetName(), SourceNamespace: namespace, TargetKind: "Pod", TargetName: pod.Name, TargetNamespace: namespace, RelationType: "selects", } relationships = append(relationships, rel) } } } } // Check for ConfigMap/Secret references in Pods if resource.GetKind() == "Pod" { // Check volumes for ConfigMap references volumes, found, _ := unstructured.NestedSlice(resource.Object, "spec", "volumes") if found { for _, v := range volumes { volume, ok := v.(map[string]interface{}) if !ok { continue } // Check for ConfigMap references if configMap, hasConfigMap, _ := unstructured.NestedMap(volume, "configMap"); hasConfigMap { if cmName, hasName, _ := unstructured.NestedString(configMap, "name"); hasName { rel := ResourceRelationship{ SourceKind: "Pod", SourceName: resource.GetName(), SourceNamespace: namespace, TargetKind: "ConfigMap", TargetName: cmName, TargetNamespace: namespace, RelationType: "mounts", } relationships = append(relationships, rel) } } // Check for Secret references if secret, hasSecret, _ := unstructured.NestedMap(volume, "secret"); hasSecret { if secretName, hasName, _ := unstructured.NestedString(secret, "secretName"); hasName { rel := ResourceRelationship{ SourceKind: "Pod", SourceName: resource.GetName(), SourceNamespace: namespace, TargetKind: "Secret", TargetName: secretName, TargetNamespace: namespace, RelationType: "mounts", } relationships = append(relationships, rel) } } } } // Check environment variables for ConfigMap/Secret references containers, found, _ := unstructured.NestedSlice(resource.Object, "spec", "containers") if found { for _, c := range containers { container, ok := c.(map[string]interface{}) if !ok { continue } // Check for EnvFrom references envFrom, hasEnvFrom, _ := unstructured.NestedSlice(container, "envFrom") if hasEnvFrom { for _, ef := range envFrom { envFromObj, ok := ef.(map[string]interface{}) if !ok { continue } // Check for ConfigMap references if configMap, hasConfigMap, _ := unstructured.NestedMap(envFromObj, "configMapRef"); hasConfigMap { if cmName, hasName, _ := unstructured.NestedString(configMap, "name"); hasName { rel := ResourceRelationship{ SourceKind: "Pod", SourceName: resource.GetName(), SourceNamespace: namespace, TargetKind: "ConfigMap", TargetName: cmName, TargetNamespace: namespace, RelationType: "configures", } relationships = append(relationships, rel) } } // Check for Secret references if secret, hasSecret, _ := unstructured.NestedMap(envFromObj, "secretRef"); hasSecret { if secretName, hasName, _ := unstructured.NestedString(secret, "name"); hasName { rel := ResourceRelationship{ SourceKind: "Pod", SourceName: resource.GetName(), SourceNamespace: namespace, TargetKind: "Secret", TargetName: secretName, TargetNamespace: namespace, RelationType: "configures", } relationships = append(relationships, rel) } } } } // Check individual env vars for ConfigMap/Secret references env, hasEnv, _ := unstructured.NestedSlice(container, "env") if hasEnv { for _, e := range env { envVar, ok := e.(map[string]interface{}) if !ok { continue } // Check for ConfigMap references if valueFrom, hasValueFrom, _ := unstructured.NestedMap(envVar, "valueFrom"); hasValueFrom { if configMap, hasConfigMap, _ := unstructured.NestedMap(valueFrom, "configMapKeyRef"); hasConfigMap { if cmName, hasName, _ := unstructured.NestedString(configMap, "name"); hasName { rel := ResourceRelationship{ SourceKind: "Pod", SourceName: resource.GetName(), SourceNamespace: namespace, TargetKind: "ConfigMap", TargetName: cmName, TargetNamespace: namespace, RelationType: "configures", } relationships = append(relationships, rel) } } // Check for Secret references if secret, hasSecret, _ := unstructured.NestedMap(valueFrom, "secretKeyRef"); hasSecret { if secretName, hasName, _ := unstructured.NestedString(secret, "name"); hasName { rel := ResourceRelationship{ SourceKind: "Pod", SourceName: resource.GetName(), SourceNamespace: namespace, TargetKind: "Secret", TargetName: secretName, TargetNamespace: namespace, RelationType: "configures", } relationships = append(relationships, rel) } } } } } } } } // Check for PVC -> PV relationships if resource.GetKind() == "PersistentVolumeClaim" { volumeName, found, _ := unstructured.NestedString(resource.Object, "spec", "volumeName") if found && volumeName != "" { rel := ResourceRelationship{ SourceKind: "PersistentVolumeClaim", SourceName: resource.GetName(), SourceNamespace: namespace, TargetKind: "PersistentVolume", TargetName: volumeName, TargetNamespace: "", RelationType: "binds", } relationships = append(relationships, rel) } } // Check for Ingress -> Service relationships if resource.GetKind() == "Ingress" { rules, found, _ := unstructured.NestedSlice(resource.Object, "spec", "rules") if found { for _, r := range rules { rule, ok := r.(map[string]interface{}) if !ok { continue } http, found, _ := unstructured.NestedMap(rule, "http") if !found { continue } paths, found, _ := unstructured.NestedSlice(http, "paths") if !found { continue } for _, p := range paths { path, ok := p.(map[string]interface{}) if !ok { continue } backend, found, _ := unstructured.NestedMap(path, "backend") if !found { // Check for newer API version format backend, found, _ = unstructured.NestedMap(path, "backend", "service") if !found { continue } } serviceName, found, _ := unstructured.NestedString(backend, "name") if found { rel := ResourceRelationship{ SourceKind: "Ingress", SourceName: resource.GetName(), SourceNamespace: namespace, TargetKind: "Service", TargetName: serviceName, TargetNamespace: namespace, RelationType: "routes", } relationships = append(relationships, rel) } } } } } } // Deduplicate relationships deduplicatedRelationships := make([]ResourceRelationship, 0) relMap := make(map[string]bool) for _, rel := range relationships { key := fmt.Sprintf("%s/%s/%s/%s/%s/%s/%s", rel.SourceKind, rel.SourceName, rel.SourceNamespace, rel.TargetKind, rel.TargetName, rel.TargetNamespace, rel.RelationType) if _, exists := relMap[key]; !exists { relMap[key] = true deduplicatedRelationships = append(deduplicatedRelationships, rel) } } return deduplicatedRelationships } // labelsToSelector converts a map of labels to a selector string func (m *ResourceMapper) labelsToSelector(labels map[string]interface{}) string { var selectors []string for key, value := range labels { if strValue, ok := value.(string); ok { selectors = append(selectors, fmt.Sprintf("%s=%s", key, strValue)) } } return strings.Join(selectors, ",") } // determineResourceHealth determines the health status of a resource func (m *ResourceMapper) determineResourceHealth(obj *unstructured.Unstructured) string { kind := obj.GetKind() // Check common status fields status, found, _ := unstructured.NestedMap(obj.Object, "status") if !found { return "unknown" } // Check different resource types switch kind { case "Pod": phase, found, _ := unstructured.NestedString(status, "phase") if found { switch phase { case "Running", "Succeeded": return "healthy" case "Pending": return "progressing" case "Failed": return "unhealthy" default: return "unknown" } } case "Deployment", "StatefulSet", "DaemonSet", "ReplicaSet": // Check if all replicas are available replicas, foundReplicas, _ := unstructured.NestedInt64(obj.Object, "spec", "replicas") if !foundReplicas { replicas = 1 // Default to 1 if not specified } availableReplicas, foundAvailable, _ := unstructured.NestedInt64(status, "availableReplicas") if foundAvailable && availableReplicas == replicas { return "healthy" } else if foundAvailable && availableReplicas > 0 { return "progressing" } else { return "unhealthy" } case "Service": // Services are typically healthy unless they have no endpoints // We'd need to check endpoints separately return "healthy" case "Ingress": // Check if LoadBalancer has assigned addresses ingress, found, _ := unstructured.NestedSlice(status, "loadBalancer", "ingress") if found && len(ingress) > 0 { return "healthy" } return "progressing" case "PersistentVolumeClaim": phase, found, _ := unstructured.NestedString(status, "phase") if found && phase == "Bound" { return "healthy" } else if found && phase == "Pending" { return "progressing" } else { return "unhealthy" } case "Job": conditions, found, _ := unstructured.NestedSlice(status, "conditions") if found { for _, c := range conditions { condition, ok := c.(map[string]interface{}) if !ok { continue } condType, typeFound, _ := unstructured.NestedString(condition, "type") condStatus, statusFound, _ := unstructured.NestedString(condition, "status") if typeFound && statusFound && condType == "Complete" && condStatus == "True" { return "healthy" } else if typeFound && statusFound && condType == "Failed" && condStatus == "True" { return "unhealthy" } } return "progressing" } default: // For other resources, try to check common status conditions conditions, found, _ := unstructured.NestedSlice(status, "conditions") if found { for _, c := range conditions { condition, ok := c.(map[string]interface{}) if !ok { continue } condType, typeFound, _ := unstructured.NestedString(condition, "type") condStatus, statusFound, _ := unstructured.NestedString(condition, "status") if typeFound && statusFound { // Check for common condition types indicating health if (condType == "Ready" || condType == "Available") && condStatus == "True" { return "healthy" } else if condType == "Progressing" && condStatus == "True" { return "progressing" } else if (condType == "Failed" || condType == "Error") && condStatus == "True" { return "unhealthy" } } } } } return "unknown" } ``` -------------------------------------------------------------------------------- /kubernetes-claude-mcp/internal/correlator/troubleshoot.go: -------------------------------------------------------------------------------- ```go package correlator import ( "context" "fmt" "strings" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/internal/k8s" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/internal/models" "github.com/Blankcut/kubernetes-mcp-server/kubernetes-claude-mcp/pkg/logging" "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" ) // TroubleshootCorrelator provides specialized logic for troubleshooting type TroubleshootCorrelator struct { gitOpsCorrelator *GitOpsCorrelator k8sClient *k8s.Client logger *logging.Logger } // NewTroubleshootCorrelator creates a new troubleshooting correlator func NewTroubleshootCorrelator(gitOpsCorrelator *GitOpsCorrelator, k8sClient *k8s.Client, logger *logging.Logger) *TroubleshootCorrelator { if logger == nil { logger = logging.NewLogger().Named("troubleshoot") } return &TroubleshootCorrelator{ gitOpsCorrelator: gitOpsCorrelator, k8sClient: k8sClient, logger: logger, } } // TroubleshootResource analyzes a resource for common issues func (tc *TroubleshootCorrelator) TroubleshootResource(ctx context.Context, namespace, kind, name string) (*models.TroubleshootResult, error) { tc.logger.Info("Troubleshooting resource", "kind", kind, "name", name, "namespace", namespace) // First, trace the resource deployment resourceContext, err := tc.gitOpsCorrelator.TraceResourceDeployment(ctx, namespace, kind, name) if err != nil { return nil, fmt.Errorf("failed to trace resource deployment: %w", err) } // Get the raw resource for detailed analysis resource, err := tc.k8sClient.GetResource(ctx, kind, namespace, name) if err != nil { tc.logger.Warn("Failed to get resource for detailed analysis", "error", err) } // Initialize troubleshooting result result := &models.TroubleshootResult{ ResourceContext: resourceContext, Issues: []models.Issue{}, Recommendations: []string{}, } // Analyze Kubernetes events for issues tc.analyzeKubernetesEvents(resourceContext, result) // Analyze resource status and conditions if resource was retrieved if resource != nil { // Pod-specific analysis if strings.EqualFold(kind, "pod") { tc.analyzePodStatus(ctx, resource, result) } // Deployment-specific analysis if strings.EqualFold(kind, "deployment") { tc.analyzeDeploymentStatus(resource, result) } } // Analyze ArgoCD sync status tc.analyzeArgoStatus(resourceContext, result) // Analyze GitLab pipeline status tc.analyzeGitLabStatus(resourceContext, result) // Check if resource is healthy if len(result.Issues) == 0 && resource != nil && !tc.isResourceHealthy(resource) { issue := models.Issue{ Source: "Kubernetes", Category: "UnknownIssue", Severity: "Warning", Title: "Resource Not Healthy", Description: fmt.Sprintf("%s %s/%s is not in a healthy state", kind, namespace, name), } result.Issues = append(result.Issues, issue) } // Generate recommendations based on issues tc.generateRecommendations(result) tc.logger.Info("Troubleshooting completed", "kind", kind, "name", name, "namespace", namespace, "issueCount", len(result.Issues), "recommendationCount", len(result.Recommendations)) return result, nil } // isResourceHealthy checks if a resource is in a healthy state func (tc *TroubleshootCorrelator) isResourceHealthy(resource *unstructured.Unstructured) bool { kind := resource.GetKind() // Pod health check if strings.EqualFold(kind, "pod") { phase, found, _ := unstructured.NestedString(resource.Object, "status", "phase") return found && phase == "Running" } // Deployment health check if strings.EqualFold(kind, "deployment") { // Check if available replicas match desired replicas desiredReplicas, found1, _ := unstructured.NestedInt64(resource.Object, "spec", "replicas") availableReplicas, found2, _ := unstructured.NestedInt64(resource.Object, "status", "availableReplicas") return found1 && found2 && desiredReplicas == availableReplicas && availableReplicas > 0 } // Default: assume healthy return true } // analyzeDeploymentStatus analyzes deployment-specific status func (tc *TroubleshootCorrelator) analyzeDeploymentStatus(deployment *unstructured.Unstructured, result *models.TroubleshootResult) { // Check if deployment is ready desiredReplicas, found1, _ := unstructured.NestedInt64(deployment.Object, "spec", "replicas") availableReplicas, found2, _ := unstructured.NestedInt64(deployment.Object, "status", "availableReplicas") readyReplicas, found3, _ := unstructured.NestedInt64(deployment.Object, "status", "readyReplicas") if !found1 || !found2 || availableReplicas < desiredReplicas { issue := models.Issue{ Source: "Kubernetes", Category: "DeploymentNotAvailable", Severity: "Warning", Title: "Deployment Not Fully Available", Description: fmt.Sprintf("Deployment has %d/%d available replicas", availableReplicas, desiredReplicas), } result.Issues = append(result.Issues, issue) } if !found1 || !found3 || readyReplicas < desiredReplicas { issue := models.Issue{ Source: "Kubernetes", Category: "DeploymentNotReady", Severity: "Warning", Title: "Deployment Not Fully Ready", Description: fmt.Sprintf("Deployment has %d/%d ready replicas", readyReplicas, desiredReplicas), } result.Issues = append(result.Issues, issue) } // Check deployment conditions conditions, found, _ := unstructured.NestedSlice(deployment.Object, "status", "conditions") if found { for _, c := range conditions { condition, ok := c.(map[string]interface{}) if !ok { continue } conditionType, _, _ := unstructured.NestedString(condition, "type") status, _, _ := unstructured.NestedString(condition, "status") reason, _, _ := unstructured.NestedString(condition, "reason") message, _, _ := unstructured.NestedString(condition, "message") if conditionType == "Available" && status != "True" { issue := models.Issue{ Source: "Kubernetes", Category: "DeploymentNotAvailable", Severity: "Warning", Title: "Deployment Not Available", Description: fmt.Sprintf("Deployment availability issue: %s - %s", reason, message), } result.Issues = append(result.Issues, issue) } if conditionType == "Progressing" && status != "True" { issue := models.Issue{ Source: "Kubernetes", Category: "DeploymentNotProgressing", Severity: "Warning", Title: "Deployment Not Progressing", Description: fmt.Sprintf("Deployment progress issue: %s - %s", reason, message), } result.Issues = append(result.Issues, issue) } } } } // analyzePodStatus analyzes pod-specific status information func (tc *TroubleshootCorrelator) analyzePodStatus(ctx context.Context, pod *unstructured.Unstructured, result *models.TroubleshootResult) { // Check pod phase phase, found, _ := unstructured.NestedString(pod.Object, "status", "phase") if found && phase != "Running" && phase != "Succeeded" { issue := models.Issue{ Source: "Kubernetes", Category: "PodNotRunning", Severity: "Warning", Title: "Pod Not Running", Description: fmt.Sprintf("Pod is in %s state", phase), } if phase == "Pending" { issue.Title = "Pod Pending" issue.Description = "Pod is still in Pending state and hasn't started running" } else if phase == "Failed" { issue.Severity = "Error" issue.Title = "Pod Failed" } result.Issues = append(result.Issues, issue) } // Check pod conditions conditions, found, _ := unstructured.NestedSlice(pod.Object, "status", "conditions") if found { for _, c := range conditions { condition, ok := c.(map[string]interface{}) if !ok { continue } conditionType, _, _ := unstructured.NestedString(condition, "type") status, _, _ := unstructured.NestedString(condition, "status") if conditionType == "PodScheduled" && status != "True" { issue := models.Issue{ Source: "Kubernetes", Category: "SchedulingIssue", Severity: "Warning", Title: "Pod Scheduling Issue", Description: "Pod cannot be scheduled onto a node", } result.Issues = append(result.Issues, issue) } if conditionType == "Initialized" && status != "True" { issue := models.Issue{ Source: "Kubernetes", Category: "InitializationIssue", Severity: "Warning", Title: "Pod Initialization Issue", Description: "Pod initialization containers have not completed successfully", } result.Issues = append(result.Issues, issue) } if conditionType == "ContainersReady" && status != "True" { issue := models.Issue{ Source: "Kubernetes", Category: "ContainerReadinessIssue", Severity: "Warning", Title: "Container Readiness Issue", Description: "One or more containers are not ready", } result.Issues = append(result.Issues, issue) } if conditionType == "Ready" && status != "True" { issue := models.Issue{ Source: "Kubernetes", Category: "PodNotReady", Severity: "Warning", Title: "Pod Not Ready", Description: "Pod is not ready to serve traffic", } result.Issues = append(result.Issues, issue) } } } // Check container statuses containerStatuses, found, _ := unstructured.NestedSlice(pod.Object, "status", "containerStatuses") if found { tc.analyzeContainerStatuses(containerStatuses, false, result) } // Check init container statuses if they exist initContainerStatuses, found, _ := unstructured.NestedSlice(pod.Object, "status", "initContainerStatuses") if found { tc.analyzeContainerStatuses(initContainerStatuses, true, result) } // Check for volume issues volumes, found, _ := unstructured.NestedSlice(pod.Object, "spec", "volumes") if found { // Track PVC usage pvcVolumes := []string{} for _, v := range volumes { volume, ok := v.(map[string]interface{}) if !ok { continue } // Check for PVC volumes pvc, pvcFound, _ := unstructured.NestedMap(volume, "persistentVolumeClaim") if pvcFound && pvc != nil { claimName, nameFound, _ := unstructured.NestedString(pvc, "claimName") if nameFound && claimName != "" { pvcVolumes = append(pvcVolumes, claimName) } } } // If PVC volumes found, check their status if len(pvcVolumes) > 0 { for _, pvcName := range pvcVolumes { pvc, err := tc.k8sClient.GetResource(ctx, "persistentvolumeclaim", pod.GetNamespace(), pvcName) if err != nil { issue := models.Issue{ Source: "Kubernetes", Category: "VolumeIssue", Severity: "Warning", Title: "PVC Not Found", Description: fmt.Sprintf("PersistentVolumeClaim %s not found", pvcName), } result.Issues = append(result.Issues, issue) continue } phase, phaseFound, _ := unstructured.NestedString(pvc.Object, "status", "phase") if !phaseFound || phase != "Bound" { issue := models.Issue{ Source: "Kubernetes", Category: "VolumeIssue", Severity: "Warning", Title: "PVC Not Bound", Description: fmt.Sprintf("PersistentVolumeClaim %s is in %s state", pvcName, phase), } result.Issues = append(result.Issues, issue) } } } } } // analyzeContainerStatuses analyzes container status information func (tc *TroubleshootCorrelator) analyzeContainerStatuses(statuses []interface{}, isInit bool, result *models.TroubleshootResult) { containerType := "Container" if isInit { containerType = "Init Container" } for _, cs := range statuses { containerStatus, ok := cs.(map[string]interface{}) if !ok { continue } containerName, _, _ := unstructured.NestedString(containerStatus, "name") ready, _, _ := unstructured.NestedBool(containerStatus, "ready") restartCount, _, _ := unstructured.NestedInt64(containerStatus, "restartCount") if !ready { // Check for specific container state state, stateExists, _ := unstructured.NestedMap(containerStatus, "state") if stateExists && state != nil { waitingState, waitingExists, _ := unstructured.NestedMap(state, "waiting") if waitingExists && waitingState != nil { reason, reasonFound, _ := unstructured.NestedString(waitingState, "reason") message, messageFound, _ := unstructured.NestedString(waitingState, "message") reasonStr := "" if reasonFound { reasonStr = reason } messageStr := "" if messageFound { messageStr = message } issue := models.Issue{ Source: "Kubernetes", Category: "ContainerWaiting", Severity: "Warning", Title: fmt.Sprintf("%s %s Waiting", containerType, containerName), Description: fmt.Sprintf("%s is waiting: %s - %s", containerType, reasonStr, messageStr), } if reason == "CrashLoopBackOff" { issue.Category = "CrashLoopBackOff" issue.Severity = "Error" issue.Title = fmt.Sprintf("%s %s CrashLoopBackOff", containerType, containerName) } else if reason == "ImagePullBackOff" || reason == "ErrImagePull" { issue.Category = "ImagePullError" issue.Title = fmt.Sprintf("%s %s Image Pull Error", containerType, containerName) } else if reason == "PodInitializing" || reason == "ContainerCreating" { issue.Category = "PodInitializing" issue.Title = fmt.Sprintf("%s Still Initializing", containerType) issue.Description = fmt.Sprintf("%s is still being created or initialized", containerType) } result.Issues = append(result.Issues, issue) } terminatedState, terminatedExists, _ := unstructured.NestedMap(state, "terminated") if terminatedExists && terminatedState != nil { reason, reasonFound, _ := unstructured.NestedString(terminatedState, "reason") exitCode, exitCodeFound, _ := unstructured.NestedInt64(terminatedState, "exitCode") message, messageFound, _ := unstructured.NestedString(terminatedState, "message") reasonStr := "" if reasonFound { reasonStr = reason } messageStr := "" if messageFound { messageStr = message } var exitCodeVal int64 = 0 if exitCodeFound { exitCodeVal = exitCode } if exitCodeVal != 0 { issue := models.Issue{ Source: "Kubernetes", Category: "ContainerTerminated", Severity: "Error", Title: fmt.Sprintf("%s %s Terminated", containerType, containerName), Description: fmt.Sprintf("%s terminated with exit code %d: %s - %s", containerType, exitCodeVal, reasonStr, messageStr), } result.Issues = append(result.Issues, issue) } } } } if restartCount > 3 { issue := models.Issue{ Source: "Kubernetes", Category: "FrequentRestarts", Severity: "Warning", Title: fmt.Sprintf("%s %s Frequent Restarts", containerType, containerName), Description: fmt.Sprintf("%s has restarted %d times", containerType, restartCount), } result.Issues = append(result.Issues, issue) } } } // analyzeKubernetesEvents looks for common issues in Kubernetes events func (tc *TroubleshootCorrelator) analyzeKubernetesEvents(rc models.ResourceContext, result *models.TroubleshootResult) { for _, event := range rc.Events { // Look for error events if event.Type == "Warning" { issue := models.Issue{ Source: "Kubernetes", Severity: "Warning", Description: fmt.Sprintf("%s: %s", event.Reason, event.Message), } // Categorize common issues switch { case strings.Contains(event.Reason, "Failed") && strings.Contains(event.Message, "ImagePull"): issue.Category = "ImagePullError" issue.Title = "Image Pull Failure" case strings.Contains(event.Reason, "Unhealthy"): issue.Category = "HealthCheckFailure" issue.Title = "Health Check Failure" case strings.Contains(event.Message, "memory"): issue.Category = "ResourceIssue" issue.Title = "Memory Resource Issue" case strings.Contains(event.Message, "cpu"): issue.Category = "ResourceIssue" issue.Title = "CPU Resource Issue" case strings.Contains(event.Reason, "BackOff"): issue.Category = "CrashLoopBackOff" issue.Title = "Container Crash Loop" default: issue.Category = "OtherWarning" issue.Title = "Kubernetes Warning" } result.Issues = append(result.Issues, issue) } } } // analyzeArgoStatus looks for issues in ArgoCD status func (tc *TroubleshootCorrelator) analyzeArgoStatus(rc models.ResourceContext, result *models.TroubleshootResult) { if rc.ArgoApplication == nil { // No ArgoCD application managing this resource return } // Check sync status if rc.ArgoSyncStatus != "Synced" { issue := models.Issue{ Source: "ArgoCD", Category: "SyncIssue", Severity: "Warning", Title: "ArgoCD Sync Issue", Description: fmt.Sprintf("Application %s is not synced (status: %s)", rc.ArgoApplication.Name, rc.ArgoSyncStatus), } result.Issues = append(result.Issues, issue) } // Check health status if rc.ArgoHealthStatus != "Healthy" { issue := models.Issue{ Source: "ArgoCD", Category: "HealthIssue", Severity: "Warning", Title: "ArgoCD Health Issue", Description: fmt.Sprintf("Application %s is not healthy (status: %s)", rc.ArgoApplication.Name, rc.ArgoHealthStatus), } result.Issues = append(result.Issues, issue) } // Check for recent sync failures for _, history := range rc.ArgoSyncHistory { if history.Status == "Failed" { issue := models.Issue{ Source: "ArgoCD", Category: "SyncFailure", Severity: "Error", Title: "Recent Sync Failure", Description: fmt.Sprintf("Sync at %s failed with revision %s", history.DeployedAt.Format("2006-01-02 15:04:05"), history.Revision), } result.Issues = append(result.Issues, issue) break // Only report the most recent failure } } } // analyzeGitLabStatus looks for issues in GitLab pipelines and deployments func (tc *TroubleshootCorrelator) analyzeGitLabStatus(rc models.ResourceContext, result *models.TroubleshootResult) { if rc.GitLabProject == nil { // No GitLab project information return } // Check last pipeline status if rc.LastPipeline != nil && rc.LastPipeline.Status != "success" { severity := "Warning" if rc.LastPipeline.Status == "failed" { severity = "Error" } issue := models.Issue{ Source: "GitLab", Category: "PipelineIssue", Severity: severity, Title: "GitLab Pipeline Issue", Description: fmt.Sprintf("Pipeline #%d status: %s", rc.LastPipeline.ID, rc.LastPipeline.Status), } result.Issues = append(result.Issues, issue) } // Check last deployment status if rc.LastDeployment != nil && rc.LastDeployment.Status != "success" { severity := "Warning" if rc.LastDeployment.Status == "failed" { severity = "Error" } issue := models.Issue{ Source: "GitLab", Category: "DeploymentIssue", Severity: severity, Title: "GitLab Deployment Issue", Description: fmt.Sprintf("Deployment to %s status: %s", rc.LastDeployment.Environment.Name, rc.LastDeployment.Status), } result.Issues = append(result.Issues, issue) } } // generateRecommendations creates recommendations based on identified issues func (tc *TroubleshootCorrelator) generateRecommendations(result *models.TroubleshootResult) { // Update the original implementation to include more recommendations recommendationMap := make(map[string]bool) for _, issue := range result.Issues { switch issue.Category { case "ImagePullError": recommendationMap["Check image name and credentials for accessing private registries."] = true recommendationMap["Verify that the image tag exists in the registry."] = true case "HealthCheckFailure": recommendationMap["Review liveness and readiness probe configuration."] = true recommendationMap["Check application logs for errors during startup."] = true case "ResourceIssue": recommendationMap["Review resource requests and limits in the deployment."] = true recommendationMap["Monitor resource usage to determine appropriate values."] = true case "CrashLoopBackOff": recommendationMap["Check container logs for errors."] = true recommendationMap["Verify environment variables and configuration."] = true case "SyncIssue", "SyncFailure": recommendationMap["Check ArgoCD application manifest for errors."] = true recommendationMap["Verify that the target revision exists in the Git repository."] = true case "PipelineIssue": recommendationMap["Review GitLab pipeline logs for errors."] = true recommendationMap["Check if the pipeline configuration is valid."] = true case "DeploymentIssue": recommendationMap["Check GitLab deployment job logs for errors."] = true recommendationMap["Verify deployment environment configuration."] = true case "PodNotRunning", "PodNotReady", "PodInitializing": recommendationMap["Check pod events for scheduling or initialization issues."] = true recommendationMap["Examine init container logs for errors."] = true case "InitializationIssue": recommendationMap["Check init container logs for errors."] = true recommendationMap["Verify that volumes can be mounted properly."] = true case "ContainerReadinessIssue": recommendationMap["Review readiness probe configuration."] = true recommendationMap["Check container logs for application startup issues."] = true case "VolumeIssue": recommendationMap["Verify that PersistentVolumeClaims are bound."] = true recommendationMap["Check if storage classes are properly configured."] = true recommendationMap["Ensure sufficient storage space is available on the nodes."] = true case "SchedulingIssue": recommendationMap["Check if nodes have sufficient resources for the pod."] = true recommendationMap["Verify that node selectors or taints are not preventing scheduling."] = true } } // Add generic recommendations if no specific issues found if len(result.Issues) == 0 { recommendationMap["Check pod logs for errors."] = true recommendationMap["Examine Kubernetes events for the resource."] = true recommendationMap["Verify network connectivity between components."] = true } // Convert map to slice for rec := range recommendationMap { result.Recommendations = append(result.Recommendations, rec) } } ```