K8sGPT: The Basics and a Quick Tutorial

What Is K8sGPT?

K8sGPT is a tool that uses generative AI to improve Kubernetes management. It integrates with Kubernetes, allowing developers to simplify tasks like monitoring, troubleshooting, and optimizing workloads. 

By using state of the art large language models (LLMs) such as OpenAI GPT-4 and Google Gemini, K8sGPT enables users to interact with Kubernetes clusters in a more intuitive manner. It makes complex operations accessible to developers with varying levels of Kubernetes expertise, reducing the learning curve associated with managing containerized applications.

Benefits of Generative AI for Kubernetes

Kubernetes administrators often encounter challenges, such as the steep learning curve of large-scale cluster management, misconfigurations, such as missing service accounts or services without endpoints, and the difficulty of monitoring and troubleshooting Kubernetes environments. These issues can lead to failed deployments, creating issues for DevOps teams and site reliability engineers (SREs). 

Traditionally, resolving these problems requires a meticulous manual process—sifting through logs, events, and configuration files to identify the root cause. K8sGPT addresses these challenges by using generative AI to automate the identification and resolution of common Kubernetes issues. It has evolved from a command-line interface tool to an automated SRE assistant.

The tool integrates with multiple AI platforms, including OpenAI, Azure, and Google Gemini, allowing it to analyze Kubernetes environments and generate descriptive problem summaries along with practical solutions. By anonymizing sensitive information like pod names, K8sGPT ensures data privacy.

Related content: Read our guide to Kubernetes tools 

How K8sGPT Works

K8sGPT continuously monitors Kubernetes clusters to detect and diagnose potential issues, functioning similarly to a skilled site reliability engineer. The process begins with data collection, where K8sGPT selectively gathers relevant information from the clusters. This ensures that only the necessary data is processed, maintaining privacy.

Once the data is collected, K8sGPT uses AI algorithms to analyze and interpret it. These algorithms identify anomalies and potential problems, such as non-running pods or missing service accounts in replica sets. After detecting an issue, K8sGPT leverages generative AI models to generate explanations and recommendations for resolving the problem. 

K8sGPT Analyzers

K8sGPT provides the following analyzer modules:

AnalyzerDescription
podAnalyzerAnalyzes the health and performance of pods, identifying issues such as non-running pods, crashes, and restarts.
pvcAnalyzerFocuses on Persistent Volume Claims (PVC), checking for problems with volume binding, storage access, or incorrect configuration.
rsAnalyzerMonitors ReplicaSets, detecting issues such as misconfigurations and failed replicas.
serviceAnalzyerExamines services for misconfigurations, missing endpoints, or network connectivity issues.
eventAnalyzerReviews Kubernetes events to uncover underlying problems like failed deployments, resource conflicts, and other critical alerts.
ingressAnalyzerAnalyzes Ingress resources for routing issues, DNS problems, or misconfigured rules that may block traffic.
statefulSetAnalyzerFocuses on StatefulSets, identifying issues related to persistence, scaling, and proper pod ordering.
deploymentAnalyzerMonitors deployments for failed updates, incorrect resource definitions, and unhealthy replicas.
cronJobAnalyzerChecks CronJobs for scheduling issues, missed executions, or failed jobs.
nodeAnalyzerMonitors the health of nodes, including resource utilization, readiness, and potential failures.
mutatingWebhookAnalyzerExamines Mutating Webhooks for configuration errors, which can lead to failed admissions or unintended behavior changes.
validatingWebhookAnalyzerReviews Validating Webhooks for misconfigurations that can cause failed resource validation or unwanted access restrictions.
hpaAnalyzer(Optional) Analyzes Horizontal Pod Autoscalers (HPA) for scaling issues, such as improper thresholds or failing to scale workloads efficiently.
pdbAnalyzer(Optional) Monitors Pod Disruption Budgets (PDB), ensuring they are correctly configured to prevent excessive pod disruptions during maintenance.
networkPolicyAnalyzer(Optional) Analyzes Network Policies for potential security risks or incorrect rules that could block or allow unwanted traffic.
gatewayClass(Optional) Examines GatewayClass resources to ensure proper configuration for load balancing and traffic routing.
gateway(Optional) Checks Gateway resources for issues with traffic management, routing, or policy application.
httproute(Optional) Analyzes HTTPRoute objects for misconfigurations in routing rules, path matching, or service bindings.
logAnalyzer(Optional) Reviews logs to detect potential errors, performance issues, and warnings across workloads and clusters.
Dan Garfield
VP of Open Source, Octopus Deploy
Dan is a seasoned leader in the tech industry with a strong focus on open source initiatives. Currently serving as VP of Open Source at Octopus Deploy, contributing as an Argo maintainer, co-creator of Open GitOps, and leveraging extensive experience as a co-founder of Codefresh, now part of Octopus Deploy.

TIPS FROM THE EXPERT

In my experience, here are tips that can help you better use K8sGPT for Kubernetes management:

  1. Optimize AI model selection: Choose different generative AI models for varying tasks. For instance, use a smaller, faster model for real-time diagnostics and larger models like GPT-4 for deeper, detailed post-mortem analysis.
  2. Examining the option of automating routine fixes: Consider setting up K8sGPT to not only detect but also automatically apply predefined fixes for common issues like missing service accounts or incorrect pod specs. Try this in non-production environments first and start with simple, predictable issues.
  3. Cross-check recommendations with cost analysis tools: Integrate K8sGPT insights with cost-management tools like KubeCost to ensure that performance optimizations also translate into financial savings, especially in multi-cloud environments.
  4. Use filters to target namespaces: When dealing with large clusters, focus K8sGPT’s analysis on specific namespaces that are more prone to issues (e.g., staging environments) to optimize performance and insights.
  5. Deploy in server mode for continuous monitoring: Running K8sGPT in “serve” mode can offer continuous health checks of your cluster, providing real-time alerts and recommendations without the need for manual intervention.

Tutorial: Getting Started with K8sGPT

This tutorial covers how to install and use K8sGPT on different systems. These instructions are adapted from the K8sGPT documentation.

Prerequisites

Before you install K8sGPT, ensure that you have Homebrew installed on your machine. Homebrew is available for both macOS and Linux, and it also works on Windows Subsystem for Linux (WSL).

To install Homebrew on macOS, use the following command:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

For Linux, follow the instructions on the Homebrew documentation.

Installing K8sGPT 

macOS and Linux:

Once Homebrew is set up, you can install K8sGPT by first tapping the K8sGPT repository and then installing the package:

brew tap k8sgpt-ai/k8sgpt
brew install k8sgpt

This will download and install K8sGPT on your system. After installation, verify that K8sGPT is installed correctly by checking the version:

k8sgpt version

You should see an output similar to:

k8sgpt version 0.2.7

Windows:

For Windows users, download the latest Windows binaries from the K8sGPT GitHub Releases page. Choose the appropriate binary based on your system architecture (32-bit or 64-bit).

After downloading, extract the files to a directory of your choice, then add this directory to your system’s PATH environment variable to make the k8sgpt command globally accessible from any command prompt or terminal.

To verify that the installation was successful, run:

k8sgpt version

This should output the installed version of K8sGPT, confirming that it is ready for use.

RPM-based systems (RedHat, CentOS, Fedora):

For 32-bit systems:

curl -LO https://github.com/k8sgpt-ai/k8sgpt/releases/download/v0.3.24/k8sgpt_386.rpm
sudo rpm -ivh k8sgpt_386.rpm

For 64-bit systems:

curl -LO https://github.com/k8sgpt-ai/k8sgpt/releases/download/v0.3.24/k8sgpt_amd64.rpm
sudo rpm -ivh k8sgpt_amd64.rpm

DEB-based systems (Ubuntu, Debian):

For 32-bit systems:

curl -LO https://github.com/k8sgpt-ai/k8sgpt/releases/download/v0.3.24/k8sgpt_386.deb
sudo dpkg -i k8sgpt_386.deb

For 64-bit systems:

curl -LO https://github.com/k8sgpt-ai/k8sgpt/releases/download/v0.3.24/k8sgpt_amd64.deb
sudo dpkg -i k8sgpt_amd64.deb

APK-based systems (Alpine):

For 32-bit systems:

curl -LO https://github.com/k8sgpt-ai/k8sgpt/releases/download/v0.3.24/k8sgpt_386.apk
sudo apk add k8sgpt_386.apk

For 64-bit systems:

curl -LO https://github.com/k8sgpt-ai/k8sgpt/releases/download/v0.3.24/k8sgpt_amd64.apk
sudo apk add k8sgpt_amd64.apk

Setting up a Kubernetes Cluster

To try out k8sgpt, you can set up a basic Kubernetes cluster using KinD or Minikube:

  1. First, install KinD:
brew install kind

2. Create a new Kubernetes cluster:

kind create cluster --name k8sgpt-demo

3. Alternatively, you can set up a Minikube cluster if you prefer. Follow the Minikube documentation specific to your operating system to get started.

Using K8sGPT

Once the Kubernetes cluster is up and running, you can start using k8sgpt to monitor, diagnose, and troubleshoot issues within the cluster. It offers a range of commands that help interact with and analyze Kubernetes environments.

Begin by viewing the available command options:

k8sgpt --help

This command will display a list of commands that k8sgpt supports, each tailored to address specific aspects of Kubernetes management. Here’s a brief overview of some key commands:

  • analyze: Used for identifying and troubleshooting issues in the Kubernetes cluster. It scans the cluster for potential problems, such as misconfigurations or resource limitations, and provides insights to help you resolve them.
  • auth: Used to authenticate with AI backend providers like OpenAI. Proper authentication is essential for leveraging the AI-driven capabilities of k8sgpt.
  • cache: Allows you to manage the cache that stores the results of previous analyses, making it easier to track and revisit issues over time.
  • serve: Runs k8sgpt as a server, which can be particularly useful in continuous monitoring scenarios, where you need to keep an eye on the health of your cluster.

Authenticating with OpenAI

To use k8sgpt with OpenAI, you need to authenticate with your OpenAI account. Follow these steps:

  1. Generate a token by running:
k8sgpt generate

This command will provide a URL that you need to visit in your browser to generate the token.

2. Copy the generated token.

3. Authenticate k8sgpt with OpenAI using the following command:

k8sgpt auth add --backend openai --model gpt-3.5-turbo

4. When prompted, paste the token you generated earlier. You should see a success message confirming that OpenAI has been added to your AI backend provider list.

Analyzing the Cluster

With authentication complete, you can now analyze your Kubernetes cluster:

Ensure that the Kubernetes context is set to the correct cluster by running:

kubectl config current-context

Check the nodes in your cluster:

kubectl get nodes

To simulate an issue, create a Service with a wrong selector by saving the following YAML configuration to a file named wrong-selector-service.yml:

apiVersion: v1
kind: Service
metadata:
  name: wrong-selector-service
  namespace: default
spec:
  selector:
    app: nonexistent-app  # This selector does not match any existing pods
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

Apply the configuration to the cluster:

kubectl apply -f wrog-selector-servicne.yml

Check the status of the Service to see that it has no endpoints:

kubectl get endpoints wrong-selector-service

You should see that the Service has no endpoints because the selector does not match any running pods.

Analyze your cluster using k8sgpt:

k8sgpt analyze

This command will generate a list of issues in your cluster, including the problem with the Service selector that does not match any pods.

For detailed explanations run:

k8sgpt analyze --explain

Note: On Ubuntu, If you get any error about authentication, please also export your key as an environment variable. You will execute the following command:

export OPENAI_KEY=<YOUR SECRET KEY FOR OPENAI>

This will provide a more in-depth analysis, helping you understand the issues in your cluster better.

Example of correct Service configuration:

  apiVersion: v1
  kind: Service
  metadata:
    name: wrong-selector-service
    namespace: default
  spec:
    selector:
      app: existing-app  # Ensure this matches the correct pod labels
    ports:
      - protocol: TCP
        port: 80
        targetPort: 8080

Please apply the YAML file (assume it is called wrong-selector-service.yml) using the following command:

kubectl apply -f wrong-selector-service.yaml

Now run the tool again as shown below:

k8sgpt analyze --explain

You will see the previous issues have been resolved.

Kubernetes Deployment with Codefresh

Codefresh lets you answer many important questions within your organization, whether you’re a developer or a product manager. For example:

  • What features are deployed right now in any of your environments?
  • What features are waiting in Staging?
  • What features were deployed last Thursday?
  • Where is feature #53.6 in our environment chain?

What’s great is that you can answer all of these questions by viewing one single dashboard. Our applications dashboard shows:

  • Services affected by each deployment
  • The current state of Kubernetes components
  • Deployment history and log of who deployed what and when and the pull request or Jira ticket associated with each deployment
Ready to Get Started?

Deploy more and fail less with Codefresh and Argo