# Access Tokens Source: https://docs.chkk.io/administration/access-tokens Administration instructions for using and managing Access Tokens Access Tokens are required for Chkk Kubernetes Connector auth. ## Managing Tokens Below are the steps to access, create, revoke, and copy tokens in the Chkk Dashboard. 1. In the **Chkk Dashboard**, expand **Configure** on the left menu. 2. Click **Settings**, then select the **Tokens** tab. 3. Here you'll see a list of all your active tokens (if any), along with options to create new ones or revoke existing ones. 1. Click the **Add Token** button to open the token creation page. 2. **NameYour Token**: Provide a meaningful label for the token (e.g., `deployment-token`). 3. Click **Mark as Done** to generate a new token. 4. You can click on the code-block or the "clipboard" icon to copy the token. 5. Additionally, you can click the "eye" icon next to the "clipboard" to unhide the token 1. In the **Tokens** list, find the token you want to revoke. 2. Click the **x** icon next to that token. 3. A confirmation modal will appear requesting your confirmation; confirm you want to revoke it. 4. Once revoked, the token is no longer valid for future requests. 1. From the **Tokens** tab, locate the token whose value you wish to copy. 2. Click the **clipboard icon** next to the Token name to copy the token string. 1. From the **Tokens** tab, locate the token whose name you wish to edit. 2. Click the **pencil icon** next to the Token name to edit the token name. 3. A modal will appear where you can edit the token name. 4. Click **Save** after re-naming the token to update the token name. # Multi-Org Support Source: https://docs.chkk.io/administration/multi-org-support Administration instructions for managing users in multiple Organizations **Multi-Org Support** lets you create and manage multiple, distinct Organizations—whether for dev, staging, and production environments or to separate different teams under their own tenants. You can administer each Organization individually, view and control resources in context, maintain unique settings, and switch between them as needed. ## Switch Between Orgs Follow these steps to switch between organizations in the **Chkk Dashboard**. 1. In the top-right corner of the **Chkk Dashboard**, click your **organization name**. 2. A dropdown will appear with a list of available organizations. 3. Select the organization you want to switch to. 4. The page will refresh, and you'll now be operating under the newly selected organization. # Organization Settings Source: https://docs.chkk.io/administration/organization-settings Administration instructions for managing Organization Settings A **Chkk Organization** groups together one or more **Chkk Accounts** under a common ownership. Within each account, users organize clusters, add-ons, application services, Kubernetes operators, risks, etc. ## Change Organization Settings Follow the steps below to change your organization settings: 1. In the top-right corner of the Chkk Dashboard, click on your **organization name**. 2. Select **Organization Settings** from the dropdown. 1. On the **Organization Settings** page, you will see: * **Organization Slug**: A unique identifier for your organization (read-only). * **Display Name**: A user-friendly name for your organization. * **Created Time**: The date when your organization was created (read-only). 2. These details help you quickly reference and organize your account information. 1. In the **Display Name** field, type the new name you want for your organization. 2. Click **Save Changes** to confirm and update the organization name. # Plan and Usage Source: https://docs.chkk.io/administration/plan-and-usage Administration instructions for plan usage management Chkk offers two main plans—**Business** (for startups and scaleups) and **Enterprise** (for mission-critical workloads). For pricing and a complete breakdown of features, refer to [Pricing and Plans](/product/pricing-plans). The Chkk Dashboard shows subscription status and usage metrics, and will display a badge whenever you near plan thresholds. ## View Subscription usage and limits Follow these steps to view the status and usage limits of your Subscription: 1. In the **Chkk Dashboard**, expand **Configure** on the left menu. 2. Select **Billing & Subscriptions** to open your billing details and plan information. 1. In this tab, you will see: * **Plan Summary**: Displays your **current plan**, the **start date**, and **activation date**. * **Usage**: Indicates your **Monthly Node Count** and **Upgrade Templates** usage. 2. There will also be a **Contact Us** button if you need to purchase more nodes or upgrade templates. 1. Here, you will see different **plan tiers** (e.g., Business, Enterprise), each with a feature breakdown. 2. This section compares available features (like **Clusters**, **Users**, **Nodes**, **Cloud Providers**, etc.) across different subscription levels. 3. Use this tab to gauge which tier meets your operational requirements and to explore upgrade possibilities. 1. In the **FAQs** tab, you'll find common questions regarding **Chkk**'s billing model: * **Definition of a node** * **Pricing for serverless Kubernetes clusters** * **Handling unexpected node increases** * **Purchase options and volume discounts** 2. This tab addresses specific billing scenarios and clarifies how usage calculations or price adjustments occur. # User Management Source: https://docs.chkk.io/administration/user-management Administration instructions for managing users Chkk supports user management through **Teams**, which are groups of users within a Chkk Account. Team members share a common set of permissions and can manage resources within that account. You can also invite additional team members to join, ensuring straightforward collaboration and secure access control. ## Steps to perform User Management Below are the steps to manage your team members in Chkk—covering how to access Teams, invite users, manage pending invites, remove users, leave an organization, and utilize the search feature. 1. In the **Chkk Dashboard**, go to the left navigation panel. 2. Under **Configure**, click **Teams** to open the user management page. 1. The **Members** tab displays all existing team members, showing: * **Name** * **Email** * **Date Added** 2. This page lets you see who currently has access to your Chkk organization. 1. Select **Invite Team Member** (top-right on the Teams page). 2. A **modal** will appear prompting you to enter the **email address** of the person you want to invite. 3. Click **Send Invite** in the modal to send an invitation email. 4. The sent invitation and its status can be found in the **Pending** tab next to **Members**. 1. Go to the **Pending** tab. 2. You'll see the status of all the **Pending** invites alongside additional details: * **Email**: Email of the person invited to join the Organization * **Inviter**: Email of the person who sent the invite * **Status**: Status of the invite (Pending/Expired) * **Expiry**: Date of Expiry of the invite 1. In the **Pending** tab, locate the expired or pending invite. 2. Click **Resend Invite**, which opens a **modal** to confirm the re-invite action. 3. Click **Send Invite** (or equivalent) in the modal to issue a new invitation link. 1. In the **Members** tab, find your own **Name**. 2. Click the **arrow icon** to the right of your **Name**. 3. A **confirmation modal** will appear. 4. Confirm to **Leave** the organization. 1. In the **Members** tab, locate the user you want to remove. 2. Click the **trash icon** to the right of the user's name. 3. A **confirmation modal** will appear. 4. Select **Remove** to remove the user. # Introduction Source: https://docs.chkk.io/api-reference/introduction Welcome to the Chkk API — a REST interface that lets you query the Chkk resources surfaced by the Chkk Operational Safety Platform. ## Pagination Bulk‑fetch methods in Chkk use cursor‑based pagination. Two parameters control the flow: | Parameter | Type | Default | Max | Description | | ------------------- | ------- | ------- | ---- | --------------------------------------------------- | | page\_size | integer | 100 | 1000 | Maximum number of objects to return | | continuation\_token | string | — | — | Cursor pointing to where the next page should start | Typical sequence 1. **First page** – Omit continuation\_token (and optionally set page\_size). 2. **Next page** – Pass the continuation\_token returned in the prior response. 3. **Done** – When the response no longer includes continuation\_token, you have reached the end. Example ``` GET /risks?page_size=100&filter=cluster_id:k8scl_a1b2c3 → 200 OK { "data": [ /* up to 100 RiskSummary objects */ ], "continuation_token": "eyJ2IjoxLCJ..." } GET /risks?page_size=100&continuation_token=eyJ2IjoxLCJ... → 200 OK { "data": [ /* next set of RiskSummary objects */ ] } ``` If page\_size exceeds 1000, Chkk returns HTTP 400 with an error object. *** ## Errors Chkk uses standard HTTP response codes to indicate the outcome of an API request. Codes in the 2xx range indicate a successful request. Codes in the 4xx range indicate a client error, such as a missing required parameter. Codes in the 5xx range indicate a server error on Chkk’s end (these are uncommon). | Code | Meaning | | ----------------------- | ---------------------------------------------------------------------------- | | `200 OK` | Request succeeded. | | `400 Bad Request` | Malformed filters, missing parameters. | | `401 Unauthorized` | Missing/invalid bearer token. | | `404 Not Found` | Resource does not exist or is out of scope. | | `429 Too Many Requests` | Rate-limit exceeded. Retrying after the `Retry-After` header is recommended. | | `5xx` | Temporary service error on Chkk’s side. | ## API Usage Examples The Go program below: 1. Calls **`GET /risks`** to collect every detected Risk instance in a cluster. 2. For each Risk, calls **`GET /risks/{id}/resources`** to enumerate affected Kubernetes resources. 3. **Optionally** shows **only** the resources that live in the namespace you pass with `-namespace`. 4. Prints output to `stdout` and writes a CSV file called `risks.csv` to the current working directory with the output. ### Prerequisites | Requirement | Notes | | --------------- | ------------------------------------------------------------------- | | Go 1.20 + | No third-party dependencies. | | Outbound HTTPS | Script contacts `https://api.us.chkk.io/v1` | | AWS credentials | Script uses AWS STS to generate a presigned URL for authentication. | ### Example script ```go package main import ( "context" "encoding/csv" "encoding/json" "flag" "fmt" "net/http" "os" "time" "github.com/aws/aws-sdk-go-v2/config" "github.com/aws/aws-sdk-go-v2/service/sts" "github.com/pkg/errors" ) type loginResponse struct { AccessTokens map[string]map[string]struct { AccessToken string `json:"access_token"` } `json:"access_tokens"` } type riskSummary struct { ID string `json:"id"` Signature struct { ID string `json:"id"` } `json:"signature"` } type listRisksResponse struct { Data []riskSummary `json:"data"` } type riskResource struct { Kind string `json:"kind"` Name string `json:"name"` Namespace string `json:"namespace"` } type listRiskResourcesResponse struct { Data []riskResource `json:"data"` } var ( clusterID = flag.String("cluster-id", "", "Chkk cluster ID (required)") namespace = flag.String("namespace", "", "Namespace filter (optional)") outFile = flag.String("out", "risks.csv", "CSV output filename") apiBase = "https://api.us.chkk.io/v1" httpClient = &http.Client{Timeout: 15 * time.Second} ) func main() { flag.Parse() if *clusterID == "" { exitErr(errors.New("flag -cluster-id is required")) } ctx := context.Background() token, err := authenticate(ctx) if err != nil { exitErr(err) } risks, err := listRisks(ctx, token, *clusterID) if err != nil { exitErr(err) } if err := printSummary(ctx, token, risks); err != nil { exitErr(err) } if err := writeCSV(ctx, token, risks); err != nil { exitErr(err) } } func printSummary(ctx context.Context, token string, risks []riskSummary) error { for _, rs := range risks { res, err := listRiskResources(ctx, token, rs.ID) if err != nil { return err } count := filterByNamespace(res, *namespace) if *namespace != "" { fmt.Printf("%s, %s: %d affected resources in namespace %s\n", *clusterID, rs.Signature.ID, count, *namespace) } else { fmt.Printf("%s, %s: %d affected resources\n", *clusterID, rs.Signature.ID, count) } } return nil } func writeCSV(ctx context.Context, token string, risks []riskSummary) error { file, err := os.Create(*outFile) if err != nil { return errors.Wrap(err, "create CSV file") } defer file.Close() cw := csv.NewWriter(file) defer cw.Flush() if err := cw.Write([]string{"Cluster", "SignatureID", "Kind", "Name", "Namespace"}); err != nil { return errors.Wrap(err, "write CSV header") } for _, rs := range risks { resources, err := listRiskResources(ctx, token, rs.ID) if err != nil { return err } for _, r := range resources { if *namespace != "" && r.Namespace != *namespace { continue } if err := cw.Write([]string{ *clusterID, rs.Signature.ID, r.Kind, r.Name, r.Namespace, }); err != nil { return errors.Wrap(err, "write CSV row") } } } return nil } func filterByNamespace(resources []riskResource, ns string) int { if ns == "" { return len(resources) } count := 0 for _, r := range resources { if r.Namespace == ns { count++ } } return count } func authenticate(ctx context.Context) (string, error) { url, err := presignSTS(ctx) if err != nil { return "", errors.Wrap(err, "generate presigned STS URL") } req, err := http.NewRequestWithContext(ctx, http.MethodPost, apiBase+"/login", nil) if err != nil { return "", errors.Wrap(err, "construct login request") } req.Header.Set("Authorization", "AWS "+url) resp, err := httpClient.Do(req) if err != nil { return "", errors.Wrap(err, "perform login request") } defer resp.Body.Close() if resp.StatusCode != 200 { data, err := io.ReadAll(resp.Body) if err != nil { return "", errors.Wrap(err, "failed to read response body") } return "", errors.Errorf("Received Login Error. Code: %d Body: %s", resp.StatusCode, string(data)) } var lr loginResponse if err := json.NewDecoder(resp.Body).Decode(&lr); err != nil { return "", errors.Wrap(err, "decode login response") } for _, acct := range lr.AccessTokens { for _, bundle := range acct { return bundle.AccessToken, nil } } return "", errors.New("no access tokens returned") } func listRisks(ctx context.Context, token, cluster string) ([]riskSummary, error) { url := fmt.Sprintf("%s/risks?filter=cluster_id:%s", apiBase, cluster) req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil) if err != nil { return nil, errors.Wrap(err, "construct list risks request") } req.Header.Set("Authorization", "Bearer "+token) resp, err := httpClient.Do(req) if err != nil { return nil, errors.Wrap(err, "perform list risks request") } defer resp.Body.Close() var lr listRisksResponse if err := json.NewDecoder(resp.Body).Decode(&lr); err != nil { return nil, errors.Wrap(err, "decode list risks response") } return lr.Data, nil } func listRiskResources(ctx context.Context, token, riskID string) ([]riskResource, error) { url := fmt.Sprintf("%s/risks/%s/resources", apiBase, riskID) req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil) if err != nil { return nil, errors.Wrap(err, "construct list risk resources request") } req.Header.Set("Authorization", "Bearer "+token) resp, err := httpClient.Do(req) if err != nil { return nil, errors.Wrap(err, "perform list risk resources request") } defer resp.Body.Close() var r listRiskResourcesResponse if err := json.NewDecoder(resp.Body).Decode(&r); err != nil { return nil, errors.Wrap(err, "decode list risk resources response") } return r.Data, nil } func presignSTS(ctx context.Context) (string, error) { cfg, err := config.LoadDefaultConfig(ctx) if err != nil { return "", errors.Wrap(err, "load AWS config") } cfg.Region = "us-east-1" stsClient := sts.NewFromConfig(cfg) sign := sts.NewPresignClient(stsClient) identity, err := sign.PresignGetCallerIdentity(ctx, &sts.GetCallerIdentityInput{}) if err != nil { return "", errors.Wrap(err, "failed to create presigned URL") } return identity.URL, nil } func exitErr(err error) { fmt.Fprintf(os.Stderr, "ERROR: %+v\n", err) os.Exit(1) } ``` # Get Access Token Source: https://docs.chkk.io/api-reference/login/get-access-token post /login Authenticate and obtain access credentials for authorized operations. # Get a specific Risk Source: https://docs.chkk.io/api-reference/risk/get-a-specific-risk get /risks/{riskId} Returns the full details of a single Risk. # List resources for a specific Risk Source: https://docs.chkk.io/api-reference/risk/list-resources-for-a-specific-risk get /risks/{riskId}/resources Returns a paginated list of Kubernetes resources that are associated with the specified Operational Risk. # List Risks Source: https://docs.chkk.io/api-reference/risk/list-risks get /risks List Risks matching one or more filter parameters. # Chkk + EKS Auto Mode Source: https://docs.chkk.io/chkk-eks-automode Do EKS Auto Mode clusters need Chkk Upgrade Copilot? ## What is Amazon EKS Auto Mode? Amazon EKS Auto Mode streamlines Kubernetes cluster management by automating infrastructure provisioning, compute instance selection, resource scaling, and core add-on management. It reduces operational overhead, allowing users to focus on application development rather than cluster management. Key features include automated compute, auto scaling, upgrades, load balancing, storage, and networking. ## What is Chkk Upgrade Copilot? Chkk Upgrade Copilot is your trusted expert that provides a comprehensive set of recommendations, stateful workflows, and safety tooling to help you upgrade cloud substrate, control plane, nodes, add-ons, application services, Kubernetes operators, and applications in your Kubernetes infrastructure. Chkk Upgrade Copilot gives the peace of mind that your upgrades are verified to succeed, while saving months of effort spent on preparing, staging, and executing upgrades. ## Do EKS Auto Mode clusters need Chkk Upgrade Copilot? While EKS Auto Mode automates many aspects of Kubernetes cluster management, it does not eliminate the need for tools like Chkk Upgrade Copilot in all cases. Here's a breakdown of which Auto Mode clusters benefit most from Chkk Upgrade Copilot: ### EKS Auto Mode clusters that NEED Chkk Upgrade Copilot: * **Clusters with add-ons, application services, and Kubernetes operators that aren't EKS Managed:** EKS Auto Mode does not manage all add-ons, application services, and Kubernetes Kubernetes operators. You are responsible for installing, managing, and upgrading add-ons, application services, and Kubernetes operators like Istio, Cert-Manager, Nginx, ArgoCD, External Secrets Operator, External DNS, CrossPlane, KEDA, Prometheus, Alertmanager, Fluentd, Grafana, Loki, Keycloak, Contour, Cilium, Calico, Argo Rollouts, and Vault Secrets Operator. Chkk Upgrade Copilot helps verify compatibility and uncover hidden dependencies before upgrades. * **Clusters Requiring Custom AMIs:** Auto Mode only supports EKS AMIs. If you need custom AMIs, you won't be able to use Auto Mode. * **Clusters with Specific CNI Requirements:** EKS Auto Mode restricts the underlying CNI to AWS' VPC CNI plug-in. If your organization requires a different CNI (e.g., Calico, Cilium) for enhanced observability or advanced networking policies, Auto Mode may not be suitable. * **Clusters with API Deprecations and Application Dependencies:** You are still responsible for migrating applications off deprecated APIs and fixing misconfigured Pod Disruption Budgets (PDBs) before an Auto Mode upgrade. Chkk helps ensure application teams update workloads and PDBs before upgrades. * **Clusters that must maintain Compliance and Improve Security Posture:** Chkk maintains an accurate inventory of all clusters, add-ons, application services, and Kubernetes operators, alerting you to existing and upcoming End-of-Life (EOL) software so you perform timely upgrades, avoid vulnerabilities, ensure vendor support, and save significant costs. * **When you want to Delegate and Parallelize Work:** Chkk's detailed Upgrade Plans standardize workflows, making it easy to delegate tasks to any team member confidently, simplify reuse and share knowledge, reduce the chances of human errors and omissions, enabling your experts to focus on what they do best. * **Organizations that want Standardization of Workflows, Knowledge Sharing, and Reuse of Best Practices:** Chkk ensures that institutional knowledge is retained and accessible, which is crucial during reorganizations or team changes, simplifies audits, enhances safety, and reduces time-to-find knowledge, minimizing context switching, and improving productivity. ### EKS Auto Mode clusters that may NOT need Chkk Upgrade Copilot: * **Simple Container Workloads:** If you are running simple containerized workloads without Datapath or Stateful add-ons or application services, you might not need Chkk Upgrade Copilot. * **Clusters Running CI Jobs:** Clusters running CI jobs that do not require special care and attention to add-on, application service, or Kubernetes operator dependencies may not need Chkk. ## Summary * Use Chkk Upgrade Copilot with Auto Mode when clusters have non-EKS add-ons, application services, and Kubernetes operators. EKS Auto Mode does not manage the maintenance and upgrades of these add-ons, application services, and Kubernetes operators, so Chkk is crucial in these scenarios. * Consider Chkk when custom AMIs or specific CNI requirements are present. Auto Mode might not accommodate these needs. * Chkk is valuable because organizations are responsible for addressing API deprecations and application dependencies. Chkk can assist in ensuring timely updates of workloads and Pod Disruption Budgets (PDBs). * If compliance and security are paramount, Chkk assists by providing an inventory of clusters, add-ons, application services, Kubernetes operators, and alerting to end-of-life software. * Chkk is helpful when you want to standardize workflows to enable task delegation, knowledge sharing, error reduction, and improved productivity. # Chkk + EKS Upgrade Insights Source: https://docs.chkk.io/chkk-eks-upgrade-insights How Chkk Upgrade Copilot uses Amazon EKS Upgrade Insights ## What is Chkk Upgrade Copilot? Chkk Upgrade Copilot is your trusted expert that provides a comprehensive set of recommendations, stateful workflows, and safety tooling to help you upgrade cloud substrate, control plane, nodes, add-ons, application services, Kubernetes operators, and applications in your Kubernetes infrastructure. Chkk Upgrade Copilot gives the peace of mind that your upgrades are verified to succeed, while saving months of effort spent on preparing, staging, and executing upgrades. ## What is Amazon EKS Upgrade Insights? Upstream Kubernetes releases 2-3 new versions every year. Kubernetes APIs continue to get deprecated and removed in each release. Any application using a removed API risks disruption if it's not upgraded to supported APIs before Kubernetes upgrade. EKS Upgrade Insights highlights deprecated and removed APIs being used by applications. According to [AWS](https://docs.aws.amazon.com/eks/latest/userguide/cluster-insights.html): "Upgrade insights scans cluster's audit logs for events related to APIs that have been deprecated. These events include information about who initiated it (i.e., the caller) and the Kubernetes resource(s) that it was initiated against. Upgrade Insights presents this information to you in a concise and easily consumable way so you can identify and remediate the appropriate resources before executing the upgrade." The information can also be retrieved programmatically using the Amazon EKS API or the AWS Command Line Interface (AWS CLI). Insight statuses: * **Error**: Impacted in the next version (N+1). * **Warning**: Impacted in a future version (N+2 or more). * **Passing**: No issues detected. * **Unknown**: Unable to determine impact. ## How does Chkk Upgrade Copilot use Amazon EKS Upgrade Insights? For EKS customers, Chkk uses EKS Upgrade Insights to detect Kubernetes APIs that have been deprecated and can cause application failures. Chkk also uses EKS Insights about version skew between control plane and kubelet versions in it's Upgrade Plans. While API deprecations/removals and version skew are critical to identify and address before an upgrade, there are many other dependencies, incompatibilities, and safety/availability risks across layers of infrastructure (cloud substrate, control plane, nodes, add-ons, application services, Kubernetes operators, and applications) that must be addressed to derisk upgrades. For reference, a typical cluster Upgrade Plan from Chkk comprises 80+ steps. API deprecations/removals and version skew are only relevant for a handful of these steps. ## Comparison Table **Multi-layer Dependency Analysis** | Cascading Incompatibilities, Misconfigurations, Coupled Changes | Chkk Upgrade Copilot | Amazon EKS Upgrade Insights | | :-------------------------------------------------------------- | :------------------- | :-------------------------- | | Application to Add-on Compatibility | ✅ | ❌ | | Application to Application Service Compatibility | ✅ | ❌ | | Application to Operator Compatibility | ✅ | ❌ | | Application to Nodes Compatibility | ✅ | ❌ | | Add-on to Control Plane Compatibility | ✅ | Limited | | Application Service to Control Plane Compatibility | ✅ | Limited | | Kubernetes Operator to Control Plane Compatibility | ✅ | Limited | | Application to Control Plane (PDB, misconfigurations) | ✅ | ❌ | | Application to Control Plane (API Deprecations) | ✅ | ✅ | | Node to Control Plane | ✅ | ❌ | | Add-on to Cloud Substrate Compatibility | ✅ | ❌ | | Application Service to Cloud Substrate Compatibility | ✅ | ❌ | | Kubernetes Operator to Cloud Substrate Compatibility | ✅ | ❌ | | Add-on to Add-on Compatibility | ✅ | ❌ | | Add-on to Application Service Compatibility | ✅ | ❌ | | Add-on to Kubernetes Operator Compatibility | ✅ | ❌ | | Application Service to Application Service Compatibility | ✅ | ❌ | | Application Service to Kubernetes Operator Compatibility | ✅ | ❌ | **Contextualized Release Notes** | Breaking changes, EOL detection, Default value changes | Chkk Upgrade Copilot | Amazon EKS Upgrade Insights | | :--------------------------------------------------------------------- | :------------------- | :-------------------------- | | Clusters | ✅ | ❌ | | Nodes | ✅ | ❌ | | Add-ons | ✅ | ❌ | | Application Services | ✅ | ❌ | | Kubernetes Operators | ✅ | ❌ | | Cloud Substrate (IAM, LB, etc.) | ✅ | ❌ | **Upgrade Version Recommendations** | Next Version Recommendations & Upgrade Considerations | Chkk Upgrade Copilot | Amazon EKS Upgrade Insights | | :--------------------------------------------------------------- | :------------------- | :-------------------------- | | Add-ons | ✅ | ❌ | | Application Services | ✅ | ❌ | | Kubernetes Operators | ✅ | ❌ | | Nodes (Rolling vs Blue/Green) | ✅ | ❌ | | Clusters (In-place vs Blue/Green) | ✅ | ❌ | **Safety, Health, and Readiness Checks** | Preflight, Inflight, Postflight Checks | Chkk Upgrade Copilot | Amazon EKS Upgrade Insights | | :--------------------------------------------------------------------------------------- | :------------------- | :-------------------------- | | Add-on Preflight/Postflight Checks | ✅ | ❌ | | Application Services Preflight/Postflight Checks | ✅ | ❌ | | Kubernetes Operators Preflight/Postflight Checks | ✅ | ❌ | | Control Plane Preflight/Postflight Checks | ✅ | ✅ | | Node Preflight/Postflight Checks | ✅ | ✅ | | Support for Custom Checks | ✅ | ❌ | **Multi-cloud Support** | Clouds | Chkk Upgrade Copilot | Amazon EKS Upgrade Insights | | :-------------------------------------------------------------------------------------------------- | :------------------- | :-------------------------- | | EKS | ✅ | ✅ | | GKE | ✅ | ❌ | | AKS | ✅ | ❌ | **Additional Capabilities** | Other Features | Chkk Upgrade Copilot | Amazon EKS Upgrade Insights | | :--------------------------------------------------------------------------------------------- | :------------------- | :-------------------------- | | Preverification on a Digital Twin | ✅ | ❌ | | Stateful Workflow | ✅ | ❌ | | Activity Stream | ✅ | ❌ | ## Summary * **EKS Upgrade Insights** focuses on detecting API deprecations/removals and ControlPlane-to-Node version skew. * **Chkk Upgrade Copilot** provides a complete upgrade solution by identifying hidden dependencies, unknown incompatibilities, misconfigurations, and breaking changes across all infrastructure layers. * Chkk includes preflight/postflight checks, contextual release notes, preverification on a digital twin, and stateful workflows to ensure upgrade success. * Chkk supports multi-cloud environments (EKS, GKE, AKS), while EKS Upgrade Insights is specific to AWS. # Chkk Cloud Connector Source: https://docs.chkk.io/connectors/chkk-cloud-connector An overview of the Chkk Cloud Connector ## Overview ### What is a Chkk Cloud Connector? **Chkk Cloud Connector** is a secure, read-only integration that fetches relevant resource data from your cloud environment and correlates it with your Kubernetes clusters. By focusing on resources that affect — or are affected by — your clusters (e.g., security groups, IAM roles, networking settings), the Connector facilitates a unified view of your infrastructure. This insight helps detect potential incompatibilities and misconfigurations early, resulting in a more stable and secure environment. ### Supported Cloud Service Providers Chkk supports connecting to the following major CSPs: * **AWS** (Amazon Web Services) * **GCP** (Google Cloud Platform) * **Azure** (Microsoft Azure) ## Permissions Chkk Cloud Connector operates under the principle of least privilege, utilizing read-only credentials to access only the necessary metadata in your cloud environment. This restricted, non-intrusive access allows Chkk to accurately map your configurations and deliver tailored guidance for upgrades and operational best practices. By granting the minimal permissions required, you maintain a strong security posture while benefiting from insights that reflect the actual state of your environment. All IAM policies and service accounts associated with the Chkk Cloud Connector remain under your direct control. You can modify, revoke, or remove these permissions at any time to align with your organization's security and compliance requirements. ## Setup Guide This guide walks you through installing a Chkk Cloud Connector for **AWS**, **GCP**, or **Azure**. 1. In the **left-hand column** of the **Chkk Dashboard**, expand **Configure** and click **Cloud Accounts**. 2. In the top-right corner, click **Add Cloud Account**. 3. From the dropdown, select **AWS**, **GCP**, or **Azure**. ![Cloud Accounts main page screenshot](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-cloud-accounts.png) Once you've selected your provider, follow the relevant instructions in the tabs below to set up and verify the Cloud Connector. 1. **AWS Account ID**: Provide the 12-digit AWS Account ID (e.g., `123456789012`). 2. **AWS Region**: Specify your primary region (e.g., `us-east-1`). 3. *(Optional)* **Account Name**: Provide a name to reference this AWS Account in the Chkk Dashboard. (e.g., `production-account`) 4. Click **Mark as done**. ![Enter AWS account details screenshot](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-aws-account-step-1.png) 1. In **Setup Environment**, choose how you want to create the **read-only IAM Role** (CloudFormation, Console, CLI, or Terraform). 2. Follow the steps mentioned under the selected method to set up the IAM Role in your AWS account. 3. Once set up, click **Mark as done**. ![Setup Environment screenshot](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-aws-account-step-2.png) 1. Wait until the IAM Role is fully created in AWS. 2. Chkk attempts to assume the newly created role to confirm connectivity. 3. Once the connection is verified, you'll see a success message on the Chkk Dashboard. 4. The **Redo** button allows you to retry the connection if needed. ![Verify Connection screenshot](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-aws-account-step-3.png) 1. A success message indicates Chkk can now access your AWS account. 2. Your AWS account will appear in the **Cloud Accounts** list with **Connected** status in the **Configure -> Cloud Accounts** view. ![Successful connection screenshot](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-aws-account-step-4.png) 1. **GCP Project ID**: Provide the ID for the GCP project you want to connect (e.g., `gcp-proj-example`). 2. *(Optional)* **Account Name**: Provide a name to reference this project in the Chkk Dashboard (e.g., `staging-project`). 3. Click **Mark as done**. ![Add GCP Account - Enter Project Details](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-gcp-account-step-1.png) 1. Under **Setup Environment**, choose your preferred method (e.g., **Manual (CLI)** or **Terraform**) to grant the Chkk service account **read-only** (`roles/viewer`) access to your project. 2. Follow the steps mentioned under the selected method. 3. After configuring your IAM policy, click **Mark as done**. ![Add GCP Account - Setup Environment](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-gcp-account-step-2.png) 1. Once you finish creating the policy bindings in GCP, Chkk will attempt to connect using the newly granted service account permissions. 2. A success message indicates that Chkk can now retrieve data from your GCP project. 3. If needed, click **Redo** to retry or refresh the connection. ![Add GCP Account - Verify Connection](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-gcp-account-step-3.png) 1. A final message confirms that your GCP project is **Connected**. 2. In the **Cloud Accounts** list (under **Configure** -> **Cloud Accounts**), your GCP project appears with a **Connected** status. ![Cloud Accounts - GCP Account Connected](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-gcp-account-step-4.png) 1. **Account Name**: Provide a name to reference this Azure account in the Chkk Dashboard (e.g., `production-account`). 2. Click **Mark as done** when finished. ![Add Azure Account - Name Your Connection](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-azure-account-step-1.png) 1. **Azure Subscription ID(s)**: Enter one or more Subscription IDs (e.g., `b6cbec...97995d`). 2. Click **Add** to include multiple subscription IDs if needed. 3. Click **Mark as done**. ![Add Azure Account - Provide Subscription IDs](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-azure-account-step-2.png) 1. Open a terminal and log in to your Azure account using the CLI. ```bash az login ``` 2. Once logged in, click **Mark as done**. ![Add Azure Account - Login to Azure](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-azure-account-step-3.png) 1. Run the following command to create a Service Principal that has the **Reader** role, scoped to your subscription: ```bash az ad sp create-for-rbac \ --display-name "chkk-cloud-connect-example" \ --role Reader \ --scopes /subscriptions/ ``` 2. **Note the output** of this command—it includes your **tenant**, **appId** (client ID), and **password** (client secret). 3. Click **Mark as done**. ![Add Azure Account - Create a Service Principal with Reader Role](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-azure-account-step-4.png) 1. Copy and paste the **Tenant ID**, **Client ID**, and **Client Secret** from the output of the previous command into the respective fields. 2. Click **Mark as done**. ![Add Azure Account - Provide Service Principal Details](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-azure-account-step-5.png) 1. After providing your Service Principal details, Chkk attempts to authenticate with Azure. 2. A success message indicates that your Azure account is now connected. 3. Use the **Redo** button if you need to retry or refresh the connection. ![Add Azure Account - Verify Connection](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-azure-account-step-6.png) 1. Navigate back to **Configure** -> **Cloud Accounts**. 2. Your Azure account appears in the list with a **Connected** status. ![Add Azure Account - Confirm Connection](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/chkk-cloud-connector/chkk-dashboard-configure-azure-account-step-7.png) # Chkk Kubernetes Connector Source: https://docs.chkk.io/connectors/chkk-kubernetes-connector An overview of Chkk Kubernetes Connector — what it is, why you need it, and how to install and configure it. ## Overview ### Key Components The Chkk Kubernetes Connector is composed of two main components: 1. Chkk Operator 2. Chkk Agent Working together, these components periodically (or on-demand) extract cluster metadata and ingest it into the Chkk SaaS platform. Once ingestion is complete, Chkk scans and analyzes your environment for potential risks or helpful insights (e.g., add-on, application service, and Kubernetes operator instances running in your cluster). *** ### Chkk Operator The Chkk Operator is a Kubernetes Operator that manages and configures the Chkk Kubernetes Connector. It deploys Chkk Agent through a single Custom Resource Definition (CRD) and simplifies configurations by: * Providing a single source of truth (the CRD) for your Connector. * Reporting deployment status, health, and errors in the CRD's status. * Limiting the risk of potential misconfigurations by enforcing higher-level settings. Once deployed, the Operator: * Validates your Chkk Connector configurations. * Keeps the Connector aligned with your CRD-based configuration. * Orchestrates creation and updates of the Connector resources. * Reports the Connector's status in the Operator's CRD. ### Chkk Agent Chkk Agent is a Kubernetes Custom Resource managed by the Operator. It defines how and when to collect data from your cluster. Some key features include: * Manages the Agent CronJob: Schedules periodic scans of your cluster to keep you informed of the latest known risks. * Resource Filtering: Allows you to include or exclude specific namespaces or resource types. *** ## Setup ### Prerequisites Before installing the Chkk Kubernetes Connector, ensure the following: 1. Allowlisted Access * You must be allowlisted to access the Chkk SaaS. Contact us to get a dedicated Chkk Organization provisioned for you: [chkk.io](https://www.chkk.io/). 2. Network Firewall Rules * If your cluster is in a restricted network, allow outbound connections to: * `chkk.io` and its subdomains * `s3.amazonaws.com` and its subdomains 3. Proxy Settings * If you use a proxy server, you will be required to configure the `HTTP_PROXY`, `HTTPS_PROXY`, and `NO_PROXY` environment variables at the time of installation. 4. Image Hosting * The Chkk Kubernetes Connector container images are hosted publicly on the [Amazon ECR Public Registry](https://gallery.ecr.aws/chkk/). Ensure your cluster can pull images from this registry. * Chkk supports custom registries. If you host all images in a private registry, detailed configuration instructions will be provided during installation. ### Resource Requirements Below are the baseline resource requests for each component of the Chkk Kubernetes Connector. Actual usage varies by cluster size and scan frequency. | Component | CPU | Memory | | ------------------ | ---- | ------ | | Chkk Operator | 100m | 256Mi | | Chkk Agent | 500m | 1024Mi | | Chkk Agent Manager | 50m | 128Mi | ### Supported Kubernetes Distributions The Chkk Kubernetes Connector is compatible with all Kubernetes providers that are compliant with the upstream API. For the list of supported Kubernetes providers and versions, refer to [Support and Compatibility](/overview/support-compatibility) ### Installation Modes There are three deployment methods available for installing the Chkk Kubernetes Connector: * Helm * K8s YAML * Terraform ### System Requirements Before installing the Chkk Kubernetes Connector, please ensure that your system meets the minimum requirements for the selected deployment method:

Kubernetes >= v1.19 (tested on EKS, GKE, AKS)
OS/Architecture: linux/amd64, linux/arm64
kubectl: >= v1.19
Helm: >= version 2

Kubernetes >= v1.19 (tested on EKS, GKE, AKS)
OS/Architecture: linux/amd64, linux/arm64
kubectl: >= v1.19

Kubernetes >= v1.19 (tested on EKS, GKE, AKS)
OS/Architecture: linux/amd64, linux/arm64

hashicorp/helm: >= version 2
gavinbunney/kubectl: >= v1.19

*** ## Installation & Validation 1. Log in to the Chkk Dashboard: [chkk.io](https://www.chkk.io/). 2. In the left-hand sidebar, navigate to **Risk Ledger** → **Clusters**. 3. Click **Add Cluster** in the top-right corner. 4. Follow the step-by-step instructions and select your preferred deployment mode. ## Configuration #### Configuration Parameters The table below lists the configurable parameters for installing the Chkk Operator. | Parameter | Description | Sample Default | | -------------------------- | ------------------------------------------------------------------------------------------------ | ----------------------------------- | | `image.repository` | Image repository | `public.ecr.aws/chkk/operator` | | `image.tag` | Image tag | `v0.0.14` | | `image.pullPolicy` | Image pull policy | `Always` | | `replicaCount` | Number of replicas | `1` | | `revisionHistoryLimit` | Revision history limit | `2` | | `secret.create` | Create a new secret | `true` | | `secret.chkkAccessToken` | Chkk access token | `CHKK-ACCESS-TOKEN` | | `secret.ref.secretName` | Name of an existing Secret with the Chkk access token (only used if `secret.create=false`) | `chkk-operator` | | `secret.ref.keyName` | Key in the existing Secret's `data` that contains the token (only used if `secret.create=false`) | `CHKK_ACCESS_TOKEN` | | `serviceAccount.create` | Create a service account | `true` | | `serviceAccount.name` | Service account name | `chkk-operator-sa` | | `podAnnotations` | Annotations applied to the Chkk Operator Pod | `{ chkk.io/name: "chkk-operator" }` | | `disableAnalytics` | Disable analytics data collection | `false` | | `proxy.http_proxy` | HTTP proxy | `""` | | `proxy.https_proxy` | HTTPS proxy | `""` | | `proxy.no_proxy` | No proxy | `""` | | `tolerations` | Node tolerations | See `values.yaml` | | `nodeSelector` | Node labels for scheduling | `{}` | | `affinity` | Pod scheduling affinity | See `values.yaml` | | `securityContext` | Pod-Level Security Context | See `values.yaml` | | `containerSecurityContext` | Container-Level Security Context | See `values.yaml` | #### Configuration Examples If you prefer to manage the secret externally, set `secret.create` to false and reference your secret in the `values.yaml` file: ```yaml secret: create: false ref: secretName: my-secret keyName: CHKK_ACCESS_TOKEN ``` To customize the RBAC settings, modify the serviceAccount parameters in the `values.yaml` file. You can specify whether to create a new service account and provide a custom name. ```yaml serviceAccount: create: false name: chkkagent-custom-sa ``` To use a custom image, update the `image.repository` and `image.tag` fields in the `values.yaml` file. You can also set the `image.pullPolicy` to control when the image is pulled. ```yaml image: repository: custom-repo/chkk/operator tag: v0.0.14 pullPolicy: IfNotPresent ``` To schedule the Chkk Operator on nodes with specific taints, configure the tolerations section in the `values.yaml` file. You can specify the key, operator, value, and effect for each toleration. ```yaml tolerations: - key: "example.com/special-taint" operator: "Equal" value: "true" effect: "NoSchedule" ``` When configuring the Chkk Connector to run behind a proxy, set the following fields in your `values.yaml` to ensure proper connectivity and to disable telemetry reporting. You **must** set `disableAnalytics: true` when defining proxy settings. ```yaml proxy: http_proxy: "http://your-proxy.example.com:3128" https_proxy: "http://your-proxy.example.com:3128" no_proxy: "localhost,127.0.0.1,.svc,.cluster.local" disableAnalytics: true ``` This ensures the Chkk Operator and Agent operate correctly within your network environment. *** The `ChkkAgent` Custom Resource (CR) tells the Operator how the cluster scanning, and cluster context ingestion should function. When you apply a `ChkkAgent` resource, the Operator: 1. Creates or updates a CronJob resource to run scans on a schedule. 2. Handles resource filtering (which namespaces or resource types to include/exclude). 3. Informs the **Chkk Agent Manager** about on-demand re-scan triggers from the Chkk Dashboard. #### Specification Overview ```yaml apiVersion: k8s.chkk.io/v1beta1 kind: ChkkAgent metadata: name: chkk-agent namespace: chkk-system spec: global: clusterName: "my-cluster" clusterEnvironment: "production" # ... agentOverride: name: chkk-cj image: name: public.ecr.aws/chkk/chkk-agent:v0.1.14 managerImage: name: public.ecr.aws/chkk/chkk-agent-manager:v0.1.14 # ... ``` #### Spec Fields ##### agentOverride | Field | Description | | ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------- | | `activeDeadlineSeconds` | Time (in seconds) after which the job is terminated if it hasn't finished. | | `backoffLimit` | Maximum number of retries for failed jobs before considering them permanently failed. | | `completions` | Number of pods that must successfully complete for the job to finish. | | `completionMode` | Method for tracking pod completions (`NonIndexed` or `Indexed`). | | `concurrencyPolicy` | Defines concurrency handling for the job (`Allow`, `Forbid`, or `Replace`). | | `createRbac` | Automatically create the necessary RBAC resources (Roles/ClusterRoles). | | `failedJobsHistoryLimit` | Number of failed job executions to keep for reference. | | `image` | Configuration for the Chkk Agent container image (repository, tag, etc.). | | `managerImage` | Configuration for the Chkk Agent Manager container image (repository, tag, etc.). | | `name` | Overrides the default name for the resource. Must be 1-65 characters if set. | | `schedule` | Cron expression defining how often the job runs. For example, `0 2 * * *` would run daily at 02:00. | | `serviceAccountName` | ServiceAccount used by this job. Ignored if `createRbac` is `true`. If you're manually managing RBAC, set this to reference your custom ServiceAccount. | | `template` | Pod template for the Chkk Agent, enabling detailed security, resource settings, etc. | | Field | Description | | ---------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `manualSelector` | When `true`, allows you to manually control pod labels and pod selectors. Typically leave this `false` or unset so the system manages labels uniquely. | | `parallelism` | The maximum number of pods the job can run in parallel at one time. | | `selector` | A label query used to match pods to the job. Normally left unset so the system generates unique labels automatically. | | `startingDeadlineSeconds` | A deadline in seconds for starting the job if a scheduled run is missed. Missed runs are counted as failures. | | `successfulJobsHistoryLimit` | The number of successful job runs to retain. Older completions are cleaned up if you exceed this limit. | | `suspend` | When `true`, no pods are created and active pods are terminated. This effectively pauses the job. | | `template` | Defines the pod template for the Chkk Agent. You can customize security contexts, resource requests, environment variables, etc. | | `timeZone` | Time zone name for interpreting the `schedule`. Defaults to the kube-controller-manager's time zone if unspecified. (Requires the `CronJobTimeZone` feature gate to be enabled in Kubernetes.) | | `ttlSecondsAfterFinished` | Time (in seconds) after the job finishes (Complete or Failed) before it's eligible for garbage collection (auto-deletion). A value of `0` deletes the job immediately upon completion. | *** ##### global | Field | Description | | ---------------------- | ------------------------------------------------------------------------------------ | | `clusterEnvironment` | Environment name (e.g., `production`, `development`). | | `clusterName` | Unique identifier for your cluster. | | `credentials` | API credentials for authenticating the agent with Chkk. | | `endpoint` | URL of the Chkk API. | | `filter` | Rules for including or excluding Kubernetes resources during scans. | | `logLevel` | Level of logging verbosity (e.g., `trace`, `debug`, `info`, `warn`, `error`, `off`). | | `podAnnotationsAsTags` | Maps selected Kubernetes pod annotations to Chkk tags for better traceability. | | `podLabelsAsTags` | Maps selected Kubernetes pod labels to Chkk tags. | | `tags` | Custom tags to apply across clusters/resources (e.g., `team:alpha`). | | `updates` | Controls agent auto-updates, specifying update frequency or behavior. | *** ##### Status Fields | Field | Description | | ------------------- | -------------------------------------------------------------------------------------------- | | `agent` | Indicates the current state of the associated CronJob (e.g., `Active`, `Suspended`). | | `conditions` | List of conditions describing the ChkkAgent's state (e.g., `Ready`, `Reconciling`). | | `lastScanTime` | Timestamp of the most recent scan performed by the agent. | | `latestUpdateState` | Reflects the status of the last update applied to the ChkkAgent (e.g., `Success`, `Failed`). | #### Configuration Examples The following example shows a basic deployment of the `ChkkAgent` resource: ```yaml apiVersion: k8s.chkk.io/v1beta1 kind: ChkkAgent metadata: name: chkk-agent namespace: chkk-system ``` Setting the `spec.global.clusterName` and `spec.global.clusterEnvironment` fields in the `ChkkAgent` resource allows you to customize the cluster name and environment displayed in the Chkk Dashboard: ```yaml apiVersion: k8s.chkk.io/v1beta1 kind: ChkkAgent metadata: name: chkk-agent namespace: chkk-system spec: global: clusterName: "my-cluster" clusterEnvironment: "production" ``` Note: This can also be done manually through the Chkk Dashboard itself. If you host the `public.ecr.aws/chkk/chkk-agent` and `public.ecr.aws/chkk/chkk-agent-manager` images in a private registry, or if you want to override the default versions of these images, you can do so by setting `spec.agentOverride.image.name` and `spec.agentOverride.managerImage.name` in the `ChkkAgent` resource: ```yaml apiVersion: k8s.chkk.io/v1beta1 kind: ChkkAgent metadata: name: chkk-agent namespace: chkk-system spec: agentOverride: image: name: public.ecr.aws/chkk/chkk-agent:v0.1.12 managerImage: name: public.ecr.aws/chkk/chkk-agent-manager:v0.1.12 ``` To customize the security context for the Chkk Agent, you can set the `spec.agentOverride.template.securityContext` field and/or the `spec.agentOverride.template.container.securityContext` field in the `ChkkAgent` resource: ```yaml apiVersion: k8s.chkk.io/v1beta1 kind: ChkkAgent metadata: name: chkk-agent namespace: chkk-system spec: agentOverride: template: securityContext: runAsNonRoot: true runAsUser: 12000 runAsGroup: 12000 fsGroup: 12000 seccompProfile: type: RuntimeDefault container: securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL readOnlyRootFilesystem: true runAsNonRoot: true ``` To use an existing Secret for the Chkk Agent's credentials, set the `spec.global.credentials.accessTokenSecret` field in the `ChkkAgent` resource: ```yaml apiVersion: k8s.chkk.io/v1beta1 kind: ChkkAgent metadata: name: chkk-agent namespace: chkk-system spec: global: credentials: accessTokenSecret: secretName: chkk-agent-token keyName: CHKK_ACCESS_TOKEN ``` If you want to use a custom Service Account for the Chkk Agent, set the `spec.agentOverride.serviceAccountName` field in the `ChkkAgent` resource: ```yaml apiVersion: k8s.chkk.io/v1beta1 kind: ChkkAgent metadata: name: chkk-agent namespace: chkk-system spec: agentOverride: createRbac: false serviceAccountName: chkk-agent-custom-sa ``` If you supply a custom ServiceAccount, ensure the associated ClusterRole has the following RBAC permissions: ```yaml - apiGroups: ["batch"] resources: ["jobs", "cronjobs"] verbs: ["get", "create", "list", "update"] - apiGroups: [""] resources: ["nodes"] verbs: ["get", "list", "watch"] ``` To include or exclude specific namespaces or resource types during scans, set the `spec.global.filter` field in the `ChkkAgent` resource. By default, the following filter is applied: ```yaml apiVersion: k8s.chkk.io/v1beta1 kind: ChkkAgent metadata: name: chkk-agent-1 namespace: chkk-system spec: global: credentials: accessToken: filter: |- rules: - exclude: - path: $.metadata.name match: ^chkk - include: - path: $.kind match: ^DaemonSet|Deployment|Pod|PodTemplate|ReplicationController|StatefulSet$ - include: - path: $.kind match: ^NetworkPolicy|CronJob|Namespace|Service|Job|Ingress|Node$ - include: - path: $.kind match: ^CSIStorageCapacity|PriorityLevelConfiguration|HorizontalPodAutoscaler|EndpointSlice$ - include: - path: $.kind match: ^PodDisruptionBudget|PodSecurityPolicy|RuntimeClass|ValidatingWebhookConfiguration|CustomResourceDefinition$ - include: - path: $.kind match: ^TokenReview|LocalSubjectAccessReview|SelfSubjectAccessReview|CertificateSigningRequest|Lease|ClusterRole$ - include: - path: $.kind match: ^ClusterRoleBinding|Role|RoleBinding|Ingress|IngressClass|PriorityClass$ - include: - path: $.kind match: ^CSIDriver|CSINode|StorageClass|VolumeAttachment$ - include: - path: $.kind match: ^ConfigMap$ - path: $.metadata.name match: ^*dns*$ ``` The Chkk Agent CronJob runs every 12 hours by default. To customize how frequently the Chkk Agent runs, set the `spec.agentOverride.schedule` field in the `ChkkAgent` custom resource. The schedule follows standard Kubernetes CronJob format. ```yaml apiVersion: k8s.chkk.io/v1beta1 kind: ChkkAgent metadata: name: chkk-agent namespace: chkk-system spec: agentOverride: schedule: "0 */6 * * *" # Every 6 hours ``` If you prefer to manage your Kubernetes environment with Terraform, Chkk provides a **Terraform module** for deploying the Chkk Kubernetes Connector. This module automates creation of the Chkk Operator and Chkk Agent, handling secret management, RBAC, and other necessary resources. #### Providers The Terraform module relies on the following providers: | Name | Version | | ------------------- | --------: | | helm | >= 2.10.0 | | gavinbunney/kubectl | >= 1.7.0 | Ensure these versions (or newer) are installed and configured in your Terraform environment. *** #### Usage Below are several usage examples showing how to configure secrets, override Service Accounts, or set the cluster name and environment. For all these examples, set `source` to point to the appropriate Git repository and tag (e.g., `ref=v0.1.6`). This snippet deploys the Chkk Operator with a newly created Secret containing your Chkk access token. The Operator and Agent resources will be installed in the `chkk-system` namespace. ```hcl module "chkk_k8s_connector" { source = "git::https://github.com/chkk-io/terraform-chkk-k8s-connector.git?ref=v0.1.6" create_namespace = true namespace = "chkk-system" chkk_operator_config = { secret = { chkkAccessToken = "" } } } ``` If you already have a Secret named `chkk-operator-secret` containing your Chkk access token, you can reference it directly instead of creating a new one. ```hcl module "chkk_k8s_connector" { source = "git::https://github.com/chkk-io/terraform-chkk-k8s-connector.git?ref=v0.1.6" create_namespace = true namespace = "chkk-system" chkk_operator_config = { secret = { create = false ref = { secretName = "chkk-operator-secret" keyName = "accessToken" } } } } ``` You can also configure separate secrets for the Operator and the Agent. The Operator references `chkk-operator-secret`, while the Agent references `chkk-agent-secret`. ```hcl module "chkk_k8s_connector" { source = "git::https://github.com/chkk-io/terraform-chkk-k8s-connector.git?ref=v0.1.6" create_namespace = true namespace = "chkk-system" chkk_operator_config = { secret = { create = false ref = { secretName = "chkk-operator-secret" keyName = "accessToken" } } } chkk_agent_config = { secret = { secretName = "chkk-agent-secret" keyName = "accessToken" } } } ``` Use a custom Service Account name and let the module create it for you: ```hcl module "chkk_k8s_connector" { source = "git::https://github.com/chkk-io/terraform-chkk-k8s-connector.git?ref=v0.1.6" create_namespace = true namespace = "chkk-system" chkk_operator_config = { secret = { chkkAccessToken = "" } serviceAccount = { create = true name = "chkk-operator-custom-sa" } } } ``` In this scenario, both Operator and Agent have their own existing Service Accounts (no new Service Accounts are created): ```hcl module "chkk_k8s_connector" { source = "git::https://github.com/chkk-io/terraform-chkk-k8s-connector.git?ref=v0.1.6" create_namespace = true namespace = "chkk-system" chkk_operator_config = { secret = { chkkAccessToken = "" } serviceAccount = { create = false name = "chkk-operator-custom-sa" } } chkk_agent_config = { serviceAccount = { create = false name = "chkk-agent-custom-sa" } } } ``` You can override the default cluster name and environment in the Chkk Dashboard by setting `cluster_name` and `cluster_environment`. In this example, a new secret is created for the Operator and Agent using ``. ```hcl module "chkk_k8s_connector" { source = "git::https://github.com/chkk-io/terraform-chkk-k8s-connector.git?ref=v0.1.6" create_namespace = true namespace = "chkk-system" chkk_operator_config = { secret = { chkkAccessToken = "" } } # Override cluster name and environment displayed in Chkk cluster_name = "eks-prod-uswest2" cluster_environment = "prod" } ``` When configuring the Chkk Connector to run behind a proxy using the Terraform module, define the following fields in your module configuration to ensure proper connectivity and to disable telemetry reporting. You **must** set `disableAnalytics = true` when defining proxy settings. ```hcl module "chkk_k8s_connector" { source = "git::https://github.com/chkk-io/terraform-chkk-k8s-connector.git?ref=v0.1.6" create_namespace = true namespace = "chkk-system" chkk_operator_config = { secret = { chkkAccessToken = "" } proxy = { http_proxy = "" https_proxy = "" no_proxy = "" } disableAnalytics = true } } ``` This ensures the Chkk Operator and Agent operate correctly within your network environment. If you host the `public.ecr.aws/chkk/chkk-agent`, `public.ecr.aws/chkk/chkk-agent-manager`, or the `public.ecr.aws/chkk/operator` images in a private registry, or if you want to override the default versions of these images, you can do so by updating the Terraform module with the following fields: ```hcl module "chkk_k8s_connector" { source = "git::https://github.com/chkk-io/terraform-chkk-k8s-connector.git?ref=v0.1.6" create_namespace = true namespace = "chkk-system" chkk_agent_config = { agent_image = { repository = "public.ecr.aws/chkk/cluster-agent:v0.1.10" } manager_image = { name = "public.ecr.aws/chkk/cluster-agent-manager:v0.1.10" } } chkk_operator_config = { secret = { chkkAccessToken = } image = { repository = "public.ecr.aws/chkk/operator" tag = "v0.0.10" } } } ``` The Chkk Agent CronJob runs every 12 hours by default. To customize how frequently the Chkk Agent runs, you can set a custom schedule in the `chkk_agent_config` section. The schedule follows standard Kubernetes CronJob format. ```hcl module "chkk_k8s_connector" { source = "git::https://github.com/chkk-io/terraform-chkk-k8s-connector.git?ref=v0.1.6" create_namespace = true namespace = "chkk-system" chkk_operator_config = { secret = { chkkAccessToken = "" } } chkk_agent_config = { schedule = "0 */6 * * *" # Every 6 hours } } ``` *** #### Inputs Below is a reference of the module's input variables: | Name | Description | Type | Default | Required | | ----------------------------------------------------- | --------------------------------------------------------------------------------------------- | ------ | ------------------ | :------: | | `release_name` | The name of the Helm release. | string | `chkk-operator` | no | | `namespace` | The namespace where resources are deployed. | string | `chkk-system` | no | | `chart_version` | Version of the Helm chart to deploy. | string | `n/a` | no | | `create_namespace` | Whether to create the namespace if it doesn't exist. | bool | `true` | no | | `filter` | Override the default filter for the ChkkAgent. | string | `n/a` | no | | `cluster_name` | Override the default cluster name for the ChkkAgent. | string | `n/a` | no | | `cluster_environment` | Override the default cluster environment for the ChkkAgent. | string | `n/a` | no | | `chkk_operator_config` | Values for configuring the Chkk Operator Helm chart. | map | `{}` | no | | `chkk_operator_config.secret` | Details on how to set up the Secret for the Chkk Operator (create new or reference existing). | map | `{}` | no | | `chkk_operator_config.secret.create` | Whether to create a new Secret resource or use an existing one. | bool | `true` | no | | `chkk_operator_config.secret.chkkAccessToken` | If `create` is `true`, this token is stored in the new Secret. | string | `""` | no | | `chkk_operator_config.secret.ref.secretName` | If `create` is `false`, the name of an existing Secret containing the token. | string | `""` | no | | `chkk_operator_config.secret.ref.keyName` | If `create` is `false`, the key name inside the existing Secret that stores the token. | string | `""` | no | | `chkk_operator_config.serviceAccount` | Configure a dedicated Service Account for the Chkk Operator. | map | `{}` | no | | `chkk_operator_config.serviceAccount.create` | Whether to create a new Service Account or use an existing one. | bool | `true` | no | | `chkk_operator_config.serviceAccount.name` | Name of the Service Account (if `create` is `true`) or the existing SA name. | string | `chkk-operator-sa` | no | | `chkk_agent_config` | Configuration overrides for the ChkkAgent. | map | `{}` | no | | `chkk_agent_config.secret` | Secret object used by the ChkkAgent. | string | `""` | no | | `chkk_agent_config.secret.accessToken` | If the ChkkAgent is to create/use a new secret, specify the token here. | string | `""` | no | | `chkk_agent_config.secret.secretName` | Name of the existing Secret that the ChkkAgent should use (if not creating a new one). | string | `""` | no | | `chkk_agent_config.secret.keyName` | Key inside that Secret for the token. | string | `""` | no | | `chkk_agent_config.serviceAccount` | Configure a dedicated Service Account for the ChkkAgent. | map | `{}` | no | | `chkk_agent_config.serviceAccount.create` | Whether to create or reuse an existing Service Account for the ChkkAgent. | bool | `true` | no | | `chkk_agent_config.serviceAccount.serviceAccountName` | If `create = false`, name of the existing Service Account the agent should use. | string | `""` | no | | `chkk_agent_config.agent_image` | Agent Image object for ChkkAgent | map | `{}` | no | | `chkk_agent_config.agent_image.name` | Full image name for the agent image (repository:tag) | string | `""` | no | | `chkk_agent_config.manager_image` | Manager Image object for ChkkAgent | map | `{}` | no | | `chkk_agent_config.manager_image.name` | Full image name for the manager image (repository:tag) | string | `""` | no | | `chkk_agent_config.schedule` | Cron schedule for ChkkAgent CronJob execution. | string | `""` | no | #### Outputs Currently, this module does not produce any outputs. ## Upgrade ```bash helm list -n chkk-system -o json ``` Sample output: ```json [{ "name": "chkk-operator", "namespace": "chkk-system", "revision": "1", "chart": "chkk-operator-0.0.9", "app_version": "0.0.9" }] ``` Update the Helm repository to fetch latest chart ```bash helm repo update chkk ``` Replace `` with your Chkk ingestion token, which you can copy from the Chkk Dashboard under **Settings → Tokens**. ```bash helm upgrade chkk-operator chkk/chkk-operator \ --namespace chkk-system \ --set secret.chkkAccessToken= ``` ```bash kubectl get secret chkk-operator-token -n chkk-system ``` Sample output: ```bash NAME TYPE DATA AGE chkk-operator-token Opaque 1 10m ``` ```bash helm upgrade chkk-operator chkk/chkk-operator \ --namespace chkk-system \ --set secret.ref.secretName=chkk-operator-token \ --set secret.ref.keyName= \ --set secret.create=false ``` 1. Get the Secret: ```bash kubectl get secret chkk-operator-token -n chkk-system ``` Sample output: ```bash NAME TYPE DATA AGE chkk-operator-token Opaque 1 10m ``` 2. Get the ServiceAccount: ```bash kubectl get serviceaccount chkk-operator -n chkk-system ``` Sample output: ```bash NAME SECRETS AGE chkk-operator 1 10m ``` ```bash helm upgrade chkk-operator chkk/chkk-operator \ --namespace chkk-system \ --set secret.ref.secretName=chkk-operator-token \ --set secret.ref.keyName= \ --set secret.create=false \ --set serviceAccount.create=false \ --set serviceAccount.name= ``` ```bash kubectl get deployment chkk-operator -n chkk-system -o json \ | jq '.spec.template.spec.containers[].image' ``` Sample output: ```bash "public.ecr.aws/chkk/operator:v0.0.14" ``` Follow the given steps to Upgrade the Chkk Kubernetes Connector via Terraform ```hcl module "chkk_k8s_connector" { source = "git::https://github.com/chkk-io/terraform-chkk-k8s-connector.git?ref=v0.1.6" create_namespace = true namespace = "chkk-system" chart_version = "v0.0.14" chkk_operator_config = { secret = { create = true chkkAccessToken = } } ``` ```hcl module "chkk_k8s_connector" { source = "git::https://github.com/chkk-io/terraform-chkk-k8s-connector.git?ref=v0.1.6" create_namespace = false namespace = "chkk-system" chart_version = "v0.0.14" chkk_operator_config = { secret = { create = false ref = { secretName = "chkk-operator-token", keyName = "CHKK_ACCESS_TOKEN" } } } chkk_agent_config = { secret = { secretName = "chkk-agent-token", keyName = "CHKK_ACCESS_TOKEN" } } } ``` ```hcl module "chkk_k8s_connector" { source = "git::https://github.com/chkk-io/terraform-chkk-k8s-connector.git?ref=v0.1.6" create_namespace = false namespace = "chkk-system" chart_version = "v0.0.14" chkk_operator_config = { secret = { create = false ref = { secretName = "chkk-operator-token", keyName = "CHKK_ACCESS_TOKEN" } } serviceAccount = { create = false name = "chkk-operator" } } chkk_agent_config = { secret = { secretName = "chkk-agent-token", keyName = "CHKK_ACCESS_TOKEN" } serviceAccount = { create = false name = "chkk-agent" } } } ``` ```bash kubectl get deployment chkk-operator -n chkk-system -o json \ | jq '.spec.template.spec.containers[].image' ``` Sample output: ```bash "public.ecr.aws/chkk/operator:v0.0.14" ``` # Activity Feed Source: https://docs.chkk.io/features/activity-feed A brief description, use-case, and usage instructions of the Activity Feed feature Chkk's **Activity Feed** feature provides a comprehensive, real-time record of all actions taken across your Upgrade Templates and Upgrade Plans. It captures both high-level events—such as status updates and who requested them—and step-specific updates (e.g., adding or editing custom steps). By centralizing each change and its context, the **Activity Feed** enhances collaboration, transparency, and auditability—making it easy to see what changed, who contributed, and when. This real-time visibility helps your team coordinate parallel upgrades with confidence, whether you're working on a single cluster or rolling out changes across multiple clusters simultaneously. ## Usage The provided steps also apply to Cluster Upgrade Plans, Add-on Upgrade Plans, and Add-on Upgrade Templates. Follow these steps to access the **Activity Feed** for both the entire **Cluster Upgrade Template** and for a specific step within that template. 1. In the **Chkk Dashboard**, expand **Upgrade Copilot** on the left menu. 2. Click **Upgrade Templates** and then **Clusters**. 3. You will see a list of all existing **Cluster Upgrade Templates**. 1. Select the **Cluster Upgrade Template** you want to examine. 2. Next to the **template name** at the top, click the **heartbeat icon**. 3. A **sidebar** will appear, showing the **Activity Feed** for the entire Upgrade Template. 1. Within the open **Upgrade Template**, navigate to a specific **step** of any **stage**. 2. Next to that step's name, click the **heartbeat icon**. 3. A **sidebar** will appear on the right, displaying the **Activity Feed** only for that step. # Comments Source: https://docs.chkk.io/features/add-comments A brief description, use-case, and usage instructions of the Add Comments feature Chkk's **Add Comments** feature allows teams to capture any additional context, internal guidelines, or unique considerations at every step of an Upgrade Template or Upgrade Plan. Each upgrade—whether for a cluster, a specific add-on, application service, or Kubernetes operator—consists of multiple stages, and each stage can contain multiple steps. By placing comments directly next to these steps, users can clarify decisions (e.g., why certain add-ons are deferred) or highlight key compliance requirements. They can also note any organization-wide policies—such as upgrading stateful add-ons or application services separately—so that future recommendations reflect these preferences. Centralizing comments at the step level **reduces guesswork**, **preserves institutional knowledge**, and **standardizes workflows**. Team members can quickly see the rationale behind key decisions, while parallel upgrade efforts become more efficient thanks to clear documentation. In short, adding comments fosters a transparent, collaborative, and consistent upgrade process for every environment. * Upgrade Templates (Cluster and Add-on) let you leave comments for Chkk and the team. * Upgrade Plans (Cluster and Add-on) let you leave comments for the team only. ## Usage The provided steps also apply to Cluster Upgrade Plans, Add-on Upgrade Plans, and Add-on Upgrade Templates. Below are the steps to **Add Comments** for both **Chkk** and **your team** on a specific step within a Cluster Upgrade Template. 1. In the **Chkk Dashboard**, expand **Upgrade Copilot** on the left menu. 2. Click **Upgrade Templates** and then **Clusters**. 3. You will see a list of all existing **Cluster Upgrade Templates**. 1. Locate the **step** you want to comment on. 2. Click the **heartbeat icon** next to that step to open the sidebar. 3. A sidebar will appear, showing the **Activity Feed** and comment sections. 1. Under **Leave Comment for your team**, type your message in the text area. 2. Click **Comment** to post it. 1. Under **Leave Comment for Chkk** in the same sidebar, enter your message or question for the Chkk team. 2. Click **Comment** to post it. 3. The Chkk team will be notified of your comment. # Custom Steps Source: https://docs.chkk.io/features/add-edit-delete-custom-steps A brief description, use-case, and usage instructions of the Add/Edit/Delete Custom Steps feature Chkk's **Add/Edit/Delete Custom Steps** feature lets users embed their organization's internal processes into any Upgrade Template or Upgrade Plan. This ensures every upgrade—across one or multiple clusters—follows the same best practices, compliance checks, and change-management guidelines. Because these custom steps carry forward from the template to each new Upgrade Plan, your team can delegate tasks in parallel without recreating instructions. Plus, you can still tailor steps for specific clusters even after a plan is instantiated, making it easy to address unique requirements. This flexible capability fosters the reuse of best practices, retains institutional knowledge, and standardizes workflows—allowing teams to align quickly on operational details and speed up upgrades across the environment. ## Usage The provided steps also apply to Cluster Upgrade Plans, Add-on Upgrade Plans, and Add-on Upgrade Templates. Below are the instructions for **adding, editing, and deleting custom steps** in an Upgrade Template. 1. In the **Chkk Dashboard**, expand **Upgrade Copilot** on the left menu. 2. Click **Upgrade Templates** and then **Clusters**. 3. You will see a list of all existing **Cluster Upgrade Templates**. ![Navigate to Upgrade Copilot](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/features/add-edit-remove-custom-steps-navigation.png) 1. Select the desired **Cluster Upgrade Template**. 2. Select the **three dots** icon on the right of any step to either **Add a step Before** or **Add a step After** the currently selected step.\ ![Add a Custom Step](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/features/add-custom-step-option.png) 3. A **modal** will appear where you can fill in: * **Title**: The name of the custom step. * **Badge**: A tag or label displayed next to the step. * **Required**: Checkbox to mark the step as **required** or **optional**. * **Write/Preview**: Use the Markdown editor and preview window to craft the step's content.\ ![Add a Custom Step Modal](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/features/add-custom-step-modal.png) 4. Click **Add Step** to confirm and insert the new custom step into the current stage. ![Custom Step Added](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/features/custom-step-added.png) 1. Click the **three dots** icon on the right of the custom step you wish to modify. * Select **Edit Step** from the dropdown.\ ![Edit Custom Step Option](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/features/edit-custom-step-option.png) 2. A modal appears with the current details of the step (title, badge, etc.). 3. Make the necessary changes and click **Update Step** to save your edits.\ ![Custom Step Edited](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/features/custom-step-edited.png) 1. Click the **three dots** icon on the right of the custom step you want to remove. * Select **Delete Step** from the dropdown.\ ![Delete Custom Step Option](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/features/delete-custom-step-option.png) 2. A confirmation modal will prompt you to confirm the deletion. 3. Click **Delete Step** to permanently remove the custom step from the selected stage. # Add-on Upgrade Recommendation Source: https://docs.chkk.io/features/addon-upgrade-recommendation A brief description and use-case of the Add-on Upgrade Recommendation feature Chkk's **Add-on Upgrade Recommendation** feature identifies which add-ons, application services, and Kubernetes operators in your clusters require upgrading, which upgrades are optional, and which can remain unchanged. Powered by Chkk's Collective Learning, the feature evaluates factors such as end-of-life (EOL) status, compatibility with your target Kubernetes version, known operational risks, and any enterprise policies you've defined (for example, preferring the latest version or sticking with stable releases). Based on these assessments, Chkk classifies each add-on, application service, and Kubernetes operator upgrade as ***Required***, ***Optional***, or ***Not Recommended***. An add-on, application service, or Kubernetes operator upgrade is deemed **Required** if it faces major compatibility concerns or presents critical risks. Optional upgrades are flagged when the add-on, application service, or Kubernetes operator remains compatible but may be outdated or has known issues. For each Required or Optional upgrade, Chkk recommends a target version that meets criteria such as compatibility, production mileage, and freedom from known risks. If your organization's internal standards differ from Chkk's recommendations—whether you prefer to upgrade all add-ons, application services, and Kubernetes operators or only those that are required—you can override the suggested version. Chkk will factor in your feedback during [Template Regeneration](/features/regenerate-template), producing updated recommendations tailored to your policies. By integrating these upgrade insights directly into your workflow, Chkk helps shorten research time, manage complex add-on, application service, and Kubernetes operator dependencies, and maintain smooth cluster operations with minimal disruptions. The image below shows an example of a Cluster Upgrade Template with Add-on Upgrade Recommendations: ![Add-on Upgrade Recommendation](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/features/addon-upgrade-recommendation.png) # Approve Upgrade Template Source: https://docs.chkk.io/features/approve-upgrade-template A brief description, use-case, and usage instructions for the Approve Upgrade Template feature Chkk's **Approve Upgrade Template** feature ensures that each Upgrade Template is thoroughly reviewed, adjusted to align with organizational guidelines, and officially marked as ready for use. Once an Upgrade Template is **Available**, users can add custom steps, confirm which add-on, application service, and Kubernetes operator versions to upgrade now versus later, and apply internal change-management standards. After all necessary adjustments are in place, the user marks the upgrade template as **Approved for Use**. Chkk automatically logs who approved the upgrade template, giving the entire team visibility into its readiness. This standardized approach **reduces context switching**, **preserves knowledge**, and **promotes the reuse of best practices**—enabling faster upgrades across multiple clusters and ensuring **consistent workflows** organization-wide. ## Usage The provided steps also apply to Add-on Upgrade Templates. Follow the steps below to **Approve a Cluster Upgrade Template**: 1. In the **left-hand column** of the **Chkk Dashboard**, expand **Upgrade Copilot**. 2. Click **Upgrade Templates**, then select **Clusters**. ![Approve an Upgrade Template](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/features/approve-a-template-before-approval.png) 1. Navigate to the **Upgrade Template** you want to approve. 2. Click the **thumbs up** icon to approve it. 1. Once approved, the upgrade template's status changes to **Approved for Use**. 2. The approval date appears under the **Approved** column, along with any comments made by the approver. 1. After approval, a new **CTA Button** will appear, allowing you to [Instantiate Upgrade Plans](/features/instantiate-template-to-plan) from the approved Upgrade Template. ![Approve an Upgrade Template](https://mintlify.s3.us-west-1.amazonaws.com/chkk/images/features/approve-a-template-after-approval.png) # Cancel an Upgrade Template or Plan Source: https://docs.chkk.io/features/cancel-upgrade-template-plan A brief description, use-case, and usage instructions of the Cancel Upgrade Template/Plan feature The **Cancel an Upgrade Plan or Upgrade Template** feature allows you to stop an Upgrade Template or Upgrade Plan at any point in its execution, marking it with a **Canceled** status that remains visible in the dashboard. Once canceled, the workflow stops and the action cannot be undone—which means either a new Plan must be instantiated or a new Template must be requested. This is especially useful when a Template or Plan has become outdated, letting you formally indicate it shouldn't be used. ## Usage The provided steps also apply to Cluster Upgrade Plans, Add-on Upgrade Plans, and Add-on Upgrade Templates. Follow the steps below to **Cancel** a Cluster Upgrade Template: 1. In the **left-hand column** of the **Chkk Dashboard**, expand **Upgrade Copilot**. 2. Select **Upgrade Templates** -> **Clusters**. 1. Find the **Upgrade Template** you wish to cancel in the list. 2. Click the **three dots** icon (⋮) at the far right of the template's entry. 1. From the dropdown, select **Cancel Upgrade Template**. 2. A **modal** will appear asking for confirmation. 3. Click **Cancel Template** to confirm. 1. After cancellation, the template's **Status** will change to **Canceled**. 2. A **Canceled** template can no longer be used or executed. # Captured Learnings Source: https://docs.chkk.io/features/captured-learnings A brief description and use-case of the Captured Learnings feature Chkk's **Captured Learnings** feature automatically preserves and reuses the unique insights your team adds during each upgrade—such as [Custom Steps](/features/add-edit-delete-custom-steps), guidelines, or specific version preferences. Rather than continually updating external runbooks or tools, you simply embed these details (e.g., always upgrading to the newest add-on, application service, or Kubernetes operator release) into an Upgrade Template or Plan. Chkk applies these insights—whether included in [Comments](/features/add-comments), [Custom Steps](/features/add-edit-delete-custom-steps), or shared via feedback—during [Template Regeneration](/features/regenerate-template). Additionally, on future upgrades, these policies and workflows are automatically included in newly generated recommendations, so your team can move faster while ensuring consistent, accurate upgrade practices across every cluster, add-on, application service, and Kubernetes operator. # Environment Upgraded Source: https://docs.chkk.io/features/environment-upgraded A brief description, use-case, and usage instructions of the Environment Upgraded feature Chkk allows users to Mark a Cluster Upgrade Template as **Environment Upgraded**. This feature provides a clear way to indicate that every cluster in an environment (e.g., dev, staging, production) has been upgraded using a specific template. Each template corresponds to a "representative" cluster and can spawn multiple Upgrade Plans—one per cluster in that environment. Once those plans are carried out, marking the template as **Environment Upgraded** signals that all clusters are now running the recommended versions. This doesn't require each plan to be formally marked as **Completed**; rather, it serves as a high-level confirmation that the environment is fully upgraded. As a result, your team gains a concise view of upgrade status without needing to inspect individual clusters, helping prevent confusion about whether further actions are needed. ## Usage Follow the steps below to mark a **Cluster Upgrade Template** as **Environment Upgraded**: 1. In the **left-hand column** of the **Chkk Dashboard**, expand **Upgrade Copilot**. 2. Select **Upgrade Templates** -> **Clusters**. 1. Locate the **Upgrade Template** you wish to mark as **Environment Upgraded** (it should already be marked as **Approved for Use**). 2. To the right of the template entry, select the **check mark** icon. 3. Select **Mark the Upgrade Template as Environment Upgraded** from the dropdown. 1. After confirmation, the **Status** of the upgrade template should have changed to: **Environment Upgraded**. # Feedback Button Source: https://docs.chkk.io/features/feedback-button A brief description, use-case, and usage instructions of the Feedback Button feature The **Feedback Button** provides a quick, frictionless way for users to engage directly with the platform's development, allowing them to report bugs, and request new features. It also lets users ask for help when something isn't clear, while capturing contextual data (such as the current page) to speed up troubleshooting. ## Usage Follow the steps below to provide feedback using the **Feedback** button: 1. In the top-right corner of the **Chkk Dashboard**, click the **messaging icon** to open the feedback modal. 3. Enter your **feedback** or **query** in the text box provided. 4. Click **Submit** to send your feedback. 5. Upon submission, you will see a toast message confirming the submission of the feedback. # Upgrade Plan Instantiation Source: https://docs.chkk.io/features/instantiate-template-to-plan A brief description, use-case, and usage instructions of the Upgrade Plan Instantiation feature Chkk's **Upgrade Plan Instantiation** feature creates a tailored Upgrade Plan from an approved [Upgrade Template](/features/request-upgrade-template), whether you're upgrading a cluster or a specific add-on. For clusters, the plan automatically incorporates details like the cluster name, region, node group configurations, and active applications. For add-ons, instantiation takes an add-on Upgrade Template and produces a plan tailored to a particular add-on instance (e.g., same major.minor version) within a given cluster. All Chkk-recommended steps—and any custom steps you've added—carry forward into each new plan, ensuring **standardized** and **reusable** processes across multiple clusters or environments. Once instantiated, each plan can be tracked through distinct stages such as "In Progress" or "Completed," making it easy to monitor overall progress. This structured, step-by-step approach also simplifies *delegation*, allowing senior engineers to assign upgrade tasks to less-experienced team members with confidence. ## Usage Follow the steps below to **instantiate an Upgrade Plan** from an existing **Upgrade Template** in the Chkk Dashboard. 1. In the **left-hand column** of the **Chkk Dashboard**, expand **Upgrade Copilot**. 2. Select **Upgrade Templates** -> **Clusters**. 3. Locate the desired **Upgrade Template** that is **Approved for Use**. 1. Click the **Instantiate Plan** button next to the selected template. 2. You'll be taken to the **Instantiate Cluster Upgrade Plan** page, where the chosen template is preselected. 1. Under **Select Cluster to Upgrade**, pick the cluster you want to upgrade. 2. *(Optional)* Assign a **Cloud Account** if needed for your upgrade. 1. Click **Instantiate Upgrade Plan** button at the bottom to finalize and create your new plan. 2. You will be redirected to **Upgrade Plans** -> **Clusters** where you can see your newly instantiated Upgrade Plan with the status: **Waiting for Plan**. 1. In the **left-hand column** of the **Chkk Dashboard**, expand **Upgrade Copilot**. 2. Select **Upgrade Templates** -> **Add-ons**. 3. Locate the desired **Add-on Upgrade Template** that is **Approved for Use**. 1. Next to the **Approved** template, click **Instantiate Plan**. 2. The **Instantiate Add-on Upgrade Plan** screen will load, automatically highlighting the selected template. 1. Under **Select Add-on Instance to Upgrade**, pick the add-on you want to instantiate the Upgrade Plan for. 1. Click **Instantiate Upgrade Plan** to finalize your choices. 2. You will be redirected to **Upgrade Plans** -> **Add-ons** where you can see your newly instantiated Upgrade Plan with the status: **Waiting for Plan**. # Upgrade Plan Marked as Completed Source: https://docs.chkk.io/features/mark-upgrade-plan-completed A brief description, use-case, and usage instructions of the Upgrade Plan Marked as Completed feature Chkk's **Upgrade Plan Marked as Completed** feature provides a clear way to finalize the upgrade for each individual cluster. Once all recommended and custom steps are done, a user can mark that cluster's plan as complete. Because each plan is tied to one cluster, completing one upgrade doesn't affect the status of the others. By using this feature, your team can quickly see which clusters are fully upgraded and which still need attention, making it easier to coordinate parallel efforts, allocate resources effectively, and maintain a transparent view of your overall upgrade strategy. ## Usage The provided steps also apply to Add-on Upgrade Plans. Follow the steps below to **Mark an Upgrade Plan as Complete**: 1. In the **left-hand column** of the **Chkk Dashboard**, expand **Upgrade Copilot**. 2. Click **Upgrade Plans**, then select **Clusters**. 3. Identify the Upgrade Plan currently in **In Progress** status that you want to complete. 1. Next to the plan entry, click the **checkmark icon**. 2. A confirmation modal will appear once you click it. 1. *(Optional)* Add any **comments** you wish to include about the plan's outcome. 2. Click **Mark as Complete** to finalize the completion process. 1. Once marked complete, the Upgrade Plan becomes **read-only** and appears with a **Completed** status. 2. The **Marked Completed by** column updates to show the name of the user who completed the plan. # Readiness Checks Source: https://docs.chkk.io/features/readiness-checks A brief description, use-case, and usage instructions of the Readiness Checks feature Chkk's **Readiness Checks feature**—part of the *Upgrade Copilot*—verifies health and safety at each stage of an upgrade by running checks both before and after each change. These checks ensure that add-ons, application services, Kubernetes operators, configurations, and workflows align with your organization's best practices. Chkk delivers them through two complementary methods: 1. **Guided Checks (Manual)** Ideal for one-off or specialized validations, Guided Checks offer step-by-step instructions you can copy and run. This hands-on method ensures your team can inspect unique configurations on-demand without altering automated workflows. 2. **SHARC Checks (Automated)** For recurring or broadly applicable validations, Chkk uses its SHARC (safety, health, and readiness checks) framework. SHARC Checks integrate seamlessly into Upgrade Plans, tracking each run with a unique flow ID and mapping results back to your upgrade workflow. You can also request new SHARC-supported checks for specific add-ons, application services, Kubernetes operators, and configurations, tapping into Chkk's expansive library of SHARC packs. By embedding these checks into each **preverified** upgrade plan—tested on a digital twin of your infrastructure—teams can identify potential issues early, reduce overall risk, and accelerate safe rollouts across single or multiple clusters. # Template Regeneration Source: https://docs.chkk.io/features/regenerate-template A brief description, use-case, and usage instructions of the Template Regeneration feature Chkk's **Template Regeneration** feature lets you quickly refresh an Upgrade Template that has grown stale or needs to reflect new enterprise-specific feedback. If an existing template sits unused for too long—perhaps due to shifting priorities or postponed upgrades—Chkk can generate a revised version aligned with the **constantly evolving Kubernetes ecosystem**. Upon requesting regeneration, Chkk automatically re-analyzes the previous recommendations and publishes an updated generation of the same template, ensuring you always benefit from the latest guidance. Regeneration also comes into play when you override recommended versions for certain add-ons, application services, and Kubernetes operators based on internal policies. For example, if your organization requires upgrading all add-ons, application services, and Kubernetes operators to the latest release, you can provide that feedback to Chkk, then use **Template Regeneration** to produce a fresh set of recommendations that blends your policies with Chkk's research. This way, your upgrade strategy remains consistent with your organization's rules while still incorporating Chkk's continually refreshed insights. The **Template Regeneration** option only appears after a user has used the [**Comments**](/features/add-comments) feature to leave a Comment for Chkk. ## Usage The provided steps also apply to Add-on Upgrade Templates. Follow the steps below to **regenerate** an Upgrade Template in the Chkk Dashboard. 1. In the **left-hand column** of the **Chkk Dashboard**, expand **Upgrade Copilot**. 2. Select **Upgrade Templates** -> **Clusters**. 3. Locate the desired **Upgrade Template** that you want to **Regenerate**. 1. Next to the template's entry, click the **Regenerate Template** button. 2. A modal will appear, informing you that the template will become read-only during regeneration. 1. (Optional) Add any relevant details or feedback in the **Additional Notes** field. 2. Click **Regenerate Template** to proceed. 1. A **toast message** will appear, indicating the template regeneration has been requested. 2. Once regeneration is complete, the template can be used or edited again if needed. # Release Notes Curation Source: https://docs.chkk.io/features/release-notes-curation A brief description, use-case, and usage instructions of the Release Notes Curation feature Chkk's **Release Notes Curation** feature drastically reduces the time spent researching and parsing raw upstream documentation by automatically collating, analyzing, and filtering only the most relevant release notes. Whether you're upgrading Kubernetes, add-ons, application services, operators, or the applications running on top of them, Chkk enriches each note with context-specific insights—from breaking changes and API deprecations to critical bug fixes—so your team knows exactly why a particular update matters. Thanks to Chkk's AI-driven analysis and dedicated research team, these curated notes highlight hidden dependencies and incompatibilities well before they become problems. Where upstream documentation may be ambiguous, Chkk provides clarifying guidance and links back to original sources for those who want deeper details. You'll also find these curated notes integrated directly into your upgrade steps, ensuring you never have to jump between multiple tools or guess which updates warrant attention. By filtering out irrelevant information and spotlighting what truly impacts your environment, Chkk's Release Notes Curation cuts down on research time, reduces uncertainty, and helps teams move faster and more confidently with every upgrade. # Upgrade Template Request Source: https://docs.chkk.io/features/request-upgrade-template A brief description, use-case, and usage instructions of the Upgrade Template Request feature Chkk's **Upgrade Template Request** feature provides a flexible, preverified plan for upgrading entire clusters, individual add-ons, application services, and Kubernetes operators—without scattering crucial details across multiple tools. When you need to upgrade a cluster, you can choose in-place, blue-green, rolling, or even skip versions. For add-ons, application services, and Kubernetes operators, simply request an Upgrade Template tailored to that specific component (e.g., Istio v1.20.2). If you're unsure which approach is best, Chkk can recommend an in-place, blue-green, rolling, or custom path based on your environment, internal strategies, and organizational guidelines. By centralizing all steps and recommendations in a single template, this feature gives teams a **single source of truth** for planning upgrades across multiple clusters, add-ons, application services, and Kubernetes operators. You can easily delegate tasks with other Chkk features—such as **Add Comments** and **Add Custom Steps**—to embed internal guidelines, tools, and workflows into every Upgrade Plan. Chkk also uses **Captured Learnings** to track and reapply your organization's proven upgrade practices, ensuring that future templates and plans automatically inherit the same best-practice steps where applicable. ## Usage Follow the steps below to **Request an Upgrade Template**: 1. In the **Chkk Dashboard**, expand **Upgrade Copilot** on the left menu. 2. Click **Upgrade Templates** and then **Clusters**. 3. You will see a list of all existing **Cluster Upgrade Templates**. 4. On the **Cluster Upgrade Templates** page, locate the **Request Cluster Upgrade Template** button (top-right). 1. Click **Request Cluster Upgrade Template**. 2. From the list presented, choose the **Cluster** that best represents the environment (e.g., dev, prod) for which you want to request a template. 1. Under **Select Upgrade Type**, choose one of the available Upgrade options (e.g., **Pick for me**, **In Place**, **Blue Green**, **Rolling**, etc.). 2. Each option specifies a different upgrade strategy and approach for your cluster. 3. After selecting the Upgrade Type, you'll be redirected back to the **Cluster Upgrade Templates** page, where your newly requested template will appear with the status: **Waiting for Template**. 1. In the **Chkk Dashboard**, expand **Upgrade Copilot** on the left menu. 2. Click **Upgrade Templates** and then **Add-ons**. 3. You will see a list of all existing **Add-on Upgrade Templates**. 4. On the **Add-on Upgrade Templates** page, locate and click the **Request Add-on Upgrade Template** button (top-right). 1. From the list of add-ons, choose the instance that most accurately represents all of the other add-on instances across your fleet. 2. This add-on instance will be used to generate the upgrade template. 1. Review your selected add-on details (e.g., tag, cluster, etc.). 2. Click **Request Add-on Upgrade Template** to finalize your request. 3. You'll be redirected back to the **Add-on Upgrade Templates** page, where your newly requested template will appear with the status: **Waiting for Template**. # Upgrade Guidance Source: https://docs.chkk.io/features/upgrade-guidance A brief description, use-case, and usage instructions of the Upgrade Guidance feature Chkk's **Upgrade Guidance** feature removes guesswork from each Upgrade Template by specifying which add-ons, application services, and Kubernetes operators need upgrading—and which are optional—alongside clear, environment-specific instructions. Chkk highlights important breaking changes, configuration updates, and any key considerations, simplifying raw release notes with concise, curated explanations. For each add-on, application service, and Kubernetes operator, you'll also see a **Helm chart values diff**, annotated to clarify new or removed configurations and their impact. Chkk then provides tailored upgrade steps based on the tools and configurations you use—be it Helm, Argo CD, managed EKS node groups, Karpenter, or custom setups. By embedding this directly in your Upgrade Templates, Chkk saves you hours of manual research and trial-and-error, helping you roll out upgrades with ease and confidence. # Upgrade Preverification Source: https://docs.chkk.io/features/upgrade-preverification A brief description, use-case, and usage instructions of the Upgrade Preverification feature Chkk's **Upgrade Preverification** checks every aspect of your prospective upgrade before publishing an Upgrade Template, ensuring that all recommended steps are valid, safe, and ready for production. By analyzing release notes, Helm charts, CRDs, and more, Chkk identifies potential pitfalls and includes any necessary remediation steps directly in the template. A key part of this process is the use of a **Digital Twin** of your representative cluster—a one-to-one replica that simulates real conditions and add-on configurations. Alongside these simulations, Chkk runs pre-flight and post-flight checks to confirm health, safety, and functionality at each stage of the upgrade. If blockers or compatibility gaps are found, you'll see targeted fixes and workarounds embedded in the final Upgrade Template. By surfacing problems early and providing explicit solutions, **Upgrade Preverification** frees your team from guesswork and time-consuming troubleshooting during live upgrades. Instead, you get clear, real-world tested instructions that streamline upgrades across your environment. # Upgrade Template/Plan Ownership Source: https://docs.chkk.io/features/upgrade-template-plan-ownership A brief description, use-case, and usage instructions of the Upgrade Template/Plan Ownership feature Chkk's Upgrade Template/Plan Ownership feature enables teams to assign clear responsibility to an individual for any Upgrade Template or Upgrade Plan. When an owner is assigned to a Template, only that person can approve it, ensuring that approval remains a single-threaded decision. Once approved, the owner is automatically carried forward to the Upgrade Plan that gets instantiated from that Template, though this can be changed as needed. Upgrade Plans can be marked as completed by any team member, regardless of who the owner is. This feature supports reliable governance across teams that manage specific areas of platform infrastructure—such as service mesh, observability, networking, or IAM—by providing a mechanism to assign accountability for upgrade readiness. Ownership helps streamline approvals, support smoother handoffs during staffing transitions, and maintain continuity in large or distributed teams. If no owner is assigned, the approval process remains open to all users. Ownership can also be reassigned at any time, allowing teams to adapt quickly to changes without interrupting upgrade workflows. ## Usage Follow the steps below to assign, re-assign, or remove an **Owner** for Upgrade Templates and Upgrade Plans: This workflow applies to both Cluster and Add-on Upgrade Templates. 1. In the **left-hand column** of the **Chkk Dashboard**, expand **Upgrade Copilot → Upgrade Templates** and select either **Clusters** *or* **Add-ons**. Click **Request Upgrade Template**. 2. Choose the **representative resource** (cluster for Cluster templates, Add-on instance for add-on templates) that will be used to generate the template. 3. In **Select Owner**, use the **search** bar to select an owner or choose **Owner Not Required** to leave the template ownerless. 4. Complete the remaining fields (Upgrade Type, etc.) and click **Request Upgrade Template**. 5. You'll be redirected back to the **Cluster** *or* **Add-on** **Upgrade Templates** view, where your newly requested template will be visible with the chosen owner shown in the **Owner** column. This workflow applies to Cluster Upgrade Templates, Add-on Upgrade Templates, Cluster Upgrade Plans, and Add-on Upgrade Plans. 1. Navigate to **Upgrade Copilot → Upgrade Templates** and open the **Clusters** or **Add-ons** tab, depending on which template you need to modify. 2. Locate the desired **Upgrade Template** and click the **︙ (three-dots)** menu at the far right of its row. 3. Click on the **search** bar next to the **Change Owner** field to select a new owner or pick **Owner Not Required** to clear the field. 4. The table refreshes automatically, and the **Owner** column now reflects the updated value (or is blank if no owner is set). # Glossary of Terms Source: https://docs.chkk.io/misc/glossary Definitions of key Chkk terms to help you navigate our platform and docs ## Account A grouping of Chkk resources, such as clusters and subscriptions. *Accounts* often map to a specific **billable entity boundary** in the your organization. ## Actions An action is a single, side‑effect‑bearing operation executed by the Workflow—e.g., invoking a REST API, calling an internal micro‑service, running a shell command, or posting to Slack. ## Add-on Upgrade Plan Similar to a [Cluster Upgrade Plan](#cluster-upgrade-plan) but specific to a particular [Add-on](#kubernetes-add-on), [Application Service](#application-service), or [Kubernetes Operator](#kubernetes-operator) instance in a given cluster. Created from an approved [Add-on Upgrade Template](#add-on-upgrade-template). ## Add-on Upgrade Template An AI-curated workflow for a specific [Addon](#kubernetes-add-on), [Application Service](#application-service), or [Kubernetes Operator](#kubernetes-operator) across multiple clusters. While simpler Add-ons, and Application Services can be upgraded as part of the Cluster Upgrade Template, stateful or datapath Add-ons, or Application Services have their own templates for specialized handling. ## AI Agent A software entity that perceives its environment, reasons about what to do next, and autonomously executes actions (often by calling tools) to achieve a user-specified goal. ## AI‑Driven ETL AI‑driven ETL pipelines use machine learning or LLMs to infer schema, cleanse anomalies, redact PII, map fields, and generate transformation code, replacing brittle hand‑coded mappings. ## Application An *Application* is a type of [Project](#project) that **implements business logic or end-user-facing functionality** and is deployed on Kubernetes as part of an [Application Stack](#application-stack). It typically consists of one or more services and depends on both [Application Services](#application-service) and [Kubernetes Add-ons](#kubernetes-add-on). ## Application Service An **Application Service** is a type of Project that provides essential services to the rest of an application stack. An application stack is one or more Projects that provide related non-Kubernetes functionality -- in other words, application-specific functionality. Application Services do *not* extend or implement cluster-level Kubernetes functionality. Examples of Application Services include: * `MySQL`, `PostgreSQL` and `Redis` These provide database services to Application Stacks. * `Kafka` and `NATS` These provide message queueing and event streaming services to Application Stacks. * `ArgoCD`, `FluxCD` Both are [Kubernetes Custom Controllers](/misc/glossary#kubernetes-custom-controller) that provide provide GitOps and CI automation services to Application Stacks running on Kubernetes. * `ArgoRollouts` and `Flagger` Both are [Kubernetes Custom Controllers](/misc/glossary#kubernetes-custom-controller) that provide progressive delivery services for Application Stacks running on Kubernetes. * `Crossplane` and `ACK S3 Controller` Both are [Kubernetes Custom Controllers](/misc/glossary#kubernetes-custom-controller) that provide cloud resources and service integrations to Application Stacks. Dive deeper into Chkk's selection of [covered Application Services ](/projects/application-services/overview). ## Application Stack An *Application Stack* is one or more [Projects](#project) that provide related non-Kubernetes functionality -- in other words, application-specific functionality. ## Blast Radius The **potential scope of impact** an error, failure, or disruption may have on your system. It represents how far negative consequences can propagate within your environment. ## Chkk Cloud Connector *Chkk Cloud Connector* is a secure, read-only integration that fetches relevant resource data from your cloud environment and correlates it with your [Kubernetes Clusters](#kubernetes-cluster). By focusing on resources that affect — or are affected by — your clusters (e.g., security groups, IAM roles, networking settings), the Connector facilitates a unified view of your infrastructure. Cloud Connector metadata is used to detect [Operational Risks](#operational-risk-or) that span beyond the K8s layers. Examples of such cross-layer risks include security groups misconfigurations, risks that are only latent for certain kernel versions, risks that only trigger when using certain AWS services (e.g. Route53 records with external-dns), etc. Details of Chkk Cloud Connector can be found [here](/connectors/chkk-cloud-connector). ## Chkk Dashboard Chkk Dashboard is a UI for you to interact with Chkk. Details of Chkk Dashboard can be found [here](/overview/understanding-chkk). ## Chkk Integrations A set of pre-built connectors which allow Chkk to fit smoothly with your existing tools. Integrations include operational tools (GitHub, Jira, Slack, etc.) and SSO (Okta, PingIdentity). ## Chkk Kubernetes Connector The *Chkk Kubernetes Connector* is composed of two main components: * Chkk Operator * Chkk Agent Working together, these components periodically (or on-demand) extract cluster metadata and ingest it into the Chkk SaaS platform. Once ingestion is complete, Chkk scans and analyzes your environment for potential risks or helpful insights (e.g., add-on instances running in your cluster). Details of Chkk Kubernetes Connector can be found [here](/connectors/chkk-kubernetes-connector) ## Chkk Proxy Filter Redacts any private or sensitive data before it leaves your cluster. It automatically excludes Kubernetes secrets and can be configured to exclude additional data as needed. ## Chkk Research Team A dedicated team of Kubernetes experts that reviews and curates [Operational Risks](#operational-risk-or) to make them actionable. ## Classifier The *Classifier* is Chkk's reasoning engine that transforms raw, customer artifacts into richly-typed objects linked to Chkk’s **Knowledge Graph**. It runs a pipeline of specialized mappers—*digest*, *deduction*, *ruleset*, and *release*—to resolve each resource and container to its most likely Deployment System, Package, Project, Component, and specific Release. ## Contextualizer The *Contextualizer* takes the Classifier’s matched entities and enriches them with situation-specific guidance. It filters change logs, synthesizes readiness checks, flags application-client actions, authors upgrade steps that make sense for the customer’s exact Deployment System, Package version, OCI repository, cluster topology etc. In effect, the Contextualizer translates generic release knowledge into **high-fidelity, environment-aware instructions** that operators can execute with confidence. ## Cluster Metadata Information about a [Kubernetes Cluster](#kubernetes-cluster)—versions, [Pods](#kubernetes-pod), [Add-ons](#kubernetes-add-on), cloud-provided components, node configs, etc. Secrets and config maps are redacted by default, and additional filters can be configured. ## Cluster Upgrade Plan An instantiation of a [Cluster Upgrade Template](#cluster-upgrade-template), customized for a specific [Kubernetes Cluster](#kubernetes-cluster). The instantiated Upgrade Plans inherit all the information present in Upgrade Templates + additional cluster-specific information like: * The cluster's name and region * Node Group details * Which [Applications](#application) are using deprecated/removed APIs? * Are there any Application client changes in [Add-ons](#kubernetes-add-on)? * Are there any Application misconfigurations-like incorrect Pod Disruption Budgets (PDBs)-that can cause the upgrade to fail? Upgrade Plans move through these statuses: 1. **Waiting for Plan** - Being generated. 2. **In Progress** - Ready for execution. 3. **Completed** - Cluster upgraded successfully. 4. **Canceled** - User-initiated cancellation. [More details here](/overview/understanding-chkk) ## Cluster Upgrade Template A *Cluster Upgrade Template* is an AI-curated workflow containing a tested and structured sequence of steps and stages to safely upgrade your [Kubernetes Clusters](#kubernetes-cluster). A Cluster Upgrade Template is generated on-demand and is scoped to an Environment (e.g. dev, staging or prod). Cluster Upgrade Templates support three commonly-used upgrade patterns: In-Place, Blue-Green, and Rolling/Surge. A template includes steps for each stage of an upgrade, preverified via a [Digital Twin](#digital-twin). Once an Upgrade Template is generated, you can perform the following actions:: * **Action**: Add custom markdown steps before or after any step. * **Action**: Request regenerations for refined versions. * **Action**: Approve the template so other team members can instantiate it as an Upgrade Plan for specific clusters. Upgrade Templates move through these statuses: 1. **Waiting for Template** - Being generated. 2. **Available** - Ready for review/customization. 3. **Approved For Use** - May now be used to create Upgrade Plans. 4. **Environment Upgraded** - All clusters in the environment have used this template to upgrade. [More details here](/overview/understanding-chkk) ## Collective Learning Collective Learning comprises a suite of technologies that **codify operational wisdom to prevent incidents, breakages, and disruptions.** Collective Learning has two main parts: 1. [Operational Risk Signature Database (RSig DB)](#operational-risk-signature-database-rsig-db) 2. [Knowledge Graph](#knowledge-graph) These components are defined on the [Understanding Chkk](/overview/understanding-chkk) page, with additional etails provided on the [Technology](https://chkk.io/technology) page. ## Deactivated Clusters [Kubernetes Clusters](#kubernetes-cluster) explicitly offboarded from Chkk via a **"Deactivate Cluster"** action in the [Chkk Dashboard](#chkk-dashboard). Deactivation in the dashboard doesn't remove all Chkk components; you can find instructions for complete removal in the [Troubleshooting](/misc/troubleshooting) page. ## Digital Twin A *Digital Twin* is a virtual replica of your infrastructure, simulating how it runs and interacts. There are four levels of Digital Twins: * **Level 1**: Basic replica using the same cluster version, node versions, and [Add-ons](#kubernetes-add-on) versions with default config. * **Level 2**: Extends Level 1 by including custom configs of all [Add-ons](#kubernetes-add-on). * **Level 3**: Adds dummy [Applications](#application) to emulate functionality on top of Level 2. * **Level 4**: Fully functioning staging environment, an exact replica of your cluster (including real Applications), typically running within your own cloud account. ## Disconnected Clusters [Kubernetes Clusters](#kubernetes-cluster) that have not been [Deactivated](#deactivated-clusters) but are not sending metadata to Chkk. They appear with an alert icon in the [Chkk Dashboard](#chkk-dashboard) so you can diagnose why the [Chkk Kubernetes Connector](#chkk-kubernetes-connector) isn't sending data. ## Grounding Grounding constrains an AI system’s outputs to verifiable fact and policies. ## Grounding Layer A curated corpus that integrates all authoritative sources consumed by the Chkk Knowledge Engine. AI pipelines ingest and normalize each source, while the Chkk Research Team continuously reviews and validates the content. The layer models clouds, open source project, add-ons, and application services so that agents, workflows, and tools can reference the same trusted schema, avoid hallucinations, and maintain provable accuracy for every knowledge attribute. ## Guardrails Rules and policies from a cloud provider, [add-on vendor](#kubernetes-add-on), kubernetes distribution, or the open source community. Whether or not to follow a Guardrail is a business/team decision. ## Helm Chart A *Helm Chart* is a type of [Package](#package) that uses the `helm` [Package System](#package-system) to format its artifacts (called "charts"). ## Knowledge Base Article (KBA) A single page explaining an [RSig](#operational-risk-signature-rsig) or [Guardrail](#guardrail)—covering its severity, impact, trigger conditions, remediation steps, and potentially code snippets. All RSigs and Guardrails have an associated KBA. KBAs support multiple actions on Operational Risks and Guardrails, such as: 1. **Action: Create Ticket** - Generates a Jira ticket for tracking. 2. **Action: Mark** - Mark as False Positive, By Design, or other reasons if not fixing. 3. **Action: Ignore** - Stop receiving notifications for this risk (optionally ignoring only specific resources). ## Knowledge Graph *Knowledge Graph* models and stores AI-curated data and relationships across hundreds of open-source [Projects](#project) and [Add-ons](#kubernetes-add-on) in the Kubernetes ecosystem, modeling their impact and identifying the safest upgrade paths. Oversight is provided for AI-curated data and relationships by the Chkk Research Team. Knowledge Graph covers Kubernetes releases of all major clouds and distributions: EKS, GKE, AKS, VMware Tanzu, OpenShift, Rancher RKE1/RKE2, Nutanix. We also support DIY and Self-Hosted Kubernetes. Chkk also covers 250+ Add-ons, and coverage for a new Add-on can be extended within 48hrs. ## Kubernetes Add-on A **Kubernetes Add-on** is a type of Project that **extends Kubernetes cluster functionality but is not part of the Kubernetes core**. Kubernetes Add-ons typically run inside the cluster as regular workloads (e.g., Deployments, DaemonSets) and provide services like networking, monitoring, logging, and DNS. Kubernetes Add-ons typically modify or enhance *cluster-level behavior*. Examples of Kubernetes Add-ons include: * `CoreDNS` and `kube-dns` They provide cluster DNS services, a critical but optional functionality of the Kubernetes cluster used for service discovery. * `Amazon VPC CNI`, `Cilium` and `Calico` They provide implementations of Kubernetes data plane networking and Container Networking Interface (CNI) plugins. * `Amazon EBS CSI Driver` and `AzureDisk CSI Driver` They provide implementations of the Container Storage Interface (CSI) plugin for plumbing Kubernetes Volumes to a cloud provider-specific storage backend. * `Ingress NGINX controller` and `AWS Load Balancer Controller` They provide implementations of Kubernetes Service and Ingress functionality, which are core Kubernetes Resources. * `Kubernetes Metrics Server` and `kube-state-metrics` They provide cluster-level aggregation of resource usage and resource health metrics that are used by other Kubernetes components like Horizontal Pod Autoscaler and Vertical Pod Autoscaler. * `External Secrets Operator` Despite being called an "Operator", the External Secrets Operator is *not* a [Kubernetes Operator](#kubernetes-operator) because it does not install or manage the lifecycle of some *other* Kubernetes Add-on or Application Service. However, it *is* a Kubernetes Add-on because it implements storage functionality for Kubernetes Secrets and therefore extends the Kubernetes cluster's functionality. For a complete list of Kubernetes Add-ons, see [covered Kubernetes Add-ons](/projects/addons/overview). ## Kubernetes Cluster A *Kubernetes Cluster* (or just "Cluster") is a group of physical of virtual machines ([Kubernetes Nodes](#kubernetes-node)) that run containerized applications. ## Kubernetes Controller A *Kubernetes Controller* is software that follows the [controller design pattern][kube-ctrl-pattern]. This design pattern features a control loop (also called a "reconciliation loop") that repeatedly attempts to make the actual state of some resource match the desired state of that resource. ## Kubernetes Custom Controller A *Kubernetes Custom Controller* is a [Kubernetes Controller](#kubernetes-controller) that tracks and reconcile [Kubernetes Custom Resources](#kubernetes-custom-resource). ## Kubernetes Custom Resource A *Kubernetes Custom Resource* is a special type of [Kubernetes Resource](#kubernetes-resource) that falls outside of the core Kubernetes API groups. ## Kubernetes DaemonSet A *Kubernetes DaemonSet* (or just "DaemonSet") is a type of [Kubernetes Resource](#kubernetes-resource) that describes a related set of [Pods](#kubernetes-pod) that will run on every [Node](#kubernetes-node) in the [Kubernetes Cluster](#kubernetes-cluster). ## Kubernetes Deployment A *Kubernetes Deployment* (or just "Deployment") is a type of [Kubernetes Resource](#kubernetes-resource) that describes a related set of [Pods](#kubernetes-pod) that run an application workload. ## Kubernetes Deployment System A *Kubernetes Deployment System* is something that manages the rollout of [Kubernetes Resources](#kubernetes-resource) like [Deployments](#kubernetes-deployment), [DaemonSets](#kubernetes-daemonset) and [StatefulSets](#kubernetes-statefulset). ## Kubernetes Operator A **Kubernetes Operator** is a type of Project that is responsible for installing and **managing the lifecycle of another Kubernetes Add-on or Application Service**. Generally, Kubernetes Operators encode domain-specific knowledge to manage complex stateful software on Kubernetes. A Kubernetes Operator is often confused with a [Kubernetes Controller](#kubernetes-controller). Almost all Kubernetes Operators use the Kubernetes Controller design pattern, but not all software Projects that use the Kubernetes Controller design pattern are Kubernetes Operators! A Kubernetes Operator is sometimes mistakenly defined as a Kubernetes Controller that uses Kubernetes Custom Resources. The more accurate term for *that* concept is \["Kubernetes custom controller"]\(#kubernetes-custom-controller]. What differentiates a Kubernetes Operator is the focus on **managing complex lifecycle operations for some *other* piece of software**. Examples of Kubernetes Operators include: * `Postgres Operator` A [Kubernetes Custom Controller](#kubernetes-custom-controller) that manages the installation and lifecycle of PostgreSQL database servers, database schemas and database users. * `Prometheus Operator` Manages the installation and lifecycle of Prometheus metrics database and associated Application Services like Prometheus Alertmanager. * `OpenTelemetry (OTEL) Operator` Manages the installation, lifecycle management and configuration of OpenTelemetry collectors and auto-instrumentation libraries. For a complete list of Kubernetes Operators, see [covered Kubernetes Operators](/projects/kubernetes-operators/overview). ## Kubernetes Pod A *Kubernetes Pod* (or just "Pod") is a type of [Kubernetes Resource](#kubernetes-resource) that describes a group of containers with shared storage and network resources. ## Kubernetes Node A *Kubernetes Node* (or just "Node") is a physical or virtual machine that comprises part of a [Kubernetes Cluster](#kubernetes-cluster). ## Kubernetes Resource A *Kubernetes Resource* is a representation of the desired and observed state of some object. Common Kubernetes Resources are [Deployments](#kubernetes-deployment), [DaemonSets](#kubernetes-daemonset) and [StatefulSets](#kubernetes-statefulset). ## Kubernetes StatefulSet A *Kubernetes StatefulSet* (or just "StatefulSet") is a type of [Kubernetes Resource](#kubernetes-resource) that describes a related set of [Pods](#kubernetes-pod) that run an application workload. The difference between a StatefulSet and a [Deployment](#kubernetes-deployment) is that the Pods in a StatefulSet typically have an ordered initialization and the application workload maintains some form of persistent state. ## Mitigation Mitigation is a short‑term workaround that lowers the probability or impact of a risk until full remediation can occur. Typical mitigations include disabling a feature flag, throttling traffic, or rolling back a change. ## Mitigation Workflow A mitigation workflow is a durable workflow that automates or guides the application of mitigations. ## Notifications Inform about team invites, cluster onboarding, newly detected [Operational Risks](#operational-risk-or), or published [Add-on Upgrade Templates](#add-on-upgrade-template), [Cluster Upgrade Templates](#cluster-upgrade-template), [Add-on Upgrade Plans](#add-on-upgrade-plan), and [Cluster Upgrade Plans](#cluster-upgrade-plan). Chkk supports Email, Slack, and in-app notifications. ## Operational Risk (OR) *Operational Risk* refers to any known or potential defect, misconfiguration, or incompatibility in Kubernetes clusters and [Add-ons](#kubernetes-add-on) that can cause incidents, disruptions, or breakages. These risks, which may include known defects or issues stemming from unsupported versions, deprecated APIs, and software nearing end-of-life, are categorized by severity—Critical, High, Medium, or Low. An Operational Risk is detected by scanning for at-risk components, identifying trigger conditions, and assessing availability impact, root cause, remediation steps, and possible mitigations. In Chkk, these risks are codified as [Risk Signatures (RSigs)](#operational-risk-signature-rsig) that continuously scan customer environments to proactively uncover and address Operational Risks before they cause breakages or outages. ## Operational Risk Signature (RSig) An *RSig* is the logic used to detect the presence of a specific [Operational Risk](#operational-risk-or) in your environment. ## Operational Risk Signature Database (RSig DB) *RSig DB* takes inspiration from cybersecurity, where security vulnerabilities are reported publicly in the CVE Database. We extended this idea to operational safety: If there's an Operational Risk (e.g. an error, failure, or disruption) that has happened anywhere in the world, Chkk AI aggregators and data connectors learn about it, convert it into a [Operational Risk Signature](#operational-risk-signature-rsig)-similar to a virus signature-and store it in the RSig DB. Any new Operational Risk Signature is streamed to all our customers, where it is scanned in their environments. That way, our customer can proactively detect, identify, and remediate Operational Risks before they cause breakages and disruptions, much like antivirus software detects and removes viruses before they start causing harm. ## Organization A grouping of multiple Chkk Accounts under a common ownership, typically matching an entire company or large department. ## Package A *Package* is a named bundle of software that is bundled (and identified) in the format of a specific [Package System](#package-system). ## Package System A *Package System* identifies of a packaging ecosystem. ## Preverification Engines Preverification Engines create a [Digital Twin](#digital-twin) of your Kubernetes environment and simulate an [Add-on Upgrade Template](#add-on-upgrade-template) or [Cluster Upgrade Template](#cluster-upgrade-template) before its published to you. Ensuring that all steps are verified to execute without errors. ## Project A *Project* is **software that provides some functionality**. ## Project Release Series A [Project](#project) may have one or more *Project Release Series*. A Project Release Series is a single release *series* for the Project. Project Release Series are identified by a simple string that must be unique for the Project. Typically the ProjectRelease Name will be a `major` or `major.minor` version series, e.g. "4" or "1.28", however this is not universally true. Some Projects use date-based names for the release series, like "2024.04". ## Project Release A [Project](#project) may have one or more *Project Releases*. A Project Release is the coordinated publication of one or more [Release Artifacts](#release-artifact) for the Project having the same Version string. [deps-dev]: https://docs.deps.dev/api/v3/ [kube-ctrl-pattern]: https://kubernetes.io/docs/concepts/architecture/controller/#controller-pattern ## Remediation Remediation is the permanent correction of a defect or risk, aimed at eliminating the root cause rather than merely reducing impact. Remediation is the long-term fix. ## Remediation Workflow The sequence of actions stitched together into a workflow to rollout remediations of a risk or a defect. For instance, code changes, or package upgrades, and validates success via post‑conditions and metrics. ## Revalidation *Revalidation* starts as soon as a new cluster is onboarded. The first pass of Revalidation leverages multiple [Classifiers](#classifiers) (part of [Collective Learning](#collective-learning)) to catalog what's running in your Kubernetes fleet and where. Once Classifiers finish, a secondary pass is kicked off to extend coverage and remove false positives. This process can take up to 24 hours. You can also report false positives via the **Action: Feedback**. [Details of Chkk's coverage dimensions here](/overview/understanding-chkk) ## Risk Scan (aka RSig Scan) An *RSig Scan* matches your running Kubernetes environment against relevant [RSigs](#operational-risk-signature-rsig) in the [RSig DB](#operational-risk-signature-database-rsig-db). Scanning an RSig has two stages: 1. **Contextualize** - Check if the RSig's components/versions are present in your environment. 2. **Test** - If the context is relevant, compare version numbers and conditions to determine if the RSig is present. RSig Scans can be periodic (default: every 12 hours), on-demand, or event-driven (triggered by CI/CD). You can also set the frequency of the scans--see instructions [here](/connectors/chkk-kubernetes-connector#chkk-agent-helm-configuration) ## Scan Engines Scan Engines identify [Operational Risks](#operational-risk-or) at various layers of Kubernetes and the underlying cloud infrastructure. Multiple engines run in parallel, each focusing on a subset of risks. ## Source‑Grounded AI Source‑grounded AI produces answers that link directly back to the underlying asset record or source, enabling users to verify every factual claim and trace the agent’s chain of thought. ## Task‑Specific AI Agent A task‑specific AI agent is an agent pre‑loaded with exactly one core skill and an intentionally narrow toolset, making its behavior easier to predict, govern, and audit. ## Team A group of users or Cloud Identities (such as AWS Accounts, IAM users, and IAM roles) within a [Chkk Organization](#organization). Team members may be assigned ownership of certain resources, like clusters or certain addons across the fleet of clusters. ## Tool A tool is any external capability—software function, API endpoint, database query, CLI command, or workflow-that an AI Agent can invoke to extend its own reasoning and perception. ## Trigger A trigger is the initiating event, schedule, or external signal that instantiates a workflow run. It can be time‑based (cron, interval), state‑based (configuration drift detected), or externally invoked. ## Workflow A workflow is a sequence of coordinated steps that binds AI agents, tools, humans, and even nested workflows into a single deterministic process to achieve a defined operational outcome. Each step’s intent, inputs, outputs, and side-effects are durably recorded, so the entire flow can be replayed, resumed, or audited—guaranteeing that every decision and external action is applied exactly once, in order, and with full lineage. ## Workflow Engine Workflow Engine persists state for long-running multi-step operations such as Upgrade Templates and Upgrade Plans. Upgrade paths are pre-verified on a [Digital Twin](#digital-twin) of the underlying cluster, ensuring predictable and safe execution before changes are applied.\ After pre-verification, Workflow Engine generates long-running, durable upgrade workflows that are highly contextualized and pre-verified to prevent failures. # Troubleshooting Source: https://docs.chkk.io/misc/troubleshooting Troubleshooting **Answer:** These errors typically indicate that your token is either invalid or missing. If you are using a secret-based approach, verify that the secret contains a valid token. If you are installing via Helm, ensure that the Helm chart is upgraded using a valid token. Login to Chkk Dashboard and export a valid Access Token.
``` helm upgrade chkk-operator chkk/chkk-operator --namespace chkk-system --set secret.chkkAccessToken= ``` ``` Release "chkk-operator" has been upgraded. Happy Helming! NAME: chkk-operator LAST DEPLOYED: Thu Aug 17 19:31:58 2023 NAMESPACE: chkk-system STATUS: deployed REVISION: 2 TEST SUITE: None ``` Use the command below to ensure the pod is running: ``` kubectl get pods -n chkk-system ``` ``` NAME READY STATUS RESTARTS AGE chkk-operator-chkk-agent-3kjfqe00fqpe-atpoiks 1/1 Running 0 4d13h ``` If these steps do not solve your issue, please reach out on your private Slack or MS Team Channel or email [support@chkk.io](mailto:support@chkk.io). **Answer:** Please refer to the [Chkk Kubernetes Connector documentation](/connectors/chkk-kubernetes-connector) for instructions on how to use an existing Service Account. **Answer:** Please refer to the [Chkk Kubernetes Connector documentation](/connectors/chkk-kubernetes-connector) for instructions on how to use an existing Secret. **Answer:** Make sure that the Jira project and issue type do not have any required custom fields. Chkk by default only supports providing the default fields to Jira. If you require the use of mandatory custom fields, contact us on Chkk support Slack/MS Team Channel or email us at [support@chkk.io](mailto:support@chkk.io). **Answer:** You can ignore Risks by adding the `chkk.io/ignore` annotation to your Kubernetes resources in your IaC. Use a wildcard (`*`) in the annotation: ``` yaml apiVersion: apps/v1 kind: Deployment metadata: annotations: chkk.io/ignore: "*" deployment.kubernetes.io/revision: "1" meta.helm.sh/release-name: traefik-1 meta.helm.sh/release-namespace: traefik-ns ``` Specify the ARSig IDs you wish to ignore: ``` yaml apiVersion: apps/v1 kind: Deployment metadata: annotations: chkk.io/ignore: CHKK-K8S-1002,CHKK-K8S-602 deployment.kubernetes.io/revision: "1" meta.helm.sh/release-name: traefik-1 meta.helm.sh/release-namespace: traefik-ns ``` **Answer:** Please refer to the [Chkk Kubernetes Connector documentation](/connectors/chkk-kubernetes-connector#configuration-examples-2) for guidance on configuring the Cluster Name and Environment using the ChkkAgent CRD. Alternatively, you can update these settings via the Chkk Dashboard by navigating to `Risk Ledger > Clusters` and clicking **Edit** on the relevant cluster card, or by modifying the values in the cluster's Properties tab. > **Note:** If the Cluster Name or Environment is defined through Infrastructure as Code (IaC), it cannot be modified from the Dashboard. **Answer:** Please refer to the [Chkk Kubernetes Connector documentation](/connectors/chkk-kubernetes-connector#usage) for guidance on configuring the Cluster Name and Environment using the Chkk Kubernetes Connector Terraform Module. Alternatively, you can update these settings via the Chkk Dashboard by navigating to `Risk Ledger > Clusters` and clicking **Edit** on the relevant cluster card, or by modifying the values in the cluster's Properties tab. > **Note:** If the Cluster Name or Environment is defined through Infrastructure as Code (IaC), it cannot be modified from the Dashboard. **Answer:** When a Kubernetes custom resource is deleted, any configured finalizers must be cleared before the object is fully removed. If a finalizer is misconfigured or cannot complete its cleanup, the resource remains stuck in the terminating state. To force-remove the finalizer and allow the deletion to complete, run the following command: ```bash kubectl patch chkkagent/chkk-agent -n chkk-system -p '{"metadata":{"finalizers":null}}' --type=merge ``` This command manually clears the finalizer from the metadata, allowing the resource to be removed successfully. **Answer:** This error commonly indicates that a proxy server or firewall is blocking requests to the Kube API Server. Verify that your Kube API Server address is allowlisted or permitted within your network's proxy/firewall configurations. Example log snippet: ``` 2024-06-26T18:19:47Z ERROR setup unable to start manager {"error": "failed to determine if *v1.ConfigMap is namespaced: failed to get restmapping: failed to get server groups: Get \"https://172.20.0.1:443/api\": Forbidden"} ``` **Answer:** This error is likely caused by your proxy server or firewall blocking traffic to and from the "chkk.io" domain. The ChkkAgent needs to communicate with the Chkk API to sync the cluster state. To fix this issue, you need to allowlist the "chkk.io" domain and its subdomains in your proxy server or firewall. **Answer:** In the Dashboard, deactivate the cluster you want to remove. 1. List all `ChkkAgent` resources: ``` kubectl get chkkagent --all-namespaces ``` 2. Delete all `ChkkAgent` resources: ``` kubectl delete chkkagent --all --all-namespaces ``` 1. Check installed charts: ``` helm ls -n chkk-system ``` 2. Uninstall the chart: ``` helm uninstall chkk-operator -n chkk-system ``` 3. Delete the namespace: ``` kubectl delete ns chkk-system ``` 1. List resources in `chkk-system`: ``` kubectl get all -n chkk-system ``` 2. Delete Operator resources: ``` kubectl delete -f https://helm.chkk.io/chkk-operator/manifest.yaml ``` 3. Delete the namespace: ``` kubectl delete ns chkk-system ``` Finally, remove the CRD: ``` kubectl delete crds chkkagents.k8s.chkk.io ``` **Answer:** This can happen due to a few common misconfigurations: either the Chkk Agent RBAC is incomplete or incorrect, explicit filter rules (especially wildcard-based) are excluding key namespaces, or Chkk API endpoints are not reachable due to network restrictions. ChkkAgent requires specific Kubernetes permissions to access resources for analysis. Please ensure you are using the RBAC definitions provided with the official Chkk Operator Helm Chart or Kubernetes Manifests. Missing or custom-modified roles/clusterroles may cause incomplete onboarding. If you have applied filter rules to exclude namespaces, review them carefully—especially if you're using a wildcard (e.g., \*). Wildcard exclusions can unintentionally block all namespaces from being scanned, resulting in no or limited coverage. The Chkk Agent must be able to communicate with Chkk's API services. Ensure your firewall or proxy settings allowlist all the domains listed in the Chkk Operator prerequisites documentation. Once any misconfigurations are resolved, the Chkk Agent will pick up the changes during the next scheduled scan. The cluster should then get onboarded. If the issue persists after 24 hours, please reach out to your Chkk support contact for further investigation. **Answer:** This error often occurs when a **proxy in the network is intercepting HTTPS traffic**. Specifically, if you use **Squid Proxy with SSL Bump enabled**, the proxy acts as a proxy-in-the-middle and presents its own certificate instead of the actual server certificate. Since this certificate is not signed by a known Certificate Authority (CA), Chkk system refuses the connection due to failed certificate validation. Squid Proxy with SSL Bump intercepts encrypted traffic and re-signs it with an internal/self-signed CA. Chkk system does not trust this certificate by default, which causes the error: ``` tls: failed to verify certificate: x509: certificate signed by unknown authority ``` To allow Chkk Operator to establish a secure connection without interference, please configure Squid Proxy to skip the SSL Bump. This will allow Chkk Operator to use its own certificates. If this does not resolve the issue, please contact your internal network/security team to confirm whether SSL Bump is still affecting `.chkk.io` traffic, and reach out to [support@chkk.io](mailto:support@chkk.io) for further assistance. **Answer:** Both the Chkk Operator and the ChkkAgent Custom Resource Definition (CRD) support overriding default container images. **Default Images**: * **Chkk Operator**: `public.ecr.aws/chkk/operator:` * **ChkkAgent**: * Agent Manager: `public.ecr.aws/chkk/cluster-agent-manager:` * Agent: `public.ecr.aws/chkk/cluster-agent:` ``` kubectl create ns chkk-system ``` ``` helm repo add chkk https://helm.chkk.io helm repo update chkk ``` ``` helm install chkk-operator chkk/chkk-operator \ --namespace chkk-system \ --set secret.create=true \ --set secret.chkkAccessToken= \ --set image.repository= \ --set image.tag= ``` ``` kubectl apply -f - < managerImage: name: EOF ``` **Answer:** 1. Navigate to **Configure > Settings > Clusters > Deactivated Clusters** in your Chkk Dashboard. 2. Locate the cluster you wish to restore and select **Activate Cluster**. 3. After activation, the cluster will reappear in **Risk Ledger** and in the **Artifact Register**, allowing normal management. # Getting Started Source: https://docs.chkk.io/overview/getting-started This Quick Start Guide provides a streamlined, step-by-step process for onboarding to Chkk, ensuring users can quickly start leveraging its risk detection, artifact tracking, and upgrade planning capabilities. ### Step 1: Get Allowlisted for Chkk Access Your organization must be **allowlisted** in order to access and use Chkk SaaS. To request access, contact Chkk Support to get an Organization provisioned for you: [https://www.chkk.io/](https://www.chkk.io/) *** ### Step 2: Onboard Your Kubernetes Clusters Installing the **Chkk Kubernetes Connector** on your clusters enables **real-time risk detection, artifact tracking, and upgrade planning**. #### Install Chkk Kubernetes Connector 1. **Log in** to [Chkk Dashboard](https://www.chkk.io/). 2. Navigate to **Risk Ledger** → **Clusters** and click **"Add Cluster"**. 3. Choose your preferred installation method: * **Helm** * **Kubernetes YAML** * **Terraform** 4. Follow the installation instructions to deploy the Chkk Kubernetes Connector. #### What Happens Next? * **Automated Risk Scan:** A **Risk Ledger scan** runs immediately upon onboarding. Within **15 minutes**, the scan will be completed and by clicking on the Cluster card, you will be able to see all the Operational Risks detected in your environment, broken by risk categories. * **Cluster Insights:** * **Risk Ledger**: View all **Operational Risks** detected in your clusters. * **Artifact Register**: Get a **version-first inventory** of the control-planes, add-ons, application services, and Kubernetes operators running in your fleet. **Next Steps:** * Learn more about the [Chkk Kubernetes Connector](/connectors/chkk-kubernetes-connector). * Learn more about the [Risk Ledger](/product/risk-ledger) and the [Artifact Register](/product/artifact-register). *** ### Step 3: Connect Your Cloud Accounts Chkk integrates with your **AWS, GCP, and Azure** Cloud Accounts to facilitate in the creation of holistic [Upgrade Template and Upgrade Plans](/product/upgrade-copilot) that are tailored to your environment. #### Install Chkk Cloud Connector 1. Navigate to **Configure** → **Cloud Accounts**. 2. Click **"Add Cloud Account"** and select your provider (**AWS, GCP, or Azure**). 3. Follow the guided steps to create a **read-only** IAM role or service account. Once connected, your **Cloud Account** should show as **"Connected"** in the Chkk Dashboard. **Next Steps:** Learn more about the [Chkk Cloud Connector](/connectors/chkk-cloud-connector). *** ### Step 4: Install the Chkk Slack App Stay informed with **real-time alerts** from Chkk by integrating Slack. #### Setup Steps 1. Click **"Integrations"** in the left menu. 2. Select **Slack** and follow the guided installation steps. 3. Once installed, Chkk will send **real-time notifications** for: * **New Operational Risks detected** in clusters. * **Published Upgrade Templates and Plans**. ### Need Help? * **Use Slack Connect** to bring Chkk into your workspace for direct collaboration. * **Email Support:** [support@chkk.io](mailto:support@chkk.io) # Welcome to Chkk Source: https://docs.chkk.io/overview/introduction Your Upgrade Copilot for Open Source built for Kubernetes and hundreds of Kubernetes Add-ons, Application Services, and Operators Chkk is the smarter, safer way to upgrade Kubernetes. Designed to eliminate the complexity of managing dependencies, configurations, and upgrades across add-ons, application services, and Kubernetes operators, Chkk makes every upgrade seamless, reliable, and risk-free. ## What is Chkk? Chkk is an AI-curated Operational Safety Platform for Kubernetes and its vast ecosystem of add-ons, application services, and Kubernetes operators. It leverages **Knowledge Graphs**, **Risk Signature Databases**, and **Agentic AI systems** to proactively identify hidden dependencies, unknown incompatibilities, and potential risks before they cause failures. By generating **pre-verified, AI-curated upgrade workflows** on demand, **Chkk Upgrade Copilot** provides teams with structured, safe, and efficient upgrade plans—speeding up the upgrades by 3x to 5x while eliminating last-minute surprises and disruptions. ## Why Chkk? * **Pre-Verified, Safe Upgrade Workflows** - Chkk generates on-demand, AI-curated workflows that are tested and structured to eliminate risks. * **AI-Powered Insights** - Uses up-to-date data from breaking changes, deprecated APIs, and dependency shifts to create a highly contextualized upgrade plan. * **Long-Running & Durable Workflows** - Backed by a workflow engine, ensuring safe, structured, and repeatable upgrades. * **Seamless Integration** - Works across cloud and on-prem Kubernetes environments without disrupting existing workflows. Supports EKS, GKE. AKS, on-prem and more. Check [Support](/overview/support-compatibility) for details. ## Safe Upgrades with Curated Workflows Traditional runbooks and best practices quickly become outdated or inconsistently applied, leading to operational drift. Chkk Upgrade Copilot solves this by enabling teams to generate **Highly Contextualized, AI-Curated, Long-Running, Safe Upgrade workflows** on demand—ensuring every upgrade is structured, tested, and risk-free. When a user initiates a request, Chkk Upgrade Copilot dynamically generates a contextualized **Upgrade Template**, tailored to their specific infrastructure. Behind the scenes, Chkk: * **Reviews all release notes** to identify relevant breaking changes. * **Analyzes dependencies and incompatibilities** to generate a pre-verified upgrade sequence, mitigating risks. * **Tracks deprecated APIs, configurations, and features** to ensure compliance and stability. * **Provides real-time visibility and actionable insights**, as a sequence of steps so teams can execute upgrades with confidence. By transforming upgrade research, validation, and execution into **sequenced, repeatable workflows**, Chkk ensures that operational knowledge isn't just documented—it's **standardized, executed, and scaled across the organization**. ## How to Use This Documentation Navigate through the sections based on your needs: Set up Chkk and integrate it into your upgrade workflow. Learn how Chkk identifies risks and generates safe upgrade paths. Get a quick overview of every Project Chkk covers. Learn about Chkk's use cases. With Chkk, Platform, DevOps, and SRE teams can confidently upgrade faster, avoid unnecessary risks, and stay ahead of compliance, [**Extended Support**](https://www.chkk.io/eks-extended-support), and [**Forced Upgrade**](https://www.chkk.io/blog/why-forced-eks-gke-upgrades-are-a-business-continuity-risk) deadlines. Let's get started. # Support and Compatibility Source: https://docs.chkk.io/overview/support-compatibility An overview of the cloud providers, Kubernetes distributions, Kubernetes Add-ons, Application Services, and Operators that Chkk supports ## Cloud Providers Amazon Web Services Google Cloud Platform Microsoft Azure *** ## Kubernetes Providers Amazon Elastic Kubernetes Service Google Kubernetes Engine Azure Kubernetes Service Enterprise-grade Kubernetes powered by VMware. Red Hat's Kubernetes platform. Kubernetes distributions from Rancher. Enterprise-grade Kubernetes powered by Nutanix. Self-installed vanilla Kubernetes. Fully on-premises Kubernetes deployments. *** ## Kubernetes Add-ons, Application Services, and Kubernetes Operators Chkk provides coverage for 100s of Kubernetes add-ons, application services, and Kubernetes Operators—such as Istio, Cilium, NGINX Ingress Controller, Cert-Manager, Vault Secrets Operator and more. For each one, Chkk offers: * Curated Release Notes * Preflight/Postflight Checks (Safety, Health, and Readiness) * End-Of-Life (EOL) Information * Version Incompatibility Information * On-demand, contextualized Upgrade Templates (In-Place, Blue-Green) * Detection of Custom Built Images and Packages (Helm, Kustomize, Static Manifests etc) * Preverification [Click here](/projects/overview) to learn more about Chkk's coverage of add-ons, application services, and Kubernetes Operators. # Technology Source: https://docs.chkk.io/overview/technology

## Collective Learning Collective Learning is Chkk’s always-on knowledge refinery. Its purpose is to capture insights from across the open-source ecosystem—hundreds of OSS projects, add-ons, applications, and services—and convert that raw change stream into source-grounded, machine-actionable knowledge. Every downstream function—classification, risk scanning, upgrade planning, automated actions—depends on the accuracy, freshness, and auditability of this layer. Continuously running **Source Feeds** harvest upstream signals from release notes, official documentation, GitHub issues, container registries, cloud bulletins, and official blog posts. These incoming events are routed into highly specialized, **Task-Specific AI Agents**. Each agent is responsible for a distinct artifact type—such as a breaking change or OS compatibility—and executes a deterministic, **AI-driven ETL** pipeline: extract the relevant fragment, transform it to Chkk’s canonical schema, and load the candidate fact for validation. All authoritative sources used by the Knowledge Engine are integrated into a **Grounding Layer** that is both AI-curated and subject to human oversight. This Grounding Layer models and organizes all required inputs to reliably identify clouds, open-source projects, add-ons, and application services in a customer’s environment. The **Chkk Research Team** actively reviews and validates every source incorporated into this layer, ensuring ongoing trustworthiness. Agents, workflows, and tools then rely on this curated corpus to prevent AI hallucinations and uphold the accuracy of knowledge attributes. Curated facts are written to two data stores. The first is the **Risk Signature Database (RSig DB)**, which houses every Risk Signature (RSig)—complete with severity, trigger conditions, and mitigations. The second is the **Knowledge Graph**, which encodes compatibility edges, version metadata, packaging information, component hierarchies, end-of-life schedules, and safety guardrails for Kubernetes, add-ons, and applications. Whenever Chkk onboards a new OSS project, add-on, or cloud distribution, Collective Learning automatically extends coverage across changelogs, compatibility timelines, image registries, package systems, and upstream GitHub activity. This ensures that the moment a community, cloud or vendor discloses a breaking change, publishes a versioned artifact or posts changelogs, they are ingested, verified, tagged, curated and made available for downstream reasoning within minutes. ## Artifact Collection Artifact Collection involves the retrieval of raw customer configuration and metadata. Because this metadata is private to a customer, it is stored separately from any data refined through the Collective Learning systems. Configuration and metadata are collected on a continuous basis. These periodic collections form an auditable timeline of snapshots that enable Chkk to detect changes to the customer’s infrastructure configuration and risk profile. ## Classification Classification connects collected customer configuration and metadata to Chkk’s Collective Learning corpus, enabling the platform to richly contextualize and reason about every inventory object. Upon ingest, each resource arrives unclassified—identified only by its raw coordinates: cluster ID, namespace, kind, name, and hash. A multi-stage classification pipeline then resolves these opaque blobs into fully enriched inventory records. Classifiers systematically detect and assign properties, including Deployment System (e.g., Terraform, ArgoCD, FluxCD, Helm, kubectl), Project, Project Release, Project Component, Package System (e.g., Helm, Kustomize, Kube, Terraform), Package, Package Release, Package Component, OCI Registry, OCI Repository, OCI Tag, and OCI Artifact. The pipeline supports custom overrides to accommodate customer-specific realities—such as private registries, internal charts, or hardened AMIs—ensuring that even bespoke infrastructure components are correctly identified and reasoned over. ## Contextualization Contextualization begins once classification has pinned every inventory object to its exact Project, Package, Release, and deployment metadata. Contextualizers layer situational intelligence onto those links—pruning changelogs to only the deltas that affect the customer, composing readiness probes that reflect the cluster’s actual topology, flagging pre-upgrade actions for client teams, and generating upgrade steps that align with the specific Deployment System, Package version, and OCI artifacts in play. By translating canonical release knowledge into environment-aware instructions, contextualization turns abstract matches into clear, executable guidance operators can trust. ## Reasoning and Generation Engines Reasoning and Generation Engines power Chkk’s core reasoning modules—Upgrade Copilot, Artifact Register, and Risk Ledger. Together they reconcile two parallel truths: the classified inventory of what is currently running and the knowledge from Collective Learning. This is the platform’s decision cortex, converting source-grounded intelligence into production-safe change artefacts for the Action Engines to execute. ## Action Engines Action Engines convert high-level intent—“mitigate this risk,” “upgrade that add-on,” “snooze for 30 days”—into precise, auditable changes across code, infrastructure, and collaboration systems. Each engine is a purpose-built, domain-specific workflow. These engines span multiple functional categories—from planning and preverification to remediation, validation, collaboration, monitoring, and reporting. They generate Upgrade Plans and readiness reports, Preverify upgrades in a Digital Twin, apply temporary safeguards or permanent fixes, verify change safety, manage workflow ownership, enforce SLAs, and produce governance-ready audit reports. ## Workflows Actions are stitched into durable workflows that orchestrate every operational objective end-to-end. Chkk ships a library of workflow Blueprints—spanning planning, mitigation, remediation, preverification, and reporting—running on the Durable Workflow Fabric. For instance, a “Fix Misconfigured PDBs” workflow assigns the owner, defines fix criteria (e.g., making sure allowedDisruptions > 0), sets a need-by date with an SLA timer, schedules Slack or email reminders, opens a Jira ticket if one doesn’t exist, monitors continuously until the condition is fully remediated, auto-closes the ticket upon verification, and notifies stakeholders when the mitigation is complete. ## Blueprints A Blueprint is a reusable, parameter-driven recipe that compiles one or more canonical Actions into a fully wired workflow on Chkk’s Durable Workflow Fabric. Chkk ships Blueprints spanning 100s of open-source project change lifecycles and lets enterprises author custom Blueprints to encode their own governance, approval, and remediation processes. # Understanding Chkk Source: https://docs.chkk.io/overview/understanding-chkk Upgrading Kubernetes clusters and add-ons is complex, error-prone, and time-consuming. The process is highly manual, with teams sifting through scattered release notes and documentation for every component. Small changes in one layer can introduce hidden incompatibilities in another, risking outages and forcing expert-level troubleshooting at every step. Because it's so complex, it can't be easily delegated or automated, creating bottlenecks, long delays, and added costs—including extended support fees and mounting technical debt. This is the problem Chkk is designed to solve. Chkk is an AI-curated Operational Safety Platform for Kubernetes and its Add-ons. It leverages Knowledge Graphs, Risk Signature Databases, and Agentic AI systems to proactively identify hidden dependencies, unknown incompatibilities, and potential risks before they cause failures. By generating pre-verified, AI-curated upgrade workflows on demand, Chkk Upgrade Copilot provides teams with structured, safe, and efficient upgrade plans—speeding up the upgrades by 3x to 5x while eliminating last-minute surprises and disruptions. ## Building Blocks ### Upgrade Templates An **Upgrade Template** is an AI-curated workflow containing a tested and structured sequence of steps and stages to safely upgrade your Kubernetes clusters. An Upgrade Template is generated on-demand and is scoped to an Environment (e.g. dev, staging or prod). Upgrade Templates support three commonly-used upgrade patterns: In-Place, Blue-Green, and Rolling/Surge. Upgrade Templates answer all the questions that Platform Teams have to address in an upgrade: It starts with "**Research**": * What versions and configuration of control planes, nodes, add-ons and applications are running? * Have any of these versions reached EOL? * Are any of these versions incompatible with each other? * What Operational Risks should I be aware of before upgrading? E.g. * Are there open defects in specific versions that have caused breakages or disruptions? * Are there any misconfigurations that I should fix prior to upgrading? -Which next version of control planes and add-ons should we upgrade to? More specifically: * What's the version matrix where all Add-ons are compatible with the next Kubernetes version? * What breaking changes (CRDs, configuration, features, behavioral changes,…) will we encounter when executing the upgrade? * Are there any hidden dependencies that can break Add-ons or applications? * Can I upgrade Add-ons to the desired version directly or do we need to do it as a sequence of upgrade hops to avoid schema breakages and/or incompatibilities? Then comes "**Preparation**": * What changes in Add-ons' Helm charts and CRDs we must cater for before executing upgrades? * What preflight checks for control plane and Add-ons should we run to ensure it's safe to execute upgrades? * What exact steps (code diffs) should be applied to upgrade Add-ons, control plane, and nodes, and in which order? * What postflight checks should be run to ensure everything is healthy after the upgrade is complete? An Upgrade Template answers all the above questions and its entire workflow of steps and stages is pre-verified to work without failures on a Digital Twin of your Kubernetes environment. While simple add-ons (e.g. VPC CNI, cert-manager, External Secrets Operator, etc.) can be upgraded with the cluster, complex add-ons (e.g. Istio, Contour, Consul, etc.) generally required a dedicated **Add-on Upgrade Template** with steps and stages specific to that add-on. Add-on Upgrade Templates ensure upgrade safety by enabling you to execute complex add-ons' upgrades before or after cluster upgrades. Your team reviews and customizes an Upgrade Template by collaborating through comments inside the Template, adding your own custom steps, and finally approving the Upgrade Template for execution. And now you are ready for "**Upgrade Execution**": This is where you instantiate **Upgrade Plans** for each cluster in the Environment. The instantiated Upgrade Plans inherit all the information present in Upgrade Templates + additional cluster-specific information like: * Which applications are using deprecated/removed APIs? * Are there any application client changes in Add-ons? * Are there any application misconfigurations-like incorrect Pod Disruption Budgets (PDBs)-that can cause the upgrade to fail? All activities performed on the Upgrade Templates and Upgrade Plans are stored in long-running, durable workflows, ensuring safe, structured, and repeatable upgrades. Upgrade Templates get all the information they need from two foundational technology components: **Risk Signature Database (RSig DB)** and **Knowledge Graph**. ### Operational Risks Operational Risk refers to any known or potential defect, misconfiguration, or incompatibility in Kubernetes clusters and add-ons that can cause incidents, disruptions, or breakages. These risks, which may include known defects or issues stemming from unsupported versions, deprecated APIs, and software nearing end-of-life, are categorized by severity—Critical, High, Medium, or Low. An Operational Risk is detected by scanning for at-risk components, identifying trigger conditions, and assessing availability impact, root cause, remediation steps, and possible mitigations. In Chkk, these risks are codified as Risk Signatures (RSigs) that continuously scan customer environments to proactively uncover and address Operational Risks before they cause breakages or outages. ## Chkk Service Chkk is a secure, scalable multi-regional solution offered in the US and EU. It is built using a cell-based architecture and supports multi-tenant and single tenant instances. It comprises multiple services that run APIs and microservices to serve product modules and maintain inventory records. At the heart of the Chkk Service are Classifiers and Engines, working in tandem to provide real-time insights and operational resilience for Kubernetes environments. Classifiers leverage references from the Knowledge Graph and RSigDB, either extracting image digests (hash-based) or applying RuleSets (rule-based) to identify Kubernetes resources and their relationships. Engines then use these classification results to detect latent Operational Risks, ensure guardrail conformance, and generate Upgrade Templates. They also create digital twins to preverify proposed Upgrade Plans, helping ensure that all steps can be executed without failures, and enabling smoother, more reliable operations. ## Risk Signature Database (RSig DB) RSig DB takes inspiration from cybersecurity, where security vulnerabilities are reported publicly in the CVE Database. We extended this idea to operational safety: If there's an Operational Risk (e.g. an error, failure, or disruption) that has happened anywhere in the world, Chkk AI aggregators and data connectors learn about it, convert it into a Risk Signature-similar to a virus signature-and store it in the RSig DB. Any new Risk Signature is streamed to all our customers, where it is scanned in their environments. That way, our customer can proactively detect, identify, and remediate Operational Risks before they cause breakages and disruptions, much like antivirus software detects and removes viruses before they start causing harm. RSigDB's AI aggregators and data connectors scour Operational Risks from the following sources of information: * Release schedules * Upstream Tickets / Issues * Release notes / Changelogs * Pull Requests * Cloud provider knowledge basis and issue trackers Customers can also voluntarily opt-in to share Operational Risks with Chkk-we don't learn anything from the customer directly. RSig DB also codifies Guardrails which represent operational best practices from across the Kubernetes ecosystem (upstream communities, cloud providers, and vendors). Platform Teams use these Guardrails to ensure Application developers are conforming to their Platform's operational excellence standards. ## Knowledge Graph Knowledge Graph stores AI-curated data and relationships across hundreds of open-source projects and add-ons in the Kubernetes ecosystem, modeling their impact and identifying the safest upgrade paths. Oversight is provided for AI-curated data and relationships by the Chkk Research Team. Knowledge Graph covers Kubernetes releases of all major clouds and distributions: EKS, GKE, AKS, VMware Tanzu, OpenShift, Rancher RKE1/RKE2, Nutanix. We also support DIY and Self-Hosted Kubernetes. Chkk also covers 250+ add-ons, and coverage for a new add-on can be extended within 48hrs. Coverage Extension is done on a continual basis by an agentic AI architecture, with multiple task-specific AI agents identifying and curating information from the following sources: * Release Notes are curated with emphasis on breaking changes and upgrade considerations * Image hashes from upstream registries * Components are modeled, where an add-on may package other add-ons (e.g. Redis-OSS is packaged inside ArgoCD) * Package Systems (Helm, Kustomize, etc.) * Package Sources (HelmCharts, KustomizeSources, KubeSources, etc.) * Deployment Modes (e.g. Istio has two Deployment Modes: Sidecar, Ambient) * EOL policies * Version Compatibility: With upstream Kubernetes versions With the cloud substrate (e.g. Amazon EKS versions and Amazon AMI versions), and With other add-ons (e.g. a Contour version being compatible with certain Envoy versions) * Safety, Health and Readiness Checks, which include per-version preflight, inflight and postflight checks packaged in single-click, ready-to-run containers Using the Knowledge Graph, Chkk identifies the safest Upgrade Paths for clusters and add-ons. This path discovery, at times, contains multiple upgrade hops. ## Chkk Dashboard Chkk Dashboard is a UI for you to interact with Chkk-this interaction includes, but isn't limited to, the following actions: * Onboarding clusters, managing access tokens * Inviting team members * Operational Risks: latent in clusters, reading Knowledge Base articles about these Risks, and performing actions (like Ignoring a risk or leaving comments for team members) * Guardrails: that are not followed, reading Knowledge Base articles about these Guardrails, and exposing these Guardrails to Application Teams through an API integration * Upgrade Templates: Requesting, customizing, reviewing and approving Upgrade Templates. * Upgrade Plans: instantiating and executing Upgrade Plans. * Artifacts: inventory extracted from running components, container images, repositories, and tools across multiple clusters, clouds and layers of infrastructure. * Integrations: with internal tools like GitHub, Slack, etc. Also includes SSO integration. # Artifact Register Source: https://docs.chkk.io/product/artifact-register Inventory of everything in a Kubernetes fleet Chkk Artifact Register maintains an inventory of all components, container images, repositories, and tools across multiple clusters and clouds. It gives you visibility into what exists where, reducing the need for manual and error-prone tracking using spreadsheets and scripts that you currently use. Platform, DevOps, and SRE Teams use Artifact Register to: * Stay up to date with visibility across all infrastructure layers. * Consolidate insights across clouds, on-prem, and edge. * Get alerted about all EOL and incompatible versions. * Avoid maintaining internal spreadsheets or custom scripts to manage Kubernetes components' inventory. # Chkk Operational Safety Platform Source: https://docs.chkk.io/product/introduction Chkk Operational Safety Platform Chkk Operational Safety Platform is your Upgrade Copilot for managing the lifecycle of Kubernetes, and hundreds of add-ons, application services, and Kubernetes operators. It helps enterprises avoid 500% Extended Support fees for EKS, GKE, and AKS while preventing 'Forced Upgrades' that disrupt operations. Enterprise customers rely on Chkk to upgrade smoothly—eliminating disruptions and last-minute scrambles. Platform, DevOps, and SRE teams use Chkk to anticipate and prevent breakages from deprecated features, hidden add-on, application service, and Kubernetes operator dependencies, and rushed upgrade cycles. With real-time visibility and actionable insights, Chkk ensures every upgrade is well-tested, safe, and compliant—eliminating last-minute scrambles to avoid Forced Upgrades. Chkk Operational Safety Platform is delivered as a SaaS which seamlessly integrates into existing workflows and scales across your entire Kubernetes fleet, whether in the cloud or on-prem. With Chkk, you stop overpaying for Extended Support, ship faster with fewer surprises, and stay ahead of Extended Support and Forced Upgrade deadlines. # Marketplaces Source: https://docs.chkk.io/product/marketplaces Purchase Chkk from AWS Marketplace and GCP Marketplace Chkk is available on the [Amazon Web Services (AWS) Marketplace](https://aws.amazon.com/marketplace/seller-profile?id=b4a9cf23-f6c9-48fe-bf73-e267f983ed27) and [Google Cloud Platform (GCP) Marketplace](https://console.cloud.google.com/marketplace/product/chkk-mp-public/chkk-business-edition). Our marketplace listings make it easier for AWS and GCP customers to adopt Chkk's solution through their existing cloud accounts. By being listed on AWS and Google Cloud Marketplaces, Chkk can be discovered and deployed with just a few clicks. You can leverage streamlined procurement including consolidated billing through your existing AWS and GCP subscriptions. AWS and GCP users can now integrate Chkk into their workflows faster and without the usual procurement hurdles, accelerating time-to-value. Amazon Web Services Google Cloud Platform # Subscription Plans Source: https://docs.chkk.io/product/pricing-plans Business and Enterprise Plans | | **Business** | **Enterprise** | | :---------------------------------------------------------------------------------------------------------------------- | :--------------------------------------------------------- | :-------------------------- | | **Clusters** | Unlimited | Unlimited | | **Users** | Unlimited | Unlimited | | **Risk Ledger** (All Risk Signatures) | Included | Included | | **Artifact Register - Clusters** (EKS, GKE, AKS, VMware Tanzu, OpenShift, Rancher RKE1/RKE2, Nutanix, DIY, Self-Hosted) | Included | Included | | **Artifact Register - Add-Ons** | Included | Included | | **Upgrade Copilot** | 3 Upgrade Cycles
included | Fleetwide upgrades included | | **Nodes** | Up to 100 included (additional nodes purchased separately) | Fleetwide nodes included | | **SSO Support** | - | Included | | **Support for Multiple Organizations** | - | Included | | **Custom Integrations** | - | Included | | **Dedicated Hosted VPC Support** | - | Available | | **Custom Contracts, Multi-Year Contracts** | - | Available | | **Premium Support Response SLA** | - | Included | | **Contractual Uptime Guarantee** | - | Included | | **Volume Discounts** | - | Available | ## FAQs Chkk product plans can be configured to meet the needs of enterprises, startups, and service providers. A node is a physical or virtual server/machine. We don't charge you per-node. We only charge in slabs of 100 nodes (1-100, 101-200, …). Chkk Enterprise Plan is tailored for organizations that need additional support and customization. It includes all features of the Business Plan plus single sign-on (SSO), multi-organization support, custom integrations, premium support with response SLAs, and contractual uptime guarantees, making it ideal for enterprises with strict compliance requirements. Yes, we offer volume discounts as the number of nodes and Upgrade Templates increases. Contact us to discuss your use case. You are not charged for infrequent bursts of autoscaled nodes because Chkk uses the average number of nodes in every monthly measurement cycle.\ We compute these average node counts based on a monthly measurement cycle from the first day to the last day of each month. At the end of each cycle, we sum all node count measurements for each cluster and divide by the total number of measurements for that cluster. You are not charged for infrequent bursts of autoscaled nodes as Chkk uses the average number of nodes in every monthly measurement cycle. We compute this average node count based on a monthly measurement cycle that spans from the first day to the last day of every month. To compute average node count across the fleet, at the end of each measurement cycle, we take the sum of all node count measurements taken across all the clusters and divide it by the total number of measurements. Yes, AWS and GCP customers can purchase Chkk from the [AWS Marketplace](https://aws.amazon.com/marketplace/seller-profile?id=b4a9cf23-f6c9-48fe-bf73-e267f983ed27) and [GCP Marketplace](https://console.cloud.google.com/marketplace/product/chkk-mp-public/chkk-business-edition). # Risk Ledger Source: https://docs.chkk.io/product/risk-ledger Eliminate risks before they cause incidents Chkk Risk Ledger is similar to security risk ledgers, but tailored specifically towards identifying contextualized Operational Risks in your Kubernetes infrastructures. It enables you to become proactive in addressing potential failures before they happen. Platform, DevOps, and SRE Teams use Risk Ledger to: * Eliminate risks before they cause incidents. * Implement Guardrails to ensure Platform and App teams comply with Kubernetes best practices * Better conformity, fewer incidents, lower operational toil, improved team productivity, and offsetting 1000s of hours of post-incident break-fix work * Shift-left your Platform and application's availability by proactively catching and fixing safety, health and readiness risks. * Prioritize top business-critical risks with clear and actionable insights. * Continuously harden your Kubernetes availability. * Learn from others and avoid repeating their mistakes. # Upgrade Copilot Source: https://docs.chkk.io/product/upgrade-copilot Your Upgrade Copilot for Kubernetes, Add-ons, Application Services, and Kubernetes operators. Avoid 500% EKS, AKS, and GKE Extended Support Fees. Prevent Forced Upgrades. Chkk Upgrade Copilot helps you plan and execute safe upgrades of clusters, add-ons, application services, Kubernetes operators, and applications. Upgrade Copilot provides Upgrade Plans containing a detailed sequence of steps that need to be executed for remediation. Upgrade Copilot also preverifies these steps on a digital twin of your infrastructure, executing the prescribed sequence of steps, to validate that the upgrade works as expected. Platform, DevOps and SRE Teams use Chkk Upgrade Copilot to stay ahead of Kubernetes upgrades and: * Avoid "forced upgrades" by cloud providers (EKS, GKE, AKS) [Blog: Forced Upgrades](https://www.chkk.io/blog/why-forced-eks-gke-upgrades-are-a-business-continuity-risk) * Avoid 500% EKS Extended Support, GKE Extended Support, and AKS Long-Term Support Fees [Blog: EKS Extended Support](https://www.chkk.io/blog/amazon-eks-extended-support), [Blog: GKE Extended Support](https://www.chkk.io/blog/gke-extended-support), [Blog: AKS Long-Term Support](https://www.chkk.io/blog/aks-long-term-support-and-eks-extended-support-similarities-differences) * Upgrade their fleets 3-5x faster and without any breakages * Prevent errors, breakages and failures by deploying critical bugs are deployed in their Kubernetes environment * Ensure security and compliance by patching CVEs and avoiding EOL versions running in prod environments * Standardize and delegate upgrades process, minimize errors, and free up time for expert team members * Publish Preverified Upgrade Plans to application teams so they can upgrade their own clusters * Unlock innovation by delivering new infrastructure features to application developers # Amazon VPC CNI Source: https://docs.chkk.io/projects/addons/amazon-vpc-cni Chkk coverage for Amazon VPC CNI. We provide curated release notes, preflight/postflight checks, and Upgrade Templates—all tailored to your environment. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.7.1 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.11.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Amazon VPC CNI Overview Amazon VPC CNI (Amazon VPC Container Network Interface plugin) is the default networking plugin for AWS Kubernetes clusters (EKS). It runs on every node and creates Elastic Network Interfaces (ENIs) to assign VPC IP addresses to pods. This direct integration with AWS networking provides high-performance, low-latency pod connectivity on EKS. However, managing and upgrading the VPC CNI can be challenging - misconfigurations or unplanned changes could lead to IP exhaustion or network outages. ## Chkk Coverage ### Curated Release Notes Chkk tracks official VPC CNI release notes and flags updates that matter, such as IP allocation changes or new prerequisites. Instead of combing through upstream change logs, operators see the changes that matter most for running clusters. By focusing on operational impact, these curated notes help you quickly identify shifts in IP management behavior, required configuration updates, or potential upgrade pitfalls. ### Preflight & Postflight Checks Upgrading the VPC CNI without proper checks can result in pod networking failures or downtime. Chkk runs preflight checks to ensure your environment meets all requirements before an upgrade. It verifies Kubernetes version compatibility and checks for known prerequisites - for instance, AWS notes that to upgrade to VPC CNI v1.12.0+ you must first be on v1.7.0, so Chkk will warn you if an intermediate upgrade is needed. After the upgrade, postflight checks confirm everything is healthy: Chkk verifies that all nodes are running the new CNI version, pods are receiving IPs normally, and there are no abnormal CNI pod logs or performance regressions. ### Version Recommendations Chkk continuously monitors Amazon VPC CNI's release cycle and support timeline to keep you on a stable, supported version. It tracks which plugin releases are recommended for each Kubernetes version and flags any end-of-life (EOL) or out-of-support versions running in your clusters. You'll get guidance on safe upgrade targets, ensuring you stay ahead of deprecations and security fixes before support lapses. ### Upgrade Templates Chkk provides **Upgrade Templates** for both in-place updates and blue-green rollout strategies. For an in-place upgrade, Chkk walks you through updating the CNI DaemonSet or EKS add-on step by step while maintaining network availability. For a more blue-green approach, Chkk can assist in setting up a parallel environment to test the new CNI version: for instance, bringing up a new set of nodes (or a staging cluster) with the updated CNI, moving a portion of workloads for validation, and then migrating traffic fully. Each template includes safety checkpoints and rollback instructions, so you can confidently promote the new CNI once it's verified, or quickly revert if needed. ### Preverification By simulating the entire upgrade in an isolated "digital twin," Chkk detects conflicts—like IP exhaustion or IAM permission gaps—before they hit production. For example, if the new version changes how IP addresses are allocated or introduces stricter requirements for custom networking (ENIConfig) or security group per pod features, those incompatibilities or errors will surface in the simulation. Any anomalies like IP assignment failures, excessive latency, or configuration warnings are flagged early. This means you only proceed with the real upgrade once it's been proven safe in a replica of your environment. ### Supported Packages Whether you install the Amazon VPC CNI via Helm, Kustomize, or raw manifest, Chkk automatically detects your installation method and adapts its plan accordingly. Regardless of your packaging method (self-managed or EKS-managed add-on), Chkk's checks and recommendations adapt accordingly. It also supports custom images or private registries - ensuring that even if you're using a forked CNI build or an air-gapped registry, the upgrade workflow and validation checks still function seamlessly for your setup. ## Common Operational Considerations * **IP Allocation Challenges:** Each node has a finite pool of IPs based on how many ENIs it can attach, so running out of available addresses will halt pod scheduling. Adjusting instance sizes, using warm IP targets, or enabling prefix delegation helps avoid IP exhaustion. * **Scaling Considerations:** As clusters grow, VPC-level IP space can become a bottleneck if subnets run out of capacity. Monitoring IP usage at scale and tuning the warm IP pool protect against AWS API throttling and networking delays. * **Security Best Practices:** Ensure your nodes have the correct IAM roles (e.g., AmazonEKS\_CNI\_Policy) and consider using NetworkPolicies to enforce zero-trust. Regularly update the CNI for security patches and keep security groups aligned with least-privilege principles. * **Compatibility Issues:** VPC CNI is tightly integrated with EKS, so chaining other CNIs or supporting Windows nodes requires matching versions. * **Monitoring & Troubleshooting:** Pod IP assignment errors often surface in the aws-node logs or as failed scheduling events. Collecting Prometheus metrics and leveraging CloudWatch alarms allow quick detection of IP exhaustion or ENI attachment failures. ## Additional Resources * [Amazon VPC CNI Documentation](https://docs.aws.amazon.com/eks/latest/userguide/managing-vpc-cni.html) * [Amazon VPC CNI Releases](https://github.com/aws/amazon-vpc-cni-k8s/releases) # Calico Source: https://docs.chkk.io/projects/addons/calico Chkk coverage for Calico. We provide version recommendations, preflight/postflight checks, and Upgrade Templates—ensuring worry-free operations. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v3.14.1 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v3.18.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Calico Overview Calico is an open-source networking and security solution for container platforms, providing policy-driven controls to secure workloads. It can leverage layer 3 routing with BGP or encapsulation (VXLAN, IP-in-IP) for flexible deployment across cloud or on-premises environments. Calico supports advanced features like eBPF for improved performance and integrates with Kubernetes APIs for network policy enforcement. It scales with large clusters using components like Typha to reduce load on the datastore. Calico's breadth of features makes it a popular CNI choice for production-grade environments that require robust security and dynamic routing. ## Chkk Coverage ### Curated Release Notes Chkk consolidates Calico release notes into a concise summary of relevant changes—highlighting major updates, security patches, and deprecations that directly affect your clusters. You no longer need to comb through every upstream note; Chkk filters the noise to provide tailored impact analysis. For instance, if a new Calico version requires updated IP pool settings or changes a default BGP parameter, you receive a targeted alert. This ensures platform engineers stay on top of key operational shifts without wading through less pertinent details. Chkk also calls out new features worth exploring, like eBPF improvements or helm chart modifications. ### Preflight & Postflight Checks Chkk's preflight checks confirm your environment meets the new Calico version's prerequisites, such as supported Kubernetes versions and no deprecated fields in your Calico CRDs. They also validate network configuration—e.g., ensuring IP pools don't overlap and that BGP or VXLAN settings won't break under new defaults. After the upgrade, Chkk's postflight checks verify that calico-node DaemonSet, kube-controllers, and Typha (if present) are running correctly and that network connectivity remains intact. These checks also look for anomalies in policy enforcement, such as unexpected traffic blocks. By rapidly diagnosing upgrade issues, Chkk saves time and reduces risk in production. ### Version Recommendations Chkk tracks Calico's support timeline and notifies you when a release is nearing or past EOL, mitigating security and compatibility risks. It also correlates which Calico versions best pair with your Kubernetes version—helping avoid combos that break key features like network policy or eBPF data-plane. If you're on a release series with known vulnerabilities or performance problems, Chkk flags a safer, more stable upgrade target. You can rely on this guidance to plan proactive upgrades, rather than reacting to urgent CVEs or operator surprises. This keeps clusters secure while ensuring the networking stack remains aligned with upstream support. ### Upgrade Templates Chkk provides step-by-step templates for Calico upgrades tailored to each packaging method (Helm, operator, or manifest). You can choose in-place updates, which automatically roll out new DaemonSet pods, or a blue-green approach that deploys a parallel Calico version before switching traffic. Each template includes recommended pre-upgrade tasks like snapshotting CRDs, verifying BGP sessions, or ensuring no IP pool exhaustion. Chkk also outlines rollback points in case new configurations create unexpected connectivity issues. This workflow reduces guesswork and ensures a consistent, repeatable upgrade process. ### Preverification Before rolling changes to production, Chkk tests the entire Calico upgrade in a controlled environment or via dry-run checks. It simulates applying the new CRDs, verifying network policies, and updating calico-node on a subset of nodes. If any conflicts arise—like unsupported config parameters or IP pool overlaps—Chkk flags them early so you can adjust. This "practice run" is especially valuable when enabling advanced features like eBPF or altering core networking modes. By catching issues ahead of time, preverification minimizes downtime and operational surprises. ### Supported Packages Chkk supports all major Calico installation methods: Helm, operator-based installs, and direct Kubernetes manifests. It respects your existing GitOps or CI/CD workflows and makes minimal, targeted changes to upgrade manifests or chart values. That means your custom images, private registries, and any additional patches remain intact after an upgrade. Even if you switch between manifest-based and operator-based management, Chkk tracks your resources consistently across the lifecycle. This flexibility helps you standardize Calico lifecycle tasks alongside other cluster components. ## Common Operational Considerations * **BGP Peering Failures:** If you're using Calico in BGP mode, ensure node IP autodetection is correct and TCP port 179 is open between all peers. Misconfigured ASNs, firewall restrictions, or missing route reflectors can disrupt peering sessions and lead to traffic blackholes. * **IP Pool Exhaustion:** Large or dense clusters can quickly run out of available pod IP addresses when IP pools are too small or fragmented. Adding new pools, increasing block sizes, or enabling IPAM garbage collection helps maintain a healthy pool allocation. * **eBPF Mode Stability:** Calico's eBPF data plane can boost throughput and reduce latency, but it requires a modern Linux kernel (5.3+ recommended) and does not support IP-in-IP. Thorough testing is essential to confirm kernel compatibility and ensure stable operation under real workloads. * **Scaling (Large Clusters):** For clusters above 100 nodes, deploying Typha reduces load on the datastore by batching updates to calico-node. Monitoring Typha's resource usage and ensuring sufficient capacity for policy processing are key to preventing performance bottlenecks. * **Multi-Cluster Networking:** Calico supports cluster mesh for cross-cluster communication via BGP or overlay tunnels, but each cluster must have unique IP pools and correct route exports. Careful coordination of routes, policies, and potential IP pool overlaps is crucial to avoid connectivity conflicts. * **Policy Enforcement Pitfalls:** Calico's network policies default to a deny-all stance once any policy is applied, requiring explicit allow rules for essential traffic like DNS. Overlapping global or tiered policies can override local settings, making periodic policy audits important for consistent security. ## Additional Resources * [Calico Documentation](https://docs.tigera.io/) * [Calico Releases](https://github.com/projectcalico/calico/releases) # cert-manager Source: https://docs.chkk.io/projects/addons/cert-manager Chkk coverage for cert-manager. We provide preflight/postflight checks, curated release notes, and Upgrade Templates—designed for seamless upgrades. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v0.14.3 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.1.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## cert-manager Overview cert-manager is a Kubernetes add-on that automates issuing and renewing TLS certificates for your workloads. It integrates with multiple CAs (including ACME providers like Let's Encrypt) to streamline certificate requests and validation. The project provides CRDs for defining issuers and certificates, along with a webhook for enforcing policy. By continuously monitoring and renewing certificates before they expire, cert-manager significantly reduces manual overhead. With proper resource tuning, it scales in both small and large clusters. ## Chkk Coverage ### Curated Release Notes Chkk tracks cert-manager release notes to highlight new features, breaking changes, or CRD updates relevant to your environment. It alerts you to shifts like a renamed API group or removed fields so you can adapt configurations before upgrading. Each curated summary points to potential operational impacts, saving teams from combing through long changelogs. You stay focused on what matters, avoiding unexpected downtime. ### Preflight & Postflight Checks Chkk's preflight checks validate Kubernetes version, CRDs, and webhook readiness before any cert-manager upgrade. They also spot deprecated API usage that could fail in the new release. Postflight checks confirm healthy controller pods, functioning webhooks, and successful certificate issuance after the upgrade. This ensures certificate renewals continue uninterrupted. By quickly catching anomalies, teams can address issues early with Chkk. ### Version Recommendations Chkk continuously tracks cert-manager's support timeline and flags EOL risks or security patches for your version. It factors in Kubernetes compatibility to help you pick stable, fully-supported releases. Alerts arrive well before your version becomes unsafe or unmaintained. You also get suggestions on the best minor version to minimize breakage. By following these recommendations, you stay ahead of critical updates. ### Upgrade Templates Chkk provides step-by-step templates for either in-place or blue-green cert-manager upgrades. In-place workflows update CRDs and the controller in sequence, while blue-green approaches deploy a parallel cert-manager instance before switching over. Both methods include rollback points and checks to ensure healthy certificate issuance. Automation is straightforward via Helm, Kustomize, or kubectl. Each template simplifies the overall upgrade experience. ### Preverification Chkk can simulate each cert-manager upgrade in a test environment to identify issues early. It applies your Issuer and Certificate configurations, runs the new controller, and checks for ACME or webhook errors. This dry-run approach lets you fix problems before rolling out to production. By validating your exact setup, preverification reduces downtime risk. Quick feedback loops give you confidence for a trouble-free upgrade. ## Common Operational Considerations * **Webhook Availability & Readiness:** The cert-manager webhook must be reachable by the API server to approve certificate-related custom resources. Downtime or misconfiguration can halt certificate issuance across the cluster. * **ACME Challenges:** Ensure proper DNS and ingress settings for HTTP-01 or DNS-01 challenge methods. Misconfigurations can cause certificates to remain pending or fail to renew. * **Certificate Expiration Monitoring:** cert-manager auto-renews certificates, but it's critical to monitor expiring certificates and observe renewal events. Prompt alerts prevent overlooked renewals and outage risks. * **Issuer & CA Rotation:** Rotating CAs or issuers requires re-issuing certificates with the new trust chain. Plan overlapping trust windows and verify the new CA is correctly propagated to workloads. * **Performance Tuning:** Large volumes of certificates can strain the controller. Allocate adequate CPU/memory, monitor queue lengths, and scale replicas if issuance throughput is impacted. ## References * [cert-manager Documentation](https://cert-manager.io/docs/) * [cert-manager Releases](https://github.com/cert-manager/cert-manager/releases) # Cilium Source: https://docs.chkk.io/projects/addons/cilium Chkk coverage for Cilium. We provide curated release notes, preflight/postflight checks, and Upgrade Templates—all tailored to your environment. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------------------- | | **Chkk Curated Release Notes** | v1.7.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.10.0 to latest | | **Supported Packages** | Helm, Kustomize, Static Manifests | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Cilium Overview Cilium is a cloud-native networking and security solution for Kubernetes that uses eBPF (Extended Berkeley Packet Filter) to dynamically program the Linux kernel. By doing so, it achieves highly efficient network routing, identity-aware policy enforcement, and fine-grained observability through Hubble. This approach avoids the overhead of traditional iptables-based CNIs, delivering better performance at scale. With Cilium, platform teams can seamlessly enforce zero-trust policies, enable transparent encryption of pod-to-pod traffic, and gain deep insights into network flows without needing application-level instrumentation. ## Chkk Coverage ### Curated Release Notes Chkk tracks official Cilium release notes and distills them into briefings that highlight key updates, new features, security advisories, and deprecations. Rather than sifting through every upstream change, you get a concise overview of what has changed and how it might affect your clusters. For example, if Cilium removes the deprecated `cidrs` field from the `CiliumLoadBalancerIPPool` CRD, Chkk flags any IP pool definitions still using that field and helps you transition to the new `blocks` syntax before it impacts IP allocation. ### Preflight & Postflight Checks Before upgrading Cilium, Chkk runs preflight checks to verify kernel capabilities, ensure your Kubernetes version is compatible, and identify any usage of deprecated APIs. It validates that key components like Hubble, Cilium DaemonSet configurations, and network policies meet the new version's requirements. After the upgrade, postflight checks confirm your pods are using the correct Cilium agents, check for datapath errors, and ensure that all cluster nodes remain reachable. This extra validation step catches issues (e.g., resource constraints or leftover pods on outdated sidecar versions) that might otherwise cause intermittent downtime. ### Version Recommendations Chkk continuously monitors the Cilium support matrix, pointing out when your current version is nearing EOL or has known issues impacting stability. By cross-referencing upstream release notes and Kubernetes compatibility, it suggests stable upgrade targets that reduce risk. Instead of blindly picking the newest release, operators can rely on Chkk's recommendations for a version that aligns with long-term support guidelines, ensuring you maintain a secure and reliable platform. ### Upgrade Templates Chkk provides two main upgrade paths for Cilium: in-place and blue-green. With an in-place upgrade, the existing Cilium DaemonSet is updated sequentially across all nodes. This minimalistic approach is straightforward but demands careful monitoring to avoid disruption. The blue-green method, on the other hand, stands up a second Cilium deployment in parallel. Operators can progressively migrate workloads to the new version, verifying stability before retiring the old. Chkk's templates for both approaches define step-by-step commands, rollback options, and best practices—helping to minimize downtime if an unexpected issue arises mid-upgrade. ### Preverification For major Cilium upgrades or large multi-cluster environments, Chkk's preverification feature offers a risk-free way to test changes. It spins up a temporary environment mirroring your current configuration—complete with your Cilium CustomResourceDefinitions, network policies, and Hubble settings—and applies the new Cilium version. The simulator checks for configuration conflicts, resource bottlenecks, or datapath issues triggered by new eBPF features. This helps ensure that the real upgrade will roll out smoothly, saving you from surprises in production. ### Supported Packages Chkk integrates seamlessly with whatever installation method you use—Helm, Kustomize, or plain Kubernetes manifests—so you can continue managing Cilium in your existing GitOps or CI/CD pipelines. It also supports private registries and custom-built images, ensuring you maintain consistency with your organization's security and compliance mandates. Whether you apply a standard Helm chart or maintain your own patched Cilium container images, Chkk can parse your manifests or values files, detect your current version, and propose an optimized upgrade plan. ## Common Operational Considerations * **Gradual Node Upgrades:** Avoid upgrading all nodes simultaneously; instead, update Cilium on a subset of nodes, validate performance, then proceed to the rest. * **Preserve Custom Configurations:** If you have custom Helm values or YAML overrides for encryption, policy settings, or L7 proxies, merge those carefully when moving to a new chart or manifest. * **Kernel & eBPF Requirements:** Verify that your kernel version includes the eBPF features required by the new release. Missing modules or too-strict sysctl settings can cause failures in Cilium's datapath. * **Policy Testing:** After each upgrade, test ingress and egress rules to confirm enforcement. Subtle changes in how Cilium interprets policy CRDs can disrupt traffic if you rely on advanced features like L7 filtering. * **Hubble Observability:** Watch Hubble flows and metrics for anomalies. If flow logs suddenly drop or specific flows are denied, it may indicate CRD or agent misconfigurations post-upgrade. * **Track Resource Usage:** Large-scale clusters running eBPF programs can stress node resources. Monitor CPU and memory usage on worker nodes, especially if you enable additional Cilium features like kube-proxy replacement. ## Additional Resources * [Cilium Documentation](https://docs.cilium.io/en/stable/) * [Cilium Releases](https://github.com/cilium/cilium/releases) * [Hubble UI Documentation](https://docs.cilium.io/en/stable/observability/hubble/hubble-ui/index.html) # Cluster Autoscaler Source: https://docs.chkk.io/projects/addons/cluster-autoscaler Chkk coverage for Cluster Autoscaler. We provide version recommendations, preflight/postflight checks, and Upgrade Templates—ensuring worry-free operations. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.19.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.21.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Cluster Autoscaler Overview The Cluster Autoscaler automatically adjusts your Kubernetes worker nodes to match demand, adding nodes when resources are scarce and removing them when they're underutilized. It integrates deeply with the scheduler, simulating how pods would fit on current or potential nodes. With provider-specific implementations, it manages resources on AWS, Azure, GCP, and more. Version alignment with your Kubernetes release is critical to avoid simulation mismatches. Proper configuration can optimize both performance and cost efficiency, making the autoscaler indispensable for production-grade clusters. ## Chkk Coverage ### Curated Release Notes Chkk constantly scans Cluster Autoscaler's upstream releases and consolidates the findings into concise updates for your specific environment. These updates highlight critical features, deprecations, and bug fixes without forcing you to sift through raw GitHub logs. Each note is contextualized against your current setup, making it easy to see potential impacts. You'll know immediately if new flags, provider integrations, or default behaviors could affect your clusters. This targeted approach helps you prioritize important patches and coordinate safe rollouts. ### Preflight & Postflight Checks Before upgrades, Chkk performs thorough preflight checks to confirm that credentials, autoscaling ranges, and provider-specific configurations align with the new release. These checks catch issues—like missing IAM roles or malformed node pool tags—before they can disrupt cluster operations. After deployment, postflight checks validate if the autoscaler handles real scale-up and scale-down events without errors. Should logs or metrics indicate issues, Chkk alerts you right away. This end-to-end validation drastically reduces unexpected downtime. ### Version Recommendations Chkk tracks the autoscaler's compatibility matrix to ensure you run a version that matches your Kubernetes release. It alerts you when your current autoscaler build enters end-of-life or has known security flaws, prompting a timely upgrade. By monitoring both community support cycles and practical runtime feedback, Chkk avoids recommending a version that's either too cutting-edge or already obsolete. It also factors in existing cluster size, workload patterns, and provider constraints to guide your choice. With Chkk's version picks, you maintain alignment with best practices and stable operation. ### Upgrade Templates Chkk provides **Upgrade Templates** for safely upgrading the Cluster Autoscaler using either in-place or blue-green strategies. Each template spells out prerequisite checks, step-by-step rollout instructions, and clear rollback procedures to mitigate risk. By embedding these templates in your CI/CD pipeline, you enforce consistent processes across multiple clusters. This standardization reduces manual errors and ensures a defined path to revert if something goes wrong. Overall, it's a systematic way to keep your autoscaler current without sacrificing service continuity. ### Preverification Through preverification, Chkk simulates the entire upgrade process in a controlled environment, mirroring your real cluster as closely as possible. It installs and tests the new autoscaler version, triggers scale-up/down actions, and monitors for errors or unexpected behavior. Issues like broken flags, slower-than-expected provisioning, or changes in default thresholds are detected before they hit production. You can then refine your configurations or resource allocations with minimal risk. By revealing potential pitfalls early, preverification streamlines your production rollout. ### Supported Packages Chkk supports multiple installation methods—Helm charts, Kustomize overlays, or plain YAML manifests—to accommodate diverse infrastructure needs. It recognizes official builds, forks, and private registry images alike, mapping each to relevant checks and release notes. This means you won't miss critical updates just because you use a custom image or a GitOps-driven workflow. Chkk adapts seamlessly to your deployment model, preserving consistency across all clusters. The end result is complete autoscaler coverage, regardless of how you package it. ## Common Operational Considerations * **Scale-down Disruptions:** Use PodDisruptionBudgets to ensure critical workloads aren't over-evicted, and only label pods as non-evictable when truly necessary. This guards against unexpected downtime or lost replicas when nodes are removed. * **Resource Fragmentation:** Configure the autoscaler's expander (e.g., "least-waste") to avoid leaving half-empty nodes. Regularly tune pod resource requests and pick instance types that align with average workload footprints. * **Cloud Provider Limitations:** Adjust poll intervals and keep node group sizes within your API limits to avoid throttling. Monitor resource quotas so the autoscaler isn't blocked by capacity constraints in your account or region. * **Priority Expansions:** Configure node group priority expanders to ensure high-priority workloads always get capacity first. Set expendable cutoff for low-priority pods that shouldn't trigger large and costly scale-ups. * **Graceful Termination:** Check that your workload containers handle SIGTERM properly and complete shutdown before autoscaler timeouts. Use terminationGracePeriodSeconds aligned with the autoscaler's max drain interval to prevent forced kills. ## Additional Resources * [Node Autoscaling Documentation](https://kubernetes.io/docs/concepts/cluster-administration/node-autoscaling/) * [Kubernetes Autoscaler Releases](https://github.com/kubernetes/autoscaler/releases) # Contour Source: https://docs.chkk.io/projects/addons/contour Chkk coverage for Contour. We provide version recommendations, preflight/postflight checks, and Upgrade Templates—ensuring worry-free operations. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.2.1 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.17.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Contour Overview Contour is a Kubernetes ingress controller that uses Envoy as the data plane, allowing dynamic routing and real-time updates. It introduces the HTTPProxy CRD for advanced capabilities, including path rewrites, TLS configuration, and delegation. Contour separates the control plane (Contour) from the data plane (Envoy) so configuration changes do not require restarts. It is a CNCF Incubating project that stresses performance, multi-tenancy, and security. Recent releases add deeper Gateway API support and refined CRDs, making it easier to modernize ingress traffic management. ## Chkk Coverage ### Curated Release Notes Chkk tracks and summarizes official Contour releases, highlighting critical changes, deprecations, and feature updates. It saves time by providing an actionable overview of new config requirements, resource shifts, or potential security implications. Contextual flags help you quickly identify if older CRDs like IngressRoute are no longer supported. The release notes also warn you if Envoy defaults have changed, potentially altering routing or TLS behavior. This means you won't have to read every upstream release detail to spot what impacts your clusters. ### Preflight & Postflight Checks Chkk runs automated checks before and after a Contour upgrade to confirm version compatibility, CRD alignment, and resource health. Preflight checks validate cluster readiness, required CRDs, and any deprecated objects, while postflight confirms that the new Contour and Envoy instances are functioning correctly. This approach proactively catches common problems, such as invalid certificate setups or leftover outdated configurations. It also monitors logs, traffic, and workload states, ensuring your routes remain healthy. As a result, you can upgrade with confidence knowing each stage was validated. ### Version Recommendations Chkk continuously monitors Contour's lifecycle to warn you of any EOL versions or upcoming support drops. It suggests stable, compatible releases that align with your Kubernetes environment and security requirements, referencing official support timelines. You can rely on its guidance to upgrade at the right time without skipping critical patches. The platform pinpoints known issues or deprecated CRDs, helping you focus your efforts on necessary migrations. This balance of proactive alerts and curated recommendations streamlines upgrade planning. ### Upgrade Templates Chkk provides in-place or blue-green upgrade templates, each detailing best-practice steps to reduce downtime. These templates walk you through CRD updates, rolling out new Contour pods, and verifying Envoy connections. In-place upgrades preserve the existing deployment, while blue-green deploys a new version in parallel, letting you switch traffic when ready. Both approaches include rollback paths in case of unexpected errors or misconfigurations. By following these structured plans, you minimize disruptions to external traffic during Contour upgrades. ### Preverification Chkk offers a dry-run "digital twin" of your Contour upgrade in an isolated environment. This simulated deployment uncovers resource conflicts, CRD migration issues, or Envoy filter mismatches before affecting production. If anomalies are detected—like broken routes or high resource usage—you can adjust configurations early. This proactive rehearsal helps reduce rollbacks and downtime. Ultimately, it provides an extra layer of certainty when deploying new Contour versions in production. ### Supported Packages Chkk works with common Contour deployment methods, including Helm charts, Kustomize, and raw Kubernetes manifests. It detects which approach you're using and tailors upgrade commands, patch files, or Helm steps accordingly. Private registries and custom builds are also supported, with checks to ensure images are tagged and available. This flexibility means you won't have to change your existing workflows or repository structure. The result is a consistent upgrade experience regardless of how Contour is managed. ## Common Operational Considerations * **Sequential Upgrades:** Avoid skipping minor versions to ensure CRD changes and config migrations run smoothly. This reduces unexpected incompatibilities and makes rollback safer if issues arise. * **DaemonSet Resource Usage:** Envoy typically runs as a DaemonSet with hostPorts on each node, so plan rolling updates carefully. Conflicts can arise if the new version tries to bind ports before the old Envoy pod is terminated. * **CRD Deprecation:** Watch for deprecated or removed objects like IngressRoute, and migrate them to HTTPProxy. Keeping CRDs up to date avoids validation failures and controller errors in new Contour releases. * **Envoy/Contour Certificates:** Ensure TLS certificates for Contour-Envoy communication and ingress workloads remain valid and updated. A mismatch can break traffic or block new pods from registering properly. * **Monitoring & Metrics:** Contour and Envoy both provide metrics that should be scraped by Prometheus or equivalent. Monitor these for signs of configuration errors, traffic anomalies, or resource saturation after upgrades. * **Ingress Class:** If you run multiple ingress controllers or parallel Contour versions, use distinct ingress classes or namespaces. This prevents conflicting status updates and helps isolate troubleshooting when testing new releases. ## Additional Resources * [Contour Documentation](https://projectcontour.io/docs/1.30/) * [Contour Releases](https://github.com/projectcontour/contour/releases) # CoreDNS Source: https://docs.chkk.io/projects/addons/coredns Chkk coverage for CoreDNS. We provide version recommendations, preflight/postflight checks, and Upgrade Templates—ensuring worry-free operations. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.6.7 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.7.1 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## CoreDNS Overview CoreDNS is a lightweight, extensible DNS server for modern deployments. It uses a plugin-based model, enabling quick customization and dynamic updates based on cluster state. As the default DNS and service discovery component in Kubernetes, CoreDNS integrates tightly with the Kubernetes API. In addition to scalability, it offers a consistent interface for managing DNS-based traffic policies. CoreDNS's design helps teams efficiently handle large-scale DNS requests with minimal overhead. ## Chkk Coverage ### Curated Release Notes Chkk monitors official CoreDNS release notes and flags new features, plugin changes, or deprecations relevant to your clusters. This helps avoid unexpected DNS failures caused by removed directives. Chkk prioritizes details that could break upgrade paths in your environment. It streamlines discovery of known issues, so you can plan updates effectively. Each alert includes step-by-step guidance for mitigating potential risks. ### Preflight & Postflight Checks Chkk performs preflight scans to detect incompatible Corefile directives, ensuring your config is ready for an upgrade. After upgrading, postflight checks verify that new CoreDNS pods are healthy and serving DNS correctly. This reduces risk by catching syntax errors, plugin mismatches, or resource constraints early. You'll receive alerts if pods fail readiness probes or crash-loop. That way, you can correct issues before affecting critical workloads. ### Version Recommendations Chkk tracks CoreDNS versions and correlates them with Kubernetes versions, highlighting when your current DNS version is at risk. The tool factors in security patches and plugin maturity to suggest stable upgrade paths. This keeps you aligned with actively maintained CoreDNS releases while accounting for real-world usage. If you've pinned an outdated version, you'll see clear warnings about EOL or security vulnerabilities. You get peace of mind knowing your DNS stack remains fully supported. ### Upgrade Templates Chkk offers in-place and blue-green upgrade guides for CoreDNS, detailing each step to avoid downtime. An in-place upgrade updates existing pods in a rolling manner with minimal disruption. Blue-green provides a parallel deployment for canary testing before switching all traffic, giving you a safer fallback. These templates integrate with Helm, Kustomize, or raw YAML flows. Automated checks during each phase help prevent mistakes and ease rollback if necessary. ### Preverification Preverification executes a dry-run of your CoreDNS upgrade in a controlled environment, validating your Corefile against the new release. This process proactively detects plugin incompatibilities and configuration errors before they affect production. By simulating the upgrade, it surfaces potential bottlenecks and service disruptions early, enabling you to proceed with confidence. With preverification, your upgrade path is both tailored and tested for your specific configuration, minimizing risk and ensuring a smooth transition. ### Supported Packages Chkk is compatible with Helm, Kustomize, and plain Kubernetes YAML distributions of CoreDNS. It recognizes custom images, private registries, and alternative configurations. This allows you to maintain consistency in your existing deployment pipelines. Chkk's intelligence adapts to your chosen method, providing targeted guidance for upgrades. Regardless of how you deploy CoreDNS, you get curated checks and detailed recommendations. ## Common Operational Considerations * **Plugin Compatibility Issues:** When upgrading CoreDNS, ensure your Corefile removes deprecated plugins or syntax. Failing to do so can cause pods to crash and break DNS resolution. * **Configuration Pitfalls:** Typos in the Corefile or invalid upstream DNS entries can lead to immediate failures. Always validate config changes in a non-production environment to avoid cluster-wide impact. * **Performance Tuning:** Right-size CoreDNS resources and replica counts, especially in large clusters to maintain stability, scalability, and performance. Monitoring query times and memory usage helps you adjust replicas or enable caching plugins. * **Debugging Failures:** Use logs to trace DNS errors or crashes, and confirm pods are healthy and Ready. Triage with test pods (e.g., dnsutils) to confirm end-to-end name resolution functionality. ## Additional Resources * [CoreDNS Documentation](https://coredns.io/tags/documentation/) * [CoreDNS Releases](https://github.com/coredns/coredns/releases) # External DNS Source: https://docs.chkk.io/projects/addons/external-dns Chkk coverage for External DNS. We provide preflight/postflight checks, curated release notes, and Upgrade Templates—designed for seamless upgrades. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v0.7.3 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v0.8.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## External DNS Overview External DNS automatically manages DNS records for Kubernetes Services and Ingresses, eliminating the need for manual updates whenever an IP address or hostname changes. By watching the cluster's API for resource changes, External DNS updates records on external providers (e.g. AWS Route 53, Google Cloud DNS, Azure DNS, Cloudflare) in real time. This ensures services remain accessible under consistent domain names without manual intervention, reducing configuration drift and minimizing downtime from outdated DNS entries. ## Chkk Coverage ### Curated Release Notes Chkk tracks official External DNS releases and distills changes into concise summaries. Operators see highlights of breaking changes, security fixes, and new features relevant to DNS automation, rather than combing through lengthy changelogs. This helps teams quickly assess whether an upgrade might impact configuration flags, provider APIs, or critical security patches. ### Preflight & Postflight Checks Before an upgrade, preflight checks verify that External DNS is set up correctly—ensuring provider credentials are valid, domain filters or annotations remain compatible, and resource limits are adequate. After the upgrade, postflight checks confirm DNS records synchronize properly and that no new error messages (e.g. API rate limits or authentication failures) appear. This validation guards against misconfigurations that could break record updates. ### Version Recommendations Chkk continuously monitors External DNS's lifecycle, flagging older releases as they near end-of-life or become incompatible with newer Kubernetes APIs. It suggests stable releases that align with your cluster version and known provider constraints, helping you avoid versions with critical bugs or deprecations. Staying on recommended releases ensures reliable DNS management and reduces the risk of unsupported features. ### Upgrade Templates Chkk offers two strategies for upgrading External DNS: in-place and blue-green. In-place applies rolling updates to the existing deployment, leveraging Kubernetes rolling strategies to maintain uptime. Blue-green spins up a parallel External DNS deployment, tests it against DNS providers, and cuts over traffic after validation. This method offers near-zero downtime and a straightforward rollback if unexpected issues arise. ### Preverification For major changes or critical environments, Chkk's preverification simulates the upgrade in a sandbox. It checks whether your DNS provider credentials, flags, or domain filters remain valid on the new version. It also evaluates whether DNS updates succeed without hitting rate limits or lingering propagation delays. This catch-issues-early approach reduces production surprises. ### Supported Packages No matter if External DNS was deployed with Helm, Kustomize, or raw YAML, Chkk aligns its upgrade steps with your current method. It supports private registries and custom-built images, ensuring that your images, security settings, and Helm/Kustomize overlays remain intact throughout the upgrade. This integration keeps your workflow consistent while benefitting from Chkk's checks and recommendations. ## Common Operational Considerations * **Ownership Conflicts:** TXT-based ownership prevents ExternalDNS from altering records it doesn't "own." Manually added DNS entries or multiple ExternalDNS instances using the same TXT owner ID can cause record flapping or collisions. * **RBAC and IAM Gaps:** Missing permissions in Kubernetes RBAC or cloud provider IAM can silently block record changes. Audit the ServiceAccount and associated roles—particularly in multi-tenant, production-grade setups. * **Garbage Collection Issues:** In "upsert-only" mode, ExternalDNS won't remove obsolete records. Switch to "sync" (with caution), or manually clean DNS zones to avoid stale entries exposing decommissioned services. * **Large-Cluster Overheads:** ExternalDNS scales poorly if it must track thousands of Services or Ingresses. Use --provider-cache-time and consider filtering by domain, zone ID, or namespace to reduce API calls and memory usage. * **Multi-Cluster Collisions:** Multiple clusters writing to the same zone must use unique TXT owner IDs or separate subdomains. Otherwise, they risk overwriting each other's records and causing unpredictable DNS behavior. ## Additional Resources * [External DNS Documentation](https://kubernetes-sigs.github.io/external-dns/latest/) * [External DNS Releases](https://github.com/kubernetes-sigs/external-dns/releases) # External Secrets Operator Source: https://docs.chkk.io/projects/addons/external-secrets-operator Chkk coverage for External Secrets Operator. We provide curated release notes, preflight/postflight checks, and Upgrade Templates—all tailored to your environment. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------------------- | | **Chkk Curated Release Notes** | v0.3.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v0.4.0 to latest | | **Supported Packages** | Helm, Kustomize, Static Manifests | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## External Secrets Operator Overview External Secrets Operator (ESO) helps Kubernetes clusters automatically fetch and update secrets from external secret managers such as AWS Secrets Manager, HashiCorp Vault, Azure Key Vault, and others. Rather than storing sensitive data directly in-cluster, platform teams keep passwords and tokens in a secure external store. ESO periodically syncs these values into Kubernetes Secret objects, ensuring applications always have the latest credentials while reducing the risk of accidental exposure. By leveraging custom resources—like ExternalSecret and SecretStore—the operator seamlessly fits into Kubernetes workflows and centralizes secret management practices. ## Chkk Coverage ### Curated Release Notes Chkk consolidates ESO release notes into concise summaries, highlighting critical changes like new provider integrations, deprecated CRD fields, or security patches. Instead of combing through lengthy changelogs, platform teams see only what matters most—such as API shifts that could break existing ExternalSecret configurations. If a version includes urgent bug fixes or vulnerability patches, Chkk flags those immediately so you can prioritize upgrades accordingly. ### Preflight & Postflight Checks Before upgrading, Chkk scans your current ESO deployment to detect deprecated fields, validate provider permissions, and confirm CRD compatibility. This proactive check prevents downtime caused by missing credentials or removed APIs. Once the upgrade is complete, Chkk runs postflight checks to ensure the new operator is healthy, verifying that all ExternalSecret resources still reconcile properly. By reviewing logs, events, and status conditions, it alerts you to any remaining issues—like secrets stuck in an error state or misconfigured SecretStores—enabling quick remediation. ### Version Recommendations Chkk continuously tracks ESO's release cadence and support timelines, surfacing which versions are nearing end-of-life or have known issues. It compares your existing deployment against official EOL announcements and CVE reports, then suggests a stable target release. Chkk balances adopting the latest features with maintaining operational stability, guiding you away from risky versions and steering you to secure, fully supported upgrades. ### Upgrade Templates For an efficient transition, Chkk provides two upgrade pathways: in-place and blue-green. In-place updates the existing ESO instance directly, minimizing resource overhead but requiring careful monitoring to catch potential regressions. Blue-green, on the other hand, deploys a parallel ESO instance first, letting you verify it before switching over. Both templates include rollback instructions and recommended checks, helping maintain uninterrupted secret synchronization even if unexpected issues arise. ### Preverification Chkk's preverification feature simulates the upgrade in a controlled environment, applying your actual ExternalSecret and SecretStore definitions against the new ESO version. This dress rehearsal pinpoints schema conflicts, missing permissions, or other incompatibilities before you touch production. With a detailed report of any errors or warnings, you can fix problems and rerun tests until everything works, drastically reducing the risk of failures during live upgrades. ### Supported Packages No matter how you've installed ESO—via Helm, Kustomize, or raw YAML—Chkk adapts to your existing workflow. It parses Helm values, merges Kustomize overlays, or patches manifests to align with private registries, custom images, and organizational security policies. By automating version bumps and CRD updates within your chosen toolchain, Chkk ensures the entire upgrade process stays consistent, secure, and compliant with your established deployment practices. ### Common Operational Considerations * **Least-Privilege Credentials:** Restrict ESO's service account to only the external secrets needed. Overly broad permissions can expose sensitive data across the cluster. * **Validate Secret Syncing:** After upgrades or config changes, confirm each ExternalSecret transitions to Ready. Check Kubernetes events for "Access Denied" or "UpdateFailed" errors. * **Avoid Overlapping Secrets:** Two ExternalSecrets writing to the same Kubernetes Secret can cause conflicts. Use unique naming or scope secret references carefully. * **Keep ESO Updated:** New releases often fix provider-specific issues or introduce vital security patches. Chkk flags older versions nearing EOL to prevent running unmaintained code. * **Network Policy Enforcement:** If you're using Cilium or another CNI with network policies, limit ESO's egress to recognized secret manager endpoints to minimize attack surface. ## Additional Resources * [External Secrets Documentation](https://external-secrets.io/latest/) * [External Secrets Releases](https://github.com/external-secrets/external-secrets/releases) # Gloo Edge OSS Source: https://docs.chkk.io/projects/addons/gloo-edge-oss Chkk coverage for Gloo Edge OSS. We provide curated release notes, preflight/postflight checks, and Upgrade Templates—all tailored to your environment. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.13.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.14.19 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Gloo Edge OSS Overview Gloo Edge OSS is an Envoy-based gateway that manages traffic routing, transforms, and integrations across Kubernetes clusters and legacy systems. It unifies API gateway capabilities (like authentication and rate-limiting) with flexible route configuration. By aggregating services and functions under one control plane, Gloo Edge streamlines operations in modern app environments. It can also handle secure east-west traffic between microservices. Operators use Gloo Edge to future-proof their architectures while maintaining low latency and high extensibility. ## Chkk Coverage ### Curated Release Notes Chkk surfaces the most critical changes from Gloo Edge release notes so you can quickly see new features, fixes, or major shifts. It flags relevant deprecations or default behavior changes that might affect your clusters. These curated notes cut through the noise and focus on operational impact. They also map each item to your environment for direct relevance. As a result, you're never caught off guard when a key config field or CRD is updated. ### Preflight & Postflight Checks Before upgrading Gloo Edge, Chkk's preflight checks examine your cluster, CRDs, and configurations for compatibility and deprecations. After upgrading, postflight checks confirm the new control plane is healthy, routes are correct, and no Envoy rejections have occurred. This reduces the risk of sudden downtime or unexpected traffic behavior. By automating this validation, you gain a quick turnaround on identifying issues. The checks provide step-by-step guidance to fix them before they become production outages. ### Version Recommendations Chkk tracks Gloo Edge release schedules and support timelines, notifying you when your current version goes EOL or loses patch coverage. It provides guidance on stable upgrade targets, matching your Kubernetes version and other dependencies. This avoids unsupported states or risky leaps between releases. Chkk's recommendations are continually updated so you can plan upgrades proactively. That way, you stay current with minimal disruption. ### Upgrade Templates Chkk provides in-place and blue-green upgrade templates for Gloo Edge that build on official best practices and incorporate safety checkpoints and rollbacks. With in-place upgrades, you sequentially update the Gloo Edge control plane, validate cluster health, then roll out Envoy data planes to minimize traffic disruption. Meanwhile, blue-green upgrades guide you to deploy a parallel set of Gloo Edge components (control plane and proxies) on the new version and migrate traffic gradually, ensuring the older version remains available if issues arise. These automated workflows catch configuration drift and errors early, so you can revert smoothly when needed. By standardizing each step of the upgrade, Chkk removes guesswork and helps maintain consistent results across large environments. ### Preverification With **Preverification**, Chkk simulates your Gloo Edge upgrade in a staging environment, using the same CRDs and configs as production. This reveals any blockers—like incompatible fields or resource constraints—before the real deployment. It also estimates the potential runtime impact in terms of CPU, memory, or performance. If errors appear, you can fix them well ahead of time. **Preverification** essentially de-risks large environment changes by surfacing issues in a controlled test run. ### Supported Packages Chkk supports Gloo Edge installed via Helm, Kustomize, or plain Kubernetes manifests, automatically detecting your deployment method. It adapts its checks and upgrade flow to match how you're currently managing Gloo Edge—be it GitOps-based YAML or a Helm chart in a private registry. Custom-built or vendor-specific images are also recognized and tracked during upgrades. This flexibility ensures you can maintain your existing provisioning workflow while gaining consistent oversight from Chkk. ## Common Operational Considerations * **Running Supported Versions:** Gloo Edge typically supports the current release plus three prior minor versions. Plan upgrades to avoid losing security patches or hitting deprecated APIs without warning. * **Upgrading Glooctl:** Always match glooctl with your Gloo Edge version so new features or fields don't break older CLI commands. This prevents deployment of invalid configs and reduces friction when debugging. * **Parallel Deployments:** If you choose a canary rollout, run a second Gloo Edge instance in parallel. Migrate traffic incrementally, observe for errors, and revert if needed. * **CRD Changes:** Watch for CRD schema updates that may invalidate existing configuration. Ensure custom fields or deprecated routes aren't left behind, which can stop traffic flows or break routing rules. * **HA Configurations:** Maintain multiple replicas of key components to minimize downtime during rolling upgrades. Check for correct leader-election settings and confirm Envoy pods are gradually updated. * **Post-Upgrade Verification:** After upgrading, confirm that VirtualServices, RouteTables, and AuthConfigs all have "Accepted" statuses. Monitor logs and metrics (e.g., 5xx spikes) and be ready to roll back if you see critical errors. ## Additional Resources * [Gloo Edge OSS Documentation](https://github.com/solo-io/workshops/blob/master/gloo-edge/README.md) * [Gloo Edge OSS Releases](https://github.com/solo-io/gloo/releases) # Ingress NGINX Controller Source: https://docs.chkk.io/projects/addons/ingress-nginx-controller Chkk coverage for Ingress NGINX Controller. We provide curated release notes, preflight/postflight checks, and Upgrade Templates—all tailored to your environment. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v2.11.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v3.15.1 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Ingress NGINX Controller Overview Ingress NGINX Controller provides a production-ready reverse proxy for Kubernetes Ingress resources, built on the NGINX web server. It dynamically configures routes based on Ingress objects, supporting host, path, and TLS-based routing. Platform teams rely on it to centralize external access, implement custom traffic rules, and perform SSL termination. This controller is known for high performance, flexibility through annotations, and seamless integration with Kubernetes networking. Its active community ensures timely updates and consistent feature enhancements. ## Chkk Coverage ### Curated Release Notes Chkk filters official release notes to highlight only critical changes like deprecated annotations, default behavior shifts, and new configuration flags. This saves time by consolidating key operational details into a concise summary. It flags changes like stricter path validation rules or security improvements that might require advanced planning. Platform teams can quickly see which features are relevant and how they impact existing setups. By focusing on actionable insights, Chkk helps avoid risks when new versions are adopted. ### Preflight & Postflight Checks Preflight checks confirm that your Kubernetes version, CRDs, and Ingress configurations remain compatible with the upcoming NGINX release. Chkk identifies any deprecations or upgrades that risk causing downtime. Postflight checks verify whether the new controller is healthy and all Ingress routes function as expected. It detects misconfigurations by examining controller logs, readiness endpoints, and resource usage. This proactive approach avoids rollout failures and quickly flags any lingering issues. ### Version Recommendations Chkk evaluates each Ingress NGINX release against your cluster's Kubernetes version and usage patterns. It warns when your current version is nearing or past community support, and provides guidance on stable upgrade targets. If a new version introduces security fixes or performance benefits, Chkk highlights these improvements for informed decision-making. It also accounts for real-world feedback on known bugs or regressions. By monitoring EOL dates and new features, Chkk ensures you're on a safe, reliable version. ### Upgrade Templates Chkk publishes step-by-step procedures for both in-place and blue-green upgrades of the controller. In-place upgrades guide you through a safe rolling update, while blue-green approaches spin up a parallel controller revision for canary testing. These templates include clear rollback steps and recommended monitoring checkpoints. They align with community best practices to minimize disruption. By systematically walking through each stage, platform teams reduce the chance of downtime. ### Preverification **Preverification** simulates the entire Ingress NGINX upgrade in a dedicated test environment mirroring production. It evaluates whether older Ingress rules, annotations, or resource constraints clash with the new release. Chkk exposes any issues—like invalid config or broken CRDs—before traffic is impacted. This rehearsal approach gives teams the confidence to fix problems early. It's particularly valuable when introducing major changes or toggling advanced NGINX features. ### Supported Packages Chkk recognizes Ingress NGINX whether deployed via Helm charts, Kustomize overlays, or plain YAML. It aligns checks and upgrades with your specific packaging method, ensuring minimal friction. Custom images or private registries are fully supported, including specialized vendor builds. Chkk also tracks Helm chart versions to verify compatibility with corresponding controller releases. This broad support lets platform teams maintain their preferred deployment strategy without losing coverage. ## Common Operational Considerations * **Performance and Scalability:** Tune worker processes, keepalive settings, and concurrency limits for high-traffic environments. Scale horizontally with multiple controller replicas and ensure node-level resources support peak throughput. * **Security and TLS Configuration:** Enforce strong TLS ciphers and upgrade regularly for security fixes. Integrate with cert-manager or external certificate automation to eliminate expired cert risks. * **Advanced Routing and Configuration:** Use separate IngressClasses for distinct traffic patterns, and carefully handle regex paths and rewrites. Combine custom annotations, ConfigMap overrides, or carefully managed snippets for specialized NGINX settings. * **Monitoring and Troubleshooting:** Scrape ingress-nginx metrics for real-time insights and set up meaningful alerts. Consult logs for 4xx/5xx error patterns, keep track of pod restarts, and ensure readiness/liveness endpoints are functioning. * **Multiple Ingress Controllers or Canary Testing:** Split public vs. private traffic across separate controllers for tighter security boundaries. Leverage canary deployments to test new versions or features on a subset of routes before a full rollout. ## Additional Resources * [Ingress NGINX Controller Documentation](https://docs.nginx.com/nginx-ingress-controller/) * [Ingress NGINX Releases](https://github.com/kubernetes/ingress-nginx/releases) # Istio Source: https://docs.chkk.io/projects/addons/istio Chkk coverage for Istio. We provide version recommendations, preflight/postflight checks, and Upgrade Templates—ensuring worry-free operations. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------------------- | | **Chkk Curated Release Notes** | v1.12.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.17.0 to latest | | **Supported Packages** | Helm, Kustomize, Static Manifests | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Istio Overview Istio is an open-source service mesh built on Envoy proxies that manage critical operational features like routing, mutual TLS encryption, and policy controls. By injecting an Envoy sidecar with each workload, Istio allows platform teams to standardize traffic management, security rules, and observability without altering microservices code. Advanced traffic shaping tactics, including canary releases and fault injection, are possible at layer 7, which L3/L4-only networking solutions cannot match. Because Istio centralizes configuration and enforcement, it lets organizations apply zero-trust security policies (such as mTLS by default) and uniform telemetry collection in large-scale clusters without requiring one-off efforts in each application. ## Chkk Coverage ### Curated Release Notes Chkk monitors and curates the official Istio release notes, flagging new features, breaking changes, or API/CRD deprecations that are specifically relevant to your clusters. Rather than manually parsing every upstream detail, platform teams receive a contextualized briefing focused on operational impact. For example, if Istio 1.24 removes a default behavior—such as retries on HTTP 503 errors—Chkk highlights how this change might alter traffic patterns in your environment. Similarly, if a particular version introduces new CRDs or requires a configuration shift, Chkk pinpoints exactly where your environment could be affected. ### Preflight & Postflight Checks Before an Istio upgrade, Chkk runs preflight checks that examine your cluster's Kubernetes version, relevant CRDs, EnvoyFilters, and resource constraints to confirm that you are within Istio's official support range. This ensures you do not jump more than one minor release in a single upgrade—a scenario that could lead to unpredictable behavior. The checks also detect deprecated fields—such as older syntax in VirtualService or DestinationRule objects—so you can address them before applying new manifests. After the upgrade, Chkk's postflight checks verify that the new istiod is healthy, analyze Istio injection logs for errors, and monitor for traffic anomalies or pods running mismatched sidecars. This approach simplifies large fleet operations by quickly identifying issues (such as leftover pods running the old proxy version) that could undermine mesh consistency. ### Version Recommendations Chkk constantly monitors Istio's support timeline and flags when your current version is nearing or has passed its end-of-life. This is crucial for platform engineers who must ensure compliance and maintain up-to-date security patches. By referencing Istio's official support matrix, Chkk explains why certain versions are considered risky—whether due to dropping out of patch availability or incompatibility with specific Kubernetes releases—helping you avoid unplanned downtime caused by outdated APIs. Beyond notifications, Chkk even suggests a stable upgrade target that aligns with both community feedback and Istio's known issues, enabling platform teams to balance the urgency of new features with operational stability. If you've forked Istio and follow a custom support policy, Chkk accommodates custom end-of-life (EOL) policies. ### Upgrade Templates Chkk delivers detailed **Upgrade Templates** for both in-place and blue-green (canary) upgrade methods, reflecting the best practices documented by the Istio project. An in-place upgrade updates the existing control plane directly, followed by rolling restarts of sidecar proxies. In contrast, a blue-green (canary) strategy deploys a new istiod revision, gradually migrates workloads, and retires the old control plane only after verifying stability. These templates integrate seamlessly with GitOps or CI/CD pipelines by providing a clear, step-by-step process complete with rollback points. This approach allows platform teams to confidently adopt either method while reducing the risk of human error that could compromise mesh traffic flows. ### Preverification **Preverification** is Chkk's "digital twin" approach to rehearsing the entire Istio upgrade in an isolated environment before any production deployment. It spins up a representative cluster—including your existing Istio configurations—and runs every upgrade command in sequence. This detects complications such as CRD conflicts, EnvoyFilter breakages, or resource overconsumption in the new istiod. Because these issues appear during the simulated upgrade rather than in production, you can adjust configurations or resource allocations in advance. Many organizations utilize preverification into their process to ensure each upgrade step passes automated checks, making live deployments more safe. ### Supported Packages Chkk supports multiple packaging formats for Istio—Helm, Kustomize, or plain Kubernetes YAML—allowing platform engineers to manage Istio just as they do other cluster resources. It respects custom images, private registries, and specialized vendor builds, ensuring your environment remains consistent when transitioning between Istio versions. If you already maintain a GitOps repository for Istio, Chkk can parse those manifests, map them to the appropriate target version, and suggest only the changes necessary for a safe upgrade. ## Common Operational Considerations * **Traffic Routing Mismatches:** VirtualService and Gateway hosts must align precisely. A mismatch often results in traffic never reaching the intended service, with no obvious error. * **Sidecar Injection Failures:** Label conflicts or multiple admission webhooks can prevent pods from being injected with Envoy. Watch istiod logs and use istioctl commands to detect any pods that lack a sidecar. * **Strict mTLS Rollout:** Enabling strict mTLS cluster-wide without verifying all services can block traffic to unmeshed workloads. Use permissive mode first, then tighten once all workloads comply. * **EnvoyFilter Limitations:** Since EnvoyFilter depends on implementation details, upgrades can break custom filters. Prefer stable APIs like WasmPlugin or Telemetry whenever possible. * **Resource Overhead:** Each sidecar consumes CPU and memory. For large clusters, use the Sidecar resource to limit which hosts Envoy must watch, and tune requests/limits for both sidecars and istiod. * **Canary Upgrades:** Minimize downtime by running multiple Istio revisions in parallel. Canary releases reduce the blast radius if new configuration or CRD changes cause unexpected behavior. ## Additional Resources * [Istio Documentation](https://istio.io/latest/docs/) * [Istio Releases](https://github.com/istio/istio/releases) # kube-proxy Source: https://docs.chkk.io/projects/addons/kube-proxy Chkk coverage for kube-proxy. We provide curated release notes, preflight/postflight checks, and Upgrade Templates—all tailored to your environment. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.20.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.20.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## kube-proxy Overview kube-proxy manages network rules on each node, forwarding traffic to the correct backend pods for Kubernetes Services. It can run in iptables, IPVS, or userspace mode, with iptables and IPVS being the most common. IPVS performs better in larger clusters because of constant-time lookups, while iptables is generally sufficient for moderate environments. By abstracting the details of Service routing, kube-proxy simplifies application networking. This ensures highly available and scalable traffic distribution at the node level. ## Chkk Coverage ### Curated Release Notes Chkk continuously monitors kube-proxy related Kubernetes release notes and surfaces relevant performance, security, or deprecation changes. This means you get curated highlights on how upgrades or patches affect your iptables, IPVS configurations, or node-level networking features. Chkk also alerts you when key flags or config settings are removed or replaced. This avoids sifting through extensive Kubernetes changelogs. You stay informed about precisely what matters to your cluster. ### Preflight & Postflight Checks Chkk runs pre-upgrade checks to verify kernel modules for IPVS, correct iptables settings, and any deprecated kube-proxy flags that might break post-upgrade. After your rollout, Chkk's postflight checks confirm that the new kube-proxy pods are running, iptables or IPVS rules are accurate, and services remain reachable. If it detects anomalies—like iptables-restore errors or missing rules—it flags them early. This helps you address connectivity gaps before they escalate. The result is safer, more predictable kube-proxy updates across all nodes. ### Version Recommendations Chkk tracks kube-proxy versions in tandem with Kubernetes releases, ensuring you don't run end-of-life or unsupported combinations. It compares your current version against known advisories, highlighting critical security patches or incompatibilities. If you're trailing behind, Chkk provides a stable upgrade target aligned with both your Kubernetes version and the broader community's feedback. This reduces risk of unexpected downtime from outdated iptables or networking behavior. You stay aligned with best practices for node-level proxying. ### Upgrade Templates Chkk provides in-place and blue-green templates for kube-proxy upgrades, guiding you through a node-by-node rollout or parallel DaemonSet deployments. Each step includes draining or cordoning to avoid traffic disruption and verifying new rules are applied before moving on. Rollback guidance is included if any node runs into problems with iptables or IPVS. By following these clear instructions, your team can adopt a consistent and low-risk approach to network updates. You combine automation with real-world best practices around incremental changes. ### Preverification Chkk's preverification rehearses your exact kube-proxy upgrade plan in an isolated environment, detecting kernel module, iptables, or config errors before production. It ensures the new configuration can properly set up rules and handle your existing Services without triggering outages. If issues arise—like a missing IPVS module—Chkk highlights them so you can fix them first. This drastically reduces the risk of network disruptions during real rollouts. Teams gain confidence by testing all aspects of the new kube-proxy version ahead of time. ### Supported Packages Whether you manage kube-proxy via Helm, Kustomize, or raw manifests, Chkk seamlessly integrates with your workflow. It locates the kube-proxy image, command-line flags, and config data within your chosen package format. This ensures every recommendation, check, and upgrade template matches your setup. Private registries and custom builds are fully supported, so no special steps are needed to stay aligned with best practices. Chkk helps you maintain a standardized pipeline regardless of how you package kube-proxy. ## Common Operational Considerations * **Performance Tuning Considerations:** If iptables becomes too large or slow, switch to IPVS, which uses constant-time lookups. Monitor kube-proxy CPU usage and sync times to avoid performance bottlenecks. * **Impact on Network Latency and Scalability:** IPVS typically offers lower latency under heavy loads and large service counts. iptables remains sufficient for smaller clusters but scales less effectively in massive environments. * **Handling Rule Inconsistencies and Troubleshooting:** Inconsistencies can arise from manual iptables changes or reboots; restarting kube-proxy can fix stale or missing rules. Check kube-proxy logs and use iptables or ipvsadm commands to validate correct forwarding. * **Ensuring Kube-Proxy High Availability and Redundancy:** Each node runs kube-proxy, so a single failure typically doesn't cause cluster-wide disruptions. DaemonSet rolling updates, combined with liveness probes, ensure each node's proxy stays functional. * **Security Considerations for Kube-Proxy:** Kube-proxy doesn't enforce network policies, so rely on CNI-based NetworkPolicies for pod-to-pod restrictions. Keep kube-proxy images up to date and secure NodePorts at the host or cloud firewall level. ## Additional Resources * [kube-proxy Command Line Tools Reference](https://kubernetes.io/docs/reference/command-line-tools-reference/kube-proxy/) * [Kubernetes Releases](https://github.com/kubernetes/kubernetes/releases) # kube-state-metrics Source: https://docs.chkk.io/projects/addons/kube-state-metrics Chkk coverage for kube-state-metrics. We provide preflight/postflight checks, curated release notes, and Upgrade Templates—designed for seamless upgrades. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.9.2 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v2.2.4 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## kube-state-metrics Overview kube-state-metrics (KSM) listens to the Kubernetes API and generates metrics about the state of cluster objects, such as pods and deployments. It presents these metrics in a Prometheus-friendly format, offering granular visibility into workloads and resource objects. This data complements the Kubernetes Metrics Server by focusing on object states rather than resource utilization. Many organizations pair KSM with Prometheus and Grafana to get clear dashboards of key cluster metrics. Because it's lightweight and widely adopted, KSM is a core tool for platform observability and troubleshooting. ## Chkk Coverage ### Curated Release Notes Chkk continuously scans official KSM releases, identifying metric additions, deprecations, and critical bug fixes. It highlights changes that could break existing Prometheus queries or dashboards, preventing alert failures. In other words, you get a direct heads-up if a particular metric has been removed or altered. Chkk thereby streamlines the process of tracking upstream changes. You see only the details that matter rather than sifting through lengthy release notes. ### Preflight & Postflight Checks Chkk's preflight checks confirm that your KSM version is compatible with your Kubernetes release and that it has the correct RBAC permissions. It flags any metrics that are likely to disappear or change when you move to a new KSM version. After the upgrade, postflight checks ensure the new KSM is healthy, exposing all expected metrics with no major variance from the baseline. This approach reduces the risk of missing critical data and shortens recovery time if something goes off track. ### Version Recommendations Chkk reads the release notes for KSM and alerts you when your version nears end-of-life or lacks compatibility with your Kubernetes cluster. It recommends a stable release that aligns with your Prometheus stack and avoids known issues. As new versions emerge, Chkk points out compelling updates (like performance improvements) or necessary security fixes. By following these pointers, you ensure continuous, reliable upgrades. This proactive approach helps you stay aligned with best practices. ### Upgrade Templates Chkk provides both in-place and blue-green upgrade templates tailored for KSM deployments. An in-place approach updates the existing KSM, typically with a rolling restart that minimizes downtime. A blue-green strategy runs two KSM versions in parallel, switching traffic only after verifying the new version is producing correct metrics. These templates cover image tags, resource requests, and recommended gating checks. They integrate easily with GitOps flows, giving you a consistent procedure every time you upgrade. ### Preverification Chkk's preverification ensures a smooth and reliable KSM upgrade by testing it in a controlled environment before deployment. It compares key metrics between your current and new versions, identifying renamed or removed series and changes in label formats that could break queries. By proactively surfacing these differences, you can adjust dashboards and alerts ahead of time, preventing disruptions. With Chkk's preverification as a safety net, you can upgrade with confidence, minimizing downtime and ensuring a seamless transition to production. ### Supported Packages Chkk accommodates multiple packaging methods for KSM—Helm, Kustomize, and plain Kubernetes manifests. It respects private registries and custom-built images, so you can maintain your own hardened version of KSM. The system automatically parses your chosen deployment format to detect required changes and check for known pitfalls. If you've rolled out KSM via GitOps, Chkk integrates seamlessly with that pipeline. In every scenario, coverage remains consistent and thorough. ## Common Operational Considerations * **Scaling & Resource Usage:** KSM's CPU/memory usage grows with cluster objects. Size or shard KSM appropriately to avoid performance issues and scraped metric delays. * **High-Cardinality Metrics & Filtering:** Some labels can explode time-series counts in Prometheus. Use allowlists or denylists to reduce unnecessary data that can overload your monitoring stack. * **RBAC Permissions:** KSM requires broad read access to various Kubernetes objects. Missing permissions can silently drop metrics, so validate role bindings and watch for incomplete data sets. * **Monitor KSM Itself:** Scrape KSM's internal metrics for watch failures and memory usage. Early detection of restarts or scrape errors ensures continuous, correct monitoring coverage. ## Additional Resources * [kube-state-metrics Documentation](https://github.com/kubernetes/kube-state-metrics/blob/main/docs/README.md) * [kube-state-metrics Releases](https://github.com/kubernetes/kube-state-metrics/releases) # Kubernetes Dashboard Source: https://docs.chkk.io/projects/addons/kubernetes-dashboard Chkk coverage for Kubernetes Dashboard. We provide version recommendations, preflight/postflight checks, and Upgrade Templates—ensuring worry-free operations. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v2.0.3 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v2.4.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Kubernetes Dashboard Overview Kubernetes Dashboard is a web-based UI that helps you view, manage, and troubleshoot workloads running on Kubernetes. It allows real-time monitoring of Pods, Deployments, and other resources with minimal overhead. Combining resource visualization with RBAC-based access control, platform teams gain a convenient method of cluster management. Whether used for quick troubleshooting or routine maintenance, the Dashboard offers a straightforward alternative to CLI-based operations. With broad support for modern Kubernetes releases, it remains a staple for rapid cluster visibility. ## Chkk Coverage ### Curated Release Notes Chkk automatically tracks upstream Kubernetes Dashboard releases and flags relevant changes. This saves time spent combing through raw release notes for bug fixes, new UI features, and security patches. If a release modifies RBAC requirements or drops a feature, Chkk highlights that in context. It then provides a concise summary tailored to your cluster's current usage. That way, you never miss critical shifts that could affect operations. ### Preflight & Postflight Checks Before you upgrade, Chkk's preflight checks confirm that your cluster meets all prerequisites for the new Dashboard release, including Kubernetes version compatibility. It inspects the existing Dashboard deployment and flags any deprecated configs or missing roles. After deployment, Chkk's postflight checks verify that the UI is accessible and logs reveal no issues. By automating these checks, you minimize risks like sudden access failures or broken permissions. The outcome is a seamless transition with immediate confirmation of success. ### Version Recommendations Chkk monitors which Dashboard versions are actively maintained and highlights any release nearing end-of-life. It compares your installed version against known stable lines to guide timely upgrades. If upstream support is dropping, you'll see clear alerts about potential security gaps and missing patches. Chkk also factors in your Kubernetes version to ensure tested compatibility. This proactive approach secures continuity and helps plan your upgrade roadmap effectively. ### Upgrade Templates Chkk provides step-by-step procedures for upgrading the Dashboard either in-place or via a blue-green strategy. The in-place approach updates the same Deployment while preserving minimal downtime. Blue-green deploys a new Dashboard in parallel so you can verify functionality before decommissioning the old version. Both templates include rollback points and recommended resource modifications to maintain continuity. By removing guesswork, these templates streamline each step and reduce the risk of breakage. ### Preverification Chkk offers a simulated "digital twin" environment to validate your Dashboard upgrade before affecting production. It recreates your cluster's resources and applies the new version, watching for RBAC conflicts or other errors. If the new Dashboard fails to load or triggers issues, you learn in the sandbox rather than production. You can then adjust configurations or patch manifests prior to the real rollout. This approach protects uptime and adds confidence to any upgrade. ### Supported Packages Kubernetes Dashboard can be installed via Helm, plain YAML, or Kustomize, and Chkk supports all of these. It detects your package method and tailors its recommendations accordingly. Helm users see guided `helm upgrade` instructions, while manifest-based users see updated YAML diffs. If you rely on a private registry or custom images, Chkk respects those references for consistent usage. In every case, you gain a single interface for analyzing, planning, and executing Dashboard updates. ## Common Operational Considerations * **Authentication and Single Sign-On (SSO):** If you're integrating the Dashboard with an external identity provider (e.g., OIDC or LDAP), ensure your cluster's API server and Dashboard configuration sync properly. Any misalignment between OIDC claims and Dashboard roles can lock out or overprivilege users, so validate tokens carefully and audit RBAC mappings for SSO groups. * **Namespace-Scoped Dashboards:** For large environments, consider deploying multiple Dashboards restricted to distinct namespaces or teams. This reduces cross-namespace congestion in the UI, lowers resource overhead in massive clusters, and tightens permissions so users only see relevant workloads. * **Metrics Scraper Efficiency:** The Dashboard often relies on an accompanying metrics scraper to collect CPU/memory data. In bigger clusters, the scraper can become a performance bottleneck if it polls too frequently, so tune scrape intervals or disable unused resource views to avoid flooding the API server. * **Auditing and Log Review:** Because the Dashboard aggregates sensitive operations into a single interface, turn on Kubernetes API audit logging and compare them with the Dashboard's logs. Look for anomalies such as repeated failed logins, unusual PUT/DELETE actions, or tokens being used across unexpected namespaces. ## Additional Resources * [Kubernetes Dashboard Documentation](https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/) * [Kubernetes Dashboard Releases](https://github.com/kubernetes/dashboard/releases) # Kubernetes Metrics Server Source: https://docs.chkk.io/projects/addons/kubernetes-metrics-server Chkk coverage for Kubernetes Metrics Server. We provide curated release notes, preflight/postflight checks, and Upgrade Templates—all tailored to your environment. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v0.4.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v0.5.1 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Kubernetes Metrics Server Overview Kubernetes Metrics Server is a lightweight aggregator for resource usage metrics (CPU and memory), exposing them to the Kubernetes API. By default, it scrapes kubelets at short intervals and provides "current" usage metrics for HPA and kubectl top. It's designed for minimal resource overhead, while still supporting clusters of thousands of nodes. Because it does not store historical data, it's best paired with full-fledged monitoring solutions for long-term metrics. Metrics Server is a critical add-on for autoscaling, granting platform engineers real-time resource snapshots without invasive instrumentation. ## Chkk Coverage ### Curated Release Notes Chkk regularly scans Metrics Server releases for new features, deprecations, and breaking changes impacting cluster operations. By highlighting only the most relevant updates—like new resource flags or security patches—teams avoid sifting through extensive upstream notes. If a new Metrics Server version changes default CPU requests, Chkk calls out this change so cluster administrators can plan accordingly. Chkk's curated summaries help you rapidly assess upgrade impacts. This streamlined insight reduces the risk of downtime during deployment. ### Preflight & Postflight Checks Chkk's preflight checks validate that a new Metrics Server version is compatible with your cluster's API configuration, RBAC, and resource constraints. They verify kubelet ports, aggregator settings, and any new flags needed by the upgrade. After deployment, postflight checks confirm that Metrics Server is serving metrics accurately and all HPAs remain functional. This ensures critical autoscaling mechanisms continue working as expected. Any anomalies—like missing metrics or errors in the Metrics API—are promptly surfaced. ### Version Recommendations Chkk tracks Metrics Server support timelines and flags when your current release is nearing or past end-of-life. Its guidance considers Kubernetes version compatibility and known upstream issues, recommending stable versions that won't disrupt cluster autoscaling. If your setup relies on a custom build or private registry, Chkk provides tailored advice while aligning with official support guidelines. This prevents running outdated, unsupported versions that could break production workloads. As updates roll out, Chkk consistently highlights safe upgrade targets. ### Upgrade Templates Chkk's **Upgrade Templates** provide step-by-step instructions for in-place or blue-green rollouts of Metrics Server. The in-place approach replaces pods with minimal disruption, while the blue-green method creates a parallel deployment for a seamless transition. Both methods address health checks, aggregator API settings, and potential resource usage spikes. By detailing each step (including rollbacks), Chkk reduces manual errors. These repeatable processes simplify upgrades across multiple clusters or environments. ### Preverification Chkk's preverification simulates the entire Metrics Server upgrade in a controlled environment. This helps detect issues like missing RBAC permissions, incorrect flags, or network blocks prior to live rollout. When the real upgrade happens, there's higher confidence that metrics collection will remain stable. **Preverification** also pinpoints necessary config adjustments for large-scale clusters. This approach prevents production disruptions and shortens troubleshooting time. ### Supported Packages Chkk works seamlessly with any Metrics Server installation method—Helm, Kustomize, or raw manifests—without forcing you to change existing workflows. It respects custom images and private registries so you can enforce internal security standards. Chkk detects your Metrics Server deployment and correlates it with upstream versions to provide relevant checks. This consistency allows teams to manage upgrades across different cluster setups. By avoiding lock-in, you can maintain the same operational patterns for all environments. ## Common Operational Considerations * **Aggregation Latency Impact on Autoscaling:** If scrape intervals are too long, HPAs will scale based on stale usage data. Ensure interval settings and resource allocations keep the latency within acceptable bounds. * **Resource Requests & Limits:** Under-provisioning can lead to missing metrics or slow responses, so allocate sufficient CPU/memory based on cluster size. Regularly review usage to adjust resource settings when the cluster grows. * **RBAC & API Access Considerations:** Metrics Server must have appropriate permissions to fetch node data and serve the metrics.k8s.io API. Recheck roles and aggregator settings after major Kubernetes updates or changes. * **Handling Large-Scale Clusters:** When node counts are high, consider tuning scrape intervals or running multiple replicas. Monitor performance so that neither kubelet nor Metrics Server becomes overloaded. * **TLS & Secure Communications:** Use proper certificates to avoid insecure skip-flags, which expose the cluster to possible man-in-the-middle attacks. Ensure your aggregator and kubelets are configured with trusted CA certificates. * **Known Conflicts with Other Monitoring Solutions:** Conflicting API registrations (e.g. dual metrics providers) can break the metrics.k8s.io endpoint. Integrate Metrics Server with Prometheus or custom adapters by separating responsibilities and verifying no overlap. ## Additional Resources * [Metrics Server Documentation](https://kubernetes-sigs.github.io/metrics-server/) * [Metrics Server Releases](https://github.com/kubernetes-sigs/metrics-server/releases) # Add-ons Source: https://docs.chkk.io/projects/addons/overview Explore All the Kubernetes Add-ons Chkk Covers A [**Kubernetes Add-on**](/misc/glossary#kubernetes-add-on) is a type of [Project](/projects/overview) that **extends Kubernetes cluster functionality but is not part of the Kubernetes core**. Kubernetes Add-ons typically run inside the cluster as regular workloads (e.g., Deployments, DaemonSets) and provide services like networking, monitoring, logging, and DNS. Get a quick overview of every Kubernetes Add-on Chkk covers, complete with curated release notes, private registry and custom image coverage, pre- and post-flight checks, and comprehensive EOL and version compatibility details. Dive deeper into each Kubernetes Add-on to see supported upgrade templates (in-place, blue-green) and preverification. Simply select a card below to learn more about that specific Kubernetes Add-on. | | | | | ----------------------------------------------------------------------- | --------------------------------------------------------------------- | ----------------------------------------------------------------------- | | [Amazon VPC CNI Plugin](/projects/addons/amazon-vpc-cni) | [Calico](/projects/addons/calico) | [Cert Manager](/projects/addons/cert-manager) | | [Cilium](/projects/addons/cilium) | [Cluster Autoscaler](/projects/addons/cluster-autoscaler) | [Contour](/projects/addons/contour) | | [CoreDNS](/projects/addons/coredns) | [External DNS](/projects/addons/external-dns) | [External Secrets Operator](/projects/addons/external-secrets-operator) | | [Gloo Edge OSS](/projects/addons/gloo-edge-oss) | [Ingress NGINX Controller](/projects/addons/ingress-nginx-controller) | [Istio](/projects/addons/istio) | | [kube-proxy](/projects/addons/kube-proxy) | [kube-state-metrics](/projects/addons/kube-state-metrics) | [Kubernetes Dashboard](/projects/addons/kubernetes-dashboard) | | [Kubernetes Metrics Server](/projects/addons/kubernetes-metrics-server) | [Vertical Pod Autoscaler](/projects/addons/vertical-pod-autoscaler) | AWS Load Balancer Controller | | AWS Network Policy Agent | AWS Node Termination Handler | Addon Resizer | | Amazon EBS CSI Driver | Amazon EFS CSI Driver | Antrea | | Azure Active Directory Workload Identity | Azure Cloud Node Manager | Azure Disk CSI Driver | | Azure File CSI Driver | Azure IP Masquerade Agent | Azure Key Vault Provider for Secrets Store CSI Driver | | Azure Key Vault to Kubernetes | Azure Monitor | Azure Network Policy Manager | | Buoyant Enterprise for Linkerd | cAdvisor | CSI External Attacher | | CSI External Attacher | CSI External Provisioner | CSI External Resizer | | CSI External Snapshotter | CSI Liveness Probe | CSI Node Driver Registrar | | Cluster Proportional Autoscaler | Cluster Proportional Vertical Autoscaler | Envoy Proxy | | Google Compute Engine (GCE) Persistent Disk CSI Driver | GCP Filestore CSI Driver | GCP GCS FUSE CSI Driver | | GKE Egress NAT Controller | GKE Metrics Agent | GKE Metrics Collector | | GKE Stackdriver Logging Agent | GKE TPU Device Plugin | Gloo Edge Enterprise | | Ingress GCE Controller | Istio CNI Node Agent | Istio CSR | | Karpenter | Kong Ingress Controller | Kubernetes Dashboard Metrics Scraper | | Kubernetes DNS | Kubernetes External Secrets | Kubernetes Secrets Store CSI Driver | | Linkerd OSS | Longhorn | MetalLB | | MinIO | Mountpoint for AWS S3 CSI Driver | netd | | Node Feature Discovery | NodeLocal DNS Cache | Nvidia Device Plugin for Kubernetes | | OpenEBS | Portworx | Rook | | Sealed Secrets | SPIRE | Traefik | # Vertical Pod Autoscaler Source: https://docs.chkk.io/projects/addons/vertical-pod-autoscaler Chkk coverage for Vertical Pod Autoscaler. We provide preflight/postflight checks, curated release notes, and Upgrade Templates—designed for seamless upgrades. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v0.8.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v0.9.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Vertical Pod Autoscaler Overview Vertical Pod Autoscaler (VPA) automatically adjusts the CPU and memory requests of Kubernetes workloads based on real-time usage patterns. It analyzes historical metrics to right-size resources, reducing the risk of OOM kills and preventing excessive over-provisioning. By dynamically evicting and recreating pods with updated resource requests, it helps maintain efficient cluster utilization. Since VPA acts as an admission webhook, it integrates seamlessly with the Kubernetes control plane while requiring careful coordination if other autoscalers are in use. Overall, VPA provides a hands-off way to continuously match resource consumption to actual demand. ## Chkk Coverage ### Curated Release Notes Chkk continuously monitors the official release notes and changelogs for the Vertical Pod Autoscaler. It distills these into actionable insights—highlighting any breaking changes, deprecated APIs/CRDs, and important operational impacts in each release. Instead of wading through raw upstream notes, you get a curated summary focused on what matters for your clusters. For example, if a new VPA version removes a beta API or changes how recommendations are calculated, Chkk will flag this prominently so you can prepare in advance. This saves time and ensures you don't miss critical changes when planning an upgrade. ### Preflight & Postflight Checks Upgrading VPA without proper checks can lead to misconfigurations or downtime. Chkk performs preflight validations to ensure your environment is ready for the new VPA version. These preflight checks catch issues like missing Metrics Server, outdated VPA CRDs, or insufficient cluster resources, preventing a failed rollout due to configuration mismatches. After the upgrade, Chkk runs postflight checks to confirm everything is working correctly. It ensures the new VPA components are running healthy and that your workloads are receiving recommendations as expected. Together, the pre- and postflight checks provide a safety net around the upgrade, ensuring any problems are identified and addressed either before changes are applied or immediately after the new VPA is in place. ### Version Recommendations Chkk's platform keeps track of VPA release milestones and end-of-life (EOL) information. It will proactively alert you if the VPA version you're running is approaching EOL or has become outdated. Chkk suggests safe upgrade paths to a supported version. These version recommendations are tailored to your environment - for example, if you're on a legacy 0.x or 1.x VPA release, Chkk will identify a stable newer release that is compatible with your current Kubernetes cluster version and VPA configuration. Chkk helps you plan upgrades ahead of time so you're never caught running an unsupported VPA. ### Upgrade Templates Upgrading the Vertical Pod Autoscaler can involve multiple steps (updating CRDs, deploying new controller components) in a live cluster, so Chkk provides **Upgrade Templates** for two strategies: in-place upgrades and blue-green (canary) upgrades. The in-place template guides you through backing up current VPA configurations, applying new CRDs, and updating the recommender, updater, and admission webhook with safety checks and rollback points to ensure high availability. The blue-green template runs a new VPA version in parallel (for example, in a separate namespace), letting you gradually migrate VPA objects and observe behavior on subsets of workloads. It ensures only one VPA actively adjusts a given workload at a time and offers a clear path to fully cut over or roll back if issues arise. ### Preverification One of Chkk's most powerful capabilities is preverification - essentially a dry-run of your VPA upgrade in a controlled environment. Before you ever apply changes to your production cluster, Chkk can simulate the upgrade in an isolated cluster that mirrors your production setup. In this preverification phase, Chkk deploys the target version of the Vertical Pod Autoscaler along with your current VPA configuration to see how they interact. This process can catch a range of potential issues early. Chkk will report any detected anomalies during this dry-run. This feature gives platform engineers a chance to test-drive the VPA upgrade safely, ensuring that by the time you perform the real upgrade, you've already resolved the major risks in a sandbox. ### Supported Packages It has built-in support for VPA deployments via Helm charts, Kustomize overlays, plain Kubernetes YAML manifests, or add-on managers. Upon connecting to your cluster, Chkk auto-detects your current VPA version and manifest source. It highlights changes in Helm chart values, provides manifest patches or overlay updates, and verifies operator version requirements while preserving custom configurations, images, and private registry references. Even if you use a custom-built VPA image, Chkk can still track version parity with the upstream project. ## Common Operational Considerations * **Coordinate VPA with HPA carefully** - Avoid running both on the same resource metric, as VPA adjusts CPU/memory requests while HPA scales replicas. If both target CPU, VPA changes may confuse HPA's calculations, causing oscillations; isolate metrics or use custom ones. * **Expect and manage pod restarts** - VPA updates require pod restarts, making it a disruptive event. Use Pod Disruption Budgets (PDBs) and rollout strategies to prevent downtime, ensuring pods don't restart simultaneously or move unpredictably between nodes. * **Ensure cluster capacity for scaling** - VPA may suggest resources beyond what the cluster can provide, causing pods to remain unschedulable. Use Cluster Autoscaler with VPA and enforce resource limits to prevent excessive recommendations that outpace hardware. * **Update CRDs and handle API deprecations** - VPA CRDs and API versions evolve, so always upgrade them before updating VPA. Failure to migrate deprecated API versions can break VPA functionality; review release notes and apply the latest manifests to avoid issues. * **Mind admission webhook interactions** - VPA acts as an admission controller, and conflicts can arise with other webhooks like OPA or PSPs. Verify that VPA's webhook is still applying resource updates correctly after upgrades or adding new admission controllers. ## Additional Resources * [Vertical Pod Autoscaler Documentation](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler) * [Vertical Pod Autoscaler Releases](https://github.com/kubernetes/autoscaler/releases) # Alertmanager Source: https://docs.chkk.io/projects/application-services/alertmanager Chkk coverage for Alertmanager. We provide version recommendations, preflight/postflight checks, and Upgrade Templates—ensuring worry-free operations. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v0.20.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v0.22.2 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Alertmanager Overview Alertmanager handles alert deduplication, routing, silencing, and inhibition for Prometheus. It ensures that alerts reach the right destinations while preventing unnecessary noise. Upgrades can introduce breaking changes, impact notification routing, or remove deprecated APIs, requiring careful preparation. Recent updates have removed long-standing APIs, enforced stricter configuration validation, and patched security vulnerabilities. Chkk helps platform engineers by automating upgrade checks, highlighting impactful changes, and providing structured upgrade guidance. ## Chkk Coverage ### Curated Release Notes Chkk extracts key updates from Alertmanager release notes, filtering out minor details and focusing on breaking changes, security patches, and new features. It flags major API removals, such as the elimination of the v1 API in v0.27, so engineers can prepare configurations in advance. Important security patches, like the XSS fix in v0.26, are surfaced for prioritization. Configuration shifts, such as stricter UTF-8 validation for labels, are highlighted to prevent unexpected failures. ### Preflight & Postflight Checks Chkk preflight checks validate configuration syntax, detect deprecated fields, and confirm compatibility with the new version before upgrading. It ensures HA setups are correctly configured, cluster communication is functional, and notification endpoints are reachable. After the upgrade, postflight checks verify that Alertmanager joins clusters successfully, continues processing alerts, and logs notifications without errors. Any misconfigurations or alert delivery failures are flagged immediately for resolution. ### Version Recommendations Chkk continuously monitors Alertmanager's support lifecycle, highlighting when a deployed version is nearing EOL or poses security risks. Engineers receive recommendations for stable versions that align with Prometheus and ensure long-term support. If a release introduces significant deprecations, Chkk warns about potential issues before an upgrade. This ensures that platform teams stay on a supported version without disruption. ### Upgrade Templates Chkk provides **Upgrade Templates** for both in-place and blue-green strategies, ensuring controlled and reliable upgrades. Each template includes pre-upgrade backups, step-by-step instructions, and health checks to minimize alerting disruptions. In-place upgrades update HA instances sequentially, maintaining continuity in smaller clusters. Blue-green upgrades reduce risk by deploying a parallel Alertmanager instance, validating alert processing, cutting over to the updated instance. These templates ensure smooth upgrades, whether for a small development cluster or a large-scale production environment. ### Preverification Chkk runs a dry-run upgrade in an isolated environment, replicating existing configurations to detect errors before changes are applied in production. It validates configuration syntax, identifies incompatibilities, and simulates alert routing behavior to catch failures early. This helps teams proactively address issues such as deprecated fields, notification mismatches, or breaking API changes before deployment. ### Supported Packages Chkk supports Alertmanager installations via Helm, Kustomize, and Kubernetes manifests, adapting its validation and upgrade checks accordingly. It seamlessly integrates with Prometheus Operator deployments, standalone Kubernetes setups, and custom-built images. Regardless of the deployment method, Chkk ensures a safe and reliable upgrade process without requiring modifications to existing workflows. ## Common Operational Considerations * **High Availability & Clustering**: For redundancy, deploy Alertmanager in HA mode and configure Prometheus to send alerts to all instances. Ensure all replicas run the same version and share identical configurations to avoid inconsistencies in alert processing. * **State Persistence (Silences)**: Alertmanager stores silences and notifications in memory, meaning a full restart will erase active silences. Use persistent storage for long-term state retention or export silences before upgrading to reapply them after deployment. * **Configuration and Templates**: Validate alerting configuration before deploying updates to prevent notification failures. Use amtool check-config to catch errors, and ensure custom notification templates work correctly after an upgrade. * **Post-Upgrade Verification**: After upgrading, send a test alert to confirm routing works as expected. Monitor logs and metrics like alertmanager\_notifications\_failed\_total to detect delivery issues and address misconfigurations promptly. * **Capacity & Performance**: Tune Alertmanager's memory and CPU resources based on alert volume to prevent processing slowdowns. Adjust grouping and throttling settings to control notification frequency and avoid overwhelming alerting channels. ## Additional Resources * [Alertmanager Documentation](https://prometheus.io/docs/alerting/latest/alertmanager/) * [Alertmanager Releases](https://github.com/prometheus/alertmanager/releases) # Apache Kafka Source: https://docs.chkk.io/projects/application-services/apache-kafka Chkk coverage for Apache Kafka. We provide preflight/postflight checks, curated release notes, and Upgrade Templates—designed for seamless upgrades. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v2.5.1 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v2.6.3 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Apache Kafka Overview Apache Kafka is a distributed event streaming platform designed to handle real-time data pipelines and event-driven applications at scale. It uses a partitioned, replicated broker architecture to deliver high throughput, durability, and fault tolerance. Kafka relies on ZooKeeper or its newer KRaft quorum for metadata management and partition leadership coordination, making the health of these components critical to cluster stability. Producers write records to Kafka topics, while consumers subscribe to specific partitions to read data. Kafka's proven scalability and flexibility have made it essential for modern infrastructure teams managing complex data workflows. ## Chkk Coverage ### Curated Release Notes Chkk consolidates official Kafka release notes and KIPs, highlighting only impactful changes such as new configurations, defaults adjustments, or API deprecations affecting your clusters. Instead of manually parsing upstream notes, you receive targeted updates relevant to production environments. For example, Chkk alerts you if a Kafka update changes replication defaults or removes legacy consumer configurations, ensuring you're not caught off guard. Summaries link directly to detailed upstream documentation for further reference. Chkk's tailored notifications streamline operational awareness and minimize unexpected disruptions. ### Preflight & Postflight Checks Chkk performs thorough preflight checks before Kafka upgrades, verifying broker compatibility, ZooKeeper (or KRaft) health, and configuration readiness to prevent upgrade issues. Post-upgrade, it validates partition leadership, consumer group offsets, broker synchronization, and replication health. Any anomalies, such as increased consumer lag or under-replicated partitions, are quickly flagged for immediate action. These automated checks help ensure upgrade consistency, minimize downtime, and reduce manual troubleshooting. Platform engineers rely on Chkk's checks for confidence during Kafka maintenance. ### Version Recommendations Chkk tracks Kafka's support lifecycle and proactively alerts you about upcoming or past EOL versions, helping mitigate security and stability risks. It recommends stable Kafka versions based on community feedback, known issues, and compatibility with your Kubernetes clusters. When your Kafka release approaches end-of-life or carries known vulnerabilities, Chkk provides actionable upgrade paths tailored to your environment. This proactive approach simplifies upgrade planning, reducing reactive firefighting from unexpected CVEs or version incompatibilities. With Chkk's recommendations, you maintain stable, secure Kafka deployments. ### Upgrade Templates Chkk offers detailed Kafka **Upgrade Templates** for in-place upgrades or blue-green deployments, aligning with common GitOps and CI/CD practices. These upgrade templates guide you step-by-step through broker restarts, inter-broker protocol adjustments, and partition reassignments. Upgrade Templates also include recommended pre-upgrade validations (e.g., verifying cluster health, ISR status, and resource availability) and rollback procedures if issues arise post-upgrade. This structured approach removes guesswork, enabling predictable, repeatable Kafka upgrades. You can confidently manage upgrades without compromising data availability. ### Preverification Chkk provides preverificaton in a controlled, sandboxed environment before applying changes to your production clusters. It replicates broker configurations, topic layouts, and workloads to identify hidden risks such as resource constraints, configuration mismatches, or protocol incompatibilities. Any upgrade issues detected during this simulation—like partition replication failures or consumer connectivity problems—are flagged early for remediation. By catching potential problems upfront, Chkk's preverification significantly reduces the risk and uncertainty of live Kafka upgrades, ensuring smooth deployments. ### Supported Packages Chkk supports Kafka deployments across multiple installation methods, including Helm charts, Kustomize overlays, and plain Kubernetes manifests. It automatically recognizes your deployment type, ensuring upgrade instructions match your operational patterns. Customizations such as private registries, bespoke image builds, or specialized patches remain intact throughout upgrades. Chkk's compatibility across installation methods means you can maintain Kafka consistently alongside other critical components without changing your preferred workflows. This flexibility streamlines Kafka management regardless of how it's deployed. ## Common Operational Considerations * **Broker Quorum Stability:** Ensure a replication factor of at least 3, enforce `min.insync.replicas`, and perform controlled shutdowns to prevent data loss or unstable leader elections. * **ZooKeeper Dependencies:** Maintain ZooKeeper in a resilient quorum (3+ nodes) with careful latency monitoring, or plan structured migrations to Kafka's KRaft to avoid irreversible metadata issues. * **ISR and Under-Replicated Partitions:** Actively monitor under-replicated partitions and use replication throttles during maintenance; consistent ISR health ensures reliable real-time message handling. * **Consumer Lag Management:** Scale consumer instances or optimize processing when lag increases, regularly tracking offsets to maintain real-time data consumption. * **Rolling Upgrades and Downgrades:** Upgrade Kafka brokers sequentially, verifying partition synchronization after each upgrade, and avoid premature inter-broker protocol version upgrades to retain rollback capabilities. * **Security and Authentication Pitfalls:** Enforce SASL and TLS authentication on brokers, implementing precise ACLs; misconfigurations in certificates or ACL rules can disrupt client connections and security. * **Networking and Load Balancing Issues:** Kafka clients connect directly to partition leaders, requiring partition-aware connectivity instead of traditional load balancers; use rack awareness and leader balancing to prevent bottlenecks. ## Additional Resources * [Apache Kafka Documentation](https://kafka.apache.org/documentation/) * [Apache Kafka Releases](https://github.com/apache/kafka/tags) # Apache Zookeeper Source: https://docs.chkk.io/projects/application-services/apache-zookeeper Chkk coverage for Apache Zookeeper. We provide curated release notes, preflight/postflight checks, and Upgrade Templates—all tailored to your environment. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v3.5.8 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v3.7.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Apache ZooKeeper Overview Apache ZooKeeper is a distributed coordination service for managing application configurations, leader election, and synchronization. It uses a quorum-based approach, replicating data across nodes to ensure high availability. A single leader processes writes, and followers replicate changes to maintain consistency. Many systems (e.g., Kafka, Hadoop) depend on ZooKeeper to track essential metadata. By providing robust primitives like locks and watches, ZooKeeper simplifies the complexity of distributed coordination. ## Chkk Coverage ### Curated Release Notes Chkk closely tracks official ZooKeeper updates and extracts only the changes that matter to your environment. You'll receive concise, actionable briefs on new features, security patches, or deprecations. It flags changes that might affect configurations or break existing setups. Chkk also highlights when certain dependencies (like Java versions) change. This saves you time by filtering through Apache's upstream details for operational relevance. ### Preflight & Postflight Checks Before upgrading, Chkk's preflight checks confirm that your ZooKeeper cluster is healthy and that your target version is compatible with existing configurations. It scans for deprecated settings, ensures each node is in sync, and verifies that you aren't skipping required upgrade paths. After the upgrade, postflight checks confirm that quorum is established, nodes are on the correct version, and logs show no critical errors. If issues arise, Chkk pinpoints them immediately for quick remediation. ### Version Recommendations Chkk maps ZooKeeper's official support timelines and flags EOL versions in use. It tracks which versions align with critical bug fixes, stable patches, and security updates. If a newly released version contains regressions or doesn't yet have community consensus, Chkk advises a safer path. This guidance ensures your cluster stays in a healthy support window, balancing features and stability. It also helps align upgrades with internal compliance and Kubernetes version constraints. ### Upgrade Templates Chkk provides **Upgrade Templates** for both in-place and blue-green upgrades of ZooKeeper. In-place upgrades guide you through safely updating nodes one at a time while maintaining quorum, whereas blue-green upgrades deploy a parallel ZooKeeper ensemble to test stability before switching traffic. Each procedure includes clear rollback steps and recommended monitoring checkpoints. They align with ZooKeeper community best practices to minimize service disruption. By systematically following each upgrade phase, platform teams significantly reduce the risk of downtime. ### Preverification Chkk's preverification feature simulates the upgrade on a sandbox digital twin, mirroring your production configuration. It flags conflicts such as parameter deprecations or resource limits that might cause downtime in production. By iterating in a safe environment, you can update configs, address performance issues, and validate the process before going live. This workflow significantly reduces the odds of discovering breaking changes mid-upgrade. Once preverification is successful, you can confidently apply the final plan. ### Supported Packages Chkk supports Apache ZooKeeper deployments across multiple installation methods, including Helm charts, Kustomize overlays, and standard Kubernetes manifests. It automatically recognizes your ZooKeeper deployment method, ensuring upgrade instructions align seamlessly with your existing operations. Customizations such as private registries, custom-built images, or specialized patches are fully supported during upgrades. Chkk's broad compatibility ensures that you can manage ZooKeeper consistently alongside other infrastructure components without altering established workflows. This flexibility simplifies ZooKeeper lifecycle management regardless of your chosen deployment approach. ## Common Operational Considerations * **Quorum and Leader Election Issues:** Use an odd number of nodes to maintain a clear majority, and ensure each node has consistent network connectivity to avoid election delays. Tuning `initLimit` and `syncLimit` is critical so that leader/follower negotiations complete promptly without premature node removal. * **Data Consistency and Syncing:** Use autopurge to remove old transaction logs and snapshots, preventing disk bloat and speeding up node recovery. Watch for lagging followers that might drop from quorum or cause consistency drift if they can't catch up quickly. * **Latency and Performance Tuning:** Keep `tickTime` and session timeouts at sensible defaults; overly aggressive settings can trigger false timeouts. Allocate enough JVM heap to avoid swapping and consider dedicating disks for transaction logs to minimize I/O latencies. * **Rolling Upgrades with Minimal Downtime:** Upgrade one node at a time, waiting for re-sync and stable quorum before moving to the next. Confirm leader re-election events are smooth and track client connections if the leader is temporarily offline. * **Handling Zookeeper Node Failures:** Replace crashed nodes quickly and ensure each new node rejoins with a consistent data directory. Monitor logs using the `stat` command and confirm nodes return to following or leading states without extended election times. * **ACL and Security Management:** Enable authentication schemes (e.g., SASL or digest) and lock down sensitive znodes with ACLs. Consider encrypting inter-node and client-server traffic (TLS) to protect data in transit and restrict unauthorized access. ## Additional Resources * [Apache Zookeepers Documentation](https://zookeeper.apache.org/doc/current/) * [Apache Zookeepers Releases](https://zookeeper.apache.org/releases.html) # Argo CD Source: https://docs.chkk.io/projects/application-services/argo-cd Chkk coverage for Argo CD. We provide version recommendations, preflight/postflight checks, and Upgrade Templates—ensuring worry-free operations. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.8.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v2.0.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Argo CD Overview Argo CD is a declarative GitOps continuous deployment tool that manages Kubernetes applications by keeping all definitions in Git and automatically syncing the cluster state to match. It runs as a controller, detecting and optionally correcting drift between the live state and the desired configuration. This approach keeps workloads aligned with Git, supporting Helm charts, Kustomize overlays, and other packaging tools for flexible declarative management. Argo CD also offers robust rollback capabilities, allowing teams to revert to any previous version. Its web UI, CLI, SSO integration, and multi-cluster support make Argo CD a powerful engine for continuous delivery. ## Chkk Coverage ### Curated Release Notes Chkk tracks the official Argo CD release notes and change logs to produce curated summaries for operators. This means you get highlights of upgrade changes, security patches, and new features from each Argo CD release that are relevant to your deployments. By offloading the research work of combing through release notes (Chkk Operational Safety Platform), Chkk's curated notes ensure you're aware of any changes (for example, deprecated APIs or config keys) that could impact your Argo CD installation before you upgrade. ### Preflight & Postflight Checks Chkk performs automated preflight and postflight checks around Argo CD upgrades to catch issues early. Before an upgrade, Chkk runs validations to identify potential problems such as deprecated Kubernetes APIs (or Argo CD CRD changes) that the new version might use, configuration settings that need updating, and any drift between the Git declarative state and the live cluster state. After the upgrade, Chkk's postflight checks verify that Argo CD is healthy and behaving as expected - for instance, all Argo CD components are running, applications remain in sync, and no new errors or warnings appear. These pre/post-flight checks help detect misconfigurations or regressions early, ensuring the upgrade was successful. ### Version Recommendations Chkk continuously monitors ArgoCD's support lifecycle and known compatibility issues to provide version recommendations. It flags when you are running an outdated or end-of-life Argo CD version and suggests a stable version for upgrade. This recommendation is based on the latest patches and security updates, as well as compatibility with your Kubernetes cluster version. In practice, Chkk's Upgrade Copilot identifies the "correct next version" of Argo CD to upgrade to - one that resolves known vulnerabilities or deprecations - and it alerts you if your current Argo CD release is approaching EOL or has known incompatibilities. ### Upgrade Templates Chkk provides proven **Upgrade Templates** for Argo CD, supporting both in-place and blue-green deployment strategies. In an in-place upgrade, it orchestrates the correct sequence of component updates to minimize downtime. For blue-green, Chkk helps spin up a parallel "green" instance, switch over once validated, and then retire the old "blue" deployment. It also determines when to pick one method over the other, aiming to keep GitOps services running smoothly. Throughout either approach, Chkk preserves application synchronization and state, ensuring upgrades do not disrupt managed apps. ### Preverification Before upgrading Argo CD in production, Chkk conducts a preverification in a controlled environment to validate compatibility with your current configuration. It performs dry-run checks, detecting any issues with Helm charts, Kustomize overlays, or custom resources like Applications and AppProjects. Chkk also confirms that your RBAC configurations remain valid and that repo server integrations (Helm, Git, etc.) continue functioning. If the new version needs additional Kubernetes permissions, Chkk highlights these before proceeding. By verifying all upgrade steps in advance, Chkk ensures you encounter fewer risks during the actual production upgrade. ### Supported Packages Chkk's Argo CD add-on works with all common installation methods—Helm, Kustomize, or raw YAML. Whether you installed it via the official Helm chart, a Kustomize overlay, or direct manifests, Chkk automatically detects and manages your deployment. In practice, it tailors checks and upgrades to fit your approach, preserving GitOps workflows for Argo CD itself. This includes suggesting updated Helm chart versions, patching manifests, and validating across all package types to ensure continued compatibility. By covering every packaging method, Chkk keeps Argo CD running smoothly and aligned with your desired state. ## Common Operational Considerations * **Repository Sync Failures:** Often caused by incorrect credentials or invalid manifests. Monitor repo-server logs and Argo CD UI events to pinpoint the issue. * **RBAC Misconfigurations:** Can lock users out or grant excessive privileges. Double-check ArgoCD's internal roles and cluster RBAC rules to ensure correct access levels. * **Reconciliation Delays:** Large installations may see slower sync times if the controller is overloaded. Tune ArgoCD's concurrency settings and watch out for apps with constantly flapping differences. * **API Server Load:** Argo CD can make frequent API calls in big clusters; consider reducing polling intervals, adjusting client QPS/burst, or sharding Argo CD instances if API requests spike. * **Secrets Management:** Use tools like Sealed Secrets or SOPS to avoid storing raw secrets in Git. Secure ArgoCD's service accounts, storage, and logs to prevent unauthorized exposure of sensitive data. ## Additional Resources * [Argo CD Documentation](https://argo-cd.readthedocs.io/en/stable/) * [Argo CD Releases](https://github.com/argoproj/argo-cd/releases) # Argo Rollouts Source: https://docs.chkk.io/projects/application-services/argo-rollouts Chkk coverage for Argo Rollouts. We provide preflight/postflight checks, curated release notes, and Upgrade Templates—designed for seamless upgrades. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v0.10.2 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.0.3 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Argo Rollouts Overview Argo Rollouts is a Kubernetes controller that enables advanced deployment strategies such as blue-green, canary, and experiment-driven progressive delivery. It introduces custom resource definitions (CRDs) to manage these rollouts, integrating with ingress controllers and service meshes to gradually shift traffic between versions. Argo Rollouts can automatically analyze metrics and perform automated promotions or rollbacks based on health criteria. These capabilities provide fine-grained control over deployments, allowing teams to limit the blast radius of changes and safely introduce new application versions even in high-traffic production environments. ## Chkk Coverage ### Curated Release Notes Chkk filters Argo Rollouts release notes, highlighting features, deprecations, and breaking changes impacting your environment. This eliminates manual tracking and flags shifts like default behavior changes in auto-promotion. Deprecations in CRDs and major API changes are surfaced before they cause issues. Security patches and lifecycle support updates are flagged to ensure compliance. These insights help teams stay ahead of potential disruptions. ### Preflight & Postflight Checks Preflight checks validate Kubernetes version compatibility, detect deprecated Rollout fields, and assess rollout strategy correctness before an upgrade. Chkk scans your existing Rollout custom resources for deprecated fields or incompatible spec usages that the upcoming version would no longer support. Postflight checks confirm the new Rollouts controller is healthy and existing Rollout objects function correctly. It flags failed rollouts, stuck resources, or unexpected behavior. This prevents unnoticed failures and keeps deployments reliable. ### Version Recommendations Chkk tracks Argo Rollouts' support lifecycle, flagging EOL risks and version incompatibilities. It helps teams avoid known-bugged releases and recommends the safest upgrade paths. Chkk prioritizes stability, guiding upgrades based on support timelines and community feedback. Historical issue tracking prevents adopting versions with recurring performance or reliability problems. This ensures that deployments remain on supported, stable builds. ### Upgrade Templates Chkk provides structured in-place and blue-green upgrade templates tailored for Argo Rollouts. In-place upgrades guide step-by-step controller updates while ensuring availability. Blue-green upgrades deploy a new Rollouts instance in parallel before fully switching over, allowing controlled migration. These templates integrate with GitOps pipelines and include rollback plans for quick recovery. They minimize human error and simplify controlled updates. ### Preverification Chkk pre-verifies upgrades by simulating them in an isolated digital twin environment before production rollout. This detects CRD conflicts, configuration mismatches, and potential resource regressions early. Sample rollouts validate that canary steps, analysis, and rollback conditions work as expected. Performance metrics are analyzed to prevent unexpected controller load increases. This process eliminates surprises and enables safer upgrades. ### Supported Packages Chkk supports Argo Rollouts deployments through Helm, Kustomize, and raw Kubernetes manifests. It detects installation methods and aligns upgrade guidance accordingly. Private registry and custom-built image compatibility are maintained without forcing external dependencies. Chkk validates custom configurations, ensuring security and compliance. This ensures seamless integration with existing CI/CD pipelines and infrastructure tooling. ## Common Operational Considerations * **Rollback Triggers & Stability Criteria:** Rollbacks rely on analysis templates; poorly defined conditions can cause premature failures. Configure thresholds to prevent flapping, require multiple failed intervals before rollback, and use consecutive success criteria for safe promotions. * **Ingress Controller & Service Mesh Compatibility:** Different ingress solutions handle traffic shifting differently, which may cause unexpected routing behavior. Ensure your ingress/service mesh is compatible with Argo Rollouts and prevent sync conflicts in GitOps-managed environments. * **Resource Consumption & Performance:** High deployment frequency can lead to excessive AnalysisRuns, increasing controller CPU and memory load. Monitor Rollouts controller resource usage and prune old analysis objects to prevent unnecessary overhead. * **Multi-Cluster Rollout Coordination:** Argo Rollouts operates per cluster and does not natively sync rollouts across multiple clusters. Use CI/CD workflows to coordinate staged deployments and enforce consistency across environments. ## Additional Resources * [Argo Rollouts Documentation](https://argoproj.github.io/argo-rollouts/) * [Argo Rollouts Releases](https://github.com/argoproj/argo-rollouts/releases) # Argo Workflows Source: https://docs.chkk.io/projects/application-services/argo-workflows Chkk coverage for Argo Workflows. We provide version recommendations, preflight/postflight checks, and Upgrade Templates—ensuring worry-free operations. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v2.5.1 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v2.11.1 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Argo Workflows Overview Argo Workflows is a Kubernetes-native workflow engine that uses CRDs to define multi-step processes as directed acyclic graphs. It's widely adopted for automating CI/CD pipelines, ML tasks, and data processing at scale. By offloading orchestration logic onto Kubernetes, Argo Workflows simplifies operational overhead. Its design allows dynamic scaling, native container isolation, and extensive integration with CNCF projects. As a CNCF graduated project, it adheres to cloud-native best practices for performance and reliability. ## Chkk Coverage ### Curated Release Notes Staying on top of Argo Workflows release changes is critical, especially since Argo does not strictly adhere to semantic versioning and even minor releases may introduce breaking changes. Chkk curates the upstream Argo Workflows release notes into an actionable summary for your team. This means you'll see highlights of new features, critical bug fixes, security patches, and any breaking changes or deprecations that could affect your workflows. This flags new features, critical fixes, or removed APIs that could impact your workflow executions. ### Preflight & Postflight Checks Upgrading Argo Workflows can be risky without thorough validation. Chkk performs automated preflight checks to ensure your environment is ready for the new version. After the upgrade, postflight checks confirm that the Argo Workflows controller and related components are healthy and functioning. Chkk will verify that the new controller pod is running with the expected version, all CRDs have been upgraded successfully, and existing workflows (including any cron schedules) continue to run without errors. ### Version Recommendations Chkk continuously monitors Argo Workflows release cycles and support timelines to help you stay on a safe version. The Argo project maintains active support only for the most recent two minor releases, which means running an older version could leave you without patches or security fixes. Chkk keeps track of when your current Argo Workflows version is approaching end-of-life. It will alert you if you're on an outdated release stream and recommend a target version to upgrade to that is both stable and compatible with your Kubernetes cluster. ### Upgrade Templates To simplify the upgrade process, Chkk provides **Upgrade Templates** for Argo Workflows. For in-place upgrades, Chkk walks you through each step in the sequence: applying CRD updates, upgrading the controller deployment, and restarting the Argo server/UI if present. For a blue-green upgrade, Chkk helps you deploy a new instance of Argo Workflows (a "green" controller) alongside the old ("blue") one. The goal is to run the old and new versions in parallel, allowing you to gradually direct non-critical workflows to the new controller and observe results. Once the new version is proven stable (i.e., workflows are running successfully under the new controller), you can promote it to replace the old version for all workflows. In both scenarios, Chkk's templates build safety checks and rollback procedures at each stage. ### Preverification One of Chkk's most powerful features is the ability to simulate an Argo Workflows upgrade in a digital twin environment before it goes to the production cluster. Chkk runs an isolated instance that mirrors your cluster's Argo Workflows setup - including the same version of Kubernetes and similar Argo configurations - and then applies the upgrade there first. During this simulation, Chkk checks for any compatibility problems. By catching CRD conflicts, performance regressions, or logic changes in this sandbox environment, Chkk ensures you only proceed with the real upgrade once you have certainty that it will succeed. ### Supported Packages It has built-in support for deployments via Helm charts, Kustomize overlays, or plain Kubernetes YAML manifests. Upon connecting to your cluster, Chkk auto-detects the installation method and tracks the current Argo Workflows version and manifest source. It preserves custom configurations, images, and private registry references within your workflows. ## Common Operational Considerations * **Workflow Execution Performance:** Large, concurrent workflow runs can overload the Kubernetes API and the workflow controller. Offloading logs and status, plus careful resource tuning, helps maintain stability at scale. * **CRD Scaling & Persistence:** High-volume or resource-intensive workflows can surpass etcd capacity and cause performance issues. Persisting workflow data to an external database ensures smooth operation under heavy loads. * **Version Mismatches:** Using different releases of the controller, UI, and CLI can lead to incompatible API calls and random failures. Align all components to the same supported version to avoid unexpected errors. * **RBAC in Multi-Tenant Clusters:** Multi-team environments often need isolation through separate namespaces and controllers. Clear roles, permissions, and namespace boundaries prevent unauthorized access or workflow overlap. * **Namespace & Template Best Practices:** Duplicating workflow definitions in multiple namespaces can lead to version drift. Reusing standardized templates and storing them in a central repository keeps definitions consistent. * **Workflow Failures from Dependency Changes:** Shifts in external services or system settings can break previously functional tasks. Monitoring upstream dependencies and updating workflow steps accordingly prevents unplanned downtime. ## Additional Resources * [Argo Workflows Documentation](https://argo-workflows.readthedocs.io/en/latest/) * [Argo Workflows Releases](https://github.com/argoproj/argo-workflows/releases) # Crossplane Source: https://docs.chkk.io/projects/application-services/crossplane Chkk coverage for Crossplane. We provide curated release notes, preflight/postflight checks, and Upgrade Templates—all tailored to your environment. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.1.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.2.1 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Crossplane Overview Crossplane is a Kubernetes add-on that manages external infrastructure via K8s-native APIs. Instead of separate dashboards, teams declare cloud resources (e.g., S3 buckets, databases) using CRDs and controllers. Providers connect Kubernetes to AWS, GCP, or Azure, while Compositions bundle multiple resources into user-friendly composite objects (e.g., kind: XPostgresInstance). Since everything is declared in YAML, GitOps flows naturally: changing manifests in Git triggers Crossplane to reconcile real-world infrastructure. ## Chkk Coverage ### Curated Release Notes Chkk continuously monitors Crossplane releases, summarizing new features, security patches, and breaking changes relevant to cloud resource management. You see concise highlights—like provider enhancements, CRD deprecations, or EOL warnings—without digging into raw changelogs. This streamlined view helps you quickly assess upgrade impact (e.g., if Composition fields changed or a provider plugin requires new RBAC permissions). ### Preflight & Postflight Checks Chkk's preflight checks ensure your Crossplane environment and CRDs are compatible with an upcoming version, verifying that providers aren't on an unsupported API and that your cluster meets the new release's Kubernetes requirements. Post-upgrade and Postflight checks confirm the new Crossplane controller and providers are healthy, scanning logs for errors and verifying managed resources continue reconciling without breakage. This two-phase validation reduces risk by detecting issues—like a missing CRD field or outdated provider image—early in the process. ### Version Recommendations Chkk tracks Crossplane's support milestones, alerting you when you're on a release nearing end-of-life or missing critical patches. It notes known incompatibilities with certain Kubernetes versions or provider releases. This approach prevents teams from unexpectedly hitting unsupported Crossplane features or security gaps. By following Chkk's guidance, you stay on stable releases aligned with your broader cluster roadmap. ### Upgrade Templates Chkk provides in-place and blue-green upgrade playbooks for Crossplane. In-place upgrades your existing deployment, typically via rolling updates or Helm chart bumps. Minimal overhead, but a small risk if changes break the controller mid-upgrade. Blue-green spins up a parallel "green" Crossplane instance (in a staging namespace or cluster), validates it, then cuts over once stable. This strategy ensures near-zero downtime, with an easy rollback if the new version misbehaves. Both methods include rollback instructions, recommended checks, and best practices for CRD updates, ensuring safe transitions with minimal disruption to managed infrastructure. ### Preverification For major version jumps or critical environments, Chkk's preverification simulates a Crossplane upgrade in a safe sandbox. It applies your actual CRDs, Providers, and Compositions to the new version, checking for schema conflicts, API deprecations, or updated RBAC needs. Any mismatch—like a provider that can't load its resources—appears in a detailed report, letting you fix problems before going live. This rehearsal significantly reduces the likelihood of unexpected production breakage. ### Supported Packages Chkk recognizes Helm, Kustomize, and raw YAML deployments of Crossplane. If you install via Helm, Chkk patches chart values and uses Helm upgrade; for Kustomize or YAML, it generates updated manifests that match your current approach. Private registries, custom images, and organization-specific security requirements are also respected. This ensures you don't have to refactor your workflow—Chkk fits seamlessly whether you're using GitOps, plain YAML, or hybrid setups. ## Common Operational Considerations * **Provider Credential Collisions:** Multiple teams may share the same ProviderConfig secret, so a single compromise or misconfiguration affects all Crossplane-managed resources. Assign unique provider credentials per namespace or team and restrict secret access via RBAC so only Crossplane can read them. * **Overly Broad XRD Schemas:** Exposing too many raw provider fields lets users override critical settings (like subnets or firewall rules). Keep Crossplane APIs minimal and validated, then patch in safe defaults in the Composition to avoid unintended or malicious config changes. * **Controller Overload:** Each Crossplane provider polls external APIs on a schedule, so hundreds of managed resources can trigger rate limits or overwhelm the control plane. Increase poll intervals (--poll-interval) and split large deployments across multiple providers to prevent reconcile bottlenecks. * **Security Leakage in Connection Secrets:** Connection details (DB passwords, tokens) often appear in namespace Secrets, risking exposure if the namespace is unprotected. Use dedicated secret managers (Vault, AWS Secrets Manager) or enforce strict RBAC and access policies to safeguard these credentials. ## Additional Resources * [Crossplane Documentation](https://docs.crossplane.io/latest/) * [Crossplane Releases](https://github.com/crossplane/crossplane/releases) # Datadog Agent Source: https://docs.chkk.io/projects/application-services/datadog-agent Chkk coverage for Datadog Agent. We provide curated release notes, preflight/postflight checks, and Upgrade Templates—all tailored to your environment. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v6.19.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v7.23.1 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Datadog Agent Overview Datadog Agent collects metrics, logs, and traces from Kubernetes clusters, running as a DaemonSet on each node. It integrates with the Kubernetes API and workloads to provide observability, forwarding data to Datadog for monitoring. The Cluster Agent coordinates metadata collection and reduces API load, improving efficiency in large clusters. Datadog Agent supports auto-discovery, integrations, and security monitoring, making it a flexible solution for infrastructure visibility. Chkk ensures its seamless deployment, monitoring, and upgrade safety in Kubernetes environments. ## Chkk Coverage ### Curated Release Notes Chkk curates Datadog Agent release notes, highlighting new features, breaking changes, and operational impacts. Instead of parsing every upstream detail, engineers receive targeted insights on critical changes, deprecated configurations, and feature adjustments. If an update requires RBAC changes or introduces new API dependencies, Chkk flags it beforehand. Release summaries also assess potential resource impact and performance shifts. This approach streamlines decision-making and prevents misconfigurations. ### Preflight & Postflight Checks Before upgrades, Chkk runs preflight checks to verify Kubernetes version compatibility, deprecated fields, and required permissions. It ensures configuration changes (e.g., log collection toggles, new API calls) won't disrupt monitoring. Post-upgrade, Chkk confirms Agent pods are healthy, data is flowing, and no unexpected failures have occurred. It detects memory spikes, missing logs, or CRD issues early, preventing monitoring blind spots. These validations reduce risk and improve upgrade confidence. ### Version Recommendations Chkk monitors Datadog Agent's lifecycle, notifying teams of security risks and upcoming EOL versions. It suggests stable upgrade paths based on support status and community-reported issues, ensuring compatibility and reliability. If a version has known bugs, Chkk recommends skipping or waiting for a fix. It also tracks Kubernetes API changes that might impact older Agent versions. This proactive approach minimizes unplanned outages and ensures ongoing support. ### Upgrade Templates Chkk provides **Upgrade Templates** for in-place rolling updates and blue-green deployments. In-place upgrades roll out Agents node by node while monitoring resource impact. Blue-green strategies deploy a parallel Agent set for validation before cluster-wide adoption. Chkk includes rollback steps, ensuring quick recovery if issues arise. These templates integrate with GitOps workflows and reduce manual intervention risks. Teams gain predictable, controlled upgrades with minimal downtime. ### Preverification **Preverification** simulates Datadog Agent upgrades in an isolated environment before live deployment. This dry-run detects configuration conflicts, RBAC gaps, or API incompatibilities without affecting production. If a new Agent version crashes due to missing dependencies, Chkk flags it early. Engineers can iterate on fixes before rolling out the upgrade. This process enhances stability and prevents production incidents. ### Supported Packages Chkk supports Helm, Datadog Operator, Kubernetes YAML, and Kustomize-based deployments. It tracks Helm chart versions, validates Operator CRs, and ensures static manifests stay in sync. Custom Agent images and private registries are fully supported, ensuring consistency across environments. If integrations like kube-state-metrics are used, Chkk verifies their compatibility with Agent versions. This flexibility allows seamless adoption across different Kubernetes setups. ## Common Operational Considerations * **Configuration Pitfalls:** Features like log collection and APM tracing must be explicitly enabled; default settings might leave critical data unmonitored. Always review datadog.yaml settings to ensure required integrations and logs are correctly configured. * **Resource Utilization & Tuning:** The Agent can consume significant CPU/memory in large clusters, often due to frequent metric polling. Tune min\_collection\_interval and filter unnecessary logs to optimize resource usage. * **Integration & Autodiscovery:** Properly annotate pods for Autodiscovery to automatically enable monitoring for services like Redis or NGINX. Running kube-state-metrics alongside the Agent ensures full cluster observability. * **Cluster Agent Usage:** The Datadog Cluster Agent offloads metadata collection, reducing API server strain. Large clusters should deploy it for improved efficiency and scalability. * **Kubernetes Compatibility:** Kubernetes API changes (e.g., EndpointSlices replacing Endpoints) may require Agent updates and RBAC adjustments. Always verify Kubernetes version compatibility before upgrading. * **Staying Up-to-Date:** Monthly Agent updates are recommended for security and stability. Avoid deploying versions with known issues, and test in a canary environment before full rollout. ## Additional Resources * [Datadog Agent Documentation](https://docs.datadoghq.com/agent/) * [Datadog Agent Releases](https://github.com/DataDog/datadog-agent/releases) # Elasticsearch Source: https://docs.chkk.io/projects/application-services/elasticsearch Chkk coverage for Elasticsearch. We provide curated release notes, preflight/postflight checks, and Upgrade Templates—all tailored to your environment. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v6.8.3 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v7.9.1 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Elasticsearch Overview Elasticsearch is a distributed search and analytics engine built on Apache Lucene, widely used for indexing, full-text search, and log aggregation. It organizes data across multiple nodes for horizontal scalability and resilience, automatically replicating and distributing shards to handle node failures. In Kubernetes, Elasticsearch is typically deployed using operators or Helm/Kustomize, enabling easy scaling and updates in a containerized environment. Thanks to its scale-out design, it can index large data volumes while delivering near real-time query performance. ## Chkk Coverage ### Curated Release Notes Chkk continuously reads Elasticsearch release notes, highlighting the updates that matter most—such as breaking changes, security patches, or performance enhancements. Operators see a concise, actionable summary rather than combing through lengthy upstream docs. This ensures your team remains aware of potential disruptions or improvements to your environment well in advance. ### Preflight & Postflight Checks Before any upgrade, Chkk's preflight checks spot deprecated APIs or config changes that could break the cluster. Postflight checks ensure everything is healthy afterward, monitoring cluster status, shard allocation, and performance metrics to confirm no regressions or drift have appeared. By catching issues early and validating cluster state post-upgrade, Chkk prevents prolonged downtime and unexpected data loss. ### Version Recommendations Chkk tracks ElasticSearch's support lifecycle and alerts you when a version nears end-of-life or has known vulnerabilities. It also considers Kubernetes compatibility, recommending stable, fully supported releases that integrate seamlessly with your cluster. Staying on a tested, up-to-date version helps prevent security gaps and performance bottlenecks from unmaintained code. ### Upgrade Templates Chkk offers in-place and blue-green upgrade paths, helping you minimize downtime and ensure data integrity. In-place rolling updates sequentially upgrade pods, preserving cluster availability. Blue-green deploys a parallel cluster, letting you cut over once the new version is validated and data is synchronized. This flexible approach accommodates both minor version bumps and major structural changes with minimal risk. ### Preverification For major or complex upgrades, Chkk's preverification feature tests the process in a controlled environment. It checks your plugins, ingest pipelines, and other integrations against the new version, catching potential incompatibilities before you update the live cluster. This safety net helps teams proceed confidently, knowing the upgrade has been validated for their unique setup. ### Supported Packages Whether you deploy Elasticsearch via Helm, Kustomize, or raw manifests, Chkk recognizes and manages it. By parsing your existing setup, it ensures that recommended changes and rollouts align with your preferred tooling and that custom images or registries remain compatible. This integration lets you maintain standardized workflows while benefiting from Chkk's automated checks and curated guidance. ## Common Operational Considerations * **Shard Hotspotting:** If shards aren't balanced across nodes, one node can become overloaded, leading to high latency or node failures. Use allocation awareness (e.g. by zone) and shard routing constraints to distribute load evenly, and periodically check `_cat/shards` for uneven distribution. * **Slow Queries & Cache Misses:** Complex aggregations or wildcard searches can cause high CPU and slow responses. Enable slow logs to pinpoint problem queries, leverage filters where possible for caching, and avoid repetitive "now" time ranges that invalidate cache. * **JVM Heap Pressure:** Elasticsearch relies heavily on heap for indexing, caching, and aggregations. Over-allocating heap (e.g., >30 GB) disables compressed object pointers, increasing GC overhead. Monitor old-gen usage and tune circuit breakers to prevent out-of-memory incidents. * **Snapshot & Disk Constraints:** Snapshots consume I/O, often evicting data from the OS cache, which can degrade search performance. Keep disk usage below watermark thresholds (e.g., \< 85%) to prevent shard relocation or write blocks, and schedule snapshots during lower traffic if possible. * **Master Node Quorum:** Running fewer than three dedicated master nodes risks split-brain events during network partitions. Always configure quorum-based election (e.g., set `discovery.zen.minimum_master_nodes`) and avoid simultaneous restarts of multiple masters to keep the cluster stable. ## Additional Resources * [Elasticsearch Documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html) * [Elasticsearch Releases on GitHub](https://github.com/elastic/elasticsearch/releases) # Fluent Bit Source: https://docs.chkk.io/projects/application-services/fluent-bit Chkk coverage for Fluent Bit. We provide preflight/postflight checks, curated release notes, and Upgrade Templates—designed for seamless upgrades. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.4.4 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.7.9 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Fluent Bit Overview Fluent Bit is a lightweight log processor and forwarder that ingests logs from multiple sources and routes them to various backends. Written in C, it's highly efficient in CPU and memory usage, which is ideal for large clusters or edge environments. Deployed typically as a DaemonSet, it ensures Kubernetes-wide log collection with minimal overhead. It comes with robust plugin support for filtering, parsing, and outputting logs to systems like Elasticsearch or S3. Since Fluent Bit's defaults can shift between releases, careful upgrades are key to uninterrupted logging. ## Chkk Coverage ### Curated Release Notes Chkk filters Fluent Bit's upstream release notes to highlight critical fixes, newly deprecated features, and behavior changes affecting your cluster. This helps you quickly identify potential issues like renamed configuration directives, changed default behaviors, or plugin removals. By surfacing only the relevant details, you avoid scanning lengthy upstream logs yourself. Chkk also flags security patches and CVEs tied to new versions. This targeted insight ensures you never miss important updates. ### Preflight & Postflight Checks Chkk runs preflight checks before you upgrade Fluent Bit to confirm your configuration, plugins, and resource allocations match the requirements for the new release. It looks for deprecated parameters, plugin conflicts, or version gaps that could break logging. Once upgraded, Chkk verifies through postflight checks that the DaemonSet is stable, logs are still flowing, and no errors appear in Fluent Bit metrics or logs. If anything deviates, you get actionable rollback or remediation steps. This two-phase process prevents unnoticed log disruptions. ### Version Recommendations Chkk continually tracks Fluent Bit's release cycle and flags when your current version nears end-of-life or has security advisories. Suggestions are grounded in proven stability data, rather than just pointing to the latest build. This is critical if certain releases introduce known memory leaks or CPU spikes that haven't been patched yet. By checking multiple sources, Chkk picks the most reliable upgrade path for your environment. You can then align with Fluent Bit's support window and avoid guesswork. ### Upgrade Templates Chkk offers in-place and blue-green **Upgrade Templates** to preserve logging continuity. An in-place upgrade leverages rolling updates to sequentially replace pods, while blue-green spins up a parallel DaemonSet for canary testing. Both approaches include pause points to validate logs, plus clear rollback procedures if the new version fails. This structured method reduces risk by ensuring no node runs without a collector. It also fits seamlessly with GitOps, letting you keep audit trails for each step. ### Preverification Chkk's digital twin concept simulates your Fluent Bit upgrade in a controlled sandbox. It replicates your config, plugins, and resource usage to expose issues like changed defaults, plugin mismatches, or performance bottlenecks before they hit production. You can then address errors and retest until the upgrade path is clean. This preverification significantly boosts confidence for major version jumps or complex deployments. By tackling pitfalls in test, you minimize disruptions and quickly revert if unexpected conflicts arise. ### Supported Packages Whether you deploy Fluent Bit via Helm, Kustomize, or raw manifests, Chkk identifies your current version, suggests an appropriate target, and generates manifest patches. It respects private registries, custom images, and internal forks. For GitOps workflows, changes are surfaced as minimal diffs that you can version-control. This approach keeps your Fluent Bit installation method consistent and reduces manual overhead in switching tooling. ## Common Operational Considerations * **High Log Volume & Resource Tuning:** Enable disk-based buffering and configure Mem\_Buf\_Limit if logs spike to avoid data loss or node pressure. Multi-worker outputs and tuned flush intervals can also manage throughput more efficiently. * **Plugin & Configuration Compatibility:** New Fluent Bit releases can rename or remove config keys, so always review release notes or perform a dry-run before going live. Plugins built for previous versions may break if underlying APIs changed. * **Monitoring & Troubleshooting:** Expose Fluent Bit metrics to detect bottlenecks, high retries, or memory leaks. Check logs and readiness/liveness probes for early signs of a stalled or misconfigured log pipeline. ## Additional Resources * [Fluent Bit Documentation](https://docs.fluentbit.io/manual) * [Fluent Bit Releases](https://github.com/fluent/fluent-bit/releases) # Grafana Source: https://docs.chkk.io/projects/application-services/grafana Chkk coverage for Grafana. We provide preflight/postflight checks, curated release notes, and Upgrade Templates—designed for seamless upgrades. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v7.1.2 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v7.4.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Grafana Overview Grafana is an open-source observability platform used for visualizing metrics, logs, and traces from various backends. It provides an interactive UI for creating dashboards, setting alerts, and correlating data across multiple sources. With its extensible plugin system, Grafana connects to a wide array of databases and services. High availability, scalability, and role-based access controls make it suitable for large production environments. Frequent releases introduce new features, security patches, and occasional breaking changes. ## Chkk Coverage ### Curated Release Notes Chkk keeps watch over Grafana's upstream releases to highlight security patches, deprecated APIs, and new features relevant to your clusters. It condenses verbose changelogs into key operational insights, so you can quickly see what requires attention. Any known issues or breaking changes are flagged early, helping you plan upgrades with minimal guesswork. By highlighting direct impacts to your environment, Chkk takes the guesswork out of tracking Grafana upgrades. ### Preflight & Postflight Checks Before upgrading, Chkk automatically inspects your Kubernetes version, Grafana config, dashboards, and plugins to confirm compatibility with the new release. It warns you if specific flags, authentication settings, or dependencies will break. Post-upgrade, it verifies that Grafana is fully operational, including data source connectivity and alert rules. These checks reduce the risk of discovering issues only after the change has taken effect. ### Version Recommendations When your current Grafana version nears end-of-life, Chkk proactively recommends stable upgrade paths, weighing factors like recent security advisories and known bugs. It accounts for the maturity of new features, ensuring you don't jump prematurely to versions lacking broad community validation. Urgent patches are flagged separately if they address critical vulnerabilities. This data-driven approach balances novelty against reliability, giving you a well-rounded upgrade strategy. ### Upgrade Templates Chkk provides repeatable **Upgrade Templates** for both in-place and blue-green strategies. Each template includes pre-upgrade backups, step-by-step instructions, and health checks to minimize downtime. In-place upgrades streamline the process on smaller clusters, while canary deployments reduce risk by running two versions in parallel. These templates enable a smooth experience whether you're updating a small Dev cluster or a global production environment. ### Preverification Chkk can perform a full test run of your Grafana upgrade in a representative "digital twin" environment. It copies your configuration, dashboards, and data sources to detect any plugin or schema issues before touching production. Failures in the simulated upgrade guide you to fix configurations or resource constraints. By isolating possible pitfalls early, you avoid disruptive surprises on live clusters. ### Supported Packages Chkk accommodates Grafana deployments via Helm, Kustomize, or straight Kubernetes manifests. It understands custom images, private registries, and specialized builds, so you can maintain existing workflows without compromise. If using GitOps, Chkk can analyze your manifests and automatically propose changes needed for safe upgrades. This unified approach helps ensure consistency and compliance across all environments you manage. ## Common Operational Considerations * **Performance Optimization:** Use caching or downsampling to prevent slow queries, and allocate sufficient CPU/memory to Grafana. Monitor its internal metrics to detect bottlenecks early. * **Plugin & Data Source Management:** Keep plugins updated to compatible versions and restrict who can install them. Validate connectivity and credentials periodically, especially after upgrades. * **Alerting Best Practices:** Create clear alert rules with meaningful thresholds to avoid noise. Ensure redundant notification channels and apply silences during maintenance windows. * **Storage & Retention:** Use MySQL/PostgreSQL in production for Grafana's database, and back it up routinely. Align retention policies with your observability stack to avoid data mismatches. * **Scaling Strategies:** Deploy multiple Grafana instances with a shared database for HA, and shard heavy queries or dashboards if needed. Use load balancing to distribute user sessions effectively. * **Security & RBAC:** Integrate with SSO to enforce centralized authentication, and limit admin roles. Secure data source credentials and network access to protect sensitive observability data. ## References * [Grafana Documentation](https://grafana.com/docs/) * [Grafana Releases](https://github.com/grafana/grafana/releases) # Grafana Loki Source: https://docs.chkk.io/projects/application-services/grafana-loki Chkk coverage for Grafana Loki. We provide version recommendations, preflight/postflight checks, and Upgrade Templates—ensuring worry-free operations. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v2.4.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.6.1 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Grafana Loki Overview Grafana Loki is a cost-effective, multi-tenant log aggregation system that indexes only metadata (labels). It uses microservices (distributor, ingester, querier, etc.) that can scale horizontally to handle high-ingestion environments. Logs are compressed and stored in an object store (S3, GCS, etc.), making Loki cheaper to operate than most traditional logging solutions. Deployed alongside Promtail or other agents, Loki can unify and centralize logs across Kubernetes clusters. With minimal indexing overhead and flexible queries via LogQL, it's ideal for large-scale log monitoring. ## Chkk Coverage ### Curated Release Notes Chkk distills Loki's official release notes into actionable insights for your team. Instead of wading through every detail, you'll get a curated summary highlighting changes that matter most. Critical bug fixes, new features, and security patches are called out clearly. Chkk also flags any deprecations, like removed config fields or schema changes, to show potential impacts on your environment. Crucial end-of-life announcements and support policy shifts are included as well. ### Preflight & Postflight Checks Chkk mitigates upgrade risks by running thorough preflight checks before any Loki version upgrade. It verifies Kubernetes compatibility, ensures resources meet new Loki's requirements, and flags deprecated fields. After applying the new version, Chkk's postflight checks confirm that log ingestion, storage, and alert rules remain healthy. Any anomalies, such as data corruption or mismatched config keys, are flagged for quick remediation. This automated process acts as a safety net to prevent downtime and missing logs. ### Version Recommendations Chkk tracks Loki's release lifecycle and identifies stable, well-supported production versions. It alerts you when your current Loki version is risky or nearing EOL, suggesting a suitable upgrade target. By balancing feature adoption with known issues, Chkk helps avoid blindly jumping to releases that may be unstable. These recommendations ensure you stay within official support timelines without missing critical patches. Chkk acts like a watchdog, guiding you to the best version for your cluster. ### Upgrade Templates Chkk offers detailed **Upgrade Templates** for in-place and blue-green upgrades, each with step-by-step instructions. In-place upgrades roll your existing deployment forward, while blue-green runs a new Loki version in parallel. Both methods include hold points for validation and explicit rollback steps if ingestion or queries fail. This process integrates with your GitOps or CI/CD flow, reducing human error and streamlining major transitions. By following these templates, you can manage Loki upgrades confidently with minimal downtime risk. ### Preverification Chkk's preverification simulates your Loki upgrade in a staging environment, using your actual config and sample data. It detects schema conflicts, index format mismatches, or increased resource demands before impacting production. By surfacing errors in a digital twin, you can address them early and refine your upgrade plan. This no-surprises approach reduces downtime and ensures a smoother transition when you finally upgrade in production. ### Supported Packages Chkk supports Loki deployments through Helm, Kustomize, or plain manifests, adapting checks to your chosen method. If using Helm, it highlights changes in values.yaml; if using raw YAML, it identifies which specs to modify. Chkk also accommodates private registries and custom images, preserving consistency across environments. This flexibility ensures you can manage Loki upgrades without restructuring your existing workflows. ## Common Operational Considerations * **Scaling & Performance:** Deploy distributed Loki components for high ingestion loads. Regularly check ingester capacity and replicate logs to avoid data loss. * **Storage Optimization:** Use object storage with a suitable backend and retention period. Limit label cardinality to keep indexes efficient. * **Query Performance:** Encourage narrow LogQL queries and deploy a query frontend for caching and parallelization. Monitor slow queries and adjust resources accordingly. * **Alerting & Observability:** Keep the Ruler healthy by offloading heavy queries to the query-frontend. Continuously track ingestion, memory usage, and log pipeline errors. * **Security & Multi-Tenancy:** Enable auth and multi-tenant headers to isolate logs per team. Encrypt data in transit and lock down object store permissions. ## Additional Resources * [Grafana Loki Documentation](https://grafana.com/docs/loki/latest/) * [Grafana Loki Releases](https://github.com/grafana/loki/releases) # HashiCorp Consul OSS Source: https://docs.chkk.io/projects/application-services/hashicorp-consul-oss Chkk coverage for HashiCorp Consul OSS. We provide preflight/postflight checks, curated release notes, and Upgrade Templates—designed for seamless upgrades. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.8.5 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.9.4 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## HashiCorp Consul OSS Overview HashiCorp Consul OSS provides a distributed service mesh and registry, securing and discovering services across multiple environments. It uses lightweight agents on each node to register local services, perform health checks, and join a cluster of server nodes that maintain global state. Consul offers an optional mTLS mesh (Consul Connect) that centralizes zero-trust policies and traffic encryption without requiring code changes. It also includes a simple key-value store for storing and synchronizing configuration data across the cluster. Overall, Consul helps unify service discovery, security, and configuration into a single platform that scales well in both Kubernetes and traditional data center setups. ## Chkk Coverage ### Curated Release Notes Chkk analyzes Consul's official release notes and flags only the changes relevant to your cluster's configuration, ACLs, or service catalog. This ensures you catch significant updates (like new Connect features, breaking config changes, or policy shifts) without parsing every minor fix. The curated feed also highlights deprecations, so you can address them before they cause production regressions. Ultimately, this saves time and cuts down the risk of missing critical release info. ### Preflight & Postflight Checks Chkk's preflight checks validate your cluster's readiness for a new Consul version by confirming server quorum, analyzing deprecated config fields, and verifying the required OS/TLS settings. During upgrades, it ensures you follow best practices (like upgrading servers first in small batches). Postflight checks confirm that all servers re-formed consensus, agents reconnected, and each service's health checks remain green. Together, these guardrails minimize downtime and highlight issues (e.g., misconfigured gossip encryption) before they become incidents. ### Version Recommendations Consul's support lifecycle can be short, so Chkk actively monitors EOL timelines to recommend safe upgrade targets. It weighs factors like your current version, upcoming patch availability, and known plugin or Envoy compatibility. If you're multiple versions behind, it suggests a stable upgrade path rather than a risky jump to the latest. These recommendations focus on security patch continuity and smooth operational transitions, keeping Consul stable and secure. ### Upgrade Templates Chkk provides in-place and blue-green upgrade templates that follow HashiCorp's documented steps but integrate safety checkpoints and rollbacks. With in-place upgrades, you update one server at a time, confirm cluster health, then proceed to agents. For blue-green, Chkk helps provision a parallel Consul cluster on the new version and migrate data, letting you switch over incrementally and revert if problems surface. These automated workflows reduce human error and standardize the approach for enterprise-scale Consul deployments. ### Preverification Chkk can run a simulated Consul upgrade in a test environment that mirrors your production cluster's configurations and data. This "digital twin" quickly exposes issues with agent compatibility, ACL replication, or resource usage that standard checks might miss. By spotting breakages in a safe sandbox, teams can fix them before scheduling the real upgrade. This approach reduces last-minute surprises and dramatically increases confidence in each Consul release. ### Supported Packages Chkk supports Consul installed via Helm, Kustomize, or plain Kubernetes manifests, automatically detecting your deployment method. It adapts its checks and upgrade flow to match how you're currently managing Consul, whether that's GitOps-based YAML or a Helm chart in a private registry. Custom-built or vendor-specific images are also recognized and tracked during upgrades. This flexibility ensures you can keep your existing provisioning workflow while gaining consistent oversight from Chkk. ## Common Operational Considerations * **Multi-Datacenter Consistency:** Ensure WAN federation and peering are carefully planned to avoid latency or partition issues, and configure ACL replication at the start if you use Consul Enterprise. Regularly test failover scenarios, especially if services depend on cross-DC discovery. * **Agent Lifecycle Management:** Automate the joining and removal of Consul client agents to maintain an accurate service registry. Keep configurations consistent (retry\_join, TLS config) so new nodes seamlessly enter the cluster. * **Service Mesh & mTLS Overheads:** Enable Consul Connect for zero-trust security, but plan for increased CPU usage on Envoy sidecars. Regularly rotate certificates (potentially via Vault) to maintain security compliance. * **ACL Governance:** Keep policies in version control, and automate token distribution for agents and services. Monitor ACL replication or manually sync across datacenters if relying on OSS-only functionality. * **KV Store Usage:** Use Consul's key-value store for configs and coordination, but avoid storing large or high-throughput data. Access can be controlled with ACL policies, so confirm your tokens match the intended write/read paths. ## Additional Resources * [HashiCorp Consul OSS Documentation](https://developer.hashicorp.com/consul/docs) * [HashiCorp Consul OSS Releases](https://github.com/hashicorp/consul/releases) # HashiCorp Vault Source: https://docs.chkk.io/projects/application-services/hashicorp-vault Chkk coverage for HashiCorp Vault. We provide version recommendations, preflight/postflight checks, and Upgrade Templates—ensuring worry-free operations. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.7.3 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.8.6 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## HashiCorp Vault Overview HashiCorp Vault is a secrets management platform that securely stores credentials, issues dynamic, short-lived secrets, and offers encryption as a service. Fine-grained ACLs dictate who can access or create secrets, with audit logs capturing every request. It integrates with Kubernetes for automatic secret injection, and supports multi-node high availability, plus enterprise-level disaster recovery and replication. ## Chkk Coverage ### Curated Release Notes Chkk monitors official Vault releases, summarizing essential changes that affect secret storage, auth methods, or policy behavior. This helps teams quickly see whether an update fixes critical security issues, alters CLI flags, or deprecates a particular secrets engine. If a version includes major feature additions—like a new database engine or dynamic secrets capability—Chkk flags those so you can decide whether to adopt them. ### Preflight & Postflight Checks Before a Vault upgrade, Chkk's preflight checks confirm your setup meets the new version's requirements—validating storage backends, reviewing TLS configs, and checking if currently used secrets engines or auth methods face deprecation. After upgrading, postflight checks verify that Vault is unsealed, auth workflows succeed, and no errors appear in logs. This automation catches common pitfalls (e.g., missed config changes) that can leave Vault sealed or break tokens. ### Version Recommendations Chkk continuously watches Vault release lifecycles, warning you when your deployed version approaches end-of-life or lacks current security patches. It compares official guidance with your environment—like your Kubernetes version or provider integrations—to recommend stable releases. By following Chkk's prompts, you stay current on security updates and avoid unsupported features that might endanger your secrets. ### Upgrade Templates For robust Vault upgrades, Chkk supports two upgrade paths: in-place and blue-green. In-place upgrade performs a rolling updates of existing Vault nodes, often done in HA mode. One node is upgraded and joined back to the cluster at a time, preserving active services. Blue-green spins up a parallel Vault cluster (green) at the new version, replicates or copies data, and switches clients once stable. This method keeps downtime near zero and simplifies rollback if issues arise. Both templates detail steps for backing up data, unsealing nodes, verifying health, and rolling back in case of unexpected regressions. ### Preverification For major or sensitive updates, Chkk's preverification simulates the Vault upgrade in a safe environment. It replicates your Vault config—auth backends, secrets engines, policies—and applies the new version to spot potential incompatibilities (e.g., a deprecated config parameter, plugin mismatch). This preview helps fix issues early (like adjusting a configuration for a changed API) rather than encountering them mid-upgrade in production. ### Supported Packages No matter if Vault is installed via Helm, Kustomize, or raw YAML, Chkk parses your manifests (and values) to orchestrate upgrades. It respects private registries, custom Vault images, and organizational security constraints. This ensures Vault's new version is deployed cleanly without requiring you to switch from your preferred packaging approach. ## Common Operational Considerations * **Unseal & Key Management:** Prefer auto-unseal with a cloud KMS or HSM to eliminate manual key handling delays; if manual unseal is used, securely distribute key shards. Regularly test unseal and rekey procedures to ensure swift recovery during outages. * **Access Control & Policy Enforcement:** Enforce least privilege with finely scoped ACL policies and retire the root token immediately after initialization. Regularly audit token permissions to prevent over-privileged access and reduce insider risk. ## Additional Resources * [HashiCorp Vault Documentation](https://developer.hashicorp.com/vault/docs) * [HashiCorp Vault Releases](https://github.com/hashicorp/terraform/releases) # Karpenter Source: https://docs.chkk.io/projects/application-services/karpenter Chkk coverage for Karpenter. We provide preflight/postflight checks, curated release notes, and Upgrade Templates—designed for seamless upgrades. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------------------- | | **Chkk Curated Release Notes** | v0.21.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v0.24.0 to latest | | **Supported Packages** | Helm, Kustomize, Static Manifests | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Karpenter Overview Karpenter is a flexible, high-performance autoscaling solution designed to optimize pod scheduling and node provisioning in Kubernetes. Developed with cloud-native principles, it uses real-time data from pending pods to create or scale nodes that precisely match workload requirements. By leveraging cloud provider APIs directly (e.g., AWS EC2), Karpenter can provision compute resources with minimal startup delays, reducing both over-provisioning and under-utilization. Platform teams benefit from the dynamic nature of Karpenter: as workloads fluctuate, it scales just enough capacity to maintain performance without driving up unnecessary cloud costs. Karpenter's approach also supports spot instances, custom AMIs, and advanced scheduling rules that integrate seamlessly with existing cluster configurations. ## Chkk Coverage ### Curated Release Notes Chkk tracks official Karpenter release notes, aggregating crucial new features, bug fixes, and deprecations. Instead of manually reviewing each release, platform engineers receive a summary of changes likely to affect day-to-day operations—such as newly introduced provisioner fields, improvements in startup latency, or modifications to AWS-specific node templates. Chkk also flags any deprecated APIs (for example, if certain spec fields are removed in later versions) so that you can adjust your cluster configuration before issues arise. ### Preflight & Postflight Checks Before upgrading Karpenter, Chkk runs preflight checks that validate whether your cluster and cloud environment meet the new version's requirements. It inspects any Karpenter-specific CRDs, verifying that your provisioner configurations, constraints, and cloud credentials align with the upgrade. For example, if a new release enforces stricter node label formats or changes default instance types, Chkk identifies these configurations in your cluster. After upgrading, postflight checks confirm the new Karpenter controller is healthy, verify that node provisioning logic is working correctly, and look for any pending pods that remain unscheduled due to misconfigurations. This helps ensure you don't discover broken autoscaling behavior only after hitting production load. ### Version Recommendations Chkk tracks Karpenter's release cadence and support windows, highlighting versions nearing end-of-life or known to conflict with certain Kubernetes releases. By mapping your current cluster environment—Kubernetes version, node OS images, or usage of spot vs. on-demand nodes—Chkk suggests a stable upgrade path. It also flags urgent security or performance fixes that may justify moving up in minor versions sooner rather than later. Chkk Upgrade Copilot keeps your autoscaling system modern and secure without the guesswork. ### Upgrade Templates Chkk provides a structured guide for upgrading Karpenter, offering two primary strategies: in-place or blue-green. With an in-place upgrade, you update the Karpenter controller and relevant CRDs in place, then closely monitor provisioning logs and cluster metrics to ensure it's functioning well. If you prefer a more conservative approach, the blue-green strategy spins up a new Karpenter controller using a separate deployment or Helm release, allowing you to transition workload provisioning gradually. Chkk's templates detail the step-by-step workflow, roll-back instructions, and recommended checks (like verifying pending pods) at each stage. ### Preverification For major Karpenter upgrades or large multi-node clusters, Chkk's preverification feature runs a simulated upgrade in a controlled environment. This includes reapplying your existing provisioner specs, checking if new or deprecated fields cause conflicts, and ensuring your cloud provider credentials and IAM roles are still valid. The simulation's feedback loop helps spot tricky scenarios—for example, if the new version relies on a capability your current AWS account setup doesn't grant. By exposing these issues in preverification, you can fix them ahead of time rather than struggling mid-upgrade in production. ### Supported Packages Chkk supports installing and upgrading Karpenter via Helm, Kustomize, or raw YAML deployments. Whether you're using a private registry with custom-built Karpenter images or relying on public repositories, Chkk's automation can parse your manifests, confirm your current Karpenter version, and propose a precise upgrade plan. This ensures your GitOps or CI/CD pipeline remains intact—Chkk just enriches it with checks, recommended config changes, and best-practice guidance. ## Common Operational Considerations * **IAM Role Configurations:** Ensure your AWS credentials allow full access to the needed APIs (e.g., EC2, Launch Templates, Auto Scaling). Missing permissions can lead to silent provisioning failures. * **Right-Sizing Node Templates:** Karpenter thrives on accurate resource requests. Overly large or generic instance selections can rack up costs; overly small ones cause scheduling backlogs. Tune your provisioner and node templates to reflect actual workload needs. * **Spot vs. On-Demand Balancing:** Karpenter can request spot instances for cost savings. However, be prepared for spot interruptions by configuring pod disruption budgets and ensuring you have an on-demand fallback for critical workloads. * **Synchronized Scale-Downs:** If you're also using Cluster Autoscaler or other scaling logic, coordinate them carefully. Running multiple autoscaling solutions can create conflicting behaviors unless well-configured. * **Validate CRDs & Constraints:** Upgrades often add or remove constraint fields (e.g., selecting instance families or specific zones). After upgrading, confirm your constraints still align with available instance types in your target region. * **Monitor Pending Pods:** A sudden spike in unschedulable pods might indicate Karpenter config changes are out of sync with cluster demands. Use the Kubernetes events or Karpenter logs to debug insufficient instance capacity or misaligned node selectors. ## Additional Resources * [Karpenter Documentation](https://karpenter.sh/docs/) * [AWS Karpenter Provider Releases](https://github.com/aws/karpenter-provider-aws/releases) * [Introducing Karpenter: An Open Source High-Performance Kubernetes Cluster Autoscaler](https://aws.amazon.com/blogs/aws/introducing-karpenter-an-open-source-high-performance-kubernetes-cluster-autoscaler/) # KEDA Source: https://docs.chkk.io/projects/application-services/keda Chkk coverage for KEDA. We provide preflight/postflight checks, curated release notes, and Upgrade Templates—designed for seamless upgrades. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.4.1 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v2.1.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## KEDA Overview KEDA (Kubernetes Event-driven Autoscaler) extends the native Horizontal Pod Autoscaler (HPA) to scale workloads based on external events such as queue depth, database activity, or cloud metrics. By adding its own operator and metrics adapter, KEDA exposes these external metrics to Kubernetes, allowing pods to dynamically scale up—or even down to zero—when no events are pending. This approach optimizes resource usage, handling spikes automatically and freeing up capacity during idle periods. KEDA supports a wide range of event sources (e.g., Kafka, RabbitMQ, Azure Event Hubs), making it easy to integrate with diverse environments while retaining the standard Kubernetes scaling model. ## Chkk Coverage ### Curated Release Notes Chkk curates official KEDA release notes, highlighting important updates, deprecations, and security advisories. Instead of going through every changelog, platform teams get a concise summary of changes—like dropped support for older Kubernetes versions or new scalers—that may affect autoscaling configurations. Chkk flags breaking changes and features, helping you prioritize upgrades aligned with your operational needs. ### Preflight & Postflight Checks Before a KEDA upgrade, Chkk runs preflight checks to detect potential incompatibilities—such as renamed CRD fields, changed scaler parameters, or missing permissions. It also ensures your existing ScaledObjects and TriggerAuthentications will remain valid. After the upgrade, postflight checks confirm that the updated KEDA operator is healthy, verifying event triggers, metrics reporting, and successful pod scaling. Any issues—like broken triggers or pods stuck at zero replicas—are flagged for quick remediation. ### Version Recommendations Chkk continuously monitors KEDA's release lifecycle and alerts you if your version is nearing end-of-life or has known vulnerabilities. It cross-references Kubernetes compatibility, patch availability, and bug reports to suggest the next stable version. By following Chkk's guidance, you can keep KEDA up to date without risking untested or unsupported releases. ### Upgrade Templates Chkk offers two main upgrade strategies for KEDA: in-place and blue-green. In-place updates the existing operator, ensuring minimal downtime through a carefully orchestrated restart. Blue-green introduces a parallel "green" deployment of KEDA alongside the old "blue" one, then cuts over once validated—ideal for zero-downtime requirements or major version jumps. Both templates preserve your autoscaling configurations, preventing disruptions to event-driven scaling. ### Preverification For major or potentially disruptive upgrades, Chkk conducts a preverification process in a controlled environment. It deploys the new KEDA version, applies your existing manifests, and simulates the scaling behavior. This test reveals any problems with event source compatibility, authentication, or changed CRD fields before production, allowing you to address them proactively and reduce deployment risks. ### Supported Packages Chkk supports KEDA deployments via Helm, Kustomize, or plain Kubernetes YAML. Regardless of your chosen method, Chkk detects and manages the operator's current version, ensuring that upgrades respect custom images or private registries. This integration aligns with typical GitOps workflows, letting you maintain KEDA the same way you handle other Kubernetes resources. ## Common Operational Considerations * **Scale-to-Zero Latency:** KEDA polls external triggers (e.g., Kafka, Prometheus) every 30s by default. A too-long polling interval leads to cold-start delays. Lower it for latency-sensitive workloads, but watch out for excessive API calls. * **Trigger Failures:** If a trigger source (like Azure Service Bus) is unreachable or slow, KEDA returns zero metrics. Configure fallback replicas to avoid unintentional scale-to-zero during outages. Monitor operator logs for repeated "failed to fetch metrics" errors. * **HA Deployment:** Running multiple KEDA operator pods provides failover (single-active leader). The metrics server can also run multiple pods, but only one instance serves external metrics at a time. Give operator and adapter enough CPU/memory for high-volume events. * **CRD & Config Pitfalls:** Old CRDs or typos in TriggerAuthentication can silently break autoscaling. Always update CRDs alongside the operator, and confirm ScaledObject status is Ready. Use KEDA's admission webhook to prevent duplicate HPAs or invalid triggers. * **Upgrade Impact:** When jumping major versions (v1→v2), reapply updated CRDs and rewrite any old fields. For minor releases, watch for changes that might invalidate triggers (like renamed fields). Prefer a test cluster or Chkk's preverification feature before production. ## Additional Resources * [KEDA Documentation](https://keda.sh/docs/latest/) * [KEDA Releases](https://github.com/kedacore/keda/releases) # Keycloak Source: https://docs.chkk.io/projects/application-services/keycloak Chkk coverage for Keycloak. We provide curated release notes, preflight/postflight checks, and Upgrade Templates—all tailored to your environment. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v12.0.3 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v16.1.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Keycloak Overview Keycloak is an open-source Identity and Access Management (IAM) solution, providing single sign-on (SSO), identity brokering, and flexible authentication/authorization. It supports OAuth 2.0, OpenID Connect, and SAML 2.0, making it broadly compatible with modern applications and services. Administrators can centrally manage realms, clients, and user policies via a web console, reducing custom code and risk. Keycloak's Quarkus-based runtime simplifies Kubernetes deployments, with clustering support for high availability. Through integration with Kubernetes RBAC, it can secure not just apps but also cluster access. ## Chkk Coverage ### Curated Release Notes Each Keycloak release often has lengthy notes covering new features, bug fixes, and breaking changes. Chkk curates these details into an operational summary, spotlighting security fixes, schema updates, and any removed features. Instead of parsing every upstream note, you receive quick pointers about potential breakage or critical vulnerabilities. This ensures you don't overlook subtle changes like updated hashing algorithms or token lifespans. ### Preflight & Postflight Checks Chkk runs automated checks before and after a Keycloak upgrade to confirm version compatibility and overall system health. Preflight checks validate database readiness, any deprecated API usage, and operator or helm chart compatibility. After the upgrade, postflight confirms that the new Keycloak pods, realms, and user flows are functioning correctly. This approach proactively catches common issues—like incomplete schema migrations or leftover outdated configurations—by monitoring logs, session states, and access patterns. As a result, you can upgrade with confidence knowing each stage was thoroughly validated. ### Version Recommendations Chkk tracks Keycloak's rapid release cadence and flags when your version falls behind on security patches or enters EOL. It references official support policies to warn you about major changes, deprecated features, or community support drop-offs. Chkk recommends stable versions that align with your Kubernetes environment, highlighting known issues. By mapping Keycloak's iteration cycles to your upgrade windows, Chkk keeps deployments secure and compliant. ### Upgrade Templates Chkk provides **Upgrade Templates** for both in-place and blue-green upgrades, covering database backups, partial rollouts, and canary checks. These instructions include configuration updates, migration tasks, and post-upgrade verifications. Rollback guidelines—such as reverting images or restoring snapshots—are built in. By following these templates, you minimize human error and ensure a safer upgrade path. ### Preverification Chkk's preverification simulates the upgrade in a separate environment, loading a mirrored database and matching realm configs. It catches issues like incompatible themes, outdated schemas, or broken SPIs before they affect production. The entire upgrade sequence is rehearsed so teams can fix problems in advance. This real-world testing boosts confidence that new Keycloak versions will run smoothly. ### Supported Packages Chkk works with Helm charts, the Keycloak Operator, Kustomize, and raw Kubernetes manifests. It detects your chosen package method and tailors checks to ensure consistency across installation types. Custom builds and private registries are also recognized, preserving enterprise workflows. Whether official or custom images, Chkk tracks version compatibility and delivers precise upgrade guidance. ## Common Operational Considerations * **Token Expiration & Refresh:** Configure token lifespans to balance security with usability, and ensure clients handle short-lived tokens appropriately. Monitor refresh rates for anomalies that may indicate client misconfiguration or a need to scale Keycloak. * **Session Management & Clustering:** Always use shared databases or caches for session consistency across Keycloak pods, and confirm the cluster is healthy after rolling updates. Properly tune memory/CPU resources and session-cleanup intervals to reduce performance bottlenecks. * **Configuring Realms:** Plan realm counts and structure from the start, avoiding unnecessary complexity or duplication. Consistently manage roles, groups, and policies at the realm level to maintain clarity and reduce upgrade friction. * **Scaling & Performance:** Scale Keycloak horizontally to handle spikes in authentication load, and ensure the database is equally robust. Use health checks and resource monitoring (CPU/memory) to proactively address performance bottlenecks. * **Kubernetes RBAC Integration:** Configure Keycloak as an OIDC provider for cluster authentication and map realm groups to K8s roles. Keep a fallback admin credential or separate auth flow to handle Keycloak downtime or misconfigurations. ## Additional Resources * [Keycloak Documentation](https://www.keycloak.org/documentation) * [Keycloak Releases](https://github.com/keycloak/keycloak/releases) # Kyverno Source: https://docs.chkk.io/projects/application-services/kyverno Chkk coverage for Kyverno. We provide preflight/postflight checks, curated release notes, and Upgrade Templates—designed for seamless upgrades. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.1.7 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.3.1 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Kyverno Overview Kyverno is a Kubernetes-native policy engine for security, compliance, and governance. Rather than requiring a new policy language, it uses standard CRDs/YAML to validate, mutate, and generate resources. Common use cases include enforcing best practices (e.g., image registry checks), injecting default labels, and automatically creating network policies when namespaces appear. By integrating with Kubernetes admission, Kyverno intercepts changes before they're stored, blocking or adjusting disallowed requests. This approach standardizes configurations, reduces drift, and maintains consistent security across workloads. ## Chkk Coverage ### Curated Release Notes Chkk continuously scans Kyverno's release notes and distills the updates into concise briefs, highlighting critical changes—like new policy rule types, security patches, or deprecated fields. If a release includes new validation capabilities (e.g., advanced JSON path matching) or modifies default webhook behavior, Chkk flags it to help you plan accordingly. This allows operators to see at-a-glance if an upgrade might affect existing policies or require reconfiguration. ### Preflight & Postflight Checks Chkk's preflight checks verify that your Kyverno installation and policies remain compatible before upgrading. This includes detecting any API or CRD changes that might break existing configurations and ensuring the cluster meets new version requirements. After the upgrade, postflight checks confirm Kyverno's admission webhooks are running smoothly, policies are enforced correctly, and no unexpected errors appear in the logs. These checks significantly reduce the risk of policy enforcement gaps during an upgrade window. ### Version Recommendations Chkk monitors Kyverno's releases, comparing each version's support timeline and known compatibility issues. You'll receive alerts when your deployed version nears end-of-life or lacks key security fixes. This ensures your policy engine keeps pace with the broader Kubernetes ecosystem, avoiding outdated features or unpatched vulnerabilities. ### Upgrade Templates Chkk provides two primary templates for upgrading Kyverno, in-place and blue-green. In-place updates your existing Kyverno deployment in a rolling fashion, typically leveraging a Helm chart or new manifests. Minimal overhead, though any unexpected policy conflicts might appear mid-process. Blue-green stands up a separate Kyverno instance in parallel, letting you test the new version and gradually shift admission controls. This approach offers near-zero downtime and simpler rollback if the new version introduces issues. Both methods include built-in safety checks, rollback guidance, and recommended validation steps—ensuring your cluster's policies stay effective throughout the transition. ### Preverification For major version changes or large-scale policy sets, Chkk's preverification simulates the Kyverno upgrade in a controlled environment. It applies your current policy definitions to the new version and runs tests to detect CRD conflicts, policy logic errors, or altered behaviors. This proactive approach often catches hidden issues (like a generate rule failing) well before production, reducing disruptions to day-to-day operations. ### Supported Packages Chkk adapts to the way you install Kyverno—Helm, Kustomize, or raw YAML. It orchestrates the appropriate upgrades, managing CRDs and manifests so you don't have to change your existing deployment practices. If you rely on private registries or custom images, Chkk respects those references, ensuring consistent security policies across your enterprise environment. ### Common Operational Considerations * **Mutation Pitfalls:** Mutate rules can alter existing workloads, which might trigger rollouts or restarts. Test in audit mode, review Kyverno logs for unexpected patches, and avoid conflicting patches by combining or tightly scoping mutate rules. * **Multi-Tenancy Considerations:** Designate global policies as ClusterPolicies and allow teams to manage namespace-scoped ones, ensuring they don't override org-wide mandates. Use PolicyException resources for nuanced waivers and rely on label or namespace selectors to prevent overlap. * **Race Conditions with Controllers:** Kyverno might collide with GitOps or other controllers, causing endless revert loops or double mutations. Coordinate with these systems—either encode Kyverno mutations in Git or exclude resources from certain controllers—to prevent tug-of-war scenarios. * **Namespace vs. Cluster-Scope:** Leverage cluster-scoped policies for organization-wide rules and namespaced ones for delegated control. Keep in mind that namespaced policies can't override cluster rules, so carefully define scope to avoid unexpected enforcement overlaps. ## Additional Resources * [Kyverno Documentation](https://kyverno.io/) * [Kyverno Releases](https://github.com/kyverno/kyverno/releases) # Application Services Source: https://docs.chkk.io/projects/application-services/overview Explore All the Application Services Chkk Covers An [**Application Service**](/misc/glossary#application-service) is a type of [Project](/projects/overview) that provides essential services to the rest of an [Application Stack](/misc/glossary#application-stack). Get a quick overview of every Application Service Chkk covers, complete with curated release notes, private registry and custom image coverage, pre- and post-flight checks, and comprehensive EOL and version compatibility details. Dive deeper into each Application Service to see supported upgrade templates (in-place, blue-green) and preverification. Simply select a card below to learn more about that specific Application Service. | | | | | ----------------------------------------------------------------------- | ----------------------------------------------------------------- | ------------------------------------------------------------------- | | [Alertmanager](/projects/application-services/alertmanager) | [Apache Kafka](/projects/application-services/apache-kafka) | [Apache Zookeeper](/projects/application-services/apache-zookeeper) | | [Argo CD](/projects/application-services/argo-cd) | [Argo Rollouts](/projects/application-services/argo-rollouts) | [Argo Workflows](/projects/application-services/argo-workflows) | | [Crossplane](/projects/application-services/crossplane) | [Datadog Agent](/projects/application-services/datadog-agent) | [Elasticsearch](/projects/application-services/elasticsearch) | | [FluentBit](/projects/application-services/fluent-bit) | [Grafana](/projects/application-services/grafana) | [Grafana Loki](/projects/application-services/grafana-loki) | | [Hashicorp Consul](/projects/application-services/hashicorp-consul-oss) | [Hashicorp Vault](/projects/application-services/hashicorp-vault) | [Karpenter](/projects/application-services/karpenter) | | [KEDA](/projects/application-services/keda) | [Keycloak](/projects/application-services/keycloak) | [Kyverno](/projects/application-services/kyverno) | | [Prometheus](/projects/application-services/prometheus) | [RabbitMQ](/projects/application-services/rabbitmq) | [Redis](/projects/application-services/redis) | | ACK IAM Controller | ACK S3 Controller | Actions Runner Controller | | Active Monitor | Adminer | Ambassador Edge Stack | | Apache Airflow | Apache Cassandra | Apache Cassandra Reaper | | Apache HTTP Server | Apollo Router | Argo Events | | Atlassian Jira | Bitbucket | Botkube | | Chaos Mesh | Cloudflare Origin CA Issuer | cloudflared | | CockroachDB | Confluent Platform Kafka | Connaisseur | | Dex | Directus | Edge Delta | | Elastic Beats (Filebeat, Metricbeat) | etcd | Flagger | | FluentBit GKE Exporter | Fluentd | Fluentd Plugin Splunk HEC | | FluxCD | GCP Cloud SQL Auth Proxy | GCP Prometheus Engine | | GKE Events Exporter | GKE Metadata Server | Google CAS Issuer for Cert Manager | | Grafana Agent | Harbor | Harness Delegate | | Hashicorp Vault Agent Injector | Inspektor Gadget | Jenkins | | JFrog Artifactory | JFrog XRay | jsreport | | Kafka UI | Kargo | Kiali | | Kibana | Knative Eventing | Knative Serving | | Kong Gateway OSS | kube-green | Kube RBAC Proxy | | kube-vip | Kubecost | Kubernetes Reflector | | Kubernetes Replicator | Kubernetes Secret Generator | Kubeshark | | KubeXray | Kured | Kyverno Policy Reporter | | Meilisearch | MongoDB | MySQL | | NATS | Nevis | NGINX | | OAuth2 Proxy | OneAgent | Open Policy Agent | | Open Policy Agent Gatekeeper | Ory Hydra | PgBouncer | | Pomerium | PostgreSQL | Prisma Cloud (formerly Twistlock) | | Prom2Teams | Prometheus Adapter | Prometheus Blackbox Exporter | | Prometheus CloudWatch Exporter | Prometheus JMX Exporter | Prometheus Node Exporter | | Prometheus PostgreSQL Exporter | Prometheus Redis Metrics Exporter | Prometheus to Microsoft Teams | | Prometheus to Stackdriver | Prometheus Varnish Exporter | Rancher Fleet | | Redpanda Console | SonarQube | Spilo | | Splunk OpenTelemetry (OTEL) Collector | Stakater Reloader | Step Issuer | | Sumo Logic Distribution for OpenTelemetry Collector | Tailscale | Teleport OSS | | Telepresence | Thanos | Trivy | | Trust Manager | Upbound Universal Crossplane (UXP) | Varnish Cache | | Vector | Velero | Ververica | | Wiz Runtime Sensor | | | # Prometheus Source: https://docs.chkk.io/projects/application-services/prometheus Chkk coverage for Prometheus. We provide curated release notes, preflight/postflight checks, and Upgrade Templates—all tailored to your environment. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v2.19.1 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v2.25.0 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Prometheus Overview Prometheus is a CNCF-graduated, open-source monitoring system that scrapes metrics from targets via HTTP endpoints. It stores time-series data locally and supports fast, flexible queries through PromQL. With Kubernetes service discovery, Prometheus detects relevant components and application pods automatically. Alertmanager integration adds powerful alerting capabilities, triggering escalations based on custom rules. This seamless observability stack equips platform teams with deep insights into system performance and reliability. ## Chkk Coverage ### Curated Release Notes Chkk filters Prometheus release notes to highlight critical changes—like deprecated flags, shifting defaults, or newly required configs. By focusing on operationally relevant updates, it lets you quickly see which releases might affect your custom alerts or exporter configurations. Chkk also flags important EOL notices, ensuring you don't unknowingly rely on an unsupported version. ### Preflight & Postflight Checks Before upgrading Prometheus, Chkk's preflight checks confirm your existing setup won't break due to incompatible flags or insufficient resource allocations. It inspects Kubernetes manifests (Helm, Kustomize, or raw YAML) for any known pitfalls and references Prometheus' official support details. Postflight checks then verify that upgraded instances are scraping all targets, that recorded rules remain valid, and that no silent failures have occurred in logs or Alertmanager alerts. ### Version Recommendations Chkk tracks Prometheus' release notes and recommends safe upgrade targets based on LTS availability, community feedback, and alignment with your cluster's Kubernetes version. It identifies upcoming EOL milestones to help you plan timely upgrades, avoiding last-minute scrambles for security patches. If you're multiple versions behind, Chkk surfaces a path that balances new features with proven stability. ### Upgrade Templates Chkk's **Upgrade Templates** detail two primary approaches for Prometheus upgrades—in-place and blue-green. In an in-place scenario, Chkk outlines rolling updates across HA pairs to preserve continuous scraping. For a blue-green strategy, it offers guidance on spinning up a parallel deployment, comparing scraped data between old and new versions, and gradually cutting over to the updated instance. ### Preverification Through preverification, Chkk simulates your Prometheus upgrade in a controlled environment using copies of existing configs and metrics data. This process uncovers issues like deprecated syntax, unexpected performance overhead, or storage format mismatches before impacting production. It's a practical safeguard that reduces the risk of losing visibility or encountering runtime errors during real upgrades. ### Supported Packages Chkk works with various packaging methods, including Helm charts (such as the Prometheus Operator), Kustomize overlays, and standard Kubernetes manifests. It respects custom images, private registries, or custom build requirements, so you can maintain consistency with your existing CI/CD pipelines. This unified approach ensures you don't need to change tooling to leverage Chkk's upgrade intelligence. ## Common Operational Considerations * **Avoiding Monitoring Gaps:** For minimal downtime, upgrade Prometheus in an HA pair or use a staggered approach in single-instance deployments. Ensure time-series data and configuration backups are taken to prevent loss during rollouts. * **Maintaining Exporter Compatibility:** Verify that all exporters produce valid metrics adhering to updated content-type requirements. Automate checks to flag exporters likely to fail under stricter parsing or deprecated protocols. * **Cleaning Deprecated Flags:** Removed command-line flags or configuration fields can break Prometheus startup after an upgrade. Review release notes thoroughly and replace obsolete flags with supported syntax to avoid unexpected crashes. * **Kubernetes Integration Updates:** If Prometheus relies on older Kubernetes discovery APIs, a new release may drop that support. Follow official migration guides and switch to supported API versions to maintain stable service discovery. ## Additional Resources * [Prometheus Documentation](https://prometheus.io/docs/introduction/overview/) * [Prometheus Releases](https://github.com/prometheus/prometheus/releases) # RabbitMQ Source: https://docs.chkk.io/projects/application-services/rabbitmq Chkk coverage for RabbitMQ. We provide preflight/postflight checks, curated release notes, and Upgrade Templates—designed for seamless upgrades. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v3.8.12 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v3.9.1 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## RabbitMQ Overview RabbitMQ is a message broker that uses the AMQP protocol to enable asynchronous communication among distributed services. It relies on queues for storage and buffering, ensuring that producers and consumers remain decoupled. Clustering, mirroring, and quorum queues support high availability, while exchange types and routing keys offer flexible messaging patterns. RabbitMQ also supports multiple protocols such as MQTT, STOMP, and AMQP 1.0, making it a versatile solution for many use cases. Security features include TLS encryption, authentication mechanisms, and policy-based access controls. ## Chkk Coverage ### Curated Release Notes Chkk filters official RabbitMQ release notes to spotlight relevant new features, security updates, and deprecations. It identifies policy or queue-type changes that could affect your environment, saving you from tracking every update. Each note includes recommended actions for addressing potential issues like plugin or Erlang version mismatches. Chkk's summaries focus on operational impact, helping you plan more effectively. This prevents surprises caused by overlooked upstream changes. ### Preflight & Postflight Checks Chkk scans your RabbitMQ clusters and configurations to confirm compatibility before you upgrade. It flags potential downtime risks, like deprecated queue types or insufficient resource allocations. Post-upgrade, it verifies cluster health by checking node status, alarms, and queue synchronization. This automated process quickly uncovers misconfigurations, allowing you to remedy them before they escalate. The result is more reliable and consistent RabbitMQ deployments. ### Version Recommendations Chkk tracks RabbitMQ's support lifecycle and warns you when your version nears end-of-life. It draws on the official compatibility matrix, ensuring your Erlang and RabbitMQ versions align. If you're on a risky or deprecated release, Chkk suggests a stable upgrade target and justifies that choice based on known issues. Recommendations account for your current plugin ecosystem and resource constraints. This context-driven approach helps you plan upgrades confidently. ### Upgrade Templates Chkk's **Upgrade Templates** provide guided workflows for both in-place and blue-green RabbitMQ upgrades. In-place upgrades focus on sequential node updates with minimal impact on ongoing traffic. Blue-green strategies spin up a parallel cluster or revision, easing the transition for mission-critical data. Each template includes automated checks and clear rollback instructions in case of failure. This reduces human error and helps maintain seamless messaging operations. ### Preverification To avoid breaking production, Chkk can simulate the entire upgrade in a digital twin environment. It uses your current configuration, queue definitions, and plugins to test the next RabbitMQ version. This reveals issues with memory usage, queue compatibility, or conflicting policies before changes go live. Test results guide any adjustments needed to ensure a smoother production upgrade. By addressing these risks in advance, you maintain a stable messaging layer during real deployments. ### Supported Packages Chkk recognizes that organizations package RabbitMQ using Helm, Kustomize, or plain Kubernetes manifests. It accommodates these approaches by adjusting its workflows, commands, and validations accordingly. This includes working with custom or private images, as well as specialized vendor forks of RabbitMQ. Chkk tracks and catalogs image references, ensuring your deployments stay consistent across versions. In this way, it fits into your existing GitOps or CI/CD pipeline seamlessly. ## Common Operational Considerations * **Queue Mirroring and High Availability:** Classic mirrored queues can suffer split-brain if not carefully configured; quorum queues offer more robust replication. Always run an odd number of nodes for reliable majority-based fault tolerance. * **Persistent vs. Transient Messages:** Durable queues and persistent messages protect against data loss but reduce throughput. Use transient messages for less critical data to optimize performance. * **Resource Limits and Memory Watermarks:** RabbitMQ stops accepting new messages if it hits its memory or disk threshold. Set memory alarms and Kubernetes resource limits in alignment to avoid OOM kills and blocked publishers. * **Connection Handling and Load Balancing:** Large numbers of client connections can overwhelm file descriptors; raise OS limits if needed. Distribute connections across nodes and prefer connection reuse over frequent opens. * **Shovel and Federation:** Shovel replicates messages from one broker to another; plan for throughput and network issues. Federation links exchanges across clusters, but be mindful of potential duplicates or latency under network stress. * **Erlang and RabbitMQ Version Compatibility:** Each RabbitMQ release supports specific Erlang versions, so mismatched upgrades can break clusters. Check the official compatibility matrix and upgrade Erlang in sync with RabbitMQ. ## Additional Resources * [RabbitMQ Documentation](https://www.rabbitmq.com/docs) * [RabbitMQ Releases](https://github.com/rabbitmq/rabbitmq-server/releases) # Redis Source: https://docs.chkk.io/projects/application-services/redis Chkk coverage for Redis. We provide version recommendations, preflight/postflight checks, and Upgrade Templates—ensuring worry-free operations. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v6.0.14 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v6.2.5 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Redis Overview Redis is a high-performance, in-memory key-value store widely used for caching, session management, and real-time analytics. It supports various data structures (strings, hashes, sets, sorted sets, streams) and can store data in memory with optional persistence to disk. Replication and clustering features help achieve high availability and horizontal scaling, while integration with Kubernetes operators or Helm/Kustomize deployments ensures automated failover and easier lifecycle management. ## Chkk Coverage ### Curated Release Notes Chkk condenses official Redis release notes into highlights of breaking changes, security fixes, and newly introduced capabilities that may affect your caching or replication setup. For instance, if a version modifies RDB/AOF persistence or patch addresses a CVE, Chkk flags it to ensure teams stay informed about potential impacts and urgent fixes. ### Preflight & Postflight Checks Before upgrading Redis, Chkk runs preflight checks to detect deprecated commands, outdated configurations, and resource constraints that may cause disruption. Afterward, postflight checks validate that replication or cluster operations are intact and that performance metrics remain stable. This two-step process prevents unexpected downtime and verifies the cluster's health once the new version is in place. ### Version Recommendations Chkk continuously monitors Redis's support lifecycle to recommend stable releases that align with security patches, feature maturity, and Kubernetes compatibility. It also tracks EOL milestones, alerting you when your current version will soon lack upstream support. This proactive approach helps you plan upgrades before you encounter unmaintained or vulnerable builds. ### Upgrade Templates Chkk offers two upgrade strategies for Redis: in-place and blue-green. In-place performs a rolling upgrade on each node or replica in sequence, maintaining availability for read/write operations. Blue-green spins up a new Redis deployment in parallel, syncing data and cutting over once validated. This method aims for near-zero downtime and provides a straightforward rollback if issues arise. ### Preverification For risky or complex upgrades, Chkk's preverification simulates the Redis upgrade in a test environment. It replicates your persistent storage and configuration, then applies the new version to spot any compatibility or performance issues. This dry-run approach means you can address mismatches (e.g., changed AOF/RDB formats) safely before deploying to production. ### Supported Packages Regardless of how you install Redis—Helm, Kustomize, or raw manifests—Chkk recognizes and supports it. This ensures custom images, private registries, and operator-based deployments all benefit from Chkk's checks, curated release info, and automated upgrade processes without requiring you to switch tools. ## Common Operational Considerations * **Replication Conflicts:** Redis Sentinel or Cluster mode auto-promotes a replica if the master fails, but async replication can lose writes if the master dies pre-propagation. Use min-replicas-to-write/min-replicas-max-lag to reduce data loss, and always run at least three Sentinels (or multiple Cluster replicas) to avoid split-brain. * **Persistence Gaps:** RDB captures point-in-time data, while AOF logs every write; a crash during a 5-minute snapshot can lose nearly 5 minutes of data.\*\* Appendfsync everysec narrows that gap at higher I/O cost, and tuning AOF rewrite frequency balances performance with durability. * **Blocking Commands:** Redis is single-threaded, so commands like KEYS \* or large LRANGE scans can stall operations; track slow queries via SLOWLOG. Use SCAN or smaller queries, and offload heavy tasks to replicas or application logic to avoid cluster-wide latency spikes. * **Eviction Storm:** When maxmemory is reached, Redis evicts keys based on its policy (e.g., allkeys-lru); sudden write spikes can cause eviction storms and boost CPU usage. Keep the working set in memory, set proper TTLs, and tune maxmemory-samples to balance eviction accuracy with performance. * **Cluster Slot Imbalances:** In Redis Cluster, uneven hashing or distribution can create hot shards; check slot allocation and re-shard if usage or load skews. For multi-key operations, use hash tags ({key}) to keep related keys in the same slot and prevent cross-slot errors. ## Additional Resources * [Redis Documentation](https://redis.io/docs/latest/) * [Redis Releases](https://github.com/redis/redis/releases) # Operators Source: https://docs.chkk.io/projects/kubernetes-operators/overview Explore All the Kubernetes Operators Chkk Covers A [**Kubernetes Operator**](/misc/glossary#kubernetes-operator) is a type of [Project](/projects/overview) that is responsible for installing and **managing the lifecycle of another [Kubernetes Add-on](/misc/glossary#kubernetes-add-on) or [Application Service](/misc/glossary#application-service)**. Get a quick overview of every Kubernetes Operator Chkk covers, complete with curated release notes, private registry and custom image coverage, pre- and post-flight checks, and comprehensive EOL and version compatibility details. Dive deeper into each Kubernetes Operator to see supported upgrade templates (in-place, blue-green) and preverification. Simply select a card below to learn more about that specific Kubernetes Operator. | | | | | ------------------------------------------------------------------------------- | ---------------------------- | ------------------- | | [Vault Secrets Operator](/projects/kubernetes-operators/vault-secrets-operator) | Bottlerocket Update Operator | Calico Operator | | Cilium Operator | Clickhouse Operator | Datadog Operator | | Dynatrace Operator | Elasticsearch (ECK) Operator | Keycloak Operator | | Kiali Operator | MinIO Operator | Nvidia GPU Operator | | OpenTelemetry (OTEL) Operator | Portworx Operator | Prometheus Operator | | Spark Operator | Strimzi Kafka Operator | Trivy Operator | # Vault Secrets Operator Source: https://docs.chkk.io/projects/kubernetes-operators/vault-secrets-operator Chkk coverage for Vault Secrets Operator. We provide version recommendations, preflight/postflight checks, and Upgrade Templates—ensuring worry-free operations. ## Coverage Matrix | | | | --------------------------------------------------------------- | --------------------- | | **Chkk Curated Release Notes** | v1.12.0 to latest | | **Private Registries** | Covered | | **Custom Built Images** | Covered | | **Preflight/Postflight Checks** (Safety, Health, and Readiness) | v1.16.4 to latest | | **Supported Packages** | Helm, Kustomize, Kube | | **End-Of-Life(EOL) Information** | Covered | | **Version Incompatibility Information** | Covered | | **Upgrade Templates** | In-Place, Blue-Green | | **Preverification** | Covered | ## Vault Secrets Operator Overview Vault Secrets Operator (VSO) manages secrets in Kubernetes by continuously synchronizing them from HashiCorp Vault. It injects Vault data into Kubernetes Secrets, supports automatic rotation, and audits changes for compliance. Platform engineers benefit from centralized policy controls in Vault while apps consume secrets via native K8s workflows. The operator reduces duplication, increases security, and automates secret lifecycle tasks. It's deployable on multiple Kubernetes distributions and works with a range of Vault secret engines. ## Chkk Coverage ### Curated Release Notes Chkk curates official VSO release notes into short, actionable updates, flagging features like dynamic secret engine support or new CRDs. It calls out deprecations, patches, or behavior shifts—so you know exactly what might affect your existing VaultSecret definitions. Instead of sifting through every upstream detail, you get streamlined highlights and a clear sense of operational impact. This allows you to proactively address changes in roles, policies, or secret formats. ### Preflight & Postflight Checks Before each upgrade, Chkk's preflight checks scan for CRD compatibility, Kubernetes version support, and potential Vault auth misconfigurations. It detects outdated fields in your VaultSecret resources, ensuring you don't encounter sync failures or unresolved references post-upgrade. Afterward, the postflight checks inspect operator logs and secret rotation status to confirm a healthy deployment. This prevents hidden issues—like leftover pods or stale secrets—from lingering unnoticed. ### Version Recommendations Chkk constantly tracks Vault Secrets Operator releases and monitors upstream known issues or EOL announcements. If your current version is nearing end-of-support or is incompatible with your Vault version, you receive timely alerts and stable upgrade paths. This ensures you maintain critical security fixes and functional parity with new Kubernetes releases. Chkk also factors in feedback from similar environments to suggest the most reliable target version. ### Upgrade Templates Chkk delivers structured procedures for both in-place and blue-green operator upgrades, mapping out each CRD update, operator pod replacement, and rollback checkpoint. In an in-place scenario, you'll apply updated manifests or Helm charts, then verify secret injections are proceeding correctly. A blue-green deployment spins up a parallel operator instance with the new version, letting you shift secret management gradually. These templates reduce risk and help ensure continuous secure secret delivery during version transitions. ### Preverification Chkk can simulate each step of the upgrade in a test environment, applying your exact VaultSecrets and CRD definitions to confirm they're recognized by the new operator. This dry-run identifies mismatches—like changed default secret paths or required Vault policy updates—long before you touch production. By pinpointing collisions or resource limits in advance, you can adjust configurations or fix them before they disrupt critical apps. This approach is particularly valuable in regulated or large-scale contexts. ### Supported Packages Whether you use Helm, Kustomize, or an Operator Lifecycle Manager (OLM) workflow, Chkk analyzes your manifests and tailors upgrade steps accordingly. It supports custom images from private registries or specialized builds, providing the same safety checks and validations regardless of deployment method. Chkk also recognizes if you're using a multi-namespace or single-tenant operator model and accounts for that in its analysis. This flexibility ensures a consistent experience across diverse Kubernetes environments. ## Common Operational Considerations * **Vault Authentication & Roles:** Maintain tightly scoped Vault policies, and ensure the operator's service account has only the minimal required access. Monitor token expiration logs and renewals to prevent sync interruptions. * **Multi-Cluster & Namespaces:** Decide whether a single operator instance or multiple namespace-scoped instances best fits your security and tenancy needs. Restrict each operator's reach via RBAC so it manages only relevant secrets. * **Secret Rotation Behavior:** Short TTLs can lead to frequent pod restarts, so validate rotation strategies against application-level reload requirements. When using mounted secrets, confirm your app processes re-read updated files. * **Vault Outages & Operator Failover:** Any Vault downtime or network disruption can halt secret updates, so use HA Vault deployments and robust retry settings in VSO. Keep an eye on operator logs to spot connectivity issues early. * **CRD Updates & Backward Compatibility:** Validate CRD changes against your existing VaultSecret definitions prior to upgrading. Keep backups of your operator and CRDs in case you need a quick rollback. ## Additional Resources * [Vault Secrets Operator Repository](https://github.com/ricoberger/vault-secrets-operator) * [Vault Secrets Operator Releases](https://github.com/ricoberger/vault-secrets-operator/releases) # Overview Source: https://docs.chkk.io/projects/overview A *Project* is **software that provides some functionality**. There are three primary types of Projects that Chkk curates: * [Kubernetes Add-on](/misc/glossary#kubernetes-add-on) * [Application Service](/misc/glossary#application-service) * [Kubernetes Operator](/misc/glossary#kubernetes-operator) # Secure Architecture Source: https://docs.chkk.io/security/architecture Chkk's platform is designed with **security, scalability, and resilience** as core principles. Our architecture ensures **strong isolation, fault tolerance, and secure scalability** while maintaining enterprise-grade security controls. Security is embedded at every level of our infrastructure, following a **defense-in-depth strategy** that applies multiple layers of protection. From the network to application layers, we employ **strict access controls, encryption, continuous monitoring, and automated remediation** to protect against unauthorized access and threats. Our platform operates on a **Zero Trust security model**, where every request is authenticated and authorized before being granted access. This model ensures that access is continuously verified, reducing the attack surface and mitigating potential security risks. All service-to-service communication is secured with **mutual TLS (mTLS)** to ensure encrypted data exchange between internal components. We maintain a **secure software development lifecycle (SDLC)**, integrating security best practices at every stage. All code changes undergo **thorough security reviews, automated vulnerability scanning, and rigorous testing** before deployment. Our infrastructure is regularly tested through **independent penetration testing and security audits** to validate our security posture and proactively identify potential threats. Resilience and availability are critical components of our architecture. Our cloud infrastructure is deployed across multiple **geographically distributed regions**, ensuring high availability and disaster recovery readiness. Automated **backup and failover mechanisms** safeguard against data loss and ensure continuity of service, even in the event of an outage. We implement **real-time monitoring and threat detection** to continuously assess security risks and respond proactively to anomalies. Our commitment to security extends beyond infrastructure. We enforce **strict access policies**, ensuring that only authorized personnel can access sensitive systems. All administrative actions are logged, monitored, and reviewed for compliance, further enhancing accountability and reducing the risk of unauthorized modifications. Chkk's security architecture is **continuously evolving** to adapt to new threats and industry best practices. We leverage advanced security technologies and methodologies to provide **a highly secure and resilient platform** for our customers. For a detailed breakdown of our architecture, including network segmentation, authentication mechanisms, and security best practices, visit the [**Chkk Trust Center**](/security/trust-center) to access full documentation and architectural diagrams. # Compliance Source: https://docs.chkk.io/security/compliance Chkk is committed to upholding the highest standards in security and regulatory compliance. Chkk is **SOC 2 Type II** certified, demonstrating that our security, availability, and confidentiality controls are independently assessed and verified. This certification is renewed annually through rigorous audits, ensuring that our security practices remain effective and aligned with industry standards. We conduct **regular penetration testing** and security assessments in partnership with trusted third-party firms. Our internal security team continuously monitors and enhances our security posture through **vulnerability scanning, patch management, and infrastructure hardening**. Additionally, we undergo annual third-party reviews to validate our security measures and identify areas for improvement. Chkk also adheres to global data privacy regulations, including **GDPR and CCPA**, ensuring that our data handling practices align with stringent privacy requirements. We offer customers flexibility in **data residency**, allowing them to store and process data in compliance with regional regulations. For a deeper understanding of our security and compliance commitments, visit the [**Chkk Trust Center**](/security/trust-center), where you can access **SOC 2 Type II reports, security documentation, penetration testing summaries, and compliance artifacts**. If your organization has specific compliance requirements, our team is available to provide additional documentation and support tailored to your needs.

# Security-First Culture Source: https://docs.chkk.io/security/culture Chkk has invested in a **security-first culture** since day one. Security is a **top-down priority**, with leadership ensuring that security remains central to our operations, engineering, and customer commitments. Having built **planet-scale, secure services** and run security programs at AWS—including the foundational controls behind some of AWS’ most fundamental security guarantees—our team also brings deep expertise designing and securing AWS networking for highly regulated financial institutions, U.S. Government (GovCloud), and intelligence-grade systems. We have built Chkk on these same security principles, and we believe that every individual—regardless of role—shares responsibility for safeguarding our systems and data. From the earliest design discussions to ongoing production operations, security underpins every decision we make. We align our organization so security teams work hand-in-hand with product and engineering. Weekly reviews with executive leadership highlight key security metrics, connect them to business objectives, and ensure alignment on strategic priorities. This commitment sets the tone that security is never an afterthought—it's an accelerator for customer trust and innovation. At Chkk, strong security practices include thorough threat modeling, well-defined secure development workflows, robust change management, and continuous testing. We operate under a **Zero Trust security model**, ensuring that every request for access is verified and continuously authenticated. Our policies enforce **least-privilege access** and rigorous security testing before new features are shipped. These same principles are applied to every dedicated deployment of Chkk, giving customers a consistent, defense-in-depth posture whether they run our SaaS platform or a private instance. We also conduct **independent audits and penetration tests**, challenging our controls and validating our processes to continuously raise the bar. Security is not just a compliance requirement; it is an ongoing operational mandate. Our goal is to provide a platform where **reliability, confidentiality, and integrity** are woven into every layer. If you have any questions, please reach out to our security team at [security@chkk.io](mailto:security@chkk.io), or explore the rest of our Security documentation to learn more about our [data handling](/security/data-handling), [third-party subprocessors](/security/subprocessors), [platform architecture](/security/architecture), and [Trust Center](/security/trust-center) resources. # Data Protection & Handling Source: https://docs.chkk.io/security/data-handling At Chkk, we take a **privacy-first approach** to data protection, ensuring that customer data is safeguarded throughout its lifecycle. Our philosophy is simple: **collect only what is necessary, protect it at all costs, and provide transparency and control to our customers**. ### **Data Collection & Minimization** We design our platform to minimize the data we collect. Chkk primarily analyzes **metadata and configurations** from Kubernetes environments—**not customer application data**. Our in-cluster connector gathers only essential information, such as cluster version details, configuration settings, and security events. By limiting data collection, we reduce risk and simplify compliance obligations. ### **Encryption in Transit & At Rest** All data transmitted between customer environments and Chkk is encrypted using **TLS 1.2+**, ensuring end-to-end protection in transit. Data at rest is safeguarded with **AES-256 encryption**, applied across databases, file storage, and backups. Encryption keys are managed using strict security controls, including **regular key rotation** and storage in secure Key Management Services (KMS). ### **Access Controls & Isolation** We enforce **least privilege access** across our platform, ensuring that only authorized users and services can access sensitive data. Role-based access control (RBAC) and multi-factor authentication (MFA) protect administrative access. Our **multi-tenant architecture** ensures complete logical separation of customer data, preventing any unauthorized cross-tenant access. ### **Data Retention & Deletion** Customer data is retained only as long as necessary to deliver our services. We maintain **defined retention policies**, automatically purging outdated or unnecessary data. Upon customer request or contract termination, all associated data is securely deleted using cryptographic erasure techniques to ensure that no residual information remains. ### **Customer Control & Transparency** We empower customers with full visibility into their data usage and provide mechanisms to support **data subject rights requests** under privacy regulations like **GDPR and CCPA**. Customers can access, export, and delete their data as needed, ensuring compliance with evolving privacy expectations. Through these rigorous protections, Chkk ensures that customer data remains secure, private, and fully under your control. For more details, visit the [**Chkk Trust Center**](/security/trust-center) to access FAQs, Security Documentation, Compliance Certificates, Penetration Testing reports, and other security resources. # Privacy Policy Source: https://docs.chkk.io/security/privacy-policy Updated November 2022 At Chkk, Inc. ("Chkk" or "We"), the protection of your personal data is of particular importance to us. We protect your personal data in accordance with applicable data protection laws as well as this Privacy Policy. We have prepared this Privacy Policy to inform you of the manner in which we collect, use, disclose, and otherwise process the information we may collect about you from (a) your use of our Website, located at [https://chkk.io](https://chkk.io) and/or our products and services, (b) your interactions with us online and at in-person events, or (c) any other circumstances in which we provide you with a copy of this Privacy Policy. ## Definitions Under this Privacy Policy: * Personal data means any information relating to an identified or identifiable natural person (data subject); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person. * Processing means any operation or set of operations which is performed on personal data or on sets of personal data, whether or not by automated means, such as collection, recording, organization, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction. * Controller means the natural or legal person, public authority, agency or other body which, alone or jointly with others, determines the purposes and means of the processing of personal data. * Processor means a natural or legal person, public authority, agency or other body which processes personal data on behalf of the controller. * Recipient means a natural or legal person, public authority, agency or another body, to which the personal data are disclosed, including both processors and controllers. * Legal basis means a lawful ground for data processing under the GDPR or similar laws. ## Personal Data We Collect We may collect your personal data when you: * Contact us; * Visit or register with our Website; * Use our products, including by browsing the product, downloading content, or receiving a product demo; * Apply for employment or other positions; * Subscribe or request to attend our webinars, events or workshops, sign up for our slack channel or blog posts; * Interact with us on our social media profiles (e.g., Facebook, Twitter, LinkedIn); * Provide your personal data to our third party sources, including our service providers; or * Interact with us or our personnel at in-person events. ### Personal Data You Provide to Us Directly We may collect information provided by you directly, including from our Website; from your contacts with us, including through our webpage and on social media; by your creation of a user account; and from your use or trial of our products and services. This nformation may include your first and last name, email address, username, password, job title, phone number, country of residence, company name, payment information, profile picture and any other information provided by you. We may also collect information provided by you in the course of evaluating or engaging you for employment or other positions. This information may include your first and last name, email address, CV, resume, cover letter and any other information provided by you. ### Personal Data We Collect through Automated Data Collection Technologies We may collect information using Automated Data Collection Technologies from your use of our Website and products. This information may include your IP Address, Log Files, Referrer URL, Browser Information, Device Information, and Data and Time of user request, cookies, information reflecting how you searched, browsed, and were directed to the Website, including mouse movement, click, touch, scroll, and keystroke activity, and any other information provided by your use of our Website and products, as further explained in the "Use of Cookies and Other Web Technologies" section below. ### Personal Data We Obtain from Third Parties We may collect information from third party sources such as lead generation companies, data sellers, advertising partners, and Service Providers. This information may include your first and last name, email address, phone number, company name, job title, and country, and other information. ## How We Use Your Personal Data We may use your personal data: * For your creation of a user account or profile to use our products and services; * To provide, maintain, and improve our Website, products and services, including for collaboration within the product, to enhance your user experience, and to understand and save your preferences for future visits; * To monitor our products' performance and implement security measures; * For the performance or preparation of a contract to which you, our customer or service provider are a party; * To communicate with our customers or clients; * To establish and maintain our business relationship with you; * To plan and host events, workshops, and webinars, including to manage our list of attendees; * To send you marketing and other information about our products, services or offerings, including through our publications and on other websites and/or media channels; * To advertise to you on other sites; * To receive, process, and respond to your feedback, requests or queries through our products, Website, or social media; * For compliance with our legal obligations and other internal legal compliance purposes; * To evaluate your employment application and assess you as a candidate; and, * For other purposes consistent with the context of the collection of your personal data, or as otherwise disclosed to you prior to the use of your personal data. ## Data Sharing Personal data may be disclosed to third parties in the following circumstances: ### Processors, Service Providers and other companies that work with or on behalf of Chkk Personal data may be disclosed to processors or service providers who act on our behalf in order to process personal data in accordance with the purposes outlined above. This includes the following categories of service providers: * IT service providers * Email marketing providers * Administrative, billing, operations, and payment operators * Cloud and other software service providers Data access by processors or service providers is protected under our contracts with these entities, which limit the processing purposes. The agreement obliges the service providers to process your personal data only on our behalf and upon our instruction. They are prohibited to pass on your personal data to other parties without permission, unless this is required by law. We may also share data with entities that are controllers, such as advertising partners, data sellers, and similar companies, in accordance with the "Use of Cookies and Other Web Technologies" Section below and other sections of this Privacy Policy. ### Sale of Business If, in the future, we sell or transfer, or we consider selling or transferring, some or all of our business, shares or assets to a third party, we will disclose your personal data to such third party (whether actual or potential) in connection with the foregoing events. In the event that we are acquired by, or merged with, a third party entity, or in the event of bankruptcy or a comparable event, we reserve the right to transfer, disclose or assign your personal data in connection with the foregoing events. ### Legal Purposes We may share your personal data with regulators, courts or competent authorities, to comply with applicable laws, regulations and rules (including, without limitation, federal, state or local laws), and requests of law enforcement, regulatory and other governmental agencies or if we have a good faith belief that the law requires it, such as in response to a search warrant, subpoena, or other legally valid inquiry, order, or process. We may also disclose information to assist us in collecting a debt, or as necessary to exercise our legal rights or defend claims brought against us. ### With Your Consent We may share your personal data where you have provided your consent to us sharing or transferring your personal data (e.g., where you provide us with marketing consents or opt-in to optional additional services or functionality). ## Your Rights Depending on the circumstances, you may be entitled to exercise some or all of the following rights: * Obtain confirmation as to whether or not your personal data is being processed and access to copy of your personal data undergoing processing * Require (i) access to and/or duplicates of your personal data retained, (ii) receive the personal data concerning you, which you have provided to us, in a structured, commonly used and machine-readable format and (iii) to transmit those personal data to another controller without hindrance from our side; where technically feasible you shall have the right to have the personal data transmitted directly from us to another controller * Request rectification, removal or restriction of your personal data * Where the data processing is based on your consent, refuse to provide and - without impact to data processing activities that have taken place before such withdrawal - withdraw your consent to processing of your personal data at any time * Take legal actions in relation to any potential breach of your rights regarding the processing of your personal data, as well as to lodge complaints before the competent data protection regulators * Not to be subject to any automated decision making, including profiling (automatic decisions based on data processing by automatic means, for the purpose of assessing several personal aspects) which produce legal effects on you or affect you with similar significance Further, you may be entitled to object, out of grounds relating to your particular situation, at any time to processing of personal data concerning you, including object to direct marketing and automated individual decision-making including profiling. In this case, please provide us with information about your particular situation. After the assessment of the facts presented by you we will either stop processing your personal data or present you our compelling legitimate grounds for ongoing processing. You can exercise your rights by submitting a request at priv. Subject to legal and other permissible considerations, we will make every reasonable effort to honor your request promptly in accordance with applicable law or inform you if we require further information in order to fulfill your request. When processing your request, we may ask you for additional information to confirm or verify your identity and for security purposes, before processing and/or honoring your request. We reserve the right to charge a fee where permitted by law, for instance if your request is manifestly unfounded or excessive. In the event that your request would adversely affect the rights and freedoms of others (for example, would impact the duty of confidentiality we owe to others) or if we are legally entitled to deal with your request in a different way than initial requested, we will address your request to the maximum extent possible, all in accordance with applicable law. Please see the "California Residents" Section below for information on rights under California law. ## Legal Basis Where applicable under the GDPR or similar laws, the legal basis for our collection and use of your personal data may include any of the following: * Performance of a contractWe process your personal data as necessary to perform our obligations under any contract with you, such as to provide our Website or services to you or complete transactions. * ConsentWe may ask for your consent to use your personal data, including if we need your consent to engage in certain marketing activities. If we obtain your consent as a legal basis for processing, you may withdraw your consent at any time. * Legitimate interestsWe have a legitimate interest in using your personal data for our business purposes, including operating, improving, and marketing our business, Website and services. * Compliance with a legal obligationWe may need to use your personal data to comply with applicable legal requirements. ## Data Storage and Transfers Where applicable under the GDPR or similar laws, we have implemented appropriate cross-border transfer mechanisms when transferring your personal data to a country outside of your home jurisdiction, including, where relevant, the EU Standard Contractual Clauses. ## Interaction With Third Parties We may link to or otherwise enable you to interact with a third party Website, mobile software applications and products or services that are not owned or controlled by us (each a "Third Party Service"). We are not responsible for the privacy practices or the content of such Third Party Services. Please be aware that Third Party Services can collect personal data from you. Accordingly, we encourage you to read the terms and conditions and privacy policies of each Third Party Service. ## Data Retention We retain your personal data as long as reasonably necessary for the respective purpose. In determining the criteria by which to retain or dispose of your personal data, we consider the type, sensitivity, context, and purpose of collecting the information. Chkk may additionally delete your personal data in response to a valid data subject request, as described below. ## Security of Your Information We maintain administrative, technical, and physical safeguards designed to protect against unauthorized access, use, modification, and disclosure of your personal data in our custody and control. No data, on the Internet or otherwise, can be guaranteed to be 100% secure. While we strive to protect your information from unauthorized access, use, or disclosure, Chkk cannot and does not ensure or warrant the security of your personal data. ## Children's Privacy Chkk does not knowingly collect or process personal data from children under the age of 13. The Website is not directed at children under the age of 13. In the event that we learn that we have collected personal data of a child under the age of 13 without parental consent, we will promptly take steps to delete that information. If you believe that we may have collected personal data from a child under 13, please contact us using the contact details outlined in this policy. ## No Processing for Automated Individual Decision-making Including Profiling We do not knowingly collect or process personal data for automated individual decision-making including profiling. ## Cookie Policy ### Use of Cookies and Other Web Technologies If your browser is configured to accept cookies, we may collect non-personally identifiable information passively using "cookies" and "page tags". It is Chkk's policy to respect your privacy regarding any information we may collect while operating our Website. Please read this policy carefully to understand how we handle and treat your personal data. ### Cookies "Cookies" are small text files that can be placed on your computer or mobile device in order to identify your Web browser and the activities of your computer on the Chkk Service and other Websites. We use cookies to personalize your experience on the Chkk Website (such as dynamically generating content on webpages specifically designed for you), to assist you in using the Chkk Service (such as saving time by not having to reenter your name each time you use the Chkk Service), to allow us to statistically monitor how you are using the Chkk Service so that we can improve our offerings, and to determine the popularity of certain content. By using cookies and page tags together, we are able to improve the Chkk Service and measure the effectiveness of our advertising and marketing campaigns. ### Page Tags "Page tags," also known as web beacons or gif tags, are a web technology used to help track Website or email usage information, such as how many times a specific page or email has been viewed. Page tags are invisible to you, and any portion of the Chkk Service, including content, or email sent on our behalf, may contain page tags. ### Do I Have To Accept Them You do not have to accept cookies to use the Chkk Website or services. If you reject cookies, certain features or resources of the Chkk Website may not work properly or at all and you may have a degraded experience. Although most browsers are initially set to accept cookies, you can change your browser settings to notify you when you receive a cookie or to reject cookies generally. To learn more about how to control privacy settings and cookie management, click the link for your browser below. * [Microsoft Internet Explorer](https://support.microsoft.com/en-us/windows/change-security-and-privacy-settings-for-internet-explorer-11-9528b011-664c-b771-d757-43a2b78b2afe) * [Mozilla Firefox](https://support.mozilla.org/en-US/kb/delete-browsing-search-download-history-firefox#w_clear-cookies-and-data-for-a-specific-website) * [Google Chrome](https://support.google.com/accounts/answer/61416) * [Apple Safari](https://support.apple.com/en-us/105082) To learn more about cookies; how to control, disable or delete them, please visit [http://www.aboutcookies.org](http://www.aboutcookies.org). Some third party advertising networks, like Google, allow you to opt out of or customize preferences associated with your internet browsing. For more information on how Google lets you customize these preferences, see their documentation. All cookies, on our Website and everywhere else on the web, fall into one of five categories: * Essential * Advertising * Analytics & Customization * Performance & Functionality, and * Social Networking You are able to see the specific cookies we use and exercise choices about the types of cookies and other technologies you want to accept by selecting the "Manage Cookie Preferences" section of our website ([https://chkk.io](https://chkk.io)). ## Log Files We collect non-personal data through our Internet log files, which record data such as browser types, domain names, and other anonymous statistical data involving the use of the Chkk services. This information may be used to analyze trends, to administer the Chkk services, to monitor the use of the Chkk services, and to gather general demographic information. We may link this information to personal data for these and other purposes such as personalizing your experience on the Chkk services and evaluating the Chkk services in general. ## Do Not Track (DNT) Settings We do not currently respond or take any action with respect to web browser "do not track" signals or other mechanisms that provide consumers the ability to exercise choice regarding the collection of personally identifiable information about an individual consumer's online activities over time and across third-party web sites or online services. We may allow third parties, such as companies that provide us with analytics tools, to collect personally identifiable information about an individual consumer's online activities over time and across different websites when a consumer uses the Services. ## California Residents If you are a California resident, your personal data may be covered by the California Consumer Privacy Act (CCPA). The below disclosures apply to the extent the CCPA applies to your personal data, subject to any applicable exemptions. ## "Personal Information" We Collect The categories of "personal information," as defined in the CCPA, that we collect include: Identifiers; Personal information categories listed in the California Customer Records statute (Cal. Civ. Code § 1798.80(e)) Commercial Information; Internet or other electronic network activity information; Audio, electronic, and visual information; Professional or employment-related information; and Inferences drawn from other personal information. Chkk may obtain, use, and share these data categories as detailed in the "Personal Data We Collect," "How We Use Your Personal Data," and "Data Sharing" sections of this Privacy Policy, above. ## Data Subject Rights You may be entitled to exercise some or all of the following rights under the CCPA: ### (i) Right to Know About Personal Data Collected, Disclosed, or Sold You may have the right to request that we provide certain information to you about our collection and use of your personal data over the past twelve (12) months. Specifically, you may have the right to request disclosure of: * The specific pieces of personal data we collected about you * The categories of personal data we collected about you * The categories of sources from which personal data was collected * Our business or commercial purpose for collecting or disclosing personal data, and * The categories of third parties with whom we shared personal data. ### (ii) Right to Request Deletion of Personal Data You may also have the right to request that we delete any of your personal Data that we collected or maintain about you, subject to certain exceptions. ### (iii) Right to Correct Inaccurate Personal Data You may also have the right to request that we correct inaccurate personal data we maintain. ### (iv) Right to Non-Discrimination for the Exercise of a Consumer's Privacy Rights We will not unlawfully discriminate against you for exercising any of your applicable privacy rights. ### (v) Right to Opt Out of the Sale or Sharing of your Personal Data Chkk uses third party cookies and similar technologies to deliver targeted advertisements, also known as data "sharing" and/or "selling" under the CCPA, as further detailed in the "Cookie Policy" section above. You can opt out of these practices by turning off advertising cookies in the "Manage Cookie Preferences" section of our website ( [https://chkk.io/](https://chkk.io/)). ## Exercise Your Rights You can exercise your rights by submitting a request at [privacy@chkk.io](mailto:privacy@chkk.io) or modifying your cookie preferences on [https://chkk.io/](https://chkk.io/). ## Response Timing and Format We will make our best effort to respond to a verifiable consumer request within 45 days of its receipt. If we require more time (up to 90 days), we will inform you of the reason and extension period in writing. Within ten (10) days of receiving the request, we will confirm receipt and provide information about its verification and processing of the request. Chkk will maintain records of consumer requests made pursuant to the CCPA as well as our response to said requests for a period of at least twenty-four (24) months. ## Your Rights Under Other California Statutes In addition to your rights under the CCPA, California Civil Code Section 1798.83 permits California residents to request information regarding our disclosure, if any, of their personal data to third parties for their direct marketing purposes. If this applies, you may obtain the categories of personal data shared and the names and addresses of all third parties that received personal data for their direct marketing purposes during the immediately prior calendar year. If you are a California resident under the age of 18 and a registered user, California Business and Professions Code Section 22581 permits you to remove content or personal data you have publicly posted. If you wish to remove such content or personal data please submit a request here and if you specify which content or personal data you wish to be removed, we will do so in accordance with applicable law. Please be aware that after removal you may not be able to restore removed content. In addition, such removal does not ensure complete or comprehensive removal of the content or personal data you have posted and that there may be circumstances in which the law does not require us to enable removal of content. You may submit this request by contacting us at [privacy@chkk.io](mailto:privacy@chkk.io). ## Updates to This Policy We may update this Privacy Policy from time to time. If we modify our Privacy Policy, we will post the revised version here, with an updated revision date. You may visit these pages periodically to be aware of and review any such revisions. If we make material changes to our Privacy Policy, we may also notify you by other means prior to the changes taking effect, such as by posting a notice on our Website or sending you a direct notification. ## Contact Us Please feel free to contact us at any time if you have any questions or comments about this Privacy Policy. Contact our Data Protection Officer at: [privacy@chkk.io](mailto:privacy@chkk.io) Contact the Controller for the processing of this Website at: Chkk, 440 North Wolfe Drive, Sunnyvale, CA, 94085 ([privacy@chkk.io](mailto:privacy@chkk.io)) # Subprocessors and Third-Party Security Source: https://docs.chkk.io/security/subprocessors Chkk partners with carefully selected providers to deliver a high-quality, secure service. These subprocessors support various functions, including **infrastructure hosting, analytics, and payment processing**. Before onboarding any vendor that will store or process customer data, we conduct a rigorous due diligence process as outlined in our **Third-Party Management Policy**. This evaluation assesses the vendor's security measures, data protection capabilities, and compliance posture. Once approved, each subprocessor signs a **Data Protection Agreement (DPA)** that defines their responsibilities regarding confidentiality and security. We continuously monitor our subprocessors to ensure they uphold our security standards. This includes reviewing their **certifications (such as ISO 27001 and SOC 2), assessing their internal security posture, and evaluating their incident response readiness**. We also require subprocessors to implement **least-privilege access controls**, ensuring they only have access to the minimal data necessary for their function. Additionally, we encourage frequent security reviews and audits to quickly identify and mitigate potential vulnerabilities. Where feasible, we implement **data segregation measures** to limit third-party access to only the data required for their role. For a comprehensive list of approved subprocessors and the services they provide, visit the [**Chkk Trust Center**](/security/trust-center). There, you will find details on their **geographic location, types of personal data processed, and relevant compliance credentials**. In the event of significant changes to our subprocessor list, we will notify customers promptly to maintain transparency and trust. If you have any questions about a specific vendor or would like more details on our third-party risk management process, reach out to us at [privacy@chkk.io](mailto:privacy@chkk.io) or consult Chkk Trust Center. # Terms of Service Source: https://docs.chkk.io/security/tos ## Master Subscription Agreement Updated October 2023 This Chkk Master Subscription Agreement ("MSA") is effective as of the effective date of an applicable signed order form ( "Order Form" and such date the "Effective Date") and is by and between Chkk Inc., a Delaware corporation with a place of business at 440 North Wolfe Road, Sunnyvale, CA, 94085 ("Chkk"), and the customer (i) set forth on the Order Form or (ii) who registers for the Services on a free trial basis ("Trial Services"), and in each case, accepts this MSA (each, a "Customer") (each a "Party" and together the "Parties"). In the event of any inconsistency or conflict between the terms of the MSA and the terms of any Order Form, the terms of the Order Form control. If Customer is provided with access to the Services on a free trial basis, the section of this Agreement entitled "Free Trial Services" will govern such access and, unless as otherwise indicated on an applicable Order Form, certain of Chkk's obligations under this MSA will not apply, as further described below. ### Section 1. Services The "Services" mean the products and services that are ordered by Customer from Chkk in an Order Form referencing this MSA or, if applicable, the Trial Services that are made available to Customer. Services exclude any products or services provided by third parties, even if Customer has connected those products or services to the Services. Subject to the terms and conditions of this MSA, Chkk will make the Services available to Customer during the Term. ### Section 2. Fees and Payment. **2.1. Fees.** Customer will pay the fees specified in the Order Form (the "Fees"). **2.2. Payment; Taxes.** Customer shall keep a payment method on file with Chkk for payment of Fees. Chkk shall invoice Customer for Fees, either within the Services or directly, within thirty (30) days of the Effective Date, the start of the Renewal Term (as defined below), or otherwise as specified in the Order Form. Customer shall pay all invoiced Fees (i) charged automatically via the payment method associated with Customer's account for the Services or (ii) if agreed otherwise in writing by both parties, upon receipt of such invoice. In the event of non-payment of Fees by Customer for thirty (30) days after the due date of an invoice, Customer's access to the Services may be immediately suspended and Customer must pay the entire remaining balance of Fees to regain access to the Services. Fees do not include local, state, or federal taxes or duties of any kind and any such taxes will be assumed and paid by Customer, except for taxes on Chkk based on Chkk's income or receipts. **2.3. Price Changes.** Chkk may change prices for the Services from time to time, in its sole discretion. Any price changes will be effective upon the commencement of Customer's next Renewal Term; provided, that Chkk shall provide Customer with reasonable notice of any such fee increase prior to the expiration of the Term or any Renewal Term. **2.4. Discounts and Promotional Pricing.** Prices specified in the Order Form may include discounts or promotional pricing. These discounts or promotional pricing amounts may be temporary and may expire upon the commencement of a Renewal Term, without additional notice. Chkk reserves the right to discontinue or modify any promotion, sale or special offer at its sole and reasonable discretion. **2.5 Free Trial Services.** If Customer is granted access to Trial Services, Chkk will make the applicable Trial Services available to Customer pursuant to this MSA starting from the time that Customer registers and is approved for such Trial Services until the earlier of: (a) the end of the Trial Services period communicated to Customer; (b) the start date of any Order Form executed by Customer for Service(s) in exchange for payment; or (c) termination by Chkk in its sole discretion. ANY CUSTOMER INFORMATION THAT CUSTOMER PROVIDES OR MAKES AVAILABLE TO CHKK DURING THE PROVISION OF TRIAL SERVICES MAY BE PERMANENTLY DELETED, AT CHKK'S DISCRETION, UNLESS CUSTOMER EXECUTES AN ORDER FORM FOR THE SAME SERVICES AS THOSE COVERED BY THE TRIAL SERVICES OR EXPORTS SUCH CUSTOMER INFORMATION BEFORE THE END OF THE TRIAL SERVICES PERIOD. NOTWITHSTANDING THE "REPRESENTATIONS, WARRANTIES AND DISCLAIMERS" SECTION AND "INDEMNIFICATION" SECTION BELOW, FREE TRIAL SERVICES ARE PROVIDED "AS-IS" WITHOUT ANY WARRANTY AND CHKK SHALL HAVE NO INDEMNIFICATION OBLIGATIONS NOR LIABILITY OF ANY TYPE WITH RESPECT TO THE TRIAL SERVICES UNLESS SUCH EXCLUSION OF LIABILITY IS NOT ENFORCEABLE UNDER APPLICABLE LAW IN WHICH CASE CHKK'S LIABILITY WITH RESPECT TO THE TRIAL SERVICES SHALL NOT EXCEED \$1,000.00. NOTWITHSTANDING ANYTHING TO THE CONTRARY IN THE "LIMITATION OF LIABILITY" SECTION BELOW, CUSTOMER SHALL BE FULLY LIABLE UNDER THIS AGREEMENT TO CHKK AND ITS AFFILIATES FOR ANY DAMAGES ARISING OUT OF CUSTOMER'S USE OF THE TRIAL SERVICES, ANY BREACH BY CUSTOMER OF THIS AGREEMENT AND ANY OF CUSTOMER'S INDEMNIFICATION OBLIGATIONS HEREUNDER. ### Section 3. Term and Termination. **3.1. Term and Renewal.** This MSA commences on the Effective Date and will remain in effect through the term specified in the Order Form (or, in the case of Trial Services, for the period of time as agreed upon between Chkk and Customer), and will renew as specified in the Order Form unless otherwise terminated in accordance with this Section (collectively the "Term"). If the Order Form does not specify, the Term will be one year and will automatically renew for successive one-year periods unless Customer provides Chkk with notice of termination at least thirty (30) days prior to the end of the Term (a "Renewal Term"). **3.2. Termination for Cause.** A Party may terminate this MSA for cause (a) immediately upon notice to the other Party of a material breach if such breach remains uncured after thirty (30) days from the date of the breaching Party's receipt of such notice; (b) immediately upon notice to the other Party if the other Party becomes the subject of a petition in bankruptcy or any other proceeding relating to insolvency, receivership, liquidation or assignment for the benefit of creditors; or (c) immediately upon written notice by Chkk to Customer for any use of the Services in violation of Section 4.5 (Prohibited Uses) below. Non-payment of Fees by Customer for sixty (60) days after issuance of an invoice, and any violation of the Prohibited Uses clause below will be considered material breaches of this MSA. **3.3. Effect of Termination and Survival.** Upon termination of an Order Form or this MSA (a) with respect to termination of the entire MSA, all Order Forms will concurrently terminate, (b) Customer will have no further right to, and shall cease and ensure its Authorized Users cease, use the Services under the terminated or cancelled Order Forms and Chkk will remove Customer's access to same, and (c) unless otherwise specified in writing, Customer will not be entitled to any refund of fees paid. The following Sections will survive termination: Section 2 (Fees and Payment), Section 4 (Ownership), Section 5 (Confidentiality), Section 7.3 (Disclaimers), Section 8 (Indemnification), Section 9 (Limitation of Liability), and Section 10 (Miscellaneous). Termination of this MSA will not limit a Party's liability for obligations accrued as of or prior to such termination or for any breach of this MSA. ### Section 4. Ownership, License, and Use of the Services. **4.1. Ownership.** Each Party will retain all rights, title and interest in any of its patents, inventions, copyrights, trademarks, domain names, trade secrets, know-how and any other intellectual property and/or proprietary rights ("Intellectual Property Rights"). Chkk will retain Intellectual Property Rights in the Services and all components of, or used to, provide the Services or created by the Services or by Chkk in the course of providing the Services (the "Services Information"). Customer will retain Intellectual Property Rights in all information it provides to Chkk as part of this MSA (other than Feedback as described below), including but not limited to in the course of its use of the Services (the "Customer Information"). **4.2. Feedback.** Customer may, under this MSA, provide suggestions, enhancement requests, recommendations about the Services, or other feedback to Chkk (the "Feedback"). Customer provides Chkk a fully paid-up, royalty-free, worldwide, transferable, sub-licensable (through multiple layers), assignable, irrevocable and perpetual license to implement, use, modify, commercially exploit, incorporate into the Services, or otherwise use any Feedback. Chkk also reserves the right to seek intellectual property protection for any features, functionality or components that may be based on or that were initiated by such Feedback. **4.3. Licenses.** Chkk hereby grants Customer a non-exclusive, non-transferable (except as set forth in Section 10.2), non-sublicensable right to and license during the Term to access and use, and permit Authorized Users to access and use, the Services as set forth in the Order Form or on a Trial Services basis all subject to the terms and conditions of this MSA and the Order Form (if applicable). Customer hereby grants Chkk a non-exclusive, non-transferable (except as set forth in Section 10.2) right and license to use, including through the use of subcontractors, the Customer Information solely to provide the Services to Customer. **4.4. Authorized Users.** Customer may designate and provide access to the Services to employees, agents, or authorized contractors (each an "Authorized User"). Customer is responsible for all use and misuse of the Services by Authorized Users and for adherence to all terms of this MSA by any Authorized Users, and references to Customer herein will be deemed to apply to Authorized Users as necessary and applicable. Customer agrees to promptly notify Chkk of any unauthorized access or use of which Customer becomes aware. Authorized Users are strictly prohibited from sharing their accounts or account passwords and their doing so is a material breach of this MSA by Customer. **4.5. Prohibited Uses.** Customer and Authorized Users will not: (a) "frame," distribute, resell, or permit access to the Services by any third party other than as allowed by the features and functionality of the Services; (b) use the Services in violation of applicable laws; (c) interfere with, disrupt, or gain unauthorized access to the Services; (d) successfully or otherwise, attempt to: decompile, disassemble, reverse engineer, discover the underlying source code or structure of, or copy the Services; (e) provide Chkk any Customer Information or Feedback that is unlawful, defamatory, harassing, discriminatory, or infringing of third party intellectual property rights; (f) transfer to the Services or otherwise use on the Services any code, exploit, or undisclosed feature that is designed to delete, disable, deactivate, interfere with or otherwise harm or provide unauthorized access to the Services; (g) use any robot, spider, data scraping, or extraction tool or similar mechanism with respect to the Services; (h) provide access to the Services to an individual associated with a Chkk Competitor (defined below); (i) extract information from the Services in furtherance of competing with Chkk; (j) encumber, sublicense, transfer, rent, lease, time-share or use the Services in any service bureau arrangement or otherwise for the benefit of any third party; (k) copy, distribute, manufacture, adapt, create derivative works of, translate, localize, port or otherwise modify any aspect of the Services; (l) introduce into the Services any software containing a virus, worm, "back door," Trojan horse or similarly harmful code; or (m) permit any third party to engage in any of the foregoing proscribed acts. A "Chkk Competitor" is any entity that provides the same or similar goods and services to those provided by Chkk, as would be determined by a commercially reasonable individual. Customer will promptly notify Chkk of any violations of the above prohibited uses by an Authorized User or a third party and require such Authorized User or third party to immediately cease any such use. Chkk reserves the right to suspend Customer and/or Authorized User's access to the Services in the event Chkk suspects Customer or an Authorized User is in breach of this MSA. **4.6 Usage Data.** Chkk may collect Usage Data and use it to operate, improve, and support the Services, and for other lawful business practices; however, Chkk will not disclose Usage Data externally unless it is (a) de-identified so that it does not identify Customer, its Authorized Users or any other person and (b) aggregated with data across other customers. "Usage Data" means technical logs, metrics and performance data, which may be derived from or include Customer Information (or part thereof) relating to the operation, delivery and use of the Services. ### Section 5. Confidentiality. "Confidential Information" of a Party (the "Disclosing Party") means all financial, technical, or business information of the Disclosing Party that the Disclosing Party designates as confidential at the time of disclosure to the other Party (the "Receiving Party") or that the Receiving Party reasonably should understand to be confidential based on the nature of the information or the circumstances surrounding its disclosure. For the avoidance of doubt, Confidential Information of: (1) Chkk shall include the Services Information, and (2) Customer shall include Customer Information. Except as expressly permitted in this MSA, the Receiving Party will not disclose, duplicate, publish, transfer or otherwise make available Confidential Information of the Disclosing Party in any form to any person or entity without the Disclosing Party's prior written consent. The Receiving Party will not use the Disclosing Party's Confidential Information except to perform its obligations under this MSA, such obligations including, in the case of Chkk, to provide the Services. Notwithstanding the foregoing, the Receiving Party may disclose Confidential Information to the extent required by law, provided that the Receiving Party: (a) gives the Disclosing Party prior written notice of such disclosure so as to afford the Disclosing Party a reasonable opportunity to appear, object, and obtain a protective order or other appropriate relief regarding such disclosure (if such notice is not prohibited by applicable law); (b) uses diligent efforts to limit disclosure and to obtain confidential treatment or a protective order; and (c) allows the Disclosing Party to participate in the proceeding. Further, Confidential Information does not include any information that: (i) is or becomes generally known to the public without the Receiving Party's breach of any obligation owed to the Disclosing Party; (ii) was independently developed by the Receiving Party without use or reference to the Disclosing Party's confidential information and without the Receiving Party's breach of any obligation owed to the Disclosing Party; or (iii) is received from a third party who obtained such Confidential Information without any third party's breach of any obligation owed to the Disclosing Party. ### Section 6. Privacy and Security Practices. Chkk's current security practices (the "Security Statement), privacy, and data protection practices are set forth at [https://www.chkk.io/privacy-policy](https://www.chkk.io/privacy-policy) (the "Privacy Policy"). ### Section 7. Representations, Warranties, and Disclaimers. **7.1. Authority.** Each Party represents that it has validly entered into this MSA and has the legal power to do so. **7.2. Warranties.** Chkk warrants that during an applicable Term (a) the Security Statement accurately describes the applicable administrative, physical, and technical safeguards for protection of the security, confidentiality, and integrity of Customer Information; and (b) the Services will perform materially in accordance with any applicable documentation provided to Customer. For any breach of a warranty in this section, Customer's sole and exclusive remedy is Customer's right to terminate this Agreement for Chkk's uncured materials breach in accordance with Section 3.2(a) (Term and Termination) herein. **7.3. Disclaimers.** EXCEPT AS SPECIFICALLY SET FORTH IN THIS SECTION, THE SERVICES, INCLUDING ALL SERVER AND NETWORK COMPONENTS, ARE PROVIDED ON AN "AS IS" AND "AS AVAILABLE" BASIS, WITHOUT ANY WARRANTIES OF ANY KIND TO THE FULLEST EXTENT PERMITTED BY LAW, AND CHKK EXPRESSLY DISCLAIMS ANY AND ALL WARRANTIES, WHETHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, ANY IMPLIED WARRANTIES OF MERCHANTABILITY, TITLE, FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT. CUSTOMER ACKNOWLEDGES THAT CHKK DOES NOT WARRANT THAT THE SERVICES WILL BE UNINTERRUPTED, TIMELY, SECURE, ERROR-FREE, OR FREE FROM VIRUSES OR OTHER MALICIOUS SOFTWARE, AND NO INFORMATION OR ADVICE OBTAINED BY CUSTOMER FROM CHKK OR THROUGH THE SERVICES SHALL CREATE ANY WARRANTY NOT EXPRESSLY STATED IN THIS MSA. THE PARTIES ADDITIONALLY AGREE THAT CHKK WILL HAVE NO LIABILITY OR RESPONSIBILITY FOR CLIENT'S VARIOUS COMPLIANCE PROGRAMS, AND THAT THE SERVICES, TO THE EXTENT APPLICABLE, ARE ONLY TOOLS FOR ASSISTING CLIENT IN MEETING THE VARIOUS COMPLIANCE OBLIGATIONS FOR WHICH IT SOLELY IS RESPONSIBLE. ### Section 8. Indemnification. **8.1. Indemnification by Chkk.** Chkk will indemnify and hold Customer harmless from any third party claim against Customer alleging that Customer's use of the Services as authorized in this MSA infringe or misappropriate a third party's valid patent, copyright, trademark, or trade secret. Chkk will, at its expense, defend such claim and pay damages finally awarded against Customer in connection therewith, including the reasonable fees and expenses of the attorneys engaged by Chkk for such defense, provided that (a) Customer promptly notifies Chkk of the threat or notice of such claim (provided that, a delay in providing notice does not excuse the Chkk's obligations unless Chkk is prejudiced by such delay); (b) Chkk will have the sole and exclusive control and authority to select defense attorneys, and defend and/or settle any such claim (however, Chkk will not settle or compromise any claim that results in liability or admission of any liability by Customer without prior written consent); and (c) at Chkk's request and expense, Customer fully cooperates with Chkk in connection therewith. Customer may participate and retain its own counsel at its own expense. If use of a Service by Customer has become, or, in Chkk's opinion, is likely to become, the subject of any such claim, Chkk may, at its option and expense, (i) procure for Customer the right to continue using the Service(s) as set forth hereunder; (ii) replace or modify a Service to make it non-infringing; or (iii) if options (i) or (ii) are not commercially reasonable or practicable as determined by Chkk, terminate this MSA and repay, on a pro-rata basis, any Fees previously paid to Chkk for the corresponding unused portion of the Term for related Services. Chkk will have no liability or obligation under this Section with respect to any claim if such claim is caused in whole or in part by (x) compliance with designs, data, instructions or specifications provided by Customer; (y) modification of the Services by anyone other than Chkk; or (z) the combination, operation or use of the Services with other hardware or software where the Services would not otherwise be infringing. The provisions of this Section state the sole, exclusive, and entire liability of Chkk to Customer and constitute Customer's sole remedy with respect to an infringement claim brought by reason of access to or use of a Service by Customer or Authorized Users. Notwithstanding anything to the contrary herein, Chkk shall have no obligation under this Section 8.1 with respect to Trial Services. **8.2. Indemnification by Customer.** Customer will indemnify and hold Chkk harmless against any third party claim arising out of (a) Prohibited Uses in breach of this MSA as set forth above; or (b) alleging that Customer Information infringes or misappropriates a third party's valid patent, copyright, trademark, or trade secret; provided (i) Chkk promptly notifies Customer of the threat or notice of such claim (provided that, a delay in providing notice does not excuse the Customer's obligations unless the Customer is prejudiced by such delay); (ii) Customer will have the sole and exclusive control and authority to select defense attorneys, and defend and/or settle any such claim (however, Customer will not settle or compromise any claim that results in liability or admission of any liability by Chkk without prior written consent); and (iii) at Customer's request and expense, Chkk fully cooperates in connection therewith. Chkk may participate and retain its own counsel at its own expense. ### SECTION 9. LIMITATION OF LIABILITY. TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, AND EXCEPT FOR A PARTY'S INDEMNIFICATION OBLIGATIONS AND DEFENSE OBLIGATIONS, BREACHES OF CONFIDENTIALITY, OR FOR DAMAGES DUE TO PROHIBITED USES (COLLECTIVELY, "EXCLUDED CLAIMS"), UNDER NO CIRCUMSTANCES AND UNDER NO LEGAL THEORY (WHETHER IN CONTRACT, TORT, NEGLIGENCE OR OTHERWISE) WILL EITHER PARTY TO THIS MSA, OR THEIR AFFILIATES, OFFICERS, DIRECTORS, EMPLOYEES, AGENTS, SERVICE PROVIDERS, SUPPLIERS OR LICENSORS BE LIABLE TO THE OTHER PARTY OR ANY AFFILIATE FOR ANY LOST PROFITS, LOST SALES OR BUSINESS, LOST DATA (BEING DATA LOST IN THE COURSE OF TRANSMISSION VIA CUSTOMER'S SYSTEMS OR OVER THE INTERNET THROUGH NO FAULT OF CHKK), BUSINESS INTERRUPTION, LOSS OF GOODWILL, COSTS OF COVER OR REPLACEMENT, OR FOR ANY TYPE OF INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, CONSEQUENTIAL OR PUNITIVE LOSS OR DAMAGES, OR ANY OTHER INDIRECT LOSS OR DAMAGES INCURRED BY THE OTHER PARTY OR ANY AFFILIATE IN CONNECTION WITH THIS MSA OR THE SERVICES REGARDLESS OF WHETHER SUCH PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF OR COULD HAVE FORESEEN SUCH DAMAGES. NOTWITHSTANDING ANYTHING TO THE CONTRARY IN THIS MSA, EITHER PARTY'S AGGREGATE LIABILITY TO THE OTHER PARTY OR ANY THIRD PARTY ARISING OUT OF THIS MSA OR THE SERVICES WILL IN NO EVENT EXCEED THE FEES PAID OR PAYABLE BY CUSTOMER DURING THE TWELVE (12) MONTHS PRIOR TO THE FIRST EVENT OR OCCURRENCE GIVING RISE TO SUCH LIABILITY; PROVIDED THAT A PARTY'S LIABILITY ARISING FROM EXCLUDED CLAIMS WILL NOT IN THE AGGREGATE EXCEED TWO TIMES THAT AMOUNT. FOR CLARITY, NOTHING IN THIS MSA WILL LIMIT OR EXCLUDE EITHER PARTY'S LIABILITY FOR GROSS NEGLIGENCE OR INTENTIONAL MISCONDUCT OF A PARTY. CUSTOMER ACKNOWLEDGES AND AGREES THAT THE ESSENTIAL PURPOSE OF THIS SECTION IS TO ALLOCATE THE RISKS UNDER THIS MSA BETWEEN THE PARTIES AND LIMIT POTENTIAL LIABILITY GIVEN THE FEES, WHICH WOULD HAVE BEEN SUBSTANTIALLY HIGHER IF CHKK WERE TO ASSUME ANY FURTHER LIABILITY OTHER THAN AS SET FORTH HEREIN. CHKK HAS RELIED ON THESE LIMITATIONS IN DETERMINING WHETHER TO PROVIDE CUSTOMER WITH THE RIGHTS TO ACCESS AND USE THE SERVICES PROVIDED FOR IN THIS MSA. THE DISCLAIMERS, EXCLUSIONS, AND LIMITATIONS OF LIABILITY UNDER THIS AGREEMENT WILL NOT APPLY TO THE EXTENT PROHIBITED BY APPLICABLE LAW. ### Section 10. Miscellaneous. **10.1. Entire Agreement.** This MSA, any active Order Forms, if applicable, constitute the entire agreement, and supersedes all prior agreements, between Chkk and Customer regarding the subject matter hereof. **10.2. Assignment.** Except as otherwise expressly permitted in this MSA, neither party may assign its rights or obligations under this MSA without the other party's prior written consent (not to be unreasonably withheld, conditioned or delayed. Either Party may, without the consent of the other Party, assign this MSA to any affiliate or in connection with any merger, change of control, or the sale of all or substantially all of such Party's assets provided that (1) the other Party is provided prior notice of such assignment and (2) any such successor agrees to fulfill its obligations pursuant to this MSA. Subject to the foregoing restrictions, this MSA will be fully binding upon, inure to the benefit of and be enforceable by the Parties and their respective successors and assigns. Any attempted assignment or transfer of this MSA in contravention of the foregoing shall be null and void. **10.3. Severability.** If any provision in this MSA is held by a court of competent jurisdiction to be unenforceable, such provision will be modified by the court and interpreted so as to best accomplish the original provision to the fullest extent permitted by law, and the remaining provisions of this MSA will remain in effect. **10.4. Relationship of the Parties.** The Parties are independent contractors. This MSA does not create a partnership, franchise, joint venture, agency, fiduciary, or employment relationship between the Parties. **10.5. Notices.** All notices provided by Chkk to Customer under this MSA may be delivered in writing (a) by nationally recognized overnight delivery service ("Courier") or U.S. mail to the contact mailing address provided by Customer on the Order Form; or (b) electronic mail to the electronic mail address provided for Customer's account owner. Customer must give notice to Chkk either in writing by Courier or U.S. mail to 440 North Wolfe Road, Sunnyvale, CA, 94085 Attn: Legal Department or by email to [legal@chkk.io](mailto:legal@chkk.io). All notices shall be deemed to have been given immediately upon delivery by electronic mail; or, if otherwise delivered upon the earlier of receipt or two (2) business days after being deposited in the mail or with a Courier as permitted above. **10.6. Governing Law, Jurisdiction, Venue.** This MSA will be governed by the laws of the State of California, without reference to conflict of laws principles. Any disputes under this MSA shall be resolved in a court of general jurisdiction in San Francisco County, California. Customer hereby expressly agrees to submit to the exclusive personal jurisdiction and venue of such courts for the purpose of resolving any dispute relating to this MSA or access to or use of the Services by Customer, its agents, or Authorized Users. **10.7. Export Compliance.** The Services and other software or components of the Services that Chkk may provide or make available to Customer are subject to U.S. export control and economic sanctions laws as administered and enforced by the Office of Foreign Assets and Control of the United States Department of Treasury. Customer will not access or use the Services if Customer or any Authorized Users are located in any jurisdiction in which the provision of the Services, software, or other components is prohibited under U.S. or other applicable laws or regulations (a "Prohibited Jurisdiction") and Customer will not provide access to the Services to any government, entity, or individual located in any Prohibited Jurisdiction. Customer represents and warrants that (a) it is not named on any U.S. government list of persons or entities prohibited from receiving U.S. exports, or transacting with any U.S. person; (b) it is not a national of, or a company registered in, any Prohibited Jurisdiction; (c) it will not permit any individuals under its control to access or use the Services in violation of any U.S. or other applicable export embargoes, prohibitions or restrictions; and (d) it will comply with all applicable laws regarding the transmission of technical data exported from the United States and the countries in which it and Authorized Users are located. **10.8. Anti-Corruption.** Customer represents and agrees that it has not received or been offered any illegal or improper bribe, kickback, payment, gift, or thing of value from any of Chkk's employees or agents in connection with this MSA. Reasonable gifts and entertainment provided in the ordinary course of business do not violate the above restriction. If Customer learns of any violation of the above restriction, Customer will use reasonable efforts to promptly give notice to Chkk. **10.9. Publicity and Marketing.** Chkk may use Customer's name, logo, and trademarks solely to identify Customer as a client of Chkk on Chkk's website and other marketing materials and in accordance with Customer's trademark usage guidelines. Chkk may share aggregated and/or anonymized information regarding use of the Services with third parties for marketing purposes to develop and promote Services. Chkk never will disclose aggregated and/or anonymized information to a third party in a manner that would identify Customer or any identifiable individual as the source of the information. **10.10. Amendments.** Chkk may amend this MSA from time to time, in which case the new MSA will supersede prior versions. Chkk will notify Customer not less than ten (10) days prior to the effective date of any such amendment and Customer's continued use of the Services following the effective date of any such amendment may be relied upon by Chkk as consent to any such amendment. **10.11. Waiver.** Chkk's failure to enforce at any time any provision of this MSA does not constitute a waiver of that provision or of any other provision of this MSA. # Trust Center and Transparency Source: https://docs.chkk.io/security/trust-center Chkk is committed to transparency in our security practices, ensuring customers have access to clear, comprehensive information about how we protect their data. The [**Chkk Trust Center**](https://trust.chkk.io/) serves as a central resource where customers can review security documentation, compliance reports, and details about our security programs. We provide up-to-date **compliance certifications, penetration testing reports, and security questionnaires**, giving customers the assurance that our platform meets industry standards. Our commitment to transparency extends to **third-party security audits**, which validate our adherence to best practices and regulatory requirements. Customers can access detailed documentation on **encryption practices, access controls, and incident response procedures**, ensuring they understand the security measures in place. Our Trust Center also offers **frequently asked questions (FAQs) and best practice guides**, providing guidance on securely integrating Chkk into enterprise environments. For organizations with specific compliance requirements, we facilitate security assessments and provide necessary artifacts under appropriate agreements. Our security team is available to address concerns and collaborate on security reviews. To explore the full range of security documentation, compliance resources, and best practices, visit the [**Chkk Trust Center**](https://trust.chkk.io/) for the latest updates and insights into our security posture. Our Trust Center provides customers with direct access to key security and compliance documents, offering a comprehensive view into our security posture. These include: * **SOC 2 Type II Reports** * **Penetration Testing Summary** * **CAIQv4.0.3** * **Product Security Overview Documents** * **Chkk Architecture and Dataflow Diagram** * **W9** * **Certificate of Liability Insurance** * *And more* These resources provide customers with a transparent view into how Chkk safeguards data and upholds security best practices. Visit the [**Chkk Trust Center**](https://trust.chkk.io/) to explore these documents and stay informed on our latest security updates. Chkk Trust Center is only accessible to existing and prospective customers. # Avoid 6x Extended Support Fees Source: https://docs.chkk.io/usecases/avoid-extended-support-fees The 500% surcharge for running outdated Kubernetes versions on services like EKS, AKS, and GKE can have a significant financial impact on organizations, and Chkk directly addresses this issue through its platform. ## Impact of the 500% Surcharge * **Hefty cost increases**: Starting in 2024, organizations using services such as Amazon EKS, Google GKE, and Azure AKS are subject to surcharges of up to 500% for running outdated Kubernetes versions. This means that if an organization fails to upgrade to the latest supported version of Kubernetes, their costs can increase dramatically. * **Compliance risk**: Outdated software versions pose security and operational safety risks, and can lead to non-compliance with industry standards. * **Resource strain**: Upgrading Kubernetes clusters and tens of add-ons, application services, and Kubernetes operators running in a cluster is a complex and resource-intensive task, and without proper tools, teams may struggle to keep up with the required upgrades, leading to higher costs. ## How Chkk Addresses the Surcharge Issue Chkk's Operational Safety Platform is designed to help organizations avoid these surcharges and maintain their systems efficiently. * **Streamlined upgrade process**: Chkk automates many tasks associated with Kubernetes upgrades, such as researching dependencies and release notes across hundreds of add-on, application service, and Kubernetes operator versions, which cuts down research and planning time by up to 8x. Chkk's Upgrade Copilot automates the tedious pre-work, delivers Preverified Upgrade Plans tested on a digital twin of the infrastructure, and provides detailed execution steps with automated checks. * **Reduced effort and time**: By automating pre-work, Chkk reduces upgrade efforts from what could take 10 FTE-days over four weeks to less than 2 FTE-days over two weeks. This efficiency means that upgrades can be completed more quickly, preventing teams from falling behind and incurring surcharges. * **Timely upgrades**: Chkk ensures that users perform timely upgrades, thus avoiding surcharges. It does this by providing a comprehensive technical roadmap for each cluster and delivering Preverified Upgrade Plans tailored to the current state of the system. * **Proactive alerts**: Chkk maintains an accurate inventory of all clusters, add-ons, application services, and Kubernetes operators, and alerts users to existing and upcoming End-of-Life (EOL) software. By identifying outdated software, Chkk enables organizations to upgrade before incurring the 500% surcharge. * **Cost savings**: By avoiding the surcharges and streamlining the upgrade process, Chkk can save organizations a significant amount of money. Chkk can save up to \$450,000 annually for every 100 clusters by avoiding these surcharges. * **Enhanced compliance**: By performing timely upgrades, Chkk helps organizations avoid vulnerabilities, ensure vendor support, and stay compliant with industry standards, thus avoiding additional costs and risks. Chkk directly addresses the financial impact of the 500% surcharge for outdated Kubernetes versions by providing tools and features that enable timely and efficient upgrades, thereby avoiding unnecessary costs, compliance issues, and operational risks. Chkk can save organizations considerable resources and help them stay ahead of upgrade deadlines. # Delegate, Parallelize, and Standardize Workflows Source: https://docs.chkk.io/usecases/delegate-parallelize-standardize-workflows Delegate, Parallelize, and Standardize Workflows: Standardization and delegation are critical for efficient Kubernetes operations, and Chkk's platform is designed to support these concepts, leading to enhanced organizational productivity and agility. ## The Importance of Standardization * **Consistency and Reduced Errors**: Standardizing workflows and processes ensures that all teams use the same methods and tools, reducing the likelihood of errors and inconsistencies. When every team follows the same standardized procedures for upgrades, for example, there are fewer chances of human error, misconfigurations, or omissions * **Knowledge Sharing and Reuse**: Standardization enables the sharing of best practices, templates, and knowledge across an organization. Chkk provides customized templates and contextualized insights that can be shared organization-wide, preventing redundant effort and ensuring that everyone is on the same page. * **Efficiency**: With standardized processes, teams can execute tasks more efficiently. For instance, Chkk standardizes upgrade tooling and processes, leading to reduced research and planning time. This prevents the duplication of effort that often occurs when different teams work in isolation. * **Centralized System of Record**: A standardized approach allows for the creation of a centralized system of record, where all work and knowledge are documented and easily accessible. This is crucial during reorganizations or team changes, as it helps to retain institutional knowledge and reduces the time required to find information, which minimizes context switching. ## The Importance of Delegation * **Empowering Teams**: Delegation enables organizations to distribute tasks and responsibilities across different team members, thus empowering teams. When tasks are clearly defined, team members can take ownership and make decisions within their assigned roles, which enhances efficiency. * **Focus on Strategic Initiatives**: By delegating routine tasks, experts can be freed up to focus on more strategic initiatives, which include innovation projects and problem-solving. Chkk's platform streamlines and simplifies complex tasks, allowing teams to delegate these tasks confidently. * **Scalability**: Delegation is essential for scaling operations, as it allows organizations to handle a larger workload without overburdening their experts. Standardization creates repeatable workflows that are easier to delegate and scale. * **Reduced bottlenecks**: When experts can delegate tasks, it reduces bottlenecks that can delay operations. Standardized processes make it possible for a larger number of team members to carry out upgrades and maintenance, which results in parallel workflows2.... ## How Chkk's Platform Supports Standardization and Delegation * **Upgrade Copilot**: Chkk's Upgrade Copilot provides preverified upgrade plans with detailed steps that are standardized and tested. These plans are tested on a digital twin of the infrastructure, which ensures that the upgrade process is consistent and safe. They include automated pre-flight and post-flight checks which standardize the execution and verification of the upgrade process. These plans can be used across teams, ensuring that all upgrades are performed in the same way. * **Artifact Register**: Chkk's Artifact Register helps to standardize the way assets are tracked across different clusters and clouds. By providing a centralized view of all components, container images, repositories, and tools, it eliminates the need for manual and error-prone methods of tracking. * **Knowledge Sharing**: Chkk facilitates knowledge sharing by providing a system of record for upgrades and maintenance. This helps to reduce context switching and improve productivity. Chkk captures best practices, release notes, and other relevant information in its Risk Signature Database, enabling the sharing of knowledge organization-wide. * **Simplified Workflows**: The platform standardizes workflows, making it easier to delegate tasks to any team member. It simplifies complex tasks by automating tedious pre-work and providing step-by-step guidance, enabling the delegation of tasks that previously required expert knowledge. * **Templates and Best Practices**: Chkk allows the sharing of customized templates and contextualized insights. This ensures that all teams adhere to the same best practices, improving overall operational safety and consistency. * **Reduced Errors**: By standardizing tasks and delegating them to other team members, the likelihood of human errors is reduced. Chkk automates and verifies each step of the upgrade, which helps to prevent errors. By supporting standardization and delegation, Chkk enhances organizational productivity and agility. Teams can work more efficiently, reduce the risk of errors, and focus on innovation, leading to better outcomes overall. Chkk's platform facilitates a more consistent and scalable approach to Kubernetes operations. # Enhance Operational Safety Source: https://docs.chkk.io/usecases/enhance-operational-safety Chkk's Operational Safety Platform significantly enhances the operational safety posture of organizations using Kubernetes by offering a range of tangible technical benefits. The platform is designed to proactively identify, manage, and remediate risks, ensuring a more stable, secure, and efficient operational environment. ## Key Technical Benefits for Enhanced Operational Safety * **Proactive Risk Detection**: Chkk identifies operational risks before they cause breakages, moving from a reactive to a proactive approach. The platform scans environments for configuration mistakes, incompatibilities, deprecations, and other risk factors. By detecting issues like feature flag renames in add-ons, application services, and Kubernetes operators, the platform alerts teams to potential problems before they escalate into incidents. This proactive identification is facilitated by Chkk's Risk Ledger, which is tailored specifically for identifying contextualized Operational Risks within Kubernetes infrastructures. * **Risk Signature Database (RSig DB) and Knowledge Graph**: At the core of Chkk's proactive approach is the RSig DB, which acts like a CVE database for operational risks. This database, along with a Knowledge Graph, captures relationships across different artifacts like issues, release notes, and breaking changes. The platform continuously sources information from the internet, release notes, bug reports, and user feedback to populate the RSig DB, ensuring that customers learn from a wide array of sources and experiences. This enables Chkk to convert these learnings into "Risk Signatures" which are then streamed to customers to be scanned against their specific infrastructures, identifying potential risks before they cause disruptions. * **Preverified Upgrade Templates and Plans**: Chkk provides preverified upgrade templates and plans that include a detailed sequence of steps for upgrades and remediations. These plans are tested on a digital twin of the customer's infrastructure to validate their effectiveness before implementation. By automating the pre-work, such as researching dependencies and curating release notes, Chkk cuts down research and planning time by up to 8x. These plans also include automated preflight and post-flight checks that enhance the safety and reliability of upgrades by validating system health at every stage. * **Reduced Breakages and Downtime**: By identifying and fixing operational risks proactively, Chkk helps customers avoid costly downtime. The platform helps to offset 500+ breakages for every 100 clusters. This not only saves money but also maintains a high level of operational reliability and service availability. * **Minimized Human Error**: Chkk reduces the chance of human errors and omissions through its standardized workflows and simplified tasks. By automating and verifying each step of the upgrade, Chkk ensures that upgrades are executed consistently. The platform's ability to delegate tasks to any team member further minimizes risks associated with the dependence on expert knowledge. * **Compliance with Standards**: The platform helps in maintaining compliance by ensuring timely upgrades and avoiding outdated software versions. Chkk alerts users to existing and upcoming End-of-Life (EOL) software versions. This ensures that organizations adhere to industry standards and avoid vulnerabilities. Additionally, avoiding outdated Kubernetes versions helps prevent the hefty surcharges imposed by services like Amazon EKS, Google GKE, and Azure AKS. * **Staying ahead with Collective Learning**: Chkk's platform uses Collective Learning, which is based on a large database of risks and their resolutions learned from many sources. By learning from incidents, reports, tickets, issues, and discussions from many sources, Chkk enhances its ability to identify and prevent future risks proactively. This means that by adopting Chkk, organizations are not just benefiting from the platform's existing capabilities, they are also continuously benefiting from the new learnings proactively preventing breakages. ## Impact on Operational Safety Posture By integrating these technical benefits, Chkk enables organizations to operate Kubernetes environments more safely and efficiently. Chkk's platform allows organizations to achieve a higher degree of operational safety with less manual effort and reduced risk of downtime. The proactive risk detection, combined with preverified upgrade plans and a centralized view of assets, significantly enhances the operational safety posture of any organization leveraging Kubernetes. By using Chkk, teams can move from reactive problem-solving to proactive risk management, resulting in improved system stability, minimized disruptions, and greater overall reliability. *** Kubernetes upgrades introduce risk, but Chkk ensures you can detect and fix potential issues before they cause breakages. With Chkk's automated risk detection, teams can offset 500+ potential breakages annually for every 100 clusters, preventing disruptions before they happen. This proactive approach saves organizations 1000s of hours of break-fix effort, allowing teams to focus on innovation rather than firefighting issues post-upgrade. *** # Improve Resource Efficiency Source: https://docs.chkk.io/usecases/improve-resource-efficiency Chkk enhances resource efficiency in Kubernetes environments by streamlining operations, reducing manual effort, and preventing costly issues. The platform's features and approach contribute to better resource utilization, cost savings, and improved productivity. Many organizations waste 1000s hours on repetitive upgrade planning and research. Chkk eliminates this multiplicative effort by unifying tooling across teams, ensuring that insights and processes don't have to be reinvented every upgrade cycle. By consolidating upgrade workflows, organizations can recover 1000s of hours that would otherwise be lost to duplicate work. ## Key Ways Chkk Improves Resource Efficiency: * **Elimination of Redundant Work**: Chkk eliminates redundant research and planning tasks that are often duplicated across teams. It provides standardized upgrade tooling and processes, ensuring everyone is on the same page and preventing unnecessary work. By sharing customized templates, insights, and best practices, Chkk saves time and ensures consistent processes. * **Minimized Break-Fix Efforts**: By proactively identifying and resolving operational risks, Chkk helps to prevent breakages and costly downtime. This reduces the need for reactive problem-solving and saves thousands of hours of break-fix effort for modest sized infrastructure. Chkk's proactive approach allows teams to focus on strategic initiatives rather than fighting fires. * **Improved Team Productivity**: By standardizing workflows and simplifying tasks, Chkk enables organizations to delegate work and reduce the chances of human errors. This allows experts to focus on innovation rather than routine tasks. Chkk provides a system of record, maintaining the history of work, which facilitates knowledge sharing and reduces time-to-find information, improving productivity by over 20%. * **Efficient Use of Expertise**: Chkk allows for the delegation of upgrade tasks, freeing up expert team members to focus on strategic projects. The platform's detailed upgrade templates standardize workflows, making it easy to assign tasks confidently to any team member. * **Faster Innovation**: By streamlining upgrades and freeing up team resources, Chkk enables organizations to accelerate innovation and achieve business goals faster. * **Reduced Operational Toil**: Chkk's standardization of workflows and upgrade planning reduces the repetitive and manual tasks that are the source of operational toil. Chkk's features like the **Upgrade Copilot**, **Artifact Register**, and **Risk Ledger** help streamline operations and reduce wasted effort. The platform's ability to automate, standardize, and provide proactive risk management helps organizations achieve greater efficiency and optimal use of their resources. By preventing downtime, and avoiding surcharges, organizations using Chkk can optimize their resource usage and minimize unnecessary expenditures. Overall, Chkk enables teams to do more with less by freeing up resources for innovation, increasing productivity, and reducing the time and effort spent on routine tasks. # Speedup and Derisk Upgrades Source: https://docs.chkk.io/usecases/speed-derisk-upgrades Upgrading Kubernetes can be complex and time-consuming, but Chkk makes it faster and safer. For each cluster, Chkk generates a detailed Upgrade Plan that covers all components—control plane, node versions, add-ons, application services, Kubernetes operators, and dependencies. It proactively highlights required changes, including recommended add-on, application services, and Kubernetes operator versions, breaking changes, deprecated APIs that need updates, and misconfigured Pod Disruption Budgets. Instead of manually piecing together upgrade requirements from release notes, teams get a clear, actionable upgrade path with explanations. Chkk's automation cuts upgrade preparation time by 3x-5x, turning weeks of planning into just days.