Production-ready service mesh: Ensuring Linkerd can run in any Kubernetes environment

Production-ready service mesh: Ensuring Linkerd can run in any Kubernetes environment

William Morgan

Apr 11, 2024

How can we ensure that Linkerd works on any Kubernetes cluster, regardless of Kubernetes version, distribution, architecture, and provider? While Kubernetes has become a standard for cloud applications in part because of its promise of universality, for a service mesh like Linkerd, the reality is that Kubernetes is far from uniform. Differences in CNIs, default network or security policies, underlying hardware, and accessibility of the Kubernetes API can mean that Linkerd can work just fine in one situation but fail to even start in another.

In this blog post, we describe how we tackle this challenge in BEL, our production-ready distribution of Linkerd. Before publishing, we automatically test every BEL release across a variety of environments using Replicated's Compatibility Matrix, allowing us to provide strong compatibility guarantees for specific Kubernetes versions, distributions, providers, and architectures, including, for the first time ever, a list of supported Kubernetes platforms and versions for Linkerd.

Production readiness testing for Linkerd

Buoyant Enterprise for Linkerd (BEL) is our production-ready distribution of Linkerd. One of the big differences between BEL and open source Linkerd (and a big part of what "production-ready" means) is our investment in build pipeline, which allows us to do things in BEL that we could never do in open source Linkerd, including proactive image scanning, hot patch releases, and a massive investment in testing infrastructure, including performance testing, regression testing, security scanning, and even "live fire" testing in production environments. It also includes the subject of this blog post: compatibility testing, which validates that BEL can always run in specific Kubernetes environments and versions.

In the past, this kind of testing has been difficult for Linkerd as an open source project, which has only limited access to specific Kubernetes environments. As a result, the project has relied on the user community to detect issues with specific Kubernetes versions or distributions, report them, and wait for a fix. These fixes were also difficult to validate, since the project itself didn't have direct access to the environment.

While this worked, for BEL we needed stronger guarantees. To call BEL production-ready, we wanted to know ahead of time, before shipping a release, that it would work in a specific set of Kubernetes environments. We especially wanted to ensure that it would work across the Kubernetes providers (AKS, EKS, OpenShift, etc), architectures (amd64, arm64), and Kubernetes versions (1.24, 1.28, etc) that our customers use today.

This type of testing can easily balloon into a combinatorial explosion of environments to maintain. Happily, we found a great answer to this challenge: Replicated's Compatibility Matrix.

Compatibility testing without the tears

Replicated and Buoyant go back many years, with Buoyant using Replicated's Distribution Platform to deploy Linkerd 1.0 in specific customer environments as far back as 2017. Today, the Compatibility Matrix enables us to have turn-key management of the full test matrix of Kubernetes distributions, architectures, and versions for BEL

In our BEL build pipeline, we make use of Compatibility Matrix via GitHub Actions, allowing us to perform this testing on a per-commit basis. In the example below, we use Compatibility Matrix to test a recent BEL commit against 7 different cluster configurations across AKS, EKS, GKE, and OpenShift, on three versions, with amd64 and arm64 architectures:

Any time we need to add a new configuration, we simply add another row to the matrix, giving us the ability to easily extend our set of validate environments for BEL. And in cases where our Compatibility Matrix tests do not pass, Replicated provides easy kubectl access to the clusters that were spun up in CI. No need to reproduce the CI environment after the fact, simply run `replicated cluster kubeconfig CLUSTER_ID` and your local `kubectl` environment has direct access to the cluster that failed in CI.

Linkerd everywhere, all the time

This investment in testing represents a significant leap forward for Linkerd, ensuring that BEL is compatible across the diverse ecosystem of Kubernetes platforms. For projects like Linkerd, the goal of universal compatibility across Kubernetes platforms is critical for the success of the project. By automating and expanding the testing landscape, Buoyant Enterprise for Linkerd can continue to offer a robust, reliable, and efficient service mesh solution, ensuring seamless operation across the ever-growing universe of Kubernetes deployments.

Including, for the first time ever, a list of supported Kubernetes platforms and versions for Linkerd:

Try Linkerd today!

Buoyant Enterprise for Linkerd is completely free to try and takes only minutes to get started, brought to you by the creators and maintainers of Linkerd. BEL is the distribution of Linkerd that we run ourselves. Anyone can download and try BEL — just start here. Happy meshing!

book
Further reading
book
Further reading
book
Further reading