Home Cloud My Biggest (Kubernetes) Lesson of 2019 – K8s Clusters Need a Management Plane

My Biggest (Kubernetes) Lesson of 2019 – K8s Clusters Need a Management Plane

by Vamsi Chemitiganti

No plan of operations extends with any certainty beyond the first contact with the main hostile force.” – Helmuth von Moltke ‘the Elder’ (Chief of Staff of the Prussian Army from 1857 to 1871 )

Photo by Jack Sloop on Unsplash

By now, it’s apparent to every CXO/Chief Architect that their containerized workloads will eventually be orchestrated by Kubernetes. Gone are all the other container competitors. The few implementations hanging on them will gradually get migrated over to k8s & vendors will eventually end of life (EOL) their non-k8s orchestration platforms. The transition may be difficult from a migration perspective for customers using those platforms but over time they will migrate to Kubernetes. 
Welcome to the Kubernetes world, we will all be living in it soon. In this post, I capture my biggest learning in 2019 from the perspective of driving successful enterprise k8s initiatives – the need for a Management plane.

In the beginning, you had a cluster running an application orchestrated by Kubernetes

A Kubernetes cluster is a set of nodes or hosts that run an application or set of applications that run on groups of containers called pods.  A cluster is a set of machines, called nodes, that run containerized applications managed by Kubernetes. A cluster has at least one worker node and at least one master node.

The k8s Master is the control plane of the architecture. It is responsible for scheduling deployments, acting as the gateway for the API, and for overall cluster management. As depicted in the below illustration, It consists of several components, such as an API server, a scheduler, and a controller manager. The master is responsible for the global, cluster-level scheduling of pods and handling of events. For high availability and load balancing, multiple masters can be setup.  The core API server which runs in the master hosts a RESTful service that can be queried to maintain the desired state of the cluster and to maintain workloads. The admin path always goes through the Master to access the worker nodes and never goes directly to the workers.  The Scheduler service is used to schedule workloads on containers running on the slave nodes. It works in conjunction with the API server to distribute applications across groups of containers working on the cluster. It’s important to note that the management functionality only accesses the master to initiate changes in the cluster and does not access the nodes directly.

Those wanting a deeper look at the K8s platform can read the below blog from which I have reproduced the above blurb.

Why Linux Containers and Docker are the Runtime for the Software Defined Data Center (SDDC)..(4/7)

Depending on the size and complexity of the clusters needed, there can be a few variations of the deployment architecture to scale out the control plane (CP). Most CP installations will include all the above components (especially etcd) installed on most of the nodes. However, not all components need to be running on every Master node e.g the scheduler – as only one scheduler instance is active at a given point per cluster. Other components such as the controller can run on every Master node, the API server can be fronted with a load balancer in situations with high volumes of traffic.
The topic of control plane scaling deserves its own blog which will follow in a few weeks.

Day 2 begins

Simple enough, so enterprise IT teams or forward-looking developers go and build a cluster or two for an MVP type implementation which ends up being successful.

Suddenly, more developers and infrastructure teams want to leverage this cluster.

The fun starts when organizations move from one or two sample applications to a whole fleet of them managed by K8s. The last couple of posts on this blog focused on multi-tenancy, a huge challenge that bedevils every medium to large enterprise Kubernetes implementation. That is just one of the issues which cover a gamut of challenges.

The typical technical reasons why containerized implementations fail is that the Enterprise “Must Haves” shown on the right side of the below illustration is often neglected at the start of the project or inadequately planned.

Successful Containerization = Two K8s Planes + One More

The basic (multiple) components that run a Kubernetes can be classified in either the control plane or the data plane, as captured in the below illustration –

Control Plane: When enterprise architects and developers talk about the K8s control plane, they are referring to the Master node – which includes the API server, the scheduler, etcd, etc. The worker node(s) host the pods that run the actual application(s). Multiple master nodes are used to provide a cluster with failover and high availability. The components of the control plane make decisions about scheduling pods and responding to events in the data plane layer such as starting up new pods, deploying new applications etc.

Data Plane: These are bare-metal servers, VMs, containers, and apps in the customer data center or public cloud. Data plane infrastructure is provisioned, de-provisioned, monitored and managed using the control plane.

Kubernetes Implementation Landscape in 2019

About 70% of the containerization initiatives I have seen so far have not performed to expectations. With that thought, I would like to introduce another key aspect to this, the Management Plane. The below illustration captures what a management plane brings to the table in addition to both the control and the data plane. Prior to the Managed Plane, all of these functions shown below were handled either by Sys Admins or in tons of automation code typically written by Professional Services (PS) teams supplied by vendors.

Let’s define the management plane and propose what its generic capabilities should be.

Management Plane: Components in this plane consist of management instances that enable on-premises or public cloud orchestration of bare-metal servers, virtual machines, and containers. The Management plane supports the “enterprise must-haves” needed to make K8s truly multi-tenant and accessible across multiple clouds.

The functions of this Management Plane –

  1. Manage  Kubernetes clusters running on multiple hypervisors, public cloud, and bare metal regions. Ensure 99.9% uptime of these clusters.  Support a “single pane of glass”. Enable the scaling and upgrades across each  region independent of the others
  2. Self Service (via UI and API) to enable users across different tenant groups to perform the lifecycle of multi-tenancy  – provision k8s clusters, scale up or down clusters across different IaaS providers and to   deploy applications using a certified catalog of images including runtimes such as Prometheus
  3. Repeatable install, the configuration of clusters,  automatic scaling, and monitoring. Abstract away some of the above issues in handling control plane scaling
  4. Solve a key pain point by automating Kubernetes upgrades across all underlying regions.  to keep up with the latest security features and bug fixes, as well as benefit from new features being released on an on-going basis. Support heterogeneity across regions. This is especially important when teams have installed an outdated version (for example v1.9) or want to automate the upgrade process to always be on top of the latest supported version
  5. Provide a service catalog with certified images of runtimes that can be used for both CI/CD. Automate their deployment to the underlying pods via an ITIL compliant system such as Service Now etc
  6. Provide developer and Ops self-service via both the UI and an API
  7. Keeps overall cost low and does not add any Professional Services impacts. It does so by handling the above functions using automaton as opposed to throwing SREs/Sys Admins at the problem

Conclusion

Badly planned containerization strategies introduce way too much technical debt & cost overruns.  With the passage of time, these challenges are often too expensive and too varied to be overcome.

The decision to self-operate your own large-scale Kubernetes infrastructure is problematic in two ways. Firstly managing and upgrading k8s clusters doesn’t fundamentally add any business value and secondly, making incorrect or suboptimal enterprise choices results in cost overruns while impacting application performance, scalability, and availability. Lastly, there is no intrinsic business value in building and maintaining container clusters. Applications are king and Kubernetes is just another lever in IT helping businesses serve customers, albeit a key lever.

With that thought, here is wishing good luck to everyone’s Kubernetes platforms in 2020!

Discover more at Industry Talks Tech: your one-stop shop for upskilling in different industry segments!

You may also like

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.