Kubernetes Architecture
The following Kubernetes architecture diagram shows all the components of the Kubernetes cluster and how external systems connect to the Kubernetes cluster.
The first and foremost thing you should understand about Kubernetes is, that it is a distributed system. Meaning, it has multiple components spread across different servers over a network. These servers could be Virtual machines or bare metal servers. We call it a Kubernetes cluster.
A Kubernetes cluster consists of control plane nodes and worker nodes.
Control Plane
The control plane is responsible for container orchestration and maintaining the desired state of the cluster. It has the following components.
Worker Node
The Worker nodes are responsible for running containerized applications. The worker Node has the following components.
Kubernetes Control Plane Components
First, let’s take a look at each control plane component and the important concepts behind each component.
1. kube-apiserver
The kube-api server is the central hub of the Kubernetes cluster that exposes the Kubernetes API.
End users, and other cluster components, talk to the cluster via the API server. Very rarely monitoring systems and third-party services may talk to API servers to interact with the cluster.
So when you use kubectl to manage the cluster, at the backend you are actually communicating with the API server through HTTP REST APIs. However, the internal cluster components like the scheduler, controller, etc talk to the API server using gRPC.
The communication between the API server and other components in the cluster happens over TLS to prevent unauthorized access to the cluster.
Kubernetes api-server is responsible for the following
Note: To reduce the cluster attack surface, it is crucial to secure the API server. The Shadowserver Foundation has conducted an experiment that discovered 380 000 publicly accessible Kubernetes API servers.
2. etcd
Kubernetes is a distributed system and it needs an efficient distributed database like etcd that supports its distributed nature. It acts as both a backend service discovery and a database. You can call it the brain of the Kubernetes cluster.
etcd is an open-source strongly consistent, distributed key-value store. So what does it mean?
etcd uses raft consensus algorithm for strong consistency and availability. It works in a leader-member fashion for high availability and to withstand node failures.
So how does etcd work with Kubernetes?
To put it simply, when you use kubectl to get kubernetes object details, you are getting it from etcd. Also, when you deploy an object like a pod, an entry gets created in etcd.
In a nutshell, here is what you need to know about etcd.
Also, etcd it is the only Statefulset component in the control plane.
3. kube-scheduler
The kube-scheduler is responsible for scheduling Kubernetes pods on worker nodes.
When you deploy a pod, you specify the pod requirements such as CPU, memory, affinity, taints or tolerations, priority, persistent volumes (PV), etc. The scheduler’s primary task is to identify the create request and choose the best node for a pod that satisfies the requirements.
The following image shows a high-level overview of how the scheduler works.
In a Kubernetes cluster, there will be more than one worker node. So how does the scheduler select the node out of all worker nodes?
Here is how the scheduler works.
Here is shat you need to know about a scheduler.
4. Kube Controller Manager
What is a controller? Controllers are programs that run infinite control loops. Meaning it runs continuously and watches the actual and desired state of objects. If there is a difference in the actual and desired state, it ensures that the kubernetes resource/object is in the desired state.
As per the official documentation,
In Kubernetes, controllers are control loops that watch the state of your cluster, then make or request changes where needed. Each controller tries to move the current cluster state closer to the desired state.
Let’s say you want to create a deployment, you specify the desired state in the manifest YAML file (declarative approach). For example, 2 replicas, one volume mount, configmap, etc. The in-built deployment controller ensures that the deployment is in the desired state all the time. If a user updates the deployment with 5 replicas, the deployment controller recognizes it and ensures the desired state is 5 replicas.
Kube controller manager is a component that manages all the Kubernetes controllers. Kubernetes resources/objects like pods, namespaces, jobs, replicaset are managed by respective controllers. Also, the Kube scheduler is also a controller managed by the Kube controller manager.
Following is the list of important built-in Kubernetes controllers.
Here is what you should know about the Kube controller manager.
5. Cloud Controller Manager (CCM)
When kubernetes is deployed in cloud environments, the cloud controller manager acts as a bridge between Cloud Platform APIs and the Kubernetes cluster.
This way the core kubernetes core components can work independently and allow the cloud providers to integrate with kubernetes using plugins. (For example, an interface between kubernetes cluster and AWS cloud API)
Cloud controller integration allows Kubernetes cluster to provision cloud resources like instances (for nodes), Load Balancers (for services), and Storage Volumes (for persistent volumes).
Cloud Controller Manager contains a set of cloud platform-specific controllers that ensure the desired state of cloud-specific components (nodes, Loadbalancers, storage, etc). Following are the three main controllers that are part of the cloud controller manager.
Following are some of the classic examples of cloud controller manager.
Overall Cloud Controller Manager manages the lifecycle of cloud-specific resources used by kubernetes.
Kubernetes Worker Node Components
Now let’s look at each of the worker node components.
1. Kubelet
Kubelet is an agent component that runs on every node in the cluster. t does not run as a container instead runs as a daemon, managed by systemd.
It is responsible for registering worker nodes with the API server and working with the podSpec (Pod specification – YAML or JSON) primarily from the API server. podSpec defines the containers that should run inside the pod, their resources (e.g. CPU and memory limits), and other settings such as environment variables, volumes, and labels.
It then brings the podSpec to the desired state by creating containers.
To put it simply, kubelet is responsible for the following.
Kubelet is also a controller that watches for pod changes and utilizes the node’s container runtime to pull images, run containers, etc.
Other than PodSpecs from the API server, kubelet can accept podSpec from a file, HTTP endpoint, and HTTP server. A good example of “podSpec from a file” is Kubernetes static pods.
Recommended by LinkedIn
Static pods are controlled by kubelet, not the API servers.
This means you can create pods by providing a pod YAML location to the Kubelet component. However, static pods created by Kubelet are not managed by the API server.
Here is a real-world example use case of the static pod.
While bootstrapping the control plane, kubelet starts the api-server, scheduler, and controller manager as static pods from podSpecs located at /etc/kubernetes/manifests
Following are some of the key things about kubelet.
2. Kube proxy
To understand Kube proxy, you need to have a basic knowledge of Kubernetes Service & endpoint objects.
Service in Kubernetes is a way to expose a set of pods internally or to external traffic. When you create the service object, it gets a virtual IP assigned to it. It is called clusterIP. It is only accessible within the Kubernetes cluster.
The Endpoint object contains all the IP addresses and ports of pod groups under a Service object. The endpoints controller is responsible for maintaining a list of pod IP addresses (endpoints). The service controller is responsible for configuring endpoints to a service.
You cannot ping the ClusterIP because it is only used for service discovery, unlike pod IPs which are pingable.
Now let’s understand Kube Proxy.
Kube-proxy is a daemon that runs on every node as a daemonset. It is a proxy component that implements the Kubernetes Services concept for pods. (single DNS for a set of pods with load balancing). It primarily proxies UDP, TCP, and SCTP and does not understand HTTP.
When you expose pods using a Service (ClusterIP), Kube-proxy creates network rules to send traffic to the backend pods (endpoints) grouped under the Service object. Meaning, all the load balancing, and service discovery are handled by the Kube proxy.
So how does Kube-proxy work?
Kube proxy talks to the API server to get the details about the Service (ClusterIP) and respective pod IPs & ports (endpoints). It also monitors for changes in service and endpoints.
Kube-proxy then uses any one of the following modes to create/update rules for routing traffic to pods behind a Service
f you would like to understand the performance difference between kube-proxy IPtables and IPVS mode, read this article.
Also, you can run a Kubernetes cluster without kube-proxy by replacing it with Cilium.
1.29 Alpha Feature: Kubeproxy has a new 𝗻𝗳𝘁𝗮𝗯𝗹𝗲𝘀 based backend. nftables is the successor of IPtables that is Designed to be simpler and more efficient
3. Container Runtime
You probably know about Java Runtime (JRE). It is the software required to run Java programs on a host. In the same way, container runtime is a software component that is required to run containers.
Container runtime runs on all the nodes in the Kubernetes cluster. It is responsible for pulling images from container registries, running containers, allocating and isolating resources for containers, and managing the entire lifecycle of a container on a host.
To understand this better, let’s take a look at two key concepts:
Kubernetes supports multiple container runtimes (CRI-O, Docker Engine, containerd, etc) that are compliant with Container Runtime Interface (CRI). This means, all these container runtimes implement the CRI interface and expose gRPC CRI APIs (runtime and image service endpoints).
So how does Kubernetes make use of the container runtime?
As we learned in the Kubelet section, the kubelet agent is responsible for interacting with the container runtime using CRI APIs to manage the lifecycle of a container. It also gets all the container information from the container runtime and provides it to the control plane.
Let’s take an example of CRI-O container runtime interface. Here is a high-level overview of how container runtime works with kubernetes.
Kubernetes Cluster Addon Components
Apart from the core components, the kubernetes cluster needs addon components to be fully operational. Choosing an addon depends on the project requirements and use cases.
Following are some of the popular addon components that you might need on a cluster.
1. CNI Plugin
First, you need to understand Container Networking Interface (CNI)
It is a plugin-based architecture with vendor-neutral specifications and libraries for creating network interfaces for Containers.
It is not specific to Kubernetes. With CNI container networking can be standardized across container orchestration tools like Kubernetes, Mesos, CloudFoundry, Podman, Docker, etc.
When it comes to container networking, companies might have different requirements such as network isolation, security, encryption, etc. As container technology advanced, many network providers created CNI-based solutions for containers with a wide range of networking capabilities. You can call it CNI-Plugins
This allows users to choose a networking solution that best fits their needs from different providers.
How does the CNI Plugin work with Kubernetes?
Following are high-level functionalities provided by CNI plugins.
Some popular CNI plugins include:
Kubernetes Native Objects
Till now we have learned about the core kubernetes components and how each component works.
All these components work towards managing the following key Kubernetes objects.
Also, Kubernetes is extendable using CRDs, and Custom Controllers. So the cluster components also manage the objects created using custom controllers and custom resource definitions.
Kubernetes Architecture FAQs
What is the main purpose of the Kubernetes control plane?
The control plane is responsible for maintaining the desired state of the cluster and the applications running on it. It consists of components such as the API server, etcd, Scheduler, and controller manager.
What is the purpose of the worker nodes in a Kubernetes cluster?
Worker nodes are the servers (either bare-metal or virtual) that run the container in the cluster. They are managed by the control plane and receive instructions from it on how to run the containers that are part of pods.
How is communication between the control plane and worker nodes secured in Kubernetes?
Communication between the control plane and worker nodes is secured using PKI certificates and communication between different components happens over TLS. This way, only trusted components can communicate with each other.
What is the purpose of the etcd key-value store in Kubernetes?
Etcd primarily stores the kubernetes objects, cluster information, node information, and configuration data of the cluster, such as the desired state of the applications running on the cluster.
What happens to Kubernetes applications if the etcd goes down?
While the running applications will not be affected if etcd experiences an outage, it will not be possible to create or update any objects without a functioning etcd
Conclusion
Understanding Kubernetes architecture helps you with day-to-day Kubernetes implementation and operations.
When implementing a production-level cluster setup, having the right knowledge of Kubernetes components will help you run and troubleshoot applications.