Kubernetes control plane scale testing with Kubemark
Continuation of Michael McCune (@elmiko) notes on Setting Up a Development Environment for the Cluster API Kubemark Provider, Automating My Hollow Kubernetes Test Rig and DevConf.cz 2022 Testing at Scale with Cluster API and Kubemark (demo).
Kubemark is a performance testing tool which allows users to run experiments on simulated clusters, by creating “hollow” Kubernetes nodes. What this means is that the nodes do not actually run containers or attach storage, but they do behave like they did, with updates to etcd and all the trimmings. At the same time, hollow nodes are extremely light (<30 MiB).
The primary use case of Kubemark is scalability testing, as simulated clusters can be much bigger than the real ones. The objective is to expose problems with the master components (API server, controller manager or scheduler) that appear only on bigger clusters (e.g. small memory leaks).
Architecture
On a very high level Kubemark cluster consists of two parts: real master components and a set of “hollow” nodes. The prefix hollow means an implementation/instantiation of a component with all moving parts mocked out. The best example is HollowKubelet, which pretends to be an ordinary Kubelet, but does not start anything, nor mount any volumes -it just lies it does.
Currently master components run on a dedicated machine(s), and HollowNodes run on an external management Kubernetes cluster. This design has the advantage of completely isolating master resources from everything else.
Integration with Cluster API
Kubernetes Cluster API (CAPI) is a project focused on providing declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters. It uses Kubernetes-style APIs and patterns to automate cluster lifecycle management for platform operators. The supporting infrastructure, like virtual machines, networks, load balancers, and VPCs, as well as the Kubernetes cluster configuration are all defined in the same way that application developers operate deploying and managing their workloads. This enables consistent and repeatable cluster deployments across a wide variety of infrastructure environments.
The Cluster API community has developed a Cluster API Kubemark Provider, allowing users to deploy fake, Kubemark-backed machines to their clusters. This is useful in a variety of scenarios, such load-testing and simulation testing.
Hands to work
On the host docker (we will be using a fresh Ubuntu 22.04 virtual machine) we will use kind (Kubernetes in Docker, a container running the necessary kubernetes pieces) to create the CAPI Management Cluster. Next, we will use the clusterctl
tool to create a second cluster (using kind as well) for the Kubemark workload (the cluster under test). Lastly, we want to create new nodes for the Kubemark Control Plane Cluster (the cluster under test) and Kubemark requires that we create these hollow nodes as pods running on a cluster that can join the control plane. The Cluster API Kubemark provider then creates pods within the CAPI Management Cluster which will join the Kubemark Control Plane Cluster (the cluster under test/worload cluster) as nodes.
For the demo we will be using a Ubuntu 22.04 virtual machine with 4 vCPUs, 4 GiB of memory and 100 GiB disk.
Environment setup
I will be using Lima (Linux virtual machines) to create and manage the VM:
$ limactl start --name=ubuntu22.04 template://ubuntu-lts
$ limactl shell ubuntu22.04
We will use cluster-apikubemark-ansible playbooks to automate the deploy of:
- Golang
- Build tools
- Docker
- Docker local registry
- Kind
- Kubectl
- Kustomize
- Kubebuilder
- Cluster API
- Cluster API Kubemark provider
Prepare the host to run ansible:
- Install
ansible
(notansible-core
)$ sudo apt install ansible $ ssh-keygen $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
- Clone and and prepare the playbooks:
$ git clone https://github.com/elmiko/cluster-api-kubemark-ansible.git $ cd cluster-api-kubemark-ansible $ echo -e "[defaults]\nallow_world_readable_tmpfiles=true" > ansible.cfg
- Update
inventory/hosts
if you need to change addresses and/or users and run the first playbook:$ ansible-playbook -i inventory/hosts setup_devel_environment.yaml
Once it is finished you will be able to login to the host as the devel user listed in the hosts file. All the development tools should be ready for access.
- Run the second playbook to build the
clusterctl
binary, all the controller images and push the images to the local registry.ansible-playbook -i inventory/hosts build_clusterctl_and_images.yaml
Creating the cluster
We will use the capi-hacks repo playbooks to aid with Kubemark deployment.
Ensure the docker local registry was created in the previous steps, if not use the 00-start-localregistry.sh
script:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7064a4208e15 registry:2 "/entrypoint.sh /etc…" 4 days ago Up 46 minutes 127.0.0.1:5000->5000/tcp kind-registry
Clone the capi-hacks repo:
$ git clone https://github.com/elmiko/capi-hacks.git
$ cd capi-hacks
Create the CAPI management cluster. This cluster will host the CAPI components and Kubemark hollow nodes:
$ ./01-start-mgmt-cluster.sh
$ kind get clusters
mgmt
Wait for the node to become ready and configure the management cluster to use the local registry:
$ kubectl get node
NAME STATUS ROLES AGE VERSION
mgmt-control-plane Ready control-plane,master 44s v1.23.6
$ ./02-apply-localregistryhosting-configmap.sh
Deploy the Cluster API (capi) and Cluster API Kubemark Provider (capk) components and wait for their pods to become ready:
$ ./03-clusterctl-init.sh
$ kubectl get deploy -A | grep cap
capd-system capd-controller-manager 1/1 1 1 10m
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager 1/1 1 1 11m
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager 1/1 1 1 11m
capi-system capi-controller-manager 1/1 1 1 11m
capk-system capk-controller-manager 1/1 1 1 10m
Create the a new kind (docker provider) cluster for the control plane under test:
$ kubectl apply -f kubemark/kubemark-workload-control-plane.yaml
Wait for the machine to transition from provisioning to running state:
$ kubectl get machine
NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION
kubemark-workload-control-plane-lvkcv kubemark-workload kubemark-workload-control-plane-lvkcv docker:////kubemark-workload-control-plane-lvkcv Running 3m31s v1.23.6
$ kubectl get clusters
NAME PHASE AGE VERSION
kubemark-workload Provisioned 4m4s
$ kind get clusters
kubemark-workload
mgmt
Let’s take a look to the new kubemark-workload
kind cluster that will host the control plane under test. As you can see the node is in NotReady
state (because there is no CNI deployed) and the CNI dependant pods are in Pending
state:
$ ./get-kubeconfig.sh kubemark-workload
$ kubectl get node --kubeconfig=kubeconfig.kubemark-workload
NAME STATUS ROLES AGE VERSION
kubemark-workload-control-plane-lvkcv NotReady control-plane,master 46m v1.23.6
$ kubectl get po -A --kubeconfig=kubeconfig.kubemark-workload
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-79dc848587-8qbgk 0/1 Pending 0 6m31s
kube-system coredns-79dc848587-n9428 0/1 Pending 0 6m31s
kube-system etcd-kubemark-workload-control-plane-lvkcv 1/1 Running 0 6m39s
kube-system kube-apiserver-kubemark-workload-control-plane-lvkcv 1/1 Running 0 6m39s
kube-system kube-controller-manager-kubemark-workload-control-plane-lvkcv 1/1 Running 0 6m39s
kube-system kube-proxy-skgc9 1/1 Running 0 6m31s
kube-system kube-scheduler-kubemark-workload-control-plane-lvkcv 1/1 Running 0 6m39s
Let’s deploy OVN-Kubernetes on the cluster (more information on how to deploy OVN-K on a preexisting kind cluster in this past blog post. OVN-Kubernetes is a CNI for Kubernetes based on the Open Virtual Network (OVN) project:
$ ./deploy-cni-ovn.sh $(pwd)/kubeconfig.kubemark-workload kubemark-workload
Check if the nodes and the CNI dependant pods have transitioned to Ready
state and the OVN pods are present:
$ kubectl get node --kubeconfig=kubeconfig.kubemark-workload
NAME STATUS ROLES AGE VERSION
kubemark-workload-control-plane-lvkcv Ready control-plane,master 78m v1.23.6
$ kubectl get po -A --kubeconfig=kubeconfig.kubemark-workload
NAMESPACE NAME READY STATUS RESTARTS AGE
default test2 1/1 Running 0 3m4s
kube-system coredns-79dc848587-8qbgk 1/1 Running 0 78m
kube-system coredns-79dc848587-n9428 1/1 Running 0 78m
kube-system etcd-kubemark-workload-control-plane-lvkcv 1/1 Running 0 78m
kube-system kube-apiserver-kubemark-workload-control-plane-lvkcv 1/1 Running 0 78m
kube-system kube-controller-manager-kubemark-workload-control-plane-lvkcv 1/1 Running 0 78m
kube-system kube-proxy-skgc9 1/1 Running 0 78m
kube-system kube-scheduler-kubemark-workload-control-plane-lvkcv 1/1 Running 0 78m
ovn-kubernetes ovnkube-db-7d8fdc7dfb-2pf8m 2/2 Running 0 6m42s
ovn-kubernetes ovnkube-master-6dbd568bb5-89s7c 2/2 Running 0 6m41s
ovn-kubernetes ovnkube-node-7s7r5 3/3 Running 0 6m33s
ovn-kubernetes ovs-node-gnpv9 1/1 Running 0 6m41s
At this point we are ready to deploy Kubemark hollow nodes in the managment cluster. This step will create 4 Kubemark hollow nodes:
kubectl apply -f kubemark/kubemark-workload-md0.yaml
Let’s check things from the managment cluster perspective first:
$ kubectl get machine
NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION
kubemark-workload-control-plane-lvkcv kubemark-workload kubemark-workload-control-plane-lvkcv docker:////kubemark-workload-control-plane-lvkcv Running 84m v1.23.6
kubemark-workload-md-0-764cb59d5-8c62j kubemark-workload kubemark-workload-md-0-v7592 kubemark://kubemark-workload-md-0-v7592 Running 57s v1.23.6
kubemark-workload-md-0-764cb59d5-bb2p4 kubemark-workload kubemark-workload-md-0-4955k kubemark://kubemark-workload-md-0-4955k Running 57s v1.23.6
kubemark-workload-md-0-764cb59d5-hwlh7 kubemark-workload kubemark-workload-md-0-m82cf kubemark://kubemark-workload-md-0-m82cf Running 57s v1.23.6
kubemark-workload-md-0-764cb59d5-jrmgt kubemark-workload kubemark-workload-md-0-82m9j kubemark://kubemark-workload-md-0-82m9j Running 57s v1.23.6
$ kubectl get po
NAME READY STATUS RESTARTS AGE
kubemark-workload-md-0-4955k 1/1 Running 0 90s
kubemark-workload-md-0-82m9j 1/1 Running 0 90s
kubemark-workload-md-0-m82cf 1/1 Running 0 90s
kubemark-workload-md-0-v7592 1/1 Running 0 90s
Finally, let’s check things from the cluster under test perspective:
$ kubectl get node --kubeconfig=kubeconfig.kubemark-workload
NAME STATUS ROLES AGE VERSION
kubemark-workload-control-plane-lvkcv Ready control-plane,master 84m v1.23.6
kubemark-workload-md-0-4955k Ready <none> 2m11s v1.23.6
kubemark-workload-md-0-82m9j Ready <none> 2m6s v1.23.6
kubemark-workload-md-0-m82cf Ready <none> 2m10s v1.23.6
kubemark-workload-md-0-v7592 Ready <none> 2m9s v1.23.6
Creating resources on the workload cluster
Let’s create a simple pod and service:
$ kubectl run test --image nginx --kubeconfig=kubeconfig.kubemark-workload
pod/test created
$ kubectl get po -o wide --kubeconfig=kubeconfig.kubemark-workload
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test 1/1 Running 0 100s 192.168.192.168 kubemark-workload-md-0-m82cf <none> <none>
$ kubectl expose po/test --port 5000 --kubeconfig=kubeconfig.kubemark-workload
service/test exposed
$ kubectl get service --kubeconfig=kubeconfig.kubemark-workload
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 192.168.122.1 <none> 443/TCP 87m
test ClusterIP 192.168.122.93 <none> 5000/TCP 7s
Let’s check OVN databases:
$ POD=$(kubectl get pod -n ovn-kubernetes -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}' --kubeconfig=kubeconfig.kubemark-workload | grep ovnkube-db-) ; kubectl exec -ti $POD -n ovn-kubernetes -c nb-ovsdb --kubeconfig=kubeconfig.kubemark-workload -- bash
[root@kubemark-workload-control-plane-lvkcv ~]# ovn-nbctl ls-list
712ca431-ff74-4aef-af8d-00acee6e40dd (ext_kubemark-workload-control-plane-lvkcv)
95755675-c275-4d04-bd35-713ba7597c0c (join)
d7264e2c-4e4e-44fe-9eae-5b99facca098 (kubemark-workload-control-plane-lvkcv)
ee3c0a20-7df2-421b-8e9e-b676080d6976 (kubemark-workload-md-0-4955k)
b4f230a6-9151-44cd-8fa9-4f489799274e (kubemark-workload-md-0-82m9j)
a27e961a-6aaf-4e33-999c-9e7fd73611fa (kubemark-workload-md-0-m82cf)
48b096d8-42c5-4d18-b226-924ec60af0c5 (kubemark-workload-md-0-v7592)
[root@kubemark-workload-control-plane-lvkcv ~]# ovn-nbctl lb-list
UUID LB PROTO VIP IPs
8ffbeb8b-c2ba-4549-9a5b-5ac9577c4271 Service_default/ tcp 192.168.122.1:443 172.18.0.5:6443
e4b5bceb-3b51-48e9-be67-7b45fb966caf Service_default/ tcp 192.168.122.93:5000 192.168.192.168:5000
654c5590-a2b7-4a6e-bf04-d8c1c78b0267 Service_default/ tcp 192.168.122.1:443 169.254.169.2:6443
ca23b927-4b87-4fdd-b16c-f8c3d824e6e6 Service_kube-sys tcp 192.168.122.10:53 10.244.0.3:53,10.244.0.4:53
tcp 192.168.122.10:9153 10.244.0.3:9153,10.244.0.4:9153
699e0b39-1be8-4db7-953f-dbc836d42faf Service_kube-sys udp 192.168.122.10:53 10.244.0.3:53,10.244.0.4:53
[root@kubemark-workload-control-plane-lvkcv ~]# ovn-sbctl list port_binding default_test
_uuid : 26050c0d-0e5d-4496-b0ee-0b3df1bb40c9
additional_chassis : []
additional_encap : []
chassis : []
datapath : 1ac0b646-9d4d-432e-9e59-64db6520973f
encap : []
external_ids : {namespace=default, pod="true"}
gateway_chassis : []
ha_chassis_group : []
logical_port : default_test
mac : ["0a:58:0a:f4:02:03 10.244.2.3"]
mirror_rules : []
nat_addresses : []
options : {iface-id-ver="b505da18-8294-41ac-a25e-ffeeb5d3b7fb", requested-chassis=kubemark-workload-md-0-m82cf}
parent_port : []
port_security : ["0a:58:0a:f4:02:03 10.244.2.3"]
requested_additional_chassis: []
requested_chassis : []
tag : []
tunnel_key : 2
type : ""
up : false
virtual_parent : []
Scaling the cluster
Let’s check how many resources Kubemark hollow nodes consume (<30 MiB, compared to 650 MiB of a normal ovnkube worker):
$ kubectl top pod
NAME CPU(cores) MEMORY(bytes)
kubemark-workload-md-0-4955k 38m 28Mi
kubemark-workload-md-0-82m9j 36m 28Mi
kubemark-workload-md-0-m82cf 45m 29Mi
kubemark-workload-md-0-v7592 41m 28Mi
In our 4 GiB VM we have 1GiB available:
$ free -h
total used free shared buff/cache available
Mem: 3.8Gi 2.5Gi 170Mi 25Mi 1.2Gi 1.0Gi
Lets create a total of 30 Kubemark hollow nodes:
$ kubectl patch --type merge MachineDeployment kubemark-workload-md-0 -p '{"spec":{"replicas":30}}'
$ kubectl get machine | grep kubemark-workload-md-0 | grep Running | wc -l
30
$ kubectl get po | grep kubemark-workload | grep Running | wc -l
30
$ free -h
total used free shared buff/cache available
Mem: 3.8Gi 3.2Gi 112Mi 28Mi 548Mi 347Mi
Stressing the cluster
Let’s use kube-burner to stress our workload cluster. Kube-burner is a tool aimed at stressing kubernetes clusters, by creating/deleting objects declared in jobs.
Let’s install kube-burner:
$ wget https://github.com/cloud-bulldozer/kube-burner/releases/download/v1.2/kube-burner-1.2-Linux-x86_64.tar.gz
$ sudo install -o root -g root -m 0755 kube-burner /usr/local/bin/kube-burner
$ kube-burner version
Version: 1.2
Git Commit: 563bc92b9262582391e5dffb8941b914ca19d2d3
Build Date: 2023-01-13T10:18:17Z
Go Version: go1.19.4
OS/Arch: linux amd64
Let’s take a look at the configuration file kubeburner/cfg.yaml
:
---
global:
writeToFile: false
indexerConfig:
enabled: false
jobs:
- name: kubelet-density
preLoadImages: false
jobIterations: 100
qps: 20
burst: 20
namespacedIterations: false
namespace: kubelet-density
waitWhenFinished: true
podWait: false
objects:
- objectTemplate: pod.yaml
replicas: 1
inputVars:
containerImage: gcr.io/google_containers/pause-amd64:3.0
Let’s create some pods on the cluster:
$ KUBECONFIG=kubeconfig.kubemark-workload kube-burner init -c kubeburner/cfg.yaml
INFO[2023-01-17 15:21:25] 🔥 Starting kube-burner (1.2@563bc92b9262582391e5dffb8941b914ca19d2d3) with UUID def1da7b-a5db-4c05-bb17-167d889ef33b
INFO[2023-01-17 15:21:25] 📈 Creating measurement factory
INFO[2023-01-17 15:21:25] Job kubelet-density: 100 iterations with 1 Pod replicas
INFO[2023-01-17 15:21:25] QPS: 20
INFO[2023-01-17 15:21:25] Burst: 20
INFO[2023-01-17 15:21:25] Triggering job: kubelet-density
INFO[2023-01-17 15:21:26] Running job kubelet-density
INFO[2023-01-17 15:21:32] Waiting up to 3h0m0s for actions to be completed
INFO[2023-01-17 15:21:51] Actions in namespace kubelet-density completed
INFO[2023-01-17 15:21:51] Finished the create job in 23s
INFO[2023-01-17 15:21:51] Verifying created objects
INFO[2023-01-17 15:21:52] pods found: 100 Expected: 100
INFO[2023-01-17 15:21:52] Job kubelet-density took 26.88 seconds
INFO[2023-01-17 15:21:52] Finished execution with UUID: def1da7b-a5db-4c05-bb17-167d889ef33b
INFO[2023-01-17 15:21:52] 👋 Exiting kube-burner def1da7b-a5db-4c05-bb17-167d889ef33b
$ kubectl get po -n kubelet-density | grep Running | wc -l
100
Comments