Table of contents
What and Why
Node draining is the process of Kubernetes for safely evicting pods from a node.
Kubernetes has the drain
command for safely evicting all your pods from a node before you perform a maintenance on the node (e.g. kernel upgrade, hardware maintenance, etc.) or for some reason you want to move your services from one node to another without introducing downtime or some kind of disruption.
By using kubectl drain
also you give the chance to the pods to be gracefully terminated and will respect PodDisruptionBudgets you have specified.
For more information about drain
command and flags you can use, check here.
PodDisruptionBudget
PodDisruptionBudget (PDB) is a resource in Kubernetes that ensures a certain number or percentage of pods for a specified service will not be voluntarily evicted and suffer from frequent disruptions.
You can create a PDB for your application and limit the number of pods of a replicated service that are down simultaneously from voluntary disruptions.
Note: According to Kubernetes Docs Voluntary disruptions include both actions initiated by the application owner and those initiated by a Cluster Administrator.
e.g
- deleting the deployment that manages the pod
- updating a deployment's pod template causing a restart
- directly deleting a pod
An example of a PDB object will look like this:
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: fastify-budget
spec:
maxUnavailable: 1
selector:
matchLabels:
app: fastify_server
where we specify the name of PDB as fastify-budget
and in the spec we can set either maxUnavailable
or minAvailable
. You can specify only one of maxUnavailable
and minAvailable
in a single PodDisruptionBudget. Values for those fields can be expressed as integers or as a percentage (e.g 50%).
Finally we set a selector to specify the set of pods to which the PDB applies.
Example
I am going to use minikube in multi-node clusters to show how to safely evict pods.
You can deploy it to other cloud managed kubernetes like AWS or GKE with no changes to yaml files.
In my example I have a deployment with a simple nodejs server in fastify image running in 10 pods.
fastify-server-77476f7bc4-78zkv 1/1 Running 0 45m
fastify-server-77476f7bc4-9h6bf 1/1 Running 0 45m
fastify-server-77476f7bc4-cnh9r 1/1 Running 0 45m
fastify-server-77476f7bc4-cqs2z 1/1 Running 0 45m
fastify-server-77476f7bc4-fn5nn 1/1 Running 0 45m
fastify-server-77476f7bc4-nvnkl 1/1 Running 0 45m
fastify-server-77476f7bc4-pt5xz 1/1 Running 0 45m
fastify-server-77476f7bc4-r2btz 1/1 Running 0 45m
fastify-server-77476f7bc4-r92b7 1/1 Running 0 45m
fastify-server-77476f7bc4-xstrj 1/1 Running 0 45m
We can drain a node by running
kubectl drain <node-name>
drain command has many flags like grace-period or ignore-daemonsets in order to parametrize the draining process.
With this command two things are going to happen, first the node is going to be cordoned and marked as unschedulable for new pods
multinode-m02 Ready,SchedulingDisabled <none> 10h v1.20.0
and the second is that the eviction process will start but you will notice in the terminal messages looking like
error when evicting pods/"fastify-server-77476f7bc4-t9rgs" -n "node-drain" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
error when evicting pods/"fastify-server-77476f7bc4-t4rsm" -n "node-drain" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
error when evicting pods/"fastify-server-77476f7bc4-qf7gq" -n "node-drain" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
meaning eviction process is respecting the PDB we have put into place for our deployment.
After a period (depending on your deployment and how much it needs to replace old pods with new) the kubectl drain
command will finish and you can verify your node is empty (except some daemonset pods maybe) by running
kubectl describe node <node-name>
In the output you will see a section of Non-terminated Pods
, there we can see only system pods running.
Non-terminated Pods: (2 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
kube-system kindnet-2gjjw 100m (2%) 100m (2%) 50Mi (1%) 50Mi (1%) 10h
kube-system kube-proxy-54tlm 0 (0%) 0 (0%) 0 (0%) 0 (0%) 10h
Finally you can do maintenance on your cordoned node or replace it. If you wish to put the node back usage you just need to mark it again as Schedulable
by running
kubectl uncordon <node-name>