In the previous blog, I explained that EKS Auto mode is now supported by terraform-eks-module and illustrated how we can create new cluster with EKS Auto Mode.
In this blog, we’ll learn how to enable EKS Auto Mode on existing clusters and migrate workloads from EKS Managed Node Groups to EKS Auto nodes with ZERO DOWNTIME and continued application availability using my terraform code.
I have also added a BONUS section which explains how we can control our pod's deployments on EKS Auto Mode nodes or other compute types.
Motivation
- Terraform-provider-aws released a new version v5.8.1 which allows to enabled EKS Auto with built-in NodePools on existing cluster.
- Terraform-aws-eks release a new version v20.31.1 which allows to use custom NodeClass/NodePools when EKS Auto is enabled without built-in NodePools.
I want this blog to be really short, crisp and efficient so lets jump into actual steps!
Deploy Terraform cluster without EKS Auto Mode
We want to create the use case where we have an existing cluster WITHOUT EKS Auto Mode using EKS MNG.
Use this repository code to deploy EKS cluster with Managed node group.
Note: I am attaching policies to the node IAM role for EKS MNG - this is too permissive, better to use EKS Pod Identity (or IRSA, but EKS Pod Identity is preferred). Feel free to send a PR to the repo :)
Deploy workload or pods
- We will automate this as well using terraform's
kubectl_manifest
resource, we will deploy workload yaml code using terraform
Note: During cluster creation, test workload(pods) were not deployed because kubectl context was not set locally. So run the following command to set the kubectl context
and run terraform apply
again once cluster is created.
aws eks --region us-east-1 update-kubeconfig --name eks-existing-cluster-tf-test --profile <your-profile-name> ; terraform apply
Current state of EKS cluster before EKS Auto Mode
Let's verify the current state of eks cluster when EKS Auto mode is not enabled.
EKS Auto mode is disabled.
- EKS Auto Managed Node group created by me is running.
- Pods are running on EKS managed node group
Enable EKS Auto Mode on Existing Cluster
- Uncomment the following code to the eks.tf and
terraform apply
to enable EKS Auto Mode
bootstrap_self_managed_addons = true
cluster_compute_config = {
enabled = true
}
-
bootstrap_self_managed_addons = true
is very important otherwise you will face error where terraform tries to recreate the cluster again. I literally cried over this
Current state of EKS cluster after EKS Auto Mode
- As expected built-in NodePools are empty
Migrate workload(pods) from EKS MNG to EKS Auto Node
- There are couple of ways to smoothly migrate existing workloads from MNG to EKS Auto with minimal disruption while maintaining application’s availability throughout the migration.
Note: Copy the EKS MNG node group name.
Using eksctl tool
- The following command will cordon all nodes and all pods are evicted from a nodegroup and EKS will provision pods to node managed by EKS Auto.
eksctl drain nodegroup --cluster=<clusterName> --name=<copiedNodegroupName> --region us-east-1 --profile=<profile>
eksctl command evicts pod one at a time which I have tested so application availability is maintained.
But if you still want to be 100% sure, you can use the best practice of using pod Disruption budget. We will automate this using terraform so run
terraform apply
resource "kubectl_manifest" "test_pdb" {
yaml_body = <<YAML
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: test-pdb
labels:
environment: test
spec:
minAvailable: 1
selector:
matchLabels:
environment: test
YAML
}
- After migrating; If we want to allow scheduling pods to EKS MNG we need to uncordon the EKS MNG or you can delete the Node group
eksctl drain nodegroup --cluster=<clusterName> --name=<copiedNodegroupName> --region us-east-1 --profile=<profile> --undo
Using kubectl
- we can use the following command to drain the nodes using kubectl
kubectl drain --ignore-daemonsets <node name>
- Once it returns (without giving an error), you can delete the node or you want to tell Kubernetes that it can resume scheduling new pods onto the node
kubectl uncordon <node name>
[ BONUS ] How to schedule Pods always on EKS Auto Nodes?
- There are 2 options to achieve this :
Either delete The NodeGroup and let EKS Auto handle the scheduling on EKS Auto Nodes
Using labels and NodeAffinity
Control if a workload is deployed on EKS Auto Mode nodes
There is concept called mix-mode cluster where you’re running both EKS Auto Mode and other compute types, such as self-managed Karpenter provisioners or EKS Managed Node Groups.
In mix mode clusters by default deployment is deployed to EKS MNG nodes and not EKS Auto Nodes
In such case we can use labels and nodeAffinity.
Using NodeSelector label
- Use the label
eks.amazonaws.com/compute-type: auto
when you want a workload is deployed to EKS Auto Node. - This nodeSelector value is only relevant if you are running a cluster in a mixed mode, node types not managed by EKS Auto Mode
apiVersion: apps/v1
kind: Deployment
spec:
nodeSelector:
eks.amazonaws.com/compute-type: auto
- I have an added the above configuration in sample_app_on_eks_auto_nodes.tf file. We are automating using Terraform so uncomment and run `terraform apply.
Using nodeAffinity
- You can add this nodeAffinity to Deployments or other workloads to require Kubernetes to not schedule them onto EKS Auto Mode nodes
- I have added the workload with nodeAffinity in sample_app_not_on_eks_auto_nodes.tf . We are automating using Terraform so uncomment and run
terraform apply
.
From DevOps, IaC Perspective
We saw how we can enable EKS Auto mode for Existing clusters with built-in NodePools using terraform-eks-module
We also saw how we can migrate our existing workload from EKS Managed Group to EKS Auto Nodes without any down time as EKS Auto node respect PodDisruptionBudget.
We also saw how we can use nodeSelector Labels and nodeAffinity to control deployment of workload in case of mixed-mode EKS clusters.
Currently EKS Auto deploys EC2 of instance type c6a.large which can be also customized using nodeClass and NodePool which we will see in the next blog. Follow me on Linkedin or on dev.to so that you get timely updates of what I share.
Feel free to reach out to me on Linkedin, X if you face any error migrating your Existing workloads to EKS Auto Mode Nodes using terraform.