Hi everyone,

This is the second installment of the Kubernetes series. You can find the previous article using the below link
Introduction to Kubernetes and AWS EKS — Part 1

Tasks we are going to do in this article:

Create a docker image of a web app and upload it to AWS ECR
Pulling docker image from AWS ECR and deploying it in EKS Cluster
Scaling your application basic on traffic
Different deployment strategies

Let’s dive into the article

Creating a docker image and uploading to AWS ECR:

I am using an open-source HTML/CSS website in this article as a deployment application
You can download the application using this link codewithsadee/portfolio: Fully responsive personal portfolio website
This is a web application. We’re gonna deploy this in our Kubernetes cluster
For deploying it, we need to make a docker image of this app
So clone the project using the below command

git clone https://github.com/codewithsadee/portfolio
Create a Dockerfile with below code at the root directory of the application

FROM nginx:alpine COPY . /usr/share/nginx/html EXPOSE 80
Here I am using the nginx: alpine image for the application deployment. We will serve the web application using the Nginx server
The second line will copy every file from the current folder to the path specified. Here it is /usr/share/nginx/html
This is the default path of nginx server, it will serve the files inside that folder in port 80
The third line will expose port 80 from the container

Let’s build the docker image and upload it to the AWS ECR

Visit the ECR service from the AWS search bar
Click on the Create Repository button
Provide a name for the repository, keep the remaining fields as it is, and click on Create
Once the repo is created. In the top right corner, you can see the button view push commands. Click on it and follow the instructions one by one

After running all the commands you can see the list of the images you uploaded like this

Deploying docker image to EKS cluster from ECR:

Create an EKS cluster and a node group
You can find how to create an EKS cluster and a node group in my previous article mentioned at the start of the article
The only change I am doing here is, I am changing the EC2 machine type from t3.medium to t3.small in the node group
Keeping the Desired, Minimum nodes as 1 and Maximum size of 2.

Point your kubectl to use the EKS cluster we created using the following command

aws eks --region <region-code> update-kubeconfig --name <cluster-name>
To check whether the kubectl is correctly configured or not. Use the following commands

kubectl get nodes
The command will return the nodes list created in the cluster.

Installing the metric server for Auto Scaling:

Metric server is a light-weight Kubernetes add-on that provides resource usage metrics
Use the below command to install the metric server in the cluster

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Once the metric server is ready, Let’s deploy the application
Create a Deployment.yaml file with the following code. This code will deploy our application in the cluster

apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: sample
    spec:
      replicas: 1 # Start with one replica
      selector:
        matchLabels:
          app: sample
      template:
        metadata:
          labels:
            app: sample
        spec:
          containers:
            - name: my-app-container
              image: <account-id>.dkr.ecr.us-east-1.amazonaws.com/sample:latest
              ports:
                - containerPort: 80
              resources:
                requests:
                  cpu: "100m" # Minimum CPU required for a pod
                limits:
                  cpu: "500m" # Maximum CPU allocated for a pod

Run the following command to deploy the application

kubectl apply -f deployment.yaml

Let’s create a service load balancer to expose the application outside of the cluster
Create another file with the name service.yaml which will create a load balancer that exposes 80 port to outside and points incoming traffic to the pod 80 port

apiVersion: v1
    kind: Service
    metadata:
      name: sample-service
    spec:
      type: LoadBalancer
      selector:
        app: sample
      ports:
        - protocol: TCP
          port: 80
          targetPort: 80

Once service is also deployed. Use the below command to see the list of the services deployed

kubectl get services
Output will look something like this

Copy the External IP of the service we deployed and try to access the applications using the browser. It should display the web page of the application we deployed
If it’s not loading wait till the load balancer comes online and is in a working state

Scaling the deployment:

Horizontal Pod AutoScaler(HPA):

It is a Kubernetes feature that helps in adjusting the pod replicas based on metrics like CPU usage
This will use the metric server for collecting the metrics, and based on those metrics it will try to adjust the pod replicas
Create a file name scale.yaml file with the following code

apiVersion: autoscaling/v1
    kind: HorizontalPodAutoscaler
    metadata:
      name: sample-hpa
    spec:
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: sample
      minReplicas: 1
      maxReplicas: 10
      targetCPUUtilizationPercentage: 50f

This will deploy the pod auto-scaler with our deployment as target and with minimum replicas as 1 and max replicas as 10 based on CPU utilization reaching more than 50%
Deploy this HPA using the following command

kubectl apply -f scale.yaml
Wait for a few minutes till it is deployed. After that run the following command to check the HPA deployment

kubectl get hpa
The output should look like this

Let’s hit the endpoint with multiple requests and try to see whether auto-scaling works or not
I am using J-meter to create the requests

I am creating 10k requests in 10 seconds to increase the CPU utilization so that auto-scaling will trigger and try to create more replicas
Use the following commands to check the CPU-Utilization percentage while creating the requests

kubectl get hpa

Once it is in peaks you can see Utilization reached more than the threshold of 50%

Let’s see how many pods are running using kubectl get pods command

You can see that 3 pods are running to manage the traffic. Wait for at least five minutes, so that it will auto-scale down the pod replicas
After 5 minutes I ran the kubectl get pods command again to check the pods count

That’s it. Like this, we can manage the scaling up/down of the pod replicas for managing the incoming traffic

Deployment Strategies:

Kubernetes supports different types of deployments for rolling out updates

Some of the methods are

Rolling update(Default):

It will replace the pods gradually for a smooth transition

Recreate:

It will terminate all the current running pods and then create new updated pods

Canary Deployment:

It will deploy the new version to the subset of the user. Like redirecting some of the users to the new deployment and keeping the other users to the old deployment until everything is stable

Blue-green deployment:

Run two complete environments blue for the old and green for the new. Once everything is verified in the green environment traffic will be redirected to the Green environment from the blue environment

These are all the different deployment methods that Kubernetes supports for rolling out the updates

Will try to practice these deployments in the upcoming article. That’s it for this article

Please let me know if there are any mistakes or suggestions to improve. I am open to suggestions

Thanks and Have a good day

Note: Before leaving try to delete the cluster, node group, and the load balancer you created to avoid un-necessary charges

Kubernetes Part 2: A Guide to Application Deployment, Autoscaling, and Rollout Strategies

Creating a docker image and uploading to AWS ECR:

Deploying docker image to EKS cluster from ECR:

Scaling the deployment:

Deployment Strategies: