AWS DevOps Project: Advanced Automated CI/CD Pipeline with Infrastructure as Code, Microservices, Service Mesh, and Monitoring

H A R S H H A A - Aug 31 - - Dev Community

Introduction

In this advanced AWS DevOps project, we will build a sophisticated CI/CD pipeline to automate the deployment of a microservices-based application on AWS. We'll leverage key DevOps practices such as Infrastructure as Code (IaC) using Terraform, container orchestration with Amazon EKS (Elastic Kubernetes Service), service mesh integration with Istio, and comprehensive monitoring and logging using Prometheus, Grafana, and Fluentd.

This project is designed to cover complex, real-world scenarios that AWS & DevOps engineers frequently encounter, providing you with an in-depth understanding of how to architect, deploy, and manage modern cloud-native applications.

Table of Contents

  1. Project Overview
  2. Prerequisites
  3. Architecture Diagram
  4. Step-by-Step Guide
  5. Testing and Validation
  6. Conclusion
  7. References

1. Project Overview

This project will guide you through setting up an advanced CI/CD pipeline that automates the deployment of a microservices-based application on AWS using Kubernetes (EKS), Istio for service mesh, and Jenkins for CI/CD. We’ll cover the full lifecycle from infrastructure provisioning using Terraform to deploying and managing microservices in a secure, scalable, and observable environment.

2. Prerequisites

  • AWS Account: Administrative access to an AWS account.
  • AWS CLI: Installed and configured with appropriate credentials.
  • Terraform: Installed on your local machine.
  • kubectl: Installed for managing Kubernetes clusters.
  • Jenkins: Installed on an EC2 instance or set up via Docker.
  • Docker: Installed and running on your local machine.
  • Git: Installed and configured.
  • Helm: Installed for Kubernetes package management.

3. Architecture Diagram

Image description
The architecture consists of a VPC with public and private subnets, an EKS cluster deployed across multiple Availability Zones, Jenkins deployed in a highly available configuration on EC2 with Auto-Scaling, RDS for persistent data storage, and Istio for service mesh integration. Monitoring is handled by Prometheus and Grafana, while Fluentd centralizes logging.

Architecture Diagram Overview

  • 1. Developer Workstations:

    • Developers commit code to a GitHub repository.
  • 2. Continuous Integration:

    • Jenkins pulls the latest code from GitHub.
    • Jenkins builds and tests the application.
    • Docker is used to create Docker images of the application.
    • Built images are pushed to a Docker Registry (e.g., Amazon ECR).
  • 3. Infrastructure as Code:

    • Terraform is used to provision the AWS infrastructure, including:
    • VPC for networking.
    • IAM Roles and Policies for security.
    • EKS Cluster for Kubernetes.
    • RDS for persistent storage.
  • 4. Continuous Deployment:

    • Jenkins deploys Docker images to Amazon EKS using Kubernetes manifests or Helm charts.
    • Istio (Service Mesh) is used to manage traffic between microservices.
  • 5. Service Mesh:

    • Istio provides traffic management, security (mTLS), and observability.
    • Istio Ingress Gateway manages external access to the services.
  • 6. Monitoring and Logging:

    • Prometheus collects metrics from the Kubernetes cluster and Istio.
    • Grafana visualizes the metrics and provides dashboards.
    • Fluentd is used for centralized logging, sending logs to Amazon CloudWatch or Elasticsearch.
    • Alertmanager integrates with Prometheus for alerting.
  • 7. Continuous Feedback:

    • Alertmanager sends notifications to Slack or email.
    • Jenkins provides build and deployment feedback to developers.

Diagram Components

  1. GitHub: Version control and source code repository.
  2. Jenkins: CI/CD server for automating builds, tests, and deployments.
  3. Docker: Containerizes the application for consistent environments.
  4. Terraform: Automates the provisioning of AWS infrastructure.
  5. Amazon ECR: Docker image repository.
  6. Amazon EKS: Kubernetes service for deploying and managing containerized applications.
  7. Istio: Service mesh for managing service-to-service communication.
  8. Prometheus & Grafana: Monitoring stack for metrics and dashboards.
  9. Fluentd: Centralized logging solution.
  10. Alertmanager: Manages alerts based on metrics collected by Prometheus.
  11. Amazon CloudWatch: Cloud-based logging and monitoring service.

Architecture Flow

  1. Code Commit: Developers push code to GitHub.
  2. CI Pipeline: Jenkins detects code changes, builds the project, runs tests, and builds Docker images.
  3. Image Push: Jenkins pushes the Docker images to Amazon ECR.
  4. Infrastructure Deployment: Terraform provisions the necessary infrastructure on AWS.
  5. CD Pipeline: Jenkins deploys the application to Amazon EKS, with Istio managing traffic routing.
  6. Service Mesh Management: Istio handles service communication, security, and monitoring.
  7. Monitoring & Logging: Prometheus collects metrics, Grafana visualizes them, and Fluentd forwards logs to CloudWatch.
  8. Alerting: Alertmanager sends alerts based on Prometheus metrics.

4. Step-by-Step Guide

Infrastructure as Code with Terraform

Step:-1 Setting Up the VPC and Networking

  • Create a new directory for your Terraform files:
  mkdir advanced-aws-devops-project
  cd advanced-aws-devops-project
Enter fullscreen mode Exit fullscreen mode
  • Initialize your Terraform project:
  terraform init
Enter fullscreen mode Exit fullscreen mode
  • Create a main.tf file with the following content to define your VPC, subnets, and networking components:
  provider "aws" {
    region = "us-west-2"
  }

  resource "aws_vpc" "main_vpc" {
    cidr_block = "10.0.0.0/16"
    enable_dns_support = true
    enable_dns_hostnames = true
    tags = {
      Name = "advanced-devops-vpc"
    }
  }

  resource "aws_subnet" "public_subnet" {
    vpc_id            = aws_vpc.main_vpc.id
    cidr_block        = "10.0.1.0/24"
    map_public_ip_on_launch = true
    availability_zone = "us-west-2a"
    tags = {
      Name = "public-subnet"
    }
  }

  resource "aws_subnet" "private_subnet" {
    vpc_id            = aws_vpc.main_vpc.id
    cidr_block        = "10.0.2.0/24"
    availability_zone = "us-west-2b"
    tags = {
      Name = "private-subnet"
    }
  }

  resource "aws_internet_gateway" "igw" {
    vpc_id = aws_vpc.main_vpc.id
    tags = {
      Name = "main-igw"
    }
  }

  resource "aws_route_table" "public_route_table" {
    vpc_id = aws_vpc.main_vpc.id

    route {
      cidr_block = "0.0.0.0/0"
      gateway_id = aws_internet_gateway.igw.id
    }

    tags = {
      Name = "public-route-table"
    }
  }

  resource "aws_route_table_association" "public_subnet_association" {
    subnet_id      = aws_subnet.public_subnet.id
    route_table_id = aws_route_table.public_route_table.id
  }
Enter fullscreen mode Exit fullscreen mode
  • Apply the Terraform configuration:
  terraform apply
Enter fullscreen mode Exit fullscreen mode

Step 2: IAM Roles and Policies

  • Create a iam.tf file to define IAM roles and policies for EKS, RDS, and other AWS services:
  resource "aws_iam_role" "eks_role" {
    name = "eks_role"
    assume_role_policy = jsonencode({
      "Version": "2012-10-17",
      "Statement": [{
        "Action": "sts:AssumeRole",
        "Effect": "Allow",
        "Principal": {
          "Service": "eks.amazonaws.com"
        }
      }]
    })
  }

  resource "aws_iam_role_policy_attachment" "eks_policy_attachment" {
    role       = aws_iam_role.eks_role.name
    policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
  }

  resource "aws_iam_role" "rds_role" {
    name = "rds_role"
    assume_role_policy = jsonencode({
      "Version": "2012-10-17",
      "Statement": [{
        "Action": "sts:AssumeRole",
        "Effect": "Allow",
        "Principal": {
          "Service": "rds.amazonaws.com"
        }
      }]
    })
  }

  resource "aws_iam_role_policy_attachment" "rds_policy_attachment" {
    role       = aws_iam_role.rds_role.name
    policy_arn = "arn:aws:iam::aws:policy/AmazonRDSFullAccess"
  }
Enter fullscreen mode Exit fullscreen mode
  • Apply the IAM role configuration:
  terraform apply
Enter fullscreen mode Exit fullscreen mode

Step 3: Creating an EKS Cluster

  • Add the following to your main.tf to create an EKS cluster with auto-scaling node groups:
  module "eks" {
    source          = "terraform-aws-modules/eks/aws"
    cluster_name    = "advanced-eks-cluster"
    cluster_version = "1.24"
    vpc_id          = aws_vpc.main_vpc.id
    subnets         = [aws_subnet.public_subnet.id, aws_subnet.private_subnet.id]

    node_groups = {
      eks_nodes = {
        desired_capacity = 3
        max_capacity     = 5
        min_capacity     = 2

        instance_type = "t3.medium"
        key_name      = "your-key-pair"
        tags = {
          Name = "advanced-eks-node"
        }
      }
    }
  }
Enter fullscreen mode Exit fullscreen mode
  • Apply the EKS configuration:
  terraform apply
Enter fullscreen mode Exit fullscreen mode

Step 4: Integrating with RDS for Persistent Storage

  • Create an RDS instance for persistent storage in your rds.tf file:
  resource "aws_db_instance" "db_instance" {
    allocated_storage    = 20
    engine               = "mysql"
    engine_version       = "8.0"
    instance_class       = "db.t3.micro"
    name                 = "microservicesdb"
    username             = "admin"
    password             = "password"
    parameter_group_name = "default.mysql8.0"
    skip_final_snapshot  = true

    vpc_security_group_ids = [aws_security_group.eks_sg.id]
    db_subnet_group_name   = aws_db_subnet_group.rds_subnet_group.name
  }

  resource "aws_db_subnet_group" "rds_subnet_group" {
    name       = "rds-subnet-group"
    subnet_ids = [aws_subnet.private_subnet.id]

    tags = {
      Name = "RDS Subnet Group"
    }
  }
Enter fullscreen mode Exit fullscreen mode
  • Apply the RDS configuration:
  terraform apply
Enter fullscreen mode Exit fullscreen mode

Deploying Jenkins in a Highly Available Configuration

Step 5: Setting Up Jenkins on EC2 with Auto-Scaling

  • Launch Jenkins on an EC2 instance with Auto-Scaling enabled:


 resource "aws_launch_configuration" "jenkins_lc" {
    image_id      = "ami-0abcdef1234567890"
    instance_type = "t3.medium"
    key_name      = "your-key-pair"
    security_groups = [aws_security_group.jenkins_sg.id]

    lifecycle {
      create_before_destroy = true
    }

    user_data = <<-EOF
              #!/bin/bash
              sudo yum update -y
              sudo yum install -y java-1.8.0-openjdk
              sudo wget -O /etc/yum.repos.d/jenkins.repo \
                https://pkg.jenkins.io/redhat-stable/jenkins.repo
              sudo rpm --import https://pkg.jenkins.io/redhat-stable/jenkins.io.key
              sudo yum install -y jenkins
              sudo systemctl start jenkins
              sudo systemctl enable jenkins
              EOF
  }

  resource "aws_autoscaling_group" "jenkins_asg" {
    launch_configuration = aws_launch_configuration.jenkins_lc.id
    min_size             = 2
    max_size             = 4
    desired_capacity     = 2
    vpc_zone_identifier  = [aws_subnet.public_subnet.id]

    tags = [
      {
        key                 = "Name"
        value               = "Jenkins Instance"
        propagate_at_launch = true
      },
    ]
  }
Enter fullscreen mode Exit fullscreen mode
  • Apply the Jenkins EC2 setup:
  terraform apply
Enter fullscreen mode Exit fullscreen mode

Step 6: Configuring Jenkins Master-Slave Architecture

The Jenkins Master-Slave (or Controller-Agent) architecture allows you to distribute the workload of building, testing, and deploying applications across multiple machines. This architecture enhances performance and provides better resource utilization, especially in large-scale environments. Here's a detailed guide to setting up Jenkins in this architecture.

Prerequisites
  1. Jenkins Master: The main Jenkins server, also known as the controller, which orchestrates the build tasks and distributes them to the agent nodes.
  2. Jenkins Agents: Additional servers (slaves) that execute the build tasks assigned by the Jenkins master.
Step 6.1: Setting Up Jenkins Master

You’ve already set up Jenkins on an EC2 instance in the previous steps. This instance will serve as your Jenkins master. Ensure the following are in place:

  • Java Installation: Jenkins requires Java to run. Verify that Java is installed:
  java -version
Enter fullscreen mode Exit fullscreen mode

If not installed, you can do so with:

  sudo yum install -y java-1.8.0-openjdk
Enter fullscreen mode Exit fullscreen mode
  • Jenkins Installation: Jenkins should be running and accessible via a web browser. Confirm by navigating to http://<Jenkins_Master_IP>:8080 in your browser.
Step 6.2: Preparing Jenkins Agent Nodes

You need additional EC2 instances that will act as Jenkins agents. Here's how to set them up:

  1. Launch EC2 Instances for Jenkins Agents:

    • Use the AWS Management Console to launch new EC2 instances. You can select an Amazon Linux 2 AMI (or any other Linux distribution).
    • Ensure these instances have sufficient resources (e.g., t2.medium) and are in the same VPC as the Jenkins master.
  2. Security Groups:

    • Ensure the security groups allow SSH access from the Jenkins master.
    • Allow communication on ports required for Jenkins agent-master communication (default SSH port 22).
  3. Install Java on Agent Nodes:

    • SSH into each agent node and install Java:
     sudo yum update -y
     sudo yum install -y java-1.8.0-openjdk
    
  4. Create Jenkins User on Agent Nodes:

    • Create a user dedicated to running Jenkins on the agent nodes:
     sudo useradd jenkins
    
  • Set up SSH access for this user so the Jenkins master can connect:

     sudo mkdir /home/jenkins/.ssh
     sudo chown jenkins:jenkins /home/jenkins/.ssh
     sudo chmod 700 /home/jenkins/.ssh
    
  • Copy the SSH public key from the Jenkins master to each agent's /home/jenkins/.ssh/authorized_keys file.

  1. Install Jenkins Agent on Agent Nodes:

    • Download the Jenkins agent (also called slave.jar) from the Jenkins master. On the Jenkins master:
     wget http://<Jenkins_Master_IP>:8080/jnlpJars/agent.jar
    
  • Transfer this agent.jar file to each agent node, placing it in a directory like /home/jenkins.
Step 6.3: Configuring Jenkins Master to Use Agents
  1. Add New Nodes in Jenkins:

    • On the Jenkins dashboard, go to Manage Jenkins > Manage Nodes and Clouds > New Node.
    • Enter a name for the new node (e.g., Agent-1), select Permanent Agent, and click OK.
  2. Configure Node Settings:

    • Remote root directory: Specify the directory on the agent where Jenkins should operate, e.g., /home/jenkins.
    • Labels: Assign labels to the node for easy job allocation (e.g., linux-agent).
    • Launch method: Choose Launch agents via SSH.
    • Host: Enter the private IP address of the agent node.
    • Credentials: Add SSH credentials (username and private key) for the jenkins user.
  3. Test the Connection:

    • Click on Save and then Launch agent via SSH.
    • Jenkins will attempt to connect to the agent node via SSH. If successful, you will see the agent status change to "Connected".
  4. Repeat for Additional Agents:

    • Repeat the above steps to configure any additional Jenkins agent nodes.
Step 6.4: Configuring Jenkins Jobs to Use Specific Agents

Now that your Jenkins master is connected to agent nodes, you can configure jobs to run on specific agents:

  1. Create or Configure a Jenkins Job:

    • Go to New Item on the Jenkins dashboard or select an existing job.
  2. Specify Node Usage:

    • Under the job configuration, look for the section Restrict where this project can be run.
    • Enter the label of the agent (e.g., linux-agent) where you want the job to run.
  3. Save and Build:

    • Save the job configuration and trigger a build.
    • The job should now execute on the specified Jenkins agent node.
Step 6.5: Monitoring and Managing Jenkins Agents
  • Node Monitoring:

    • Jenkins provides built-in monitoring for each agent node. You can view this by navigating to Manage Jenkins > Manage Nodes and Clouds, where you’ll see the status, idle time, and load for each agent.
  • Scaling Agents:

    • For large-scale environments, you may configure auto-scaling for your agent nodes using AWS Auto Scaling Groups, allowing Jenkins to automatically scale the number of agents based on workload.
  • Agent Availability:

    • Ensure agents are always available by configuring them with robust health checks, such as CPU, memory, and disk usage monitoring. Integration with monitoring tools like CloudWatch or Prometheus can help automate this.

Building and Dockerizing Microservices

Step 7: Microservices Application Overview

  • Assume you have a microservices application with the following components:

    • User Service: Manages user authentication and profiles.
    • Order Service: Handles order processing and management.
    • Payment Service: Integrates with payment gateways to process payments.
  • Create Dockerfiles for each service:

User Service Dockerfile:

  FROM openjdk:8-jdk-alpine
  VOLUME /tmp
  ARG JAR_FILE=target/user-service.jar
  COPY ${JAR_FILE} user-service.jar
  ENTRYPOINT ["java","-jar","/user-service.jar"]
Enter fullscreen mode Exit fullscreen mode

Order Service Dockerfile:

  FROM openjdk:8-jdk-alpine
  VOLUME /tmp
  ARG JAR_FILE=target/order-service.jar
  COPY ${JAR_FILE} order-service.jar
  ENTRYPOINT ["java","-jar","/order-service.jar"]
Enter fullscreen mode Exit fullscreen mode

Payment Service Dockerfile:

  FROM openjdk:8-jdk-alpine
  VOLUME /tmp
  ARG JAR_FILE=target/payment-service.jar
  COPY ${JAR_FILE} payment-service.jar
  ENTRYPOINT ["java","-jar","/payment-service.jar"]
Enter fullscreen mode Exit fullscreen mode
  • Build and push the Docker images to Amazon ECR:
  docker build -t user-service ./user-service
  docker build -t order-service ./order-service
  docker build -t payment-service ./payment-service

  # Push images to ECR
  aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin <aws_account_id>.dkr.ecr.us-west-2.amazonaws.com
  docker tag user-service:latest <aws_account_id>.dkr.ecr.us-west-2.amazonaws.com/user-service:latest
  docker push <aws_account_id>.dkr.ecr.us-west-2.amazonaws.com/user-service:latest
Enter fullscreen mode Exit fullscreen mode

Implementing Service Mesh with Istio

Step 8: Installing Istio on EKS

  • Install Istio on your EKS cluster using Helm:
  helm repo add istio https://istio-release.storage.googleapis.com/charts
  helm install istio-base istio/base -n istio-system --create-namespace
  helm install istiod istio/istiod -n istio-system
  helm install istio-ingress istio/gateway -n istio-system
Enter fullscreen mode Exit fullscreen mode

Step 9: Configuring Traffic Management, Security, and Observability

  • Define Istio VirtualService and DestinationRule for traffic management:
  apiVersion: networking.istio.io/v1alpha3
  kind: VirtualService
  metadata:
    name: user-service
  spec:
    hosts:
      - user-service
    http:
      - route:
          - destination:
              host: user-service
              subset: v1

  apiVersion: networking.istio.io/v1alpha3
  kind: DestinationRule
  metadata:
    name: user-service
  spec:
    host: user-service
    subsets:
      - name: v1
        labels:
          version: v1
Enter fullscreen mode Exit fullscreen mode
  • Apply security policies using Istio's Mutual TLS (mTLS):
  apiVersion: security.istio.io/v1beta1
  kind: PeerAuthentication
  metadata:
    name: default
    namespace: istio-system
  spec:
    mtls:
      mode: STRICT
Enter fullscreen mode Exit fullscreen mode
  • Enable observability by integrating Istio telemetry with Prometheus and Grafana.

Setting Up Jenkins CI/CD Pipeline

Step 10: Integrating GitHub, Jenkins, Docker, and EKS

  • Create a Jenkins pipeline script to automate the build, test, and deployment process:
  pipeline {
    agent any
    stages {
      stage('Checkout') {
        steps {
          git 'https://github.com/your-repo/microservices-app.git'
        }
      }
      stage('Build') {
        steps {
          sh 'mvn clean package'
        }
      }
      stage('Docker Build & Push') {
        steps {
          script {
            docker.build("user-service:latest").push("${env.BUILD_TAG}")
          }
        }
      }
      stage('Deploy to EKS') {
        steps {
          script {
            sh 'kubectl apply -f k8s/deployment.yaml'
            sh 'kubectl apply -f k8s/service.yaml'
          }
        }
      }
    }
  }
Enter fullscreen mode Exit fullscreen mode

Step 11: Implementing Blue-Green Deployments

  • Define Kubernetes manifests for blue-green deployment:
  apiVersion: apps/v1
  kind: Deployment
  metadata:
    name: user-service-green
  spec:
    replicas: 3
    selector:
      matchLabels:
        app: user-service
        version: green
    template:
      metadata:
        labels:
          app: user-service
          version: green
      spec:
        containers:
        - name: user-service
          image: <aws_account_id>.dkr.ecr.us-west-2.amazonaws.com/user-service:green
          ports:
          - containerPort: 8080
Enter fullscreen mode Exit fullscreen mode
  • Apply the green deployment and switch traffic using Istio:
  kubectl apply -f k8s/user-service-green.yaml
  kubectl apply -f k8s/virtualservice.yaml
Enter fullscreen mode Exit fullscreen mode

Step 12: Automated Canary Deployments with Istio

  • Implement canary deployment strategy in your Istio VirtualService:
  apiVersion: networking.istio.io/v1alpha3
  kind: VirtualService
  metadata:
    name: user-service
  spec:
    hosts:
      - user-service
    http:
      - route:
          - destination:
              host: user-service
              subset: v1
            weight: 90
          - destination:
              host: user-service
              subset: v2
            weight: 10
Enter fullscreen mode Exit fullscreen mode

Monitoring, Logging, and Alerting

Step 13: Setting Up Prometheus and Grafana

  • Install Prometheus and Grafana via Helm:
  helm install prometheus prometheus-community/prometheus
  helm install grafana grafana/grafana
Enter fullscreen mode Exit fullscreen mode
  • Access Grafana and add Prometheus as a data source, then import Istio dashboards for monitoring.

Step 14: Configuring Fluentd for Centralized Logging

  • Set up Fluentd DaemonSet on your EKS cluster to aggregate logs from all microservices:
  apiVersion: apps/v1
  kind: DaemonSet
  metadata:
    name: fluentd
  spec:
    selector:
      matchLabels:
        app: fluentd
    template:
      metadata:
        labels:
          app: fluentd
    spec:
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.11
        env:
          - name: FLUENT_ELASTICSEARCH_HOST
            value: "elasticsearch"
          - name: FLUENT_ELASTICSEARCH_PORT
            value: "9200"
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
Enter fullscreen mode Exit fullscreen mode

Step 15: Setting Up Alertmanager for Incident Response

  • Configure Prometheus Alertmanager to send alerts based on metrics and thresholds:
  global:
    smtp_smarthost: 'smtp.gmail.com:587'
    smtp_from: 'alertmanager

@example.com'
    smtp_auth_username: 'example@gmail.com'
    smtp_auth_password: 'password'
  route:
    receiver: 'email-alert'
  receivers:
  - name: 'email-alert'
    email_configs:
    - to: 'your-email@example.com'
Enter fullscreen mode Exit fullscreen mode
  • Integrate Alertmanager with Slack for real-time incident notifications.

Conclusion:

This project showcases an advanced AWS DevOps setup, incorporating a multi-service architecture, Jenkins CI/CD pipelines, Docker, Kubernetes, and Istio service mesh. The detailed steps provided cover infrastructure provisioning, automated deployments, monitoring, logging, and alerting, offering a comprehensive guide for AWS & DevOps engineers. This setup ensures high availability, scalability, security, and observability, making it a robust solution for managing microservices in a cloud-native environment.


👤 Author

banner

Join Our Telegram Community || Follow me on GitHub for more DevOps content!

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .