Introduction
In this advanced AWS DevOps project, we will build a sophisticated CI/CD pipeline to automate the deployment of a microservices-based application on AWS. We'll leverage key DevOps practices such as Infrastructure as Code (IaC) using Terraform, container orchestration with Amazon EKS (Elastic Kubernetes Service), service mesh integration with Istio, and comprehensive monitoring and logging using Prometheus, Grafana, and Fluentd.
This project is designed to cover complex, real-world scenarios that AWS & DevOps engineers frequently encounter, providing you with an in-depth understanding of how to architect, deploy, and manage modern cloud-native applications.
Table of Contents
- Project Overview
- Prerequisites
- Architecture Diagram
-
Step-by-Step Guide
-
Infrastructure as Code with Terraform
- Setting Up the VPC and Networking
- IAM Roles and Policies
- Creating an EKS Cluster
- Integrating with RDS for Persistent Storage
-
Deploying Jenkins in a Highly Available Configuration
- Setting Up Jenkins on EC2 with Auto-Scaling
- Configuring Jenkins Master-Slave Architecture
- Building and Dockerizing Microservices
-
Implementing Service Mesh with Istio
- Installing Istio on EKS
- Configuring Traffic Management, Security, and Observability
-
Setting Up Jenkins CI/CD Pipeline
- Integrating GitHub, Jenkins, Docker, and EKS
- Implementing Blue-Green Deployments
- Automated Canary Deployments with Istio
-
Monitoring, Logging, and Alerting
- Setting Up Prometheus and Grafana
- Configuring Fluentd for Centralized Logging
- Setting Up Alertmanager for Incident Response
-
Infrastructure as Code with Terraform
- Testing and Validation
- Conclusion
- References
1. Project Overview
This project will guide you through setting up an advanced CI/CD pipeline that automates the deployment of a microservices-based application on AWS using Kubernetes (EKS), Istio for service mesh, and Jenkins for CI/CD. We’ll cover the full lifecycle from infrastructure provisioning using Terraform to deploying and managing microservices in a secure, scalable, and observable environment.
2. Prerequisites
- AWS Account: Administrative access to an AWS account.
- AWS CLI: Installed and configured with appropriate credentials.
- Terraform: Installed on your local machine.
- kubectl: Installed for managing Kubernetes clusters.
- Jenkins: Installed on an EC2 instance or set up via Docker.
- Docker: Installed and running on your local machine.
- Git: Installed and configured.
- Helm: Installed for Kubernetes package management.
3. Architecture Diagram
The architecture consists of a VPC with public and private subnets, an EKS cluster deployed across multiple Availability Zones, Jenkins deployed in a highly available configuration on EC2 with Auto-Scaling, RDS for persistent data storage, and Istio for service mesh integration. Monitoring is handled by Prometheus and Grafana, while Fluentd centralizes logging.
Architecture Diagram Overview
-
1. Developer Workstations:
- Developers commit code to a GitHub repository.
-
2. Continuous Integration:
- Jenkins pulls the latest code from GitHub.
- Jenkins builds and tests the application.
- Docker is used to create Docker images of the application.
- Built images are pushed to a Docker Registry (e.g., Amazon ECR).
-
3. Infrastructure as Code:
- Terraform is used to provision the AWS infrastructure, including:
- VPC for networking.
- IAM Roles and Policies for security.
- EKS Cluster for Kubernetes.
- RDS for persistent storage.
-
4. Continuous Deployment:
- Jenkins deploys Docker images to Amazon EKS using Kubernetes manifests or Helm charts.
- Istio (Service Mesh) is used to manage traffic between microservices.
-
5. Service Mesh:
- Istio provides traffic management, security (mTLS), and observability.
- Istio Ingress Gateway manages external access to the services.
-
6. Monitoring and Logging:
- Prometheus collects metrics from the Kubernetes cluster and Istio.
- Grafana visualizes the metrics and provides dashboards.
- Fluentd is used for centralized logging, sending logs to Amazon CloudWatch or Elasticsearch.
- Alertmanager integrates with Prometheus for alerting.
-
7. Continuous Feedback:
- Alertmanager sends notifications to Slack or email.
- Jenkins provides build and deployment feedback to developers.
Diagram Components
- GitHub: Version control and source code repository.
- Jenkins: CI/CD server for automating builds, tests, and deployments.
- Docker: Containerizes the application for consistent environments.
- Terraform: Automates the provisioning of AWS infrastructure.
- Amazon ECR: Docker image repository.
- Amazon EKS: Kubernetes service for deploying and managing containerized applications.
- Istio: Service mesh for managing service-to-service communication.
- Prometheus & Grafana: Monitoring stack for metrics and dashboards.
- Fluentd: Centralized logging solution.
- Alertmanager: Manages alerts based on metrics collected by Prometheus.
- Amazon CloudWatch: Cloud-based logging and monitoring service.
Architecture Flow
- Code Commit: Developers push code to GitHub.
- CI Pipeline: Jenkins detects code changes, builds the project, runs tests, and builds Docker images.
- Image Push: Jenkins pushes the Docker images to Amazon ECR.
- Infrastructure Deployment: Terraform provisions the necessary infrastructure on AWS.
- CD Pipeline: Jenkins deploys the application to Amazon EKS, with Istio managing traffic routing.
- Service Mesh Management: Istio handles service communication, security, and monitoring.
- Monitoring & Logging: Prometheus collects metrics, Grafana visualizes them, and Fluentd forwards logs to CloudWatch.
- Alerting: Alertmanager sends alerts based on Prometheus metrics.
4. Step-by-Step Guide
Infrastructure as Code with Terraform
Step:-1 Setting Up the VPC and Networking
- Create a new directory for your Terraform files:
mkdir advanced-aws-devops-project
cd advanced-aws-devops-project
- Initialize your Terraform project:
terraform init
- Create a
main.tf
file with the following content to define your VPC, subnets, and networking components:
provider "aws" {
region = "us-west-2"
}
resource "aws_vpc" "main_vpc" {
cidr_block = "10.0.0.0/16"
enable_dns_support = true
enable_dns_hostnames = true
tags = {
Name = "advanced-devops-vpc"
}
}
resource "aws_subnet" "public_subnet" {
vpc_id = aws_vpc.main_vpc.id
cidr_block = "10.0.1.0/24"
map_public_ip_on_launch = true
availability_zone = "us-west-2a"
tags = {
Name = "public-subnet"
}
}
resource "aws_subnet" "private_subnet" {
vpc_id = aws_vpc.main_vpc.id
cidr_block = "10.0.2.0/24"
availability_zone = "us-west-2b"
tags = {
Name = "private-subnet"
}
}
resource "aws_internet_gateway" "igw" {
vpc_id = aws_vpc.main_vpc.id
tags = {
Name = "main-igw"
}
}
resource "aws_route_table" "public_route_table" {
vpc_id = aws_vpc.main_vpc.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.igw.id
}
tags = {
Name = "public-route-table"
}
}
resource "aws_route_table_association" "public_subnet_association" {
subnet_id = aws_subnet.public_subnet.id
route_table_id = aws_route_table.public_route_table.id
}
- Apply the Terraform configuration:
terraform apply
Step 2: IAM Roles and Policies
- Create a
iam.tf
file to define IAM roles and policies for EKS, RDS, and other AWS services:
resource "aws_iam_role" "eks_role" {
name = "eks_role"
assume_role_policy = jsonencode({
"Version": "2012-10-17",
"Statement": [{
"Action": "sts:AssumeRole",
"Effect": "Allow",
"Principal": {
"Service": "eks.amazonaws.com"
}
}]
})
}
resource "aws_iam_role_policy_attachment" "eks_policy_attachment" {
role = aws_iam_role.eks_role.name
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
}
resource "aws_iam_role" "rds_role" {
name = "rds_role"
assume_role_policy = jsonencode({
"Version": "2012-10-17",
"Statement": [{
"Action": "sts:AssumeRole",
"Effect": "Allow",
"Principal": {
"Service": "rds.amazonaws.com"
}
}]
})
}
resource "aws_iam_role_policy_attachment" "rds_policy_attachment" {
role = aws_iam_role.rds_role.name
policy_arn = "arn:aws:iam::aws:policy/AmazonRDSFullAccess"
}
- Apply the IAM role configuration:
terraform apply
Step 3: Creating an EKS Cluster
- Add the following to your
main.tf
to create an EKS cluster with auto-scaling node groups:
module "eks" {
source = "terraform-aws-modules/eks/aws"
cluster_name = "advanced-eks-cluster"
cluster_version = "1.24"
vpc_id = aws_vpc.main_vpc.id
subnets = [aws_subnet.public_subnet.id, aws_subnet.private_subnet.id]
node_groups = {
eks_nodes = {
desired_capacity = 3
max_capacity = 5
min_capacity = 2
instance_type = "t3.medium"
key_name = "your-key-pair"
tags = {
Name = "advanced-eks-node"
}
}
}
}
- Apply the EKS configuration:
terraform apply
Step 4: Integrating with RDS for Persistent Storage
- Create an RDS instance for persistent storage in your
rds.tf
file:
resource "aws_db_instance" "db_instance" {
allocated_storage = 20
engine = "mysql"
engine_version = "8.0"
instance_class = "db.t3.micro"
name = "microservicesdb"
username = "admin"
password = "password"
parameter_group_name = "default.mysql8.0"
skip_final_snapshot = true
vpc_security_group_ids = [aws_security_group.eks_sg.id]
db_subnet_group_name = aws_db_subnet_group.rds_subnet_group.name
}
resource "aws_db_subnet_group" "rds_subnet_group" {
name = "rds-subnet-group"
subnet_ids = [aws_subnet.private_subnet.id]
tags = {
Name = "RDS Subnet Group"
}
}
- Apply the RDS configuration:
terraform apply
Deploying Jenkins in a Highly Available Configuration
Step 5: Setting Up Jenkins on EC2 with Auto-Scaling
- Launch Jenkins on an EC2 instance with Auto-Scaling enabled:
resource "aws_launch_configuration" "jenkins_lc" {
image_id = "ami-0abcdef1234567890"
instance_type = "t3.medium"
key_name = "your-key-pair"
security_groups = [aws_security_group.jenkins_sg.id]
lifecycle {
create_before_destroy = true
}
user_data = <<-EOF
#!/bin/bash
sudo yum update -y
sudo yum install -y java-1.8.0-openjdk
sudo wget -O /etc/yum.repos.d/jenkins.repo \
https://pkg.jenkins.io/redhat-stable/jenkins.repo
sudo rpm --import https://pkg.jenkins.io/redhat-stable/jenkins.io.key
sudo yum install -y jenkins
sudo systemctl start jenkins
sudo systemctl enable jenkins
EOF
}
resource "aws_autoscaling_group" "jenkins_asg" {
launch_configuration = aws_launch_configuration.jenkins_lc.id
min_size = 2
max_size = 4
desired_capacity = 2
vpc_zone_identifier = [aws_subnet.public_subnet.id]
tags = [
{
key = "Name"
value = "Jenkins Instance"
propagate_at_launch = true
},
]
}
- Apply the Jenkins EC2 setup:
terraform apply
Step 6: Configuring Jenkins Master-Slave Architecture
The Jenkins Master-Slave (or Controller-Agent) architecture allows you to distribute the workload of building, testing, and deploying applications across multiple machines. This architecture enhances performance and provides better resource utilization, especially in large-scale environments. Here's a detailed guide to setting up Jenkins in this architecture.
Prerequisites
- Jenkins Master: The main Jenkins server, also known as the controller, which orchestrates the build tasks and distributes them to the agent nodes.
- Jenkins Agents: Additional servers (slaves) that execute the build tasks assigned by the Jenkins master.
Step 6.1: Setting Up Jenkins Master
You’ve already set up Jenkins on an EC2 instance in the previous steps. This instance will serve as your Jenkins master. Ensure the following are in place:
- Java Installation: Jenkins requires Java to run. Verify that Java is installed:
java -version
If not installed, you can do so with:
sudo yum install -y java-1.8.0-openjdk
-
Jenkins Installation: Jenkins should be running and accessible via a web browser. Confirm by navigating to
http://<Jenkins_Master_IP>:8080
in your browser.
Step 6.2: Preparing Jenkins Agent Nodes
You need additional EC2 instances that will act as Jenkins agents. Here's how to set them up:
-
Launch EC2 Instances for Jenkins Agents:
- Use the AWS Management Console to launch new EC2 instances. You can select an Amazon Linux 2 AMI (or any other Linux distribution).
- Ensure these instances have sufficient resources (e.g., t2.medium) and are in the same VPC as the Jenkins master.
-
Security Groups:
- Ensure the security groups allow SSH access from the Jenkins master.
- Allow communication on ports required for Jenkins agent-master communication (default SSH port 22).
-
Install Java on Agent Nodes:
- SSH into each agent node and install Java:
sudo yum update -y sudo yum install -y java-1.8.0-openjdk
-
Create Jenkins User on Agent Nodes:
- Create a user dedicated to running Jenkins on the agent nodes:
sudo useradd jenkins
-
Set up SSH access for this user so the Jenkins master can connect:
sudo mkdir /home/jenkins/.ssh sudo chown jenkins:jenkins /home/jenkins/.ssh sudo chmod 700 /home/jenkins/.ssh
Copy the SSH public key from the Jenkins master to each agent's
/home/jenkins/.ssh/authorized_keys
file.
-
Install Jenkins Agent on Agent Nodes:
- Download the Jenkins agent (also called
slave.jar
) from the Jenkins master. On the Jenkins master:
wget http://<Jenkins_Master_IP>:8080/jnlpJars/agent.jar
- Download the Jenkins agent (also called
- Transfer this
agent.jar
file to each agent node, placing it in a directory like/home/jenkins
.
Step 6.3: Configuring Jenkins Master to Use Agents
-
Add New Nodes in Jenkins:
- On the Jenkins dashboard, go to Manage Jenkins > Manage Nodes and Clouds > New Node.
- Enter a name for the new node (e.g.,
Agent-1
), select Permanent Agent, and click OK.
-
Configure Node Settings:
-
Remote root directory: Specify the directory on the agent where Jenkins should operate, e.g.,
/home/jenkins
. -
Labels: Assign labels to the node for easy job allocation (e.g.,
linux-agent
). - Launch method: Choose Launch agents via SSH.
- Host: Enter the private IP address of the agent node.
-
Credentials: Add SSH credentials (username and private key) for the
jenkins
user.
-
Remote root directory: Specify the directory on the agent where Jenkins should operate, e.g.,
-
Test the Connection:
- Click on Save and then Launch agent via SSH.
- Jenkins will attempt to connect to the agent node via SSH. If successful, you will see the agent status change to "Connected".
-
Repeat for Additional Agents:
- Repeat the above steps to configure any additional Jenkins agent nodes.
Step 6.4: Configuring Jenkins Jobs to Use Specific Agents
Now that your Jenkins master is connected to agent nodes, you can configure jobs to run on specific agents:
-
Create or Configure a Jenkins Job:
- Go to New Item on the Jenkins dashboard or select an existing job.
-
Specify Node Usage:
- Under the job configuration, look for the section Restrict where this project can be run.
- Enter the label of the agent (e.g.,
linux-agent
) where you want the job to run.
-
Save and Build:
- Save the job configuration and trigger a build.
- The job should now execute on the specified Jenkins agent node.
Step 6.5: Monitoring and Managing Jenkins Agents
-
Node Monitoring:
- Jenkins provides built-in monitoring for each agent node. You can view this by navigating to Manage Jenkins > Manage Nodes and Clouds, where you’ll see the status, idle time, and load for each agent.
-
Scaling Agents:
- For large-scale environments, you may configure auto-scaling for your agent nodes using AWS Auto Scaling Groups, allowing Jenkins to automatically scale the number of agents based on workload.
-
Agent Availability:
- Ensure agents are always available by configuring them with robust health checks, such as CPU, memory, and disk usage monitoring. Integration with monitoring tools like CloudWatch or Prometheus can help automate this.
Building and Dockerizing Microservices
Step 7: Microservices Application Overview
-
Assume you have a microservices application with the following components:
- User Service: Manages user authentication and profiles.
- Order Service: Handles order processing and management.
- Payment Service: Integrates with payment gateways to process payments.
Create Dockerfiles for each service:
User Service Dockerfile:
FROM openjdk:8-jdk-alpine
VOLUME /tmp
ARG JAR_FILE=target/user-service.jar
COPY ${JAR_FILE} user-service.jar
ENTRYPOINT ["java","-jar","/user-service.jar"]
Order Service Dockerfile:
FROM openjdk:8-jdk-alpine
VOLUME /tmp
ARG JAR_FILE=target/order-service.jar
COPY ${JAR_FILE} order-service.jar
ENTRYPOINT ["java","-jar","/order-service.jar"]
Payment Service Dockerfile:
FROM openjdk:8-jdk-alpine
VOLUME /tmp
ARG JAR_FILE=target/payment-service.jar
COPY ${JAR_FILE} payment-service.jar
ENTRYPOINT ["java","-jar","/payment-service.jar"]
- Build and push the Docker images to Amazon ECR:
docker build -t user-service ./user-service
docker build -t order-service ./order-service
docker build -t payment-service ./payment-service
# Push images to ECR
aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin <aws_account_id>.dkr.ecr.us-west-2.amazonaws.com
docker tag user-service:latest <aws_account_id>.dkr.ecr.us-west-2.amazonaws.com/user-service:latest
docker push <aws_account_id>.dkr.ecr.us-west-2.amazonaws.com/user-service:latest
Implementing Service Mesh with Istio
Step 8: Installing Istio on EKS
- Install Istio on your EKS cluster using Helm:
helm repo add istio https://istio-release.storage.googleapis.com/charts
helm install istio-base istio/base -n istio-system --create-namespace
helm install istiod istio/istiod -n istio-system
helm install istio-ingress istio/gateway -n istio-system
Step 9: Configuring Traffic Management, Security, and Observability
- Define Istio VirtualService and DestinationRule for traffic management:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: user-service
spec:
hosts:
- user-service
http:
- route:
- destination:
host: user-service
subset: v1
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: user-service
spec:
host: user-service
subsets:
- name: v1
labels:
version: v1
- Apply security policies using Istio's Mutual TLS (mTLS):
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: istio-system
spec:
mtls:
mode: STRICT
- Enable observability by integrating Istio telemetry with Prometheus and Grafana.
Setting Up Jenkins CI/CD Pipeline
Step 10: Integrating GitHub, Jenkins, Docker, and EKS
- Create a Jenkins pipeline script to automate the build, test, and deployment process:
pipeline {
agent any
stages {
stage('Checkout') {
steps {
git 'https://github.com/your-repo/microservices-app.git'
}
}
stage('Build') {
steps {
sh 'mvn clean package'
}
}
stage('Docker Build & Push') {
steps {
script {
docker.build("user-service:latest").push("${env.BUILD_TAG}")
}
}
}
stage('Deploy to EKS') {
steps {
script {
sh 'kubectl apply -f k8s/deployment.yaml'
sh 'kubectl apply -f k8s/service.yaml'
}
}
}
}
}
Step 11: Implementing Blue-Green Deployments
- Define Kubernetes manifests for blue-green deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: user-service-green
spec:
replicas: 3
selector:
matchLabels:
app: user-service
version: green
template:
metadata:
labels:
app: user-service
version: green
spec:
containers:
- name: user-service
image: <aws_account_id>.dkr.ecr.us-west-2.amazonaws.com/user-service:green
ports:
- containerPort: 8080
- Apply the green deployment and switch traffic using Istio:
kubectl apply -f k8s/user-service-green.yaml
kubectl apply -f k8s/virtualservice.yaml
Step 12: Automated Canary Deployments with Istio
- Implement canary deployment strategy in your Istio VirtualService:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: user-service
spec:
hosts:
- user-service
http:
- route:
- destination:
host: user-service
subset: v1
weight: 90
- destination:
host: user-service
subset: v2
weight: 10
Monitoring, Logging, and Alerting
Step 13: Setting Up Prometheus and Grafana
- Install Prometheus and Grafana via Helm:
helm install prometheus prometheus-community/prometheus
helm install grafana grafana/grafana
- Access Grafana and add Prometheus as a data source, then import Istio dashboards for monitoring.
Step 14: Configuring Fluentd for Centralized Logging
- Set up Fluentd DaemonSet on your EKS cluster to aggregate logs from all microservices:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.11
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
Step 15: Setting Up Alertmanager for Incident Response
- Configure Prometheus Alertmanager to send alerts based on metrics and thresholds:
global:
smtp_smarthost: 'smtp.gmail.com:587'
smtp_from: 'alertmanager
@example.com'
smtp_auth_username: 'example@gmail.com'
smtp_auth_password: 'password'
route:
receiver: 'email-alert'
receivers:
- name: 'email-alert'
email_configs:
- to: 'your-email@example.com'
- Integrate Alertmanager with Slack for real-time incident notifications.
Conclusion:
This project showcases an advanced AWS DevOps setup, incorporating a multi-service architecture, Jenkins CI/CD pipelines, Docker, Kubernetes, and Istio service mesh. The detailed steps provided cover infrastructure provisioning, automated deployments, monitoring, logging, and alerting, offering a comprehensive guide for AWS & DevOps engineers. This setup ensures high availability, scalability, security, and observability, making it a robust solution for managing microservices in a cloud-native environment.
👤 Author
Join Our Telegram Community || Follow me on GitHub for more DevOps content!