In this blog I am going show you how you can minimise downtime when performing minor/major engine upgrades using blue/green deployments.
This blog post assumes that you are familiar with the RDS service.
What is Blue/Green?
- Blue/Green is a deployment technique where you create two separate environments. They are identical and replicate one another, you have your current application running against the "Blue" environment and the "Green" environment can be used to perform upgrades, maintenance etc. Once you are happy you simply "switch over" allow you to run your application(s) against what was the "Green" environment
Why would we want to use Blue/Green?
- Allows you to perform your minor or major RDS upgrade in-hours which reduces engineering efforts out of hours
- Allows you to perform time consuming maintenance such as vacuum or reindex without the application being connected and thus preventing any I/O related performance bottlenecks or without having to take the system offline to perform these
- Application upgrades could be tested on a "proper" production environment to reduce the likelihood of issues once live
In this example, my source database is running PostgreSQL 15.8 and I intend to upgrade this to PostgreSQL 16.4 with minimal downtime
Firstly, you need to create a new custom parameter group for the new engine version (if you are performing a major upgrade)
In my example I have created a PostgreSQL 16 RDS parameter group
The CLI can also be used:
aws rds create-db-parameter-group \
--db-parameter-group-name rds-postgresql-v16 \
--db-parameter-group-family postgres16 \
--description rds-postgresql-v16
The next step is for you to enable logical_replication on both the source/target parameter group. The target parameter group will be the one you have just created and the source will be what the RDS instance is currently using
The CLI can be used:
aws rds modify-db-parameter-group \
--db-parameter-group-name rds-postgresql-v16 \
--parameters ParameterName=rds.logical_replication,ParameterValue=1,ApplyMethod=pending-reboot
aws rds modify-db-parameter-group \
--db-parameter-group-name rds-postgresql-v15 \
--parameters ParameterName=rds.logical_replication,ParameterValue=1,ApplyMethod=pending-reboot
You will then have to reboot the source RDS instance if this parameter was not already set, as its a static parameter.
aws rds reboot-db-instance \
--db-instance-identifier rds-postgresql
The next step is for you to create a Blue/Green configuration:
Here you can opt to create the deployment using the same engine version as the source instance, or specify a new engine version. In this example I am selecting the same as the source and I'll perform the upgrade manually
Ensure that you select the correct parameter group for the configuration to use and once you are happy select create
You are then presented with the below
Depending how large your dataset is, this stage could take a while to complete
The CLI can also be used:
aws rds create-blue-green-deployment \
--blue-green-deployment-name rds-postgresql-blue-green-v15-to-v16 \
--source-arn arn:aws:rds:eu-west-1:123456789:db:rds-postgresql \
--target-db-parameter-group-name rds-postgresql-v15 \
--target-db-instance-class db.m7g.large
Once the deployment has created this is what you are left with:
You are now in a position to work on the "Green" instance, this could be maintenance such as VACUUM FULL/REINDEX or it could be an application upgrade etc.
I am going to upgrade this instance to the latest version of RDS PostgreSQL
From here, navigate to the modify the "Green" instance and select the new RDS engine version and the new target parameter group. Once happy, modify the instance
The CLI can also be used:
aws rds modify-db-instance \
--db-instance-identifier rds-postgresql-green-7fh0sf \
--engine-version 16.4 \
--db-parameter-group-name rds-postgresql-v16 \
--allow-major-version-upgrade \
--apply-immediately
You can now see that the instance is now upgrading
The beauty of Blue/Green is that whilst this maintenance is occurring your application is still happily running against the Blue instance and is serving traffic as normal.
It essentially allows you to perform the upgrade ahead of time and allows you to 'extend' any maintenance windows you may have.
Once the upgrade has completed, you can see that the "Green" instance here is now running PostgreSQL 16 and my "Blue" instance is still running PostgreSQL 15.
From here, you can connect to the "Green" instance and run any recommended post steps such as ANALYSE, VACUUM, REINDEX, EXTENSION UPGRADES etc
This instance will stay in sync with your "Blue" instance, allowing you to leave this instance here now CDC'ing until you are able to take the outage to "switch over" these instances
Once you are in a position to now "switch over" the process will be
- Stop/pause the application(s)
- Switch over the RDS instances.
- Start/resume the application(s)
On this example I don't have any applications connected so I will not be performing these stages
To switch over the RDS instances the process is as follows:
- In the RDS console, select the Blue/Green Deployment. Click actions, switch over
You specify a timeout setting, this is the amount of time the switch over has to complete in before it classes it as a fail and rolls back. I am leaving it here at the default
The CLI can also be used:
aws rds switchover-blue-green-deployment \
--blue-green-deployment-identifier rds-postgresql-blue-green-v15-to-v16 \
--switchover-timeout 300
Switchover actions:
The instances will change to 'switching over' status
As you can see now, the instance that the upgrade was performed on is now labelled as 'New Blue'
You are now in a position to connect your application(s) and test
The 'Old-Blue' instance will stay there until you manually remove, so it is important to remove this instance once you are happy and will not be rolling back - As this will be costing you money to keep hold of
Rollback - if required
If you find yourself in a position where you need to rollback to the previous instance you will need to do the following:
- Delete the Blue/Green deployment
- Rename the instances so that they match the previous endpoint
- Or switch the route53 DNS address for your application to point to the 'Old Blue' instance
Once happy, you can remove the previous instance and the Blue/Green deployment to leave you with just one RDS instance running the new version of RDS PostgreSQL
Congratulations - you've managed perform a major RDS upgrade whilst minimising downtime
Conclusion
RDS Blue/Green deployments offer a efficient way to perform RDS upgrades (minor/major), applying any hardware intense maintenance, parameter changes, application changes without impacting the current instance that is serving production. This enhances security and reliability and allows you to be fully confident in the changes you're proposing before deploying to production.
I hope this blog post helped, and happy building
Feel free to connect