Restore AWS RDS Snapshots using Terraform

Xavier Gourmandin - Sep 14 '23 - - Dev Community

The problem

A common use case for AWS accounts is the creation of ephemeral platforms.
Usually for development or integration environment, we want to optimize cost and therefore shutdown services when they are not needed.

In our case, this process is managed by Terraform, which for example create/destroy the platform based on a schedule of our CI/CD tool.

The database problem

But by the nature of the platform concerned by this create/destroy cycle, customers often want their database to be filled with test data that helps them run their integration/functional tests.

By chance, RDS offers a function to Snapshot the database just before deletion, and you can use it in the next platform creation iteration to restore it.

And this is where things are gonna get messy.

The Terraform RDS resource problem

Let's check how AWS RDS resource works with Terraform.
You have two options to create an RDS instance:

  • Without a Snapshot
resource "aws_db_instance" "dbname" {
  allocated_storage         = 10
  identifier                = "db-instance-id"
  db_name                   = "dbname"
  engine                    = "postgres"
  engine_version            = data.aws_rds_engine_version.pg_version.version
  instance_class            = "db.t3.micro"
  username                  = "adminuser"
  password                  = random_password.admin.result
  skip_final_snapshot       = false
  final_snapshot_identifier = "${terraform.workspace}-${formatdate("YYYYMMDDhhmmss", timestamp())}"
  storage_encrypted         = true

  backup_retention_period = 5
  backup_window           = "07:00-09:00"

  maintenance_window = "Tue:05:00-Tue:07:00"

  vpc_security_group_ids = [
    aws_security_group.allow_postgres[0].id
  ]

  db_subnet_group_name = var.subnet_db_name

}
Enter fullscreen mode Exit fullscreen mode
  • With a Snapshot
resource "aws_db_instance" "dbname" {
  identifier                = "db-instance-id"
  db_name                   = "dbname"
  instance_class            = "db.t3.micro"
  skip_final_snapshot       = false
  final_snapshot_identifier = "${terraform.workspace}-${formatdate("YYYYMMDDhhmmss", timestamp())}"
snapshot_identifier       = data.aws_db_snapshot.latest_snapshot.id
  storage_encrypted         = true

  backup_retention_period = 5
  backup_window           = "07:00-09:00"

  maintenance_window = "Tue:05:00-Tue:07:00"

  vpc_security_group_ids = [
    aws_security_group.allow_postgres[0].id
  ]

  db_subnet_group_name = var.subnet_db_name

}
Enter fullscreen mode Exit fullscreen mode

As we can see, the same resource is not configured in the same way, whether there is the snapshot_identifier property or not.

Before the first Terraform destroy

Before the first Terraform destroy process, there is no Snapshot available to restore from, so the first applies should be configured with the first definition above, but after the first destroy, a snapshot becomes available, and the RDS resource should be configured with the second definition above.

Can we make come up with a RDS definition that works in all cases ?
Turns out we can, with a little bit of Terraform tricks.

The solution

Terraform Data should point to existing resources

The first thing to note is that the snapshot identifier to restore from comes from a Terraform data source :

snapshot_identifier       = data.aws_db_snapshot.latest_snapshot.id
Enter fullscreen mode Exit fullscreen mode

which is defined like this :

data "aws_db_snapshot" "latest_snapshot" {
  db_instance_identifier = "db-instance-id"
  most_recent            = true
}
Enter fullscreen mode Exit fullscreen mode

But, by its nature, Terraform cannot read data that don't exists without complete failure of the Terraform process, so we will need to read the snapshot id data only if a snapshot already exists.

Reading data only if it exists

We need to check first if a snapshot exists, before reading it with terraform.
So we make the following changes:

data "external" "rds_final_snapshot_exists" {
  program = [
    "./check-rds-snapshot.sh",
    "db-instance-${terraform.workspace}"
  ]
}


data "aws_db_snapshot" "latest_snapshot" {
  count                  = data.external.rds_final_snapshot_exists.result.db_exists ? 1 : 0
  db_instance_identifier = "db-instance-id"
  most_recent            = true
}
Enter fullscreen mode Exit fullscreen mode

And the content of the check-rds-snapshot.sh script :

#!/bin/bash

db_id=$1

if [ -z ${db_id} ]; then
  echo "usage : $0 <db_id>" >2
  exit 1
fi

RESULT=($(aws rds describe-db-snapshots --db-instance-identifier $db_id --output text 2> /dev/null))
aws_result=$?

if [ ${aws_result} -eq 0 ] && [[ ${RESULT[0]} == "DBSNAPSHOTS" ]]; then
  result='true'
else
  result='false'
fi

jq -n --arg exists ${result} '{"db_exists": $exists }'
Enter fullscreen mode Exit fullscreen mode

The external data source checks with the AWS CLI if the snapshot exists, and the count argument on the snapshot data source prevents Terraform from reading its value if none exists.

Now, we only need to combine the two RDS declaration to make it works every time, !

resource "aws_db_instance" "dbname" {
allocated_storage         = 10
  identifier                = "db-instance-id"
  db_name                   = "dbname"
  engine                    = "postgres"
  engine_version            = data.aws_rds_engine_version.pg_version.version
  instance_class            = "db.t3.micro"
  username                  = "adminuser"
  password                  = random_password.admin.result
  skip_final_snapshot       = false
  final_snapshot_identifier = "${terraform.workspace}-${formatdate("YYYYMMDDhhmmss", timestamp())}"
  snapshot_identifier       = try(data.aws_db_snapshot.latest_snapshot.0.id, null)
  storage_encrypted         = true

  backup_retention_period = 5
  backup_window           = "07:00-09:00"

  maintenance_window = "Tue:05:00-Tue:07:00"

  vpc_security_group_ids = [
    aws_security_group.allow_postgres[0].id
  ]

  db_subnet_group_name = var.subnet_db_name

  lifecycle {
    ignore_changes = [
      snapshot_identifier,
      final_snapshot_identifier
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

And there you have it, a Terraform configuration that create RDS database and restore the latest snapshot if it exists.

Thanks for reading! I’m Xavier, Cloud Developer at Stack Labs.

If you want to join an enthusiast Dev cloud team, please contact us.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .