A plan is in place, infrastructure is provisioned, now what?
Well, I've decided to take a step back and review the distributed system holistically, before surging forward with application development.
Which system level improvements can I make to increase the speed with which I can iterate on my designs, test my code, and alert myself to potential bottlenecks?
Here’s a sneak peek into the globally-distributed bookstore that I’ve been working on. Users are able to simulate connecting from 6 different locations across the globe, Los Angeles, Washington, D.C., São Paulo, London, Mumbai and Sydney. Multiple YugabyteDB Managed database clusters are deployed to support this application. So far, I've deployed 3 clusters, a single-region cluster, a multi-region cluster w/ read replicas and a geo-partitioned cluster. Users may choose which database the application connects to, in order to highlight the latency discrepancies between database configurations.
In this blog, I’ll explain how multiple Node.js servers are deployed across these geographies to make this possible.
Changing it up
The main benefit of Terraform is the ability to easily change cloud infrastructure when plans change.
Well, plans have changed!
I have decided to scrap Google's Container-Optimized OS in favor of Ubuntu.
In the short term, this will increase my productivity. I'll be able to update my code and manipulate the environment without needing to push builds to Google Container Registry. Change can be hard, but with Terraform, it's incredibly easy.
# main.tf
boot_disk {
initialize_params {
image = "cos-cloud/cos-stable" //OLD
image = "ubuntu-os-cloud/ubuntu-2004-lts" //NEW
}
}
This is the only configuration change required to provision new infrastructure, with the OS of my choosing. The beauty of automation is that the rest of the system, the networking, the instance sizes, and the locations, remain unchanged. A little upfront configuration saved me a ton of time and a lot of headaches.
Automating Application Deployment
However, it's not all rainbows and butterflies!
You might recall that in my previous post, I outlined how Google's Container-Optimized OS automatically pulls and runs container images upon provisioning infrastructure.
Now that we're no longer running containers on our VMs, we'll need to deploy and run our code another other way.
Fear not, there are many tools out there which makes this a breeze. I've chosen to use to use Ansible for my code automation.
Let's dive into it, starting with some configuration.
# inventory.yml
[Private]
10.168.0.2 #Los Angeles
10.154.0.2 #London
10.160.0.2 #Mumbai
10.158.0.2 #Sao Paulo
10.152.0.2 #Sydney
10.150.0.2 #Washington, D.C.
[Private:vars]
ansible_ssh_common_args= '-o ProxyCommand="ssh -W %h:%p -q [USERNAME]@[IP_ADDRESS]"'
ansible_ssh_extra_args= '-o StrictHostKeyChecking=no'
Here, I'm setting the IP addresses of the 6 application instances we provisioned using Terraform, plus some variables to be used by our Ansible playbook.
You might have noticed that I've set internal IP addresses, which typically cannot be accessed from outside of the private network. This is correct, and I'll be covering how this works in my next blog in this series (hint: SSH tunneling). For now, just assume these are publicly-accessible addresses.
Now, on to the playbook…
# playbook.yml
---
- hosts: all
become: yes
vars:
server_name: "{{ ansible_default_ipv4.address }}"
document_root: /home/
app_root: ../api_service/
tasks:
- name: Copy api service to /home directory
synchronize:
src: "{{ app_root }}"
dest: "{{ document_root }}"
rsync_opts:
- "--no-motd"
- "--exclude=node_modules"
- "--rsh='ssh {{ ansible_ssh_common_args }} {{ ansible_ssh_extra_args }}'"
- name: Install project dependencies
command: sudo npm install --ignore-scripts
args:
chdir: /home/
- name: Kill Node Server
command: sudo pm2 kill
args:
chdir: /home/
- name: Restart Node Server
shell: NODE_ENV=$NODE_ENV NODE_APP_INSTANCE=$NODE_APP_INSTANCE sudo pm2 start index.js --name node-app
args:
chdir: /home/
This script lays out the steps required to provision our code and run our Node.js server.
We start by reading the hosts from our inventory.yml
file and using rsync to push code to each of them in sequence.
Our application dependencies are installed, the server is stopped if currently running, and then restarted with environment variables set on each machine. This is all remarkably easy to set up, understand, and replicate.
We now essentially have just two buttons to push - one to spin up our VMs and another to provision our application code.
Starting Fresh
You might be wondering where those environment variables are being set on our Ubuntu VMs. Don't we need to configure these instances and install system level dependencies?
We sure do!
We can add a startup script to our Terraform config which will run whenever an instance starts or reboots.
# startup_script.sh
#! /bin/bash
if [ ! -f "/etc/initialized_on_startup" ]; then
echo "Launching the VM for the first time."
sudo apt update
sudo apt-get update
curl -fsSL https://deb.nodesource.com/setup_16.x | sudo -E bash -
sudo apt-get install -y nodejs
sudo npm install pm2 -g
NODE_APP_INSTANCE=$(curl http://metadata.google.internal/computeMetadata/v1/instance/attributes/instance_id -H "Metadata-Flavor: Google")
echo "NODE_APP_INSTANCE=${NODE_APP_INSTANCE}" | sudo tee -a /etc/environment
echo $NODE_APP_INSTANCE
echo "NODE_ENV=production" | sudo tee -a /etc/environment
source /etc/environment
sudo touch /etc/initialized_on_startup
else
# Executed on restarts
source /etc/environment
cd /home/api_service && NODE_ENV=$NODE_ENV NODE_APP_INSTANCE=$NODE_APP_INSTANCE sudo pm2 start index.js --name node-app
fi
The startup script installs Node.js and PM2 and reads from our instance metadata to set the NODE_APP_INSTANCE environment variable.
This environment variable will be used in our application to determine where our application is running, and which database nodes the API service should connect to. I’ll cover this in more detail in a future post.
If our VM requires a restart, this script will re-run our Node server.
# main.tf
variable "instances" {
type = map(object({
name = string
zone = string
network_ip = string
metadata = object({ instance_id = string})
startup_script = string
}))
default = {
"los_angeles" = {
metadata = {instance_id = "los-angeles"}
startup_script = "startup_script.sh",
name = "los-angeles"
zone = "us-west2-a" // Los Angeles, CA
network_ip = "10.168.0.2"
},
...
}
}
resource "google_compute_instance" "vm_instance" {
...
metadata_startup_script = file("${path.module}/${each.value.startup_script}")
...
}
After writing this basic startup script, I've added it to the Terraform config, so that it's run on all provisioned API service instances.
I now have a great foundation with which to build out my application logic. My infrastructure and application code have been deployed in a way that is easily replicable.
I'm looking forward to wrapping up the application logic, so I can unveil the final product. I guess I better get coding…
Follow along for more updates!