Upgrade your CockroachDB cluster using Ansible

Fabio Ghirardello - Nov 29 '21 - - Dev Community

Following my previous blog on how to deploy CockroachDB using Ansible, in this post we will test CockroachDB cluster upgrades using the same CockroachDB Collection for the following upgrade types, starting from a cluster with v21.1.6:

  1. Minor version upgrade, to v21.1.10
  2. Minor version downgrade, back to v21.1.9
  3. Major version upgrade, to 21.2.0
  4. Major version downgrade, back to 21.1.12 (will fail)
  5. Major version downgrade, forced back to 21.1.12 (will succeed)

Setup

So, without further ado, follow the instructions on my previous blog to create your Ansible Control Machine and spin up your 3 nodes cluster on your favorite cloud.
You can opt to have the cluster insecure if you want, it's just a test so you don't need Kerberos or certs.

Below are the imporant bits in my config/sample.yml file:

# COCKROACHDB
cockroachdb_deployment: standard
cockroachdb_version: v21.1.6
cockroachdb_upgrade_delay: 60
cockroachdb_autofinalize: no
Enter fullscreen mode Exit fullscreen mode

Which translates to a standard installation of v21.1.6.
The last 2 variables are not applicable right now.
Once the playbook completes, these are my hosts:

PLAY [print Ansible inventory for reference] ***************

TASK [PRINT GROUPS MAGIC VARIABLE] *************************
2021-11-23 10:28:13 (0:00:00.044)       0:01:10.237 ******** 
ok: [localhost] => 
  msg:
  - all:
    - 35.226.241.43
    - 35.224.206.2
    - 34.67.60.69
    cockroachdb:
    - 35.226.241.43
    - 35.224.206.2
    - 34.67.60.69
    demo-0:
    - 35.226.241.43
    - 35.224.206.2
    - 34.67.60.69
    ungrouped: []
Enter fullscreen mode Exit fullscreen mode

From here I pick any of the 3 hosts and point my browser at port 8080 to open the DB Console, example http://35.226.241.43:8080, confirming the cluster is up and running with the correct version.

cockroachdb

Test 1: Minor version upgrade, to v21.1.10

We're ready for our first test.
In the config/sample.yaml file, change as per below and re-run the playbook

# COCKROACHDB
cockroachdb_deployment: upgrade
cockroachdb_version: v21.1.10
cockroachdb_upgrade_delay: 60
cockroachdb_autofinalize: no
Enter fullscreen mode Exit fullscreen mode

What you will see, is that we are cycling through each of the 3 hosts sequentially and atomically to do these 3 simple operations:

  1. Install the new binary, replacing the old
  2. Restart the CockroachDB service
  3. Wait 60 seconds

That's it! That's what a CockroachDB upgrade is, very simple!

cockroachdb

Test 2: Minor version downgrade, back to v21.1.9

CockroachDB supports downgrades between minor (patch) versions of the same major version, that is, you can downgrade from 21.1.10 to 21.1.9.
As long as your desired version's major version is the same as the current version's major version - 21.1, in this case - you can freely upgrade and downgrade.

Just as you've done before, change the cockroachdb_version variable and re-run the playbook.
Here, I've also changed the pause time between each node upgrade so we proceed faster.
In Production, I suggest you set this value to 300.

# COCKROACHDB
cockroachdb_deployment: upgrade
cockroachdb_version: v21.1.9
cockroachdb_upgrade_delay: 1
cockroachdb_autofinalize: no
Enter fullscreen mode Exit fullscreen mode

As expected, you've downgraded successfully to v21.1.9.

cockroachdb

Test 3: Major version upgrade, to 21.2.0

While Ansible can automate the upgrade for you, you should always check the doc page related to the major upgrade, particularly the "Review Breaking Changes" section.

Also, you need to decide if you want to preserve the downgrade option or auto-finalize, explained in Step 3 of the same document.

For this test, we opt to preserve the downgrade option by setting cockroachdb_autofinalize = no. Run the playbook

# COCKROACHDB
cockroachdb_deployment: upgrade
cockroachdb_version: v21.2.0
cockroachdb_upgrade_delay: 1
cockroachdb_autofinalize: no
Enter fullscreen mode Exit fullscreen mode

Log into any of the nodes and check on the SQL prompt

root@35.226.241.43:26257/defaultdb> SELECT version();
                                          version
-------------------------------------------------------------------------------------------
  CockroachDB CCL v21.2.0 (x86_64-unknown-linux-gnu, built 2021/11/15 13:58:04, go1.16.6)
(1 row)

Time: 1ms total (execution 0ms / network 1ms)

root@35.226.241.43:26257/defaultdb> SHOW CLUSTER SETTING cluster.preserve_downgrade_option;       
  cluster.preserve_downgrade_option
-------------------------------------
  21.1
(1 row)

Time: 1ms total (execution 0ms / network 0ms)
Enter fullscreen mode Exit fullscreen mode

As expected, we upgraded to 21.2 and kept the downgrade option open.

Test 4: Major version downgrade, back to 21.1.12

Major version downgrades are not supported after finalization.
We haven't finalized, but the Ansible Collection is designed to prevent you from downgrade nonetheless.

Try downgrading by updating the vars as below and by running the playbook again

# COCKROACHDB
cockroachdb_deployment: upgrade
cockroachdb_version: v21.1.12
cockroachdb_upgrade_delay: 1
cockroachdb_autofinalize: no
Enter fullscreen mode Exit fullscreen mode

As expected, the playbook errors out

TASK [cockroachdb : fail as we don't support downgrades] **************
2021-11-23 12:08:46 (0:00:00.038)       0:00:21.364 ******************* 
fatal: [35.226.241.43]: FAILED! => changed=false 
  msg: 'Downgrades are not supported: from v21.2.0 down to v21.1.12'
Enter fullscreen mode Exit fullscreen mode

This is a built-in safety measure, but if you know what you're doing, then you can override it.

Test 5: Major version downgrade, forced back to 21.1.12

Run the playbook again, this time pass the Ansible extra variable iknowwhatiamdoing=True, example:

ansible-playbook site.yml -e @config/sample.yml --extra-vars=iknowwhatiamdoing=True
Enter fullscreen mode Exit fullscreen mode

And back you are to v21.1.12!

root@35.226.241.43:26257/defaultdb> SELECT version();
                                           version
---------------------------------------------------------------------------------------------
  CockroachDB CCL v21.1.12 (x86_64-unknown-linux-gnu, built 2021/11/15 16:01:13, go1.15.14)
(1 row)

Time: 1ms total (execution 0ms / network 0ms)
Enter fullscreen mode Exit fullscreen mode

Conclusion

CockroachDB upgrades are very simple, and the key for a smooth and successful upgrade is automation.

I conclude with a quick recap of the takeaway points:

  1. The upgrade is as simple as replacing the binary with the new one, and systemctl restart cockroachdb.
  2. Make sure that the upgrade runs sequentially and atomically, that is, upgrade one node at the time, slowly, and upgrade all nodes.
  3. For major upgrades, always read the docs first, and
  4. Disable auto-finalization, as it leaves the door open for a downgrade.

Good luck on your next upgrade!

. . . . . . . . . . . . . . . . . . . . .