How to Develop and Test an Automated CI/CD Workflow with Cassandra

Jim Dickinson - Jun 21 '22 - - Dev Community

Image description

In this post, we’ll show you how to develop a CI/CD workflow using Apache Cassandra™ with a GitHub Actions runner. See for yourself how much time and effort you can save by deploying Cassandra cloud-natively while you test and deploy your cloud-native applications!

If you have projects that depend on Apache Cassandra™ and you want to develop an automated continuous integration and continuous delivery (CI/CD) flow, you’re going to need to create Cassandra clusters dynamically for your tests to make sure that your app works after each code change. DataStax does this every day — we run Cassandra in Kubernetes to power Astra DB. And, we use continuous testing of our Cassandra deployments to make sure Astra DB works reliably.

In this post, we’d like to show you how you can develop and test your own CI/CD workflows with Cassandra using a GitHub Actions runner.

Challenges that vex developers building CI/CD workflows

Say you want to support any number of workflows (within reason, of course) all running at the same time. But if you find your test environment is broken, migrated, scaled down to save costs, or you encounter any one of the many other common situations that developers have to work around, all of your great automation is totally wrecked.

A big “gotcha” when you’re implementing continuous integration is that you need a real database for your app to talk to. Historically a DevOps team would provision a static test Cassandra environment on some cloud-based virtual machines (VMs), which was probably time-consuming and required more than a little bit of effort. This process doesn’t scale so well if you’re running feature branch environments or have multiple teams sharing a Kubernetes cluster. Now, you’re probably deploying your apps in a more cloud-native way with containers, and it would be best to get your database deployed cloud-natively, too.

You might think all you need is a container running Cassandra. However, it can be more challenging than it looks to get Cassandra going. If you’re going to do all this with containers, it’s better to take advantage of the best parts of a container orchestration system like Kubernetes. Then you can deploy your app AND database with close-to-production configuration, test it, and tear everything down at the end, reducing costs. No magical, came-from-the-DevOps-team dependencies or expensive test environment databases to maintain!

Let’s try it out without leaving GitHub!!

We’ve built out a GitHub repo to show you how you can configure and deploy your app and database and test it in an ephemeral way, leveraging a GitHub Actions runner and Kubernetes-in-Docker (kind).

GitHub Actions runner VMs come with two cores and 8 GB of RAM. This is plenty of space to install a three-node, Kubernetes-in-Docker (kind) cluster that mimics three distinct physical servers. That allows you to bring up a real RF=3 Cassandra cluster, your frontend pods, and your backend pods. Now, you don’t need mock storage code, any special connections, or a VPN.

With this setup, you can bring Cassandra to where your app is running more easily than ever before. You can insert some fake data and then your tests can assert your family of apps behave correctly without the concern you would have with testing it in your live application.

This process also works with self-hosted GitHub runners, but you have to be a bit more careful here. If you’re on a private repo and you have access to somewhere you can host a private runner (e.g. a private AWS account), that’s fine. But, if what you want to provision it with is on private infrastructure, you need to make sure you’re not using the public repo because anyone could open a pull request and start running code behind your firewall. You can learn more about the risks associated with mixing self-hosted runners with public repos in the GitHub documentation here.

The public GitHub will provide plenty of RAM to support the learning experience and show you what you can do. And, if you’re working on a private project and using GitHub, it isn’t difficult to port your workflow to a self-hosted runner after having tested it in the public GitHub. So, let’s just stick with that.

The Basic Steps

Here’s a brief rundown of the basic steps you can use for building and testing an automated CI/CD workflow in GitHub:

  1. Install a three-node Kubernetes-in-Docker (kind) environment to simulate a more full Kubernetes cluster. We’ll use the kind-action by helm in the Actions Marketplace.
  2. Deploy cass-operator and three-node Cassandra cluster.
  3. Deploy frontend app and backend app.
  4. Load up your data and run your tests.
  5. Then fold the whole thing up when your test run is over. We get this for free by using helm’s kind-action — It’s automatic!

The approach to developing and testing CI/CD workflows that we’ve described here is one that DataStax uses routinely to test its Astra DB workflows (though not necessarily on GitHub). In production, we run our CI flows with GitHub and Jenkins.io, and Harness.io.

Check it out now on our GitHub repo to try it out today!

Follow the DataStax Tech Blog for more developer stories. Check out our YouTube channel for tutorials and here for DataStax Developers on Twitter for the latest news about our developer community.

Resources

  1. Apache Cassandra
  2. DataStax
  3. Kubernetes
  4. Astra DB
  5. GitHub Documentation: About Self-Hosted Runners
  6. GitHub Documentation: Self-Hosted Runner Security with Public Repositories
  7. Quick Start — kind — Kubernetes
  8. DataStax Documentation: What is Cass Operator?
  9. GitHub repo for the DataStax Cassandra CI/CD Example
  10. Jenkins.io
  11. Harness.io
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .