Developing and releasing software is a complex process. The most common solution is to use an automated system for testing, validating, and publishing software.
We’re going to look at what this means, how this is typically implemented, and learn about a platform that helps us to create these software pipelines. We’ll see the features in context by digging into how Fastly has implemented them for one of its own services.
Here is a breakdown of the content:
- Tools and terminology
- Coordinating complex pipelines
- The 10,000-foot view
- Storing your workflows
- Handling events
- A ‘Hello World’ job
- Validating against a matrix
- Third-party actions
- Accessing contextual data
- Defining environment variables
- Managing secrets
- Flow control
- Persisting data
- Reusable workflows
- Conclusion
Tools and terminology
First, let’s understand what it is we’re going to be talking about: CI/CD.
CI (Continuous Integration) is the process of automating the merging of new or updated code into an existing code base and utilising supplementary tools, such as code linters and unit tests, to help make the integration as correct as possible.
CD (Continuous Deployment) is an extension of CI, focused on automating the deployment of a piece of software once the CI process has been completed successfully.
There are many CI/CD solutions available, such as Jenkins and Travis CI (both of which Fastly has used in the past) but here I’m going to be talking about GitHub Actions, which has proven to be a powerful and flexible tool that has enabled us to simplify our content publishing pipeline.
The GitHub Actions platform is managed by GitHub (the largest online collaboration platform where developers and companies build and maintain their software). This means GitHub Actions exist alongside where our code lives, and this can enable better integration with our internal repositories compared to external solutions such as Jenkins/Travis.
Different CI/CD platforms use different terminology to describe their service model. With GitHub Actions, everything starts with a ‘workflow’. A workflow is a YAML configuration file that consists of
- Jobs: a job represents a collection of 'steps'.
- Steps: each ‘step’ executes logic to achieve some goal.
- Events: an event determines when your jobs are run.
- Runner: a virtual machine that runs your jobs.
GitHub provides a nice visualisation of this structure...
Coordinating complex pipelines
We use CI/CD across most projects at Fastly, both internal and public. I’m going to detail some examples of how we configure this for our Developer Hub (DevHub).
The DevHub is a static website generated using Gatsby, and it provides Fastly's developer community with information on, amongst other things
- Fastly public APIs, and language references for VCL and Compute@Edge services.
- Fastly tools that simplify the setup, testing, validation and deployment of your code.
- A plethora of solutions, examples and architectural concept information.
We’re going to dig into DevHub's multi-layered CI/CD process and see how we have utilised the various features of the GitHub Actions platform to support some of its more complex requirements.
The 10,000-foot view
Let’s start with a visualisation of what we’re dealing with when it comes to DevHub’s CI/CD pipeline. Below is a “summary view” of our DevHub pipeline that the GitHub Actions UI provides. It is an interactive graph that lets us interact with each of the available jobs and to view their individual progress and status.
Here we can see we start with two consecutive jobs, i.e. one must complete before the other
- Gatsby Build
- Deploy to GCS
These are consecutive because we have to build our application before we can deploy it.
This pipeline is the same for both our development and production environments. The only difference is the subdomain to which the Gatsby application is deployed.
When a PR is opened on our DevHub GitHub repository, the code is deployed to a dynamically generated subdomain for QA. When the PR is merged the deployment is to our production endpoint developer.fastly.com.
After the first two consecutive jobs, we have multiple jobs that are run in parallel, handling things like checking for stale content that needs to be reviewed, updating search metadata, and purging the Fastly CDN to ensure users see the latest content.
HINT: Jobs are by default run in parallel. If a job has a dependency on another job, this can be made explicit by using
jobs.<job_id>.needs
.
We then introduce a blocking job called “Post-deployment checks” that validates whether any updated files are related to our Compute@Edge services, or if any of the code examples for the various Compute@Edge SDK languages have been updated. If they have, the final set of jobs will be executed, which deploys the updated service code, and validates our code examples are correct.
Some of the jobs don’t need to run every time. For example, there are jobs defined that only run on a predetermined schedule and others that only need to run when we're building the production environment (i.e. are skipped as part of the PR review process). We’ll see how this is handled later on.
Storing your workflows
Defining pipelines with GitHub Actions requires defining jobs within a workflow file. You can have any number of workflow files, which are stored in a .github/workflows folder within the root directory of your project repository.
A tree view of this directory structure might look like this:
├── .github
│ └── workflows
│ ├── my-first-workflow.yaml
│ ├── my-second-workflow.yaml
│ └── my-third-workflow.yaml
Handling events
The jobs you define within a workflow file are triggered based on specific events. GitHub Actions offer a rich set of events that let you fine-tune your workflow using the on
key for numerous use cases. Some common examples of triggering a workflow are:
When pushing changes to any branch
on: push
When pushing changes to specific branches
on:
push:
branches:
- "my-feature-branch"
- "my-other-feature-branch"
When pushing a tag
on:
push:
tags:
- 'v*'
Scheduling execution at a specific time or interval
on:
schedule:
- cron: "0 0 1 * *" # https://crontab.guru/every-month
- cron: "*/5 * * * *" # https://crontab.guru/every-5-minutes
HINT: https://crontab.guru/ makes dealing with the cron syntax easy.
A 'Hello World' job
Before we jump into more detailed examples, let’s look at a simplified workflow file.
name: My Workflow
on: push
jobs:
a-simple-job:
runs-on: ubuntu-latest
steps:
- name: Say Hello
run: echo 'hello'
- run: |
echo 'foo'
echo 'bar'
echo 'baz'
We’ve already seen the on
key, but now we have the basic building blocks that you’ll see in nearly all types of workflows.
-
name
(optional): the name of your workflow file. -
jobs
: a collection of jobs we want to have executed. -
jobs.<job_id>
: a unique job identifier (a-simple-job
in our example). -
jobs.<job_id>.runs-on
: indicates which virtual machine to run our job on. -
jobs.<job_id>.steps
: a collection of steps we want to have executed. -
jobs.<job_id>.steps[*].name
(optional): the name of your step. -
jobs.<job_id>.steps[*].run
: a command-line program to execute.
Each run
key represents a new process and shell. In the above example, we can see two separate steps defined, and the latter step uses a pipe character |
to configure multiline mode. This means each of the echo
commands are run within the same shell instance.
NOTE: the use of the multiline pipe character requires the following lines to be indented.
The caveat to using multiline mode is that any errors that occur during runtime execution become a lot harder to identify and debug compared to using individual steps. By splitting multiple commands across separate steps, any failure that occurs will be associated with a specific step.
There are many other steps[*]
keys available, and we’ll see examples of them as we dig more into the DevHub’s example configuration.
When defining multiple jobs within a workflow we’re able to control the order of the jobs. By defining jobs.<job_id>.needs
and providing it with the relevant jobs.<job_id>
we’re able to inform the GitHub runner that we need a specific job to have been run and completed successfully before the current job can begin.
Validating against a matrix
For the DevHub we have jobs.<job_id>.runs-on
set to run our workflow on a Ubuntu virtual machine but this particular key is much more powerful when coupled with jobs.<job_id>.strategy
, which enables you to configure your job to be run on multiple virtual machines. An example of this configuration would be:
jobs:
example_matrix:
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
runs-on: ${{ matrix.os }}
The runs-on
key example above is using an Expression syntax we’ve not yet discussed. It’s a feature you’ll use a lot when writing your own workflows. In summary, it lets you dynamically reference other keys and exposed data, and we’ll cover this in more detail in the section on “Accessing contextual data”.
You can even extend this approach to support a multi-dimensional matrix, enabling you to validate a job not only against multiple platforms but also using different software versions across those platforms. We use this approach in our open-source Fastly CLI pipeline when validating the CLI test suite, which needs to produce an executable binary that works across all three major operating systems (Linux, Mac, Windows).
Refer to the documentation for further examples of this type of use case.
Third-party actions
Actions are a collection of steps that can be reused. You can create your own custom actions but it’s more common to use third-party community actions, for example, Fastly provides actions for building and deploying to Compute@Edge. To use an action, the workflow job needs a step that includes the jobs.<job_id>.steps[*].uses
key set to an appropriate configuration path.
A common example is using actions/checkout, which lets a job access the project code inside the repository running the workflow.
jobs:
example:
runs-on: ubuntu-latest
steps:
- run: ls # no files/directories available
- uses: actions/checkout@v2
- run: ls # lists all root project files/directories
The DevHub makes heavy use of third-party actions. Here are just a few of them:
- actions/checkout: access repository files.
- actions/cache: cache dependencies and build outputs.
- actions/setup-go: install and configure a specific version of Go.
- actions/setup-node: install and configure a specific version of Node.js.
- actions-rs/toolchain: install and configure a specific version of Rust.
WARNING: All of the above actions, with the exception of the last one, are officially maintained by GitHub. Fastly trusts using these actions as they are developed by the same organisation providing the platform. Any non-official third-party actions should be audited first before using as they can access an automatically generated GitHub token which will enable an action to carry out API requests that otherwise require authentication. See “Managing secrets” for more information.
Accessing contextual data
GitHub Actions offers a collection of ‘context’ objects that provide information for things like your workflow environment, github repository, job and step output, secrets data and more.
We use a lot of these context objects in the DevHub workflow to generate cache keys, persist data between jobs, access secrets and other environment variables so we can configure the shell scripts that we run, and identify when a particular flow control should be applied.
An example of this is when we need to authorise access to private NPM packages hosted on GitHub’s package repository:
- run: echo "//npm.pkg.github.com/:_authToken=${{ secrets.GITHUB_TOKEN }}" > ~/.npmrc
Here we use the secrets
context to access an authentication token and persist it temporarily into a ~/.npmrc file used by NPM (we’ll revisit this example later when discussing secrets in more detail).
Another example is when we use the github
context to access the root directory of our checked-out project so we can access a Google authentication configuration file:
env:
GOOGLE_APPLICATION_CREDENTIALS: ${{ github.workspace}}/.google.json
Defining environment variables
These are typically used in CI/CD to inject values at runtime. GitHub Actions lets you configure environment variables in multiple sections within the workflow file, depending on the scope they should have.
- Globally: available to all jobs (and all steps within those jobs).
- Job level: available to all steps within the specific job.
- Step level: available for a single step.
Here is an example that demonstrates all three scope levels, they each use the env
key to contain the variables being declared:
env:
GLOBAL: foo
jobs:
my-job:
runs-on: ubuntu-latest
env:
LITERAL: bar
INTERPOLATION: ${{ github.ref_name }}
EXPRESSION: $(echo ${{ github.ref_name }} | perl -pe 's/[^a-zA-Z0-9]+/-/g')
steps:
- name: Print Global Environment Variables
run: echo $GLOBAL
- name: Print Job Environment Variables
run: |
echo ${{ env.LITERAL }}
echo ${{ env.INTERPOLATION }}
echo ${{ env.EXPRESSION }}
- name: Print Step Environment Variables:
env:
STEPVAR: my step
run: echo ${{ env.STEPVAR }}
We can see at the top of the example we have an env
key where we define an environment variable called GLOBAL
. This environment variable is declared in the top-level global scope and so it’s available at all other levels including job and step levels.
Next, we can see an env
key defined within the job itself, and within that are three separate environment variables declared LITERAL
, INTERPOLATION
and EXPRESSION
. These variables are available for all the steps within that specific job.
Lastly, we can see an env
key defined within the final step, and within it is a single environment variable declared STEPVAR
, which is only available to that step and no other.
You’ll notice we’ve used two different ways to access the environment variables
$NAME
${{ env.NAME }}
The first was using a syntax familiar to developers who work within a terminal environment: echo $NAME
. This causes the shell instance (that the echo
command runs inside) to evaluate $NAME
and look up its value from the containing environment.
The other approach is to use the interpolation feature we looked at earlier: echo ${{ env.NAME }}
, and although the result is the same, the way the values are acquired is subtly different. With interpolation, the value is acquired by looking up the relevant key within the env
context object, and then the value is injected into the shell command to be executed (as if the command was literally echo my step
), while the more traditional shell resolution approach works by the shell instance looking up the environment variable to access the value.
HINT: I would tend towards using the interpolation approach as it’s more explicit and makes setting environment variables much more ‘visible’ when scanning long workflow files.
In the previous example, you might have expected echo ${{ env.EXPRESSION }}
to have printed the result of the expression (i.e. inspect the github.ref_name
context and then use Perl to normalise the value), instead the literal value defined would have been printed.
This is because the env
key does not evaluate shell commands like jobs.<job_id>.steps[*].run
does, and causes problems when we need an environment variable to contain a dynamically generated value. So how do we do that?
Well, let’s consider the scenario we had with the DevHub. We were using the third-party action setup-node
to install and configure the Node.js programming language. This action lets you specify the node version to install but it can’t be a dynamically acquired value. You either have to hardcode it or interpolate the value.
In the DevHub we have a .nvmrc file that indicates the supported node version. We want to read the version contained in this file and pass that to the action. There are a few ways to achieve this but the simplest one was to store the value in an environment variable and interpolate the value in the action’s node-version
input field using the env
context object.
steps:
- run: echo "NODE_VERSION=$(cat .nvmrc)" >> $GITHUB_ENV
- uses: actions/setup-node@v2
with:
node-version: "${{ env.NODE_VERSION }}"
The way this works is we use a shell command to read the node version from the .nvmrc file and assign it to a variable called NODE_VERSION
and finally append that generated string to a file that the GitHub Actions runner uses to generate the environment variables. The file path is provided via the default environment variable GITHUB_ENV
.
Managing secrets
A GitHub repository can be configured to store secrets. These secrets are exposed to your GitHub Actions workflow via the secrets
context object. We saw an example of this earlier when we referenced secrets.GITHUB_TOKEN
as a way to access our private NPM packages hosted on GitHub’s NPM package repository.
That particular secret is a special case - it's always available, since it's provided by the GitHub Actions runner. To quote GitHub directly…
At the start of each workflow run, GitHub automatically creates a unique GITHUB_TOKEN secret to use in your workflow. When you enable GitHub Actions, GitHub installs a GitHub App on your repository. The GITHUB_TOKEN secret is a GitHub App installation access token. You can use the installation access token to authenticate on behalf of the GitHub App installed on your repository. The token’s permissions are limited to the repository that contains your workflow.
That last sentence is the important bit. It means yes you can use secrets.GITHUB_TOKEN
to make GitHub API calls on behalf of your repository but you can’t use it to access any other private repository in your organisation or to access GitHub’s package repository. For that we need a PAT (Personal Access Token).
This was a real problem we stumbled into and so for the DevHub it meant we had to create a PAT with the appropriate access permissions, and add it as a repository secret that we could reference via the secrets
context.
Flow control
GitHub Actions provides a mechanism for preventing a job from running unless a specific condition is met. To do this you must assign an expression to jobs.<job_id>.if
.
jobs:
example-job:
if: ${{ github.ref_name == "main" }}
runs-on: ubuntu-latest
steps:
- run: echo hello
The above example looks up github.ref_name
and only runs the job if the branch being run is main
. The DevHub uses conditional execution for multiple use cases, such as
- Only run
gatsby build
(an expensive operation) if we have a cache miss. - Only recompile the DevHub Compute@Edge service if it has been updated.
- Only run Compute@Edge language tests if relevant code examples were updated.
- Only publish API updates to Postman if it’s a scheduled production event.
HINT: The
if
expression can omit the outer${{ }}
but I tend to include them because I prefer the visual explicitness.
A more complex conditional used in the DevHub is when the build artifacts are deployed. We check if there were any changes to specific files, and if there were changes, we execute a job to validate those files. We do this because there’s no point in running scripts to validate files that haven’t been changed.
if: ${{ contains(needs.post-deployment-checks.outputs.data, 'RUST_CODE_SAMPLE_CHECKS=true') }}
This examples uses the contains
function to inspect data exposed by another job (post-deployment-checks
). We’ll learn about how the persisting of data between jobs can be achieved in the next section.
To learn more about the available operators, like ==
, !=
and &&
, refer to the operators documentation.
Persisting data
Each workflow job is executed within its own runner. This means any files or data that are created will be lost once the job finishes. There are a few ways to persist data:
- Caching: actions/cache.
- Artifacts: actions/upload-artifact, actions/download-artifact.
-
Outputs:
jobs.<job_id>.outputs
.
Caching
Caching is the simplest, and most common, way to persist data but it can result in confusing failure scenarios. This has been experienced with the DevHub pipeline on a few occasions. Typically a subtle workflow configuration change will cause the cache
action to look up the wrong key and subsequently we either get back stale data or in some cases no data, both can cause confusing errors in the pipeline.
Caching requires two stages of configuration. You first need to define a step that indicates what to cache and what the cache key should be, then you need a separate step in another job to extract the persisted files from the cache.
# cache our content
- uses: actions/cache@v2
with:
path: path/to/be/cached
key: ${{ runner.os }}-my-cache-key
# restore from cache (in a separate job)
- uses: actions/cache@v2
with:
path: path/to/be/cached
key: ${{ runner.os }}-my-cache-key
HINT: You can specify multiple paths to be cached/restored using the multiline character
|
.
Whenever the action step is executed it will attempt to lookup the given key in the cache and if cached content is found the content is restored to the given path(s), otherwise it does nothing. When the job completes there is a ‘post run’ hook that each installed action can execute and for the cache
action, when the hook is triggered, it will look at the given path and store whatever it finds there into the cache.
The DevHub uses the cache
action for caching our node_modules directory (installing node modules is a very slow operation and so the less we have to do that the better), caching build configuration and resulting artifacts.
Artifacts
Artifacts are much slower at storing files than caching because the files need to be uploaded and downloaded from GitHub’s servers. Unlike caching you can only download a file that was uploaded as part of the same workflow run, but like caching they have two distinct steps that need to be defined:
# upload our files
- uses: actions/upload-artifact@v3
with:
name: my-artifact
path: my_file.txt
# download our files (in a separate job)
- uses: actions/download-artifact@v3
with:
name: my-artifact
GitHub’s own recommendation for which approach to take (caching or artifacts) is:
Use caching when you want to reuse files that don't change often between jobs or workflow runs, such as build dependencies from a package management system.
Use artifacts when you want to save files produced by a job to view after a workflow run has ended, such as built binaries or build logs.
Outputs
Jobs have an jobs.<job_id>.outputs
key which can produce data for another job to consume. This approach is used heavily in the DevHub workflow to better support caching of expensive resources such as installing node dependencies or running a Gatsby build.
The DevHub workflow dynamically generates cache keys for resources that we would like subsequent jobs to use, and it uses a job’s output key to persist those cache keys to the next job so that it can look up the cached content using the same keys.
An example of this is:
# job 1
example-job:
runs-on: ubuntu-latest
outputs:
cache_key: ${{ steps.example-build.outputs.cache_key }}
steps:
- name: Run a build
run: make build # generates a ./build directory
- id: example-build
run: echo "::set-output name=cache_key::${{ hashFiles('./build') }}"
- name: Cache the build
uses: actions/cache@v2
with:
path: ./build
key: ${{ runner.os }}-${{ needs.example-job.outputs.cache_key }}
# job 2
other-job:
runs-on: ubuntu-latest
needs: example-job
steps:
- name: Extract the build from the cache
uses: actions/cache@v2
with:
path: ./build
key: ${{ runner.os }}-${{ needs.example-job.outputs.cache_key }}
In the first job (example-job
) we see the outputs
key is defined with a nested cache-key
set to evaluate the expression steps.example-build.outputs.cache_key
. This expression results in assigning the output from the step with id: example-build
.
The outputs.cache_key
in the expression is a reference to the output from the example-build
step. The step’s run
key executes the echo
command and produces output that is structured in a specific format that GitHub Actions recognises and allows the command to communicate with the runner machine.
In this case, the command outputs a string containing the ::set-output
workflow command followed by name=cache_key
and assigns the named key a value which is the result of hashing the ./build
directory (generated in the previous step name: Run a build
) using the hashFiles
function.
The digest that is generated is assigned to the example-build
step output and is exposed to other steps (and the job) using the name cache_key
. The job itself ensures the cache_key
value from the step is assigned to its own outputs.cache_key
.
The second job (other-job
) states that it needs: example-job
. Once that dependency is declared, other-job
can access the output from example-job
using the needs
context object referenced using the expression syntax ${{ … }}
. In this case, other-job
acquires the persisted cache key with needs.example-job.outputs.cache_key
and passes the digest value to the actions/cache
action so it may lookup the ./build
directory and restore it.
Reusable workflows
The DevHub pipeline has a bunch of ‘post-deployment’ jobs that validate the content produced by the Gatsby build system. For example we run scripts that check for broken links, and scripts that run test suites against the different Compute@Edge programming language examples provided on the DevHub site.
Each validation process is defined as a separate job. The steps for these validation jobs are typically the same but the script that is executed will be unique to the specific job, and this means all the steps that are required to enable the validation process have to be duplicated across each validation job. To help reduce the duplication of steps we define a set of reusable workflows.
A reusable workflow is a yaml file that has a structure very similar to the standard workflow yaml file we’ve seen so far. For example, you define a jobs
key which determines what platform the job should run on, any environment variables that need to exist, and a set of steps that should be executed.
NOTE: Reusable workflows don’t inherit the parent workflow environment.
The only difference is that a reusable workflow file is defined as being a ‘template’ and it has access to an extra ‘event’ called a workflow_call
. Within the workflow_call
key you can define a collection of inputs
and (optionally) secrets
.
The workflow_call
event is triggered by the main workflow file when it defines a job that references the reusable workflow via the uses
key. The main workflow file is able to pass its own values for the defined inputs
and secrets
.
An example reusable workflow might look something like this:
on:
workflow_call:
inputs:
script:
required: true
type: string
secrets:
api_key:
required: true
jobs:
template:
runs-on: ubuntu-latest
env:
...
steps:
- ...
The inputs
and secrets
can be referenced within the reusable workflow’s steps using the expression syntax ${{ inputs.script }}
and ${{ secrets.api_key }}
.
The skeleton outline of the DevHub’s main workflow, with a call to the reusable workflow, looks like this:
jobs:
build:
...
deploy:
...
validate-foo:
needs: deploy
uses: <org>/<repo>/.github/workflows/<filename>@<branch>
with:
script: ./foo.sh
secrets:
api_key: secret_squirrel
validate-bar:
needs: deploy
uses: <org>/<repo>/.github/workflows/<filename>@<branch>
with:
script: ./bar.sh
secrets:
api_key: secret_squirrel
Reusable workflows help reduce boilerplate by abstracting away common steps. Without this feature the DevHub post-deployment validation jobs would easily have caused the overall workflow file to quadruple in size and complexity, and also would have resulted in a maintenance nightmare if ever we needed to change a set of steps, as we otherwise would have had to remember to change them in multiple places instead of just one reusable workflow file.
Conclusion
GitHub Actions is a CI platform that offers a very flexible and expressive configuration model, supporting a wide-ranging eco-system of community-driven third-party tools, all for free and directly integrated with where your code likely already exists, on the largest code repository platform available today. Try it out for yourself and see how you might simplify your existing CI workflows and take advantage of its rich feature set.