Warning: In a public repository, scheduled workflows are automatically disabled when no repository activity has occurred in 60 days.
TL;DR
https://github.com/sanity-io/github-action-sanity#backup-routine
Introduction
Sanity is a hosted content platform that offers many features going beyond what we expect from a traditional CMS. One of their core focus is the developer experience. Thanks the Sanity CLI, it takes only two commands to bootstrap a project and deploy it to a free subdomain. Data are stored safely and in multiple copies on Google Cloud servers. Thanks to the document history, we can restore our documents to a previous state. However, deleted documents and datasets cannot be recovered.
Even if those scenarios are unlikely to happen, it is worth creating a simple backup routine, just in case. And the method I'm going to show you here is easy to set up, won't cost you any money, and doesn't require you to register with a 3rd party service (I assume that all my readers have a GitHub account🙃).
Ways to backup Sanity datasets
There are three ways to backup datasets:
- cURL request to an export URL endpoint
- Using the Sanity CLI
- Using the
@sanity/export
npm package
In an another blog post, I explained how to use the @sanity/export
npm package inside a serverless function to back up content to Google Drive or Dropbox:
Use Netlify cloud function to back up data to Google Drive
Jérôme Pott ・ Sep 13 '19
There's however an easier way: GitHub Actions (GA). Here are their advantages:
- Backup files are stored alongside your studio code.
- They only require a few lines of YAML config.
- They support CRON jobs.
- They are cheap (execution time + storage).
- We can make use of the GitHub ecosystem (notifications for failed workflows, access management, etc.)
Going full onboard with GitHub Actions
There is a GitHub Action that wraps the Sanity CLI. Basically, it means that we can run sanity dataset export
inside our GA workflow.
Before we can export the dataset, we need to generate a read token from the Sanity project dashboard and store it as a secret in the GitHub repository.
This is how the first workflow step looks like:
- name: Export dataset
uses: sanity-io/github-action-sanity@v0.2-alpha
env:
SANITY_AUTH_TOKEN: ${{ secrets.SANITY_AUTH_TOKEN }}
with:
args: dataset export production backups/backup.tar.gz
Then we need to upload the generated backup file so that it will be available for download as a workflow artifact. For this, we use the upload-artifact action and we specify the same path as above: backups/backup.tar.gz
.
By default, this step passes even if GitHub cannot find our generated backup file. That is why I recommend setting the if-no-files-found
option to error
.
And here's the details of the step:
- name: Upload backup.tar.gz
uses: actions/upload-artifact@v2
with:
name: backup-tarball
path: backups/backup.tar.gz
# Fails the workflow if no files are found; defaults to 'warn'
if-no-files-found: error
In addition to running the backup routine on a schedule, you also add an option to trigger the backup process manually from the GA dashboard. This can be useful in various situations, e.g. right after content editors added a large amount of data, or right before manipulating datasets.
Here's an example of a workflow triggered manually or by a CRON job:
on:
schedule:
# Runs at 04:00 UTC on the 1st and 17th of every month
- cron: '0 4 */16 * *'
workflow_dispatch:
Conclusion
We now have set a solid backup routine in place. You can of course tune the frequency of the backups to your needs. Make sure to also read the latest information about pricing, size limits and file retention from GitHub. For example, as of writing this, backup files are automatically deleted after 90 days on public repo. I personally think that 90 days is long enough, even too long maybe. If you want to keep backups files for a shorter time, you can do so in the repository settings under Actions.
Finally, if you would like to see the workflow described in this post along with the generated artifacts, you can visit this page: https://github.com/mornir/movies-studio/actions/workflows/main.yml
Sanity.io: Get the most out of your content
Sanity.io is a platform to build websites and applications. It comes with great APIs that let you treat content like data. Give your team exactly what they need to edit and publish their content with the customizable Sanity Studio. Get real-time collaboration out of the box. Sanity.io comes with a hosted datastore for JSON documents, query languages like GROQ and GraphQL, CDNs, on-demand asset transformations, presentation agnostic rich text, plugins, and much more.
Don't compromise on developer experience. Join thousands of developers and trusted companies and power your content with Sanity.io. Free to get started, pay-as-you-go on all plans.