Introduction
In this blog post, we are going to write a cloud function that will create a backup every time new content is published inside a CMS. The data will be stored in a GZIP file and then uploaded to Google Drive.
The backend (CMS) is managed by and hosted at Sanity. Their CMS, the Sanity studio, is an open-source React application that you should check out. They offer a service to quickly bootstrap a new Sanity project with your favorite front-end framework.
Since Sanity is hosted service, everything is managed for you and your data are safe. The folks at Sanity have their own backup routine in place, but you don't have access to the backups files. They are used in case of data loss on Sanity's end. If it happens on your end (e.g. accidentally deleting the database), you'd better have your own backups at hand. (Note that within the studio you can always restore a document to a previous version and undo delete actions. The risk of data loss is hence quite low.)
As for Netlify, you probably know it already. It is an amazing platform with plenty of useful services, such as cloud functions which allows you to easily execute server-side code. They recently launched Netlify Dev that let you easily test your cloud functions locally. Perfect for our use case! So let's get started!
Setting up the Google Drive API
Authenticating to Google was harder than I expected that is why I decided to dedicate a separate post for it:
Create a service account to authenticate with Google
Jérôme Pott ・ Aug 25 '19
You should now have a JSON file with your Drive API credentials and the ID of the shared folder.
Note about installing npm packages:
You should install all the dependencies of your cloud function inside your main package.json. In the official examples of Netlify, each cloud function has its own package.json, but I noticed that Netlify sometimes fails to install the dependencies specified in there.
Setting up Netlify
I am assuming that your front-end is hosted at Netlify. First add the shared folder ID and the content of the JSON file in two Netlify env variables (e.g. CREDENTIALS and FOLDER_ID) using the Netlify dashboard. Since your Drive API credentials are now a string, we'll read from it using JSON.parse(process.env.CREDENTIALS)
.
Then add this line to your netlify.toml under build:
[build]
functions = "functions"
This line tells Netlify in which folder you keep your cloud functions. Create this folder and create a JS file. This file will be our serverless function. Usually the name of the file doesn't matter, but in our case, it is important to name it deploy-succeeded.js
. A cloud function with this exact name will be triggered automatically when a deployment is successful. You can find other triggers here.
Now install netlify-cli
globally if not already done and launch netlify dev
in your project. It should automatically detect the type of framework used (Nuxt, Next, etc.). If not, make sure that you didn't change the default port. (e.g. 3000 is the default port for Nuxt).
Now if you visit the URL localhost:8888/.netlify/functions/deploy-succeeded
, you can manually trigger the function. The best thing is that you have access to your environment variables under process.env
!
Exporting the data
The npm package @sanity/client
makes the export process extremely easy. Add it to your main package.json file.
const DATASET = process.env.DATASET
const sanityClient = sanity({
projectId: process.env.PROJECT_ID,
dataset: DATASET,
token: process.env.SANITY_TOKEN,
useCdn: false,
})
exportDataset({
// Instance of @sanity/client configured to your project ID and dataset
client: sanityClient,
// Name of dataset to export
dataset: DATASET,
// Path to write zip-file to
outputPath: path.join('/tmp', `${DATASET}.tar.gz`),
// Whether or not to export assets
assets: false,
// Exports documents only
raw: true,
// Whether or not to export drafts
drafts: false,
})
Notes:
- All the environment variables are saved in the Netlify dashboard.
- We don't back up the assets (images, videos, etc.) and the drafts. If you want to back up assets, you need to use a different upload method than the one described below. Keep also in mind that Google Drive free tier is limited at 15 GB.
- The
/temp
path is a special location which let you store files temporary.
Uploading data dump to Google Drive
Now we can bring in the Google Drive API:
const FOLDER_ID = process.env.FOLDER_ID
const client = await google.auth.getClient({
credentials: JSON.parse(process.env.CREDENTIALS),
scopes: 'https://www.googleapis.com/auth/drive.file',
})
const drive = google.drive({
version: 'v3',
auth: client,
})
await drive.files.create({
requestBody: {
name: `${DATASET}.tar.gz`,
mimeType: 'application/gzip',
parents: [FOLDER_ID],
},
media: {
mimeType: 'application/gzip',
body: fs.createReadStream(path.join('/tmp', `${DATASET}.tar.gz`)),
},
})
// Delete oldest if more than 5 files
// Get list of backup files inside folder with specified id
const res = await drive.files.list({
fields: 'files(id, parents, createdTime)',
q: `'${FOLDER_ID}' in parents`,
orderBy: 'createdTime',
})
// Keep max. 5 backups
if (res.data.files.length >= 5) {
// Delete oldest backup
drive.files.delete({ fileId: res.data.files[0].id })
}
I think the code is rather self-explanatory. I like how the async/await
syntax makes the code more readable.
We create an upload request by reading from the /temp
location, then we make sure that we don't keep more than 5 backup files by getting a list of all files in the shared folder and checking if its length is greater or equal to 5. If we do have more than 5 files, we delete the last file.
Netlify Handler Method
Each JavaScript file deployed as a cloud function must export a handler. In this handler, you should invoke the callback method passing either null
with a response object if no error occurred or the caught error. In the following snippet, we assume that we have a function named backup that contains our backup logic.
exports.handler = function(event, context, callback) {
backup()
.then(() => {
callback(null, {
statusCode: 200,
body: 'Backup completed successfully!',
})
})
.catch(e => {
callback(e)
})
}
The message "Backup completed successfully!" or the error will be printed in the console under the functions tab in the Netlify dashboard.
Conclusion and caveat
I am using this backup function in production for very small websites managed by one or two people. It's enough for my needs, but it clearly has some limitations:
- Assets and drafts are not saved.
- If the website is deployed five times in a short period of time, the oldest backup will be quite recent.
I think that this first draft is a good starting point to build upon. We could also trigger this cloud function via a cron job. There are many possibilities.
Here's the complete function used in a personal website of mine: https://github.com/mornir/copywork-portfolio/blob/master/functions/deploy-succeeded.js