When it comes to managing our infrastructure, as developers we often face a double standard. Our code is version-controlled, subject to code reviews, checked periodically by test harnesses. We don't risk it.
However, the infrastructure running our code, often the most critical part of our production systems, is treated differently. Log onto your AWS account, a few clicks, provision a machine. With the advent of Docker, at least environments have become more reproducible, but still too many of us spend their days clicking around with their cloud superuser account to terraform the world.
Declarative DSLs
Speaking of which, there is... well, Terraform! And other solutions too. The notion of Infrastructure As Code is just starting to become the new default in the DevOps mindset. Declaratively defining your infrastructure, running some basic health checks and having everything version-controlled is a huge step forward.
Declarative DSLs are an improvement, but we can do better. In our experience at Codegram, often you can spend 80% of your time figuring out the right incantation of your cloud provider's module, with very little help from tooling. That means the debugging cycle is painfully slow, and often very frustrating. Even more so in a serverless context, where most of the code is infrastructure glue --queues, storage buckets, and lambdas all need to talk to each other in just the right ways, with the right ACL permissions.
Pulumi was built to leverage the power of general-purpose programming languages to solve these very shortcomings.
Enter Pulumi
Disclaimer: Codegram is not affiliated with pulumi.com in any way. None of the benefits detailed in this post require a subscription to pulumi.com --you can just use the open-source library.
Pulumi is an open-source library that lets you use a general-purpose programming language to build your infrastructure on your cloud provider of choice. At Codegram we usually choose TypeScript as our language for this task.
As an example, here's all the code it takes to create an S3 bucket and spin up an AWS Lambda that triggers on every document uploaded to it:
import * as aws from "@pulumi/aws";
const docsBucket = new aws.s3.Bucket("docs");
docsBucket.onObjectCreated("docsHandler", (e) => {
for (const rec of e.Records || []) {
console.log(`Hello from Lambda -- got an S3 Object: ${rec.s3.object.key}`);
}
});
export docsBucketName = docsBucket.bucketName;
All this glue code is type-checked by the Typescript compiler --from the records inside the S3 event, to the properties of the created S3 Bucket should be want to inspect them. And it goes way beyond serverless --whether you're deploying to Kubernetes, or plain Docker containers, you're good to go.
In my opinion, the advantages of this approach go well beyond mere convenience. Read on.
Extensibility via Terraform providers
Even though Pulumi doesn't depend on Terraform, The Pulumi CLI often leverages Terraform providers, and there is nice tooling to generate a Pulumi library from an existing Terraform provider. In our case, this was tremendously useful as we needed to manage DNSs with DNSimple through Pulumi. Even though it was my first time using the code generation tool, it was only an hour or two until I had the DNSimple integration working.
Repeatable environments
When you keep all your infrastructure alongside your code, spinning up a staging environment, or a test environment in a different cloud region, is trivial. We find that Pulumi helps us bring the gap between starting testing out a piece of code and having a deployed, repeatable environment provisioned from our Continuous Deployment pipeline.
For an agency like us, this becomes even more critical --often, projects start out in our cloud accounts, only to then be moved over to the client's --without the certainty that environments are completely repeatable and nothing is left to point and click interfaces, it would be a daunting task to ensure a smooth handover (and believe me, it used to be).
Avoiding cloud vendor lock-in
The pulumi.cloud right now implements a lot of the basic concepts of a cloud service, in a cloud-agnostic way.
This lets you abstract a lot of the cloud-specific tools to facilitate deploying a multi-cloud project. However, wherever the current APIs can't reach, it's just code -- you can leverage abstraction to avoid locking your infrastructure to a particular cloud vendor.
Serverless and magic functions
Pulumi has the concept of a magic function. It's a closure that will be packaged into a Lambda or cloud function, and it can reference elements of the infrastructure. These elements are resolved at provisioning time, so they are not dynamic, even though they are as flexible as if they were.
import * as aws from '@pulumi/aws'
const inputBucket = new aws.s3.Bucket('input')
const outputBucket = new aws.s3.Bucket('output')
input.onObjectCreated('process', e => {
for (const rec of e.Records || []) {
const processed = processFile(rec.s3.object.key)
uploadFile(outputBucket.bucketName.get(), processed)
}
})
In this example, every time a file is uploaded to inputBucket
, we call a Lambda that processes the file and uploads it to another bucket. The closure we use to define the lambda closes upon outputBucket.bucketName
and calls get()
on it. This code, at provisioning time, will be compiled to a Lambda that will contain the literal name of the output bucket, statically.
We get all the benefit of seemingly dynamic references between elements of our infrastructure, without the hassle of configuration management --misconfigured environment variables and so on. These references are checked at compile time, before provisioning time, and obviously way before runtime.
Magic functions help building the kind of glue code that is so tedious to write in serverless, seamlessly and with the power of a full programming language and tooling at your disposal.
Encoding best practices
When it comes to managing infrastructure as a team, standardizing best practices becomes essential. Which settings do we use for storage buckets? Do we always use queues to decouple production and consumption? What durability settings are good defaults?
By abstracting common infrastructure patterns in our own company library, these questions are brought into the team, and answered in the form of Pull Requests, code reviews, and technical discussions around real, tangible code. This way we move away from the one person who knows how things are done around here, to version controlled best practices owned by the whole team.
Granular security by default
In traditional infrastructure management, people usually default to creating a superuser first, and worry about tightening permissions later, when all the parts are defined and stable. Does it sound familiar?
To avoid that, having infrastructure patterns in code means you can abstract granular permissions around resources. Whenever you create a queue, the code that does that can make sure that only a specific storage bucket can publish events to it, or that only a specific machine or lambda can read from it.
Once it's in the library, its users (team members) never have to worry about loose permissions again. Write once, worry never again.
Bonus points: type safety
At Codegram, TypeScript is our language of choice to manage infrastructure. The myriad of options that each cloud provider exposes in their wide range of offerings, and the fact that these change often, makes compiler help invaluable.
The reason is, most of the parameter validation in cloud providers is done at actual provisioning time of the particular resource, which means, for N resources, it can easily degenerate into a really long feedback loop.
By having most of the parameter validation at compile time, you don't even need to run the code to make sure it matches. Plus, you get a lot of help from your tooling in terms of auto-completion.
On top of that, TypeScript's type system is powerful enough to allow parametric polymorphism, which helps greatly in abstracting away common patterns, and helps a lot in the way of multi-cloud abstractions.
Conclusion
Managing infrastructure used to be a tedious task that few people enjoyed doing. Since we started using Pulumi, it's fused with the process of developing a project from start to end, and it's given us extra confidence in the portability and repeatability of our production environments.
And now, go forth and terraform the world! Pun intended.
Cover photo by Ricardo Gómez Ángel on Unsplash